Sandbox FAQ » History » Version 1
Herve Caumont, 2013-06-17 10:26
1 | 1 | Herve Caumont | h1. Sandbox Frequently Asked Questions |
---|---|---|---|
2 | |||
3 | {{>toc}} |
||
4 | |||
5 | h2. Installation of applications and software |
||
6 | |||
7 | h3. How do I install external libraries (e.g. HDF5)? |
||
8 | |||
9 | Libraries and associated binaries (e.g. h5dump) shall be made available via yum: |
||
10 | |||
11 | <pre> |
||
12 | [user@fb ~] sudo yum search hdf |
||
13 | </pre> |
||
14 | |||
15 | This will list all packages related with HDF. |
||
16 | We will install the hdf5 libraries and binaries with: |
||
17 | |||
18 | <pre> |
||
19 | [user@sb ~]sudo yum install hdf5.x86_64 |
||
20 | </pre> |
||
21 | |||
22 | h3. Several jobs of my workflow use the same software, where is it installed? |
||
23 | |||
24 | When several jobs use the same software (e.g. NEST toolbox or CDAT) the location to install these packages is: |
||
25 | |||
26 | <pre> |
||
27 | /application/share/<software name> |
||
28 | </pre> |
||
29 | |||
30 | instead of installing it several times under |
||
31 | |||
32 | <pre> |
||
33 | /application/job1/ |
||
34 | /application/job2/ |
||
35 | </pre> |
||
36 | |||
37 | h3. Am I a sudoer ? |
||
38 | |||
39 | Yes, you are, but with the command listed below: |
||
40 | |||
41 | * yum - yum allows you to install packages on your sandbox |
||
42 | |||
43 | h2. Parameters and inputs |
||
44 | |||
45 | h3. How do I manage the inputs to a job? |
||
46 | |||
47 | There are several ways to pass inputs to a job: |
||
48 | |||
49 | * Local inputs - local files will use the file:// protocol and are defined in the workflow as follows: |
||
50 | |||
51 | <pre><code class="xml"> |
||
52 | <workflow id="somename"> |
||
53 | <workflowVersion>1.0</workflowVersion> |
||
54 | <node id="somenodeid"> |
||
55 | <job id="ceda-collect"></job> |
||
56 | <sources> |
||
57 | <source refid="file:urls" >/application/input.urls</source> |
||
58 | </sources> |
||
59 | </node> |
||
60 | </workflow> |
||
61 | </code></pre> |
||
62 | |||
63 | and the file _input.urls_ contains the references to the local files: |
||
64 | |||
65 | <pre><code class="ruby"> |
||
66 | [ user@sb ~] cat /application/input.urls |
||
67 | file:///home/user/somefile1 |
||
68 | file:///home/user/somefile2 |
||
69 | file:///home/user/somefile3 |
||
70 | </code></pre> |
||
71 | |||
72 | Then the job executable can use ciop-copy to copy the files if needed. |
||
73 | |||
74 | <pre><code class="c"> |
||
75 | while read inputfile |
||
76 | do |
||
77 | echo $inputfile | ciop-copy -o ./ - |
||
78 | done |
||
79 | </code></pre> |
||
80 | |||
81 | * Values |
||
82 | |||
83 | Passing values to a job follows the same approach as above. |
||
84 | |||
85 | <pre><code class="xml"> |
||
86 | <workflow id="somename"> |
||
87 | <workflowVersion>1.0</workflowVersion> |
||
88 | <node id="somenodeid"> |
||
89 | <job id="ceda-collect"></job> |
||
90 | <sources> |
||
91 | <source refid="file:urls" >/application/inputparams</source> |
||
92 | </sources> |
||
93 | </node> |
||
94 | </workflow> |
||
95 | </code></pre> |
||
96 | |||
97 | and the file _inputparams_ contains the list of values: |
||
98 | |||
99 | <pre><code class="ruby"> |
||
100 | [ user@sb ~] cat /application/inputparams |
||
101 | -10,-10,10,10 |
||
102 | 10,10,20,20 |
||
103 | </code></pre> |
||
104 | |||
105 | In the example above, the executable manages the parameters (bounding boxes) with: |
||
106 | |||
107 | <pre><code class="ruby"> |
||
108 | while read bbox |
||
109 | do |
||
110 | echo "processing bounding box $bbox" |
||
111 | done |
||
112 | </code></pre> |
||
113 | |||
114 | * Products available in the Sandbox internal catalogue |
||
115 | |||
116 | During the sandbox definition and creation you may have selected a list of data products, the references to these products are available in the sandbox internal catalogue. |
||
117 | The workflow is defined as follows: |
||
118 | |||
119 | <pre><code class="xml"> |
||
120 | <workflow id="testVomir"> |
||
121 | <workflowVersion>1.0</workflowVersion> |
||
122 | <node id="Vimage"> <!-- workflow node unique id --> |
||
123 | <job id="imager"></job> <!-- job defined above --> |
||
124 | <sources> |
||
125 | <source refid="cas:serie" >ATS_TOA_1P</source> |
||
126 | </sources> |
||
127 | <parameters> <!-- parameters of the job --> |
||
128 | <parameter id="volcano_db"></parameter> |
||
129 | </parameters> |
||
130 | </node> |
||
131 | </code></pre> |
||
132 | |||
133 | As an example, the job executable would contain the lines below to copy the data products locally: |
||
134 | |||
135 | <pre><code class="ruby"> |
||
136 | while read product |
||
137 | do |
||
138 | echo $product | ciop-copy -o ./ - |
||
139 | done |
||
140 | </code></pre> |
||
141 | |||
142 | h2. Jobs |
||
143 | |||
144 | h3. What environmental variables can I use in my jobs? |
||
145 | |||
146 | CCBox provides the environmental variables: |
||
147 | * _CIOP_APPLICATION_PATH is the path to the application.xml files and all other underlying folders. Its value is /application |
||
148 | > Note: do not use its value in the executable scripts, always use $_CIOP_APPLICATION_PATH |
||
149 | * _JOB_DIR |
||
150 | * TMPDIR is temporary directory for the task. |
||
151 | * _JOB_ID contains the job id |
||
152 | * _JOB_LOCAL_DIR is the job specific shared scratch space |
||
153 | * _TASK_ID is the task id |
||
154 | * _TASK_LOCAL_DIR is the task specific scratch space |
||
155 | * _TASK_NUM contains the number of tasks |
||
156 | * _TASK_INDEX |
||
157 | |||
158 | The best way to get acquainted to the values of the environmental variables is to have them logged in a job with: |
||
159 | |||
160 | <pre> |
||
161 | ciop-log "DEBUG" "TMPDIR = $TMPDIR" |
||
162 | ciop-log "DEBUG" "_JOB_ID = ${_JOB_ID}" |
||
163 | ciop-log "DEBUG" "_JOB_LOCAL_DIR = ${_JOB_LOCAL_DIR}" |
||
164 | ciop-log "DEBUG" "_TASK_ID = ${_TASK_ID}" |
||
165 | ciop-log "DEBUG" "_TASK_LOCAL_DIR = ${_TASK_LOCAL_DIR}" |
||
166 | ciop-log "DEBUG" "_TASK_NUM = ${_TASK_NUM}" |
||
167 | ciop-log "DEBUG" "_TASK_INDEX = ${_TASK_INDEX}" |
||
168 | </pre> |
||
169 | |||
170 | h3. How do I test a single job of a workflow? |
||
171 | |||
172 | For that you have to know the nodeid of the job in the workflow. |
||
173 | |||
174 | <pre><code class="xml"> |
||
175 | <workflow id="testVomir"> |
||
176 | <workflowVersion>1.0</workflowVersion> |
||
177 | <node id="Vimage"> |
||
178 | ... |
||
179 | </node> |
||
180 | |||
181 | </code></pre> |
||
182 | |||
183 | With that value simply do: |
||
184 | |||
185 | <pre> |
||
186 | [user@sb ~] ciop-simjob -f Vimage |
||
187 | </pre> |
||
188 | |||
189 | h2. Workflows |
||
190 | |||
191 | h3. How do I test a workflow? |
||
192 | |||
193 | Simply run the command: |
||
194 | |||
195 | <pre><code class="ruby"> |
||
196 | [ user@sb ~] ciop-simwf |
||
197 | </code></pre> |
||
198 | |||
199 | h3. How do I access the details of my workflow run? |
||
200 | |||
201 | When you run the _ciop-simwf_ you'll see on your terminal window the image below. The link to the details is highlighted. |
||
202 | Copy and paste the URL on your browser and navigate through the pages to find details about the workflow execution. |
||
203 | |||
204 | !workflow_url.png! |
||
205 | |||
206 | h3. How do I access the results of my workflow? |
||
207 | |||
208 | After a successful run of your workflow, your results including logs can be found in the folder: |
||
209 | |||
210 | <pre> |
||
211 | /share/tmp/sandbox/<workflow name> |
||
212 | </pre> |