Sandbox FAQ » History » Version 1
Herve Caumont, 2013-06-17 10:26
| 1 | 1 | Herve Caumont | h1. Sandbox Frequently Asked Questions |
|---|---|---|---|
| 2 | |||
| 3 | {{>toc}} |
||
| 4 | |||
| 5 | h2. Installation of applications and software |
||
| 6 | |||
| 7 | h3. How do I install external libraries (e.g. HDF5)? |
||
| 8 | |||
| 9 | Libraries and associated binaries (e.g. h5dump) shall be made available via yum: |
||
| 10 | |||
| 11 | <pre> |
||
| 12 | [user@fb ~] sudo yum search hdf |
||
| 13 | </pre> |
||
| 14 | |||
| 15 | This will list all packages related with HDF. |
||
| 16 | We will install the hdf5 libraries and binaries with: |
||
| 17 | |||
| 18 | <pre> |
||
| 19 | [user@sb ~]sudo yum install hdf5.x86_64 |
||
| 20 | </pre> |
||
| 21 | |||
| 22 | h3. Several jobs of my workflow use the same software, where is it installed? |
||
| 23 | |||
| 24 | When several jobs use the same software (e.g. NEST toolbox or CDAT) the location to install these packages is: |
||
| 25 | |||
| 26 | <pre> |
||
| 27 | /application/share/<software name> |
||
| 28 | </pre> |
||
| 29 | |||
| 30 | instead of installing it several times under |
||
| 31 | |||
| 32 | <pre> |
||
| 33 | /application/job1/ |
||
| 34 | /application/job2/ |
||
| 35 | </pre> |
||
| 36 | |||
| 37 | h3. Am I a sudoer ? |
||
| 38 | |||
| 39 | Yes, you are, but with the command listed below: |
||
| 40 | |||
| 41 | * yum - yum allows you to install packages on your sandbox |
||
| 42 | |||
| 43 | h2. Parameters and inputs |
||
| 44 | |||
| 45 | h3. How do I manage the inputs to a job? |
||
| 46 | |||
| 47 | There are several ways to pass inputs to a job: |
||
| 48 | |||
| 49 | * Local inputs - local files will use the file:// protocol and are defined in the workflow as follows: |
||
| 50 | |||
| 51 | <pre><code class="xml"> |
||
| 52 | <workflow id="somename"> |
||
| 53 | <workflowVersion>1.0</workflowVersion> |
||
| 54 | <node id="somenodeid"> |
||
| 55 | <job id="ceda-collect"></job> |
||
| 56 | <sources> |
||
| 57 | <source refid="file:urls" >/application/input.urls</source> |
||
| 58 | </sources> |
||
| 59 | </node> |
||
| 60 | </workflow> |
||
| 61 | </code></pre> |
||
| 62 | |||
| 63 | and the file _input.urls_ contains the references to the local files: |
||
| 64 | |||
| 65 | <pre><code class="ruby"> |
||
| 66 | [ user@sb ~] cat /application/input.urls |
||
| 67 | file:///home/user/somefile1 |
||
| 68 | file:///home/user/somefile2 |
||
| 69 | file:///home/user/somefile3 |
||
| 70 | </code></pre> |
||
| 71 | |||
| 72 | Then the job executable can use ciop-copy to copy the files if needed. |
||
| 73 | |||
| 74 | <pre><code class="c"> |
||
| 75 | while read inputfile |
||
| 76 | do |
||
| 77 | echo $inputfile | ciop-copy -o ./ - |
||
| 78 | done |
||
| 79 | </code></pre> |
||
| 80 | |||
| 81 | * Values |
||
| 82 | |||
| 83 | Passing values to a job follows the same approach as above. |
||
| 84 | |||
| 85 | <pre><code class="xml"> |
||
| 86 | <workflow id="somename"> |
||
| 87 | <workflowVersion>1.0</workflowVersion> |
||
| 88 | <node id="somenodeid"> |
||
| 89 | <job id="ceda-collect"></job> |
||
| 90 | <sources> |
||
| 91 | <source refid="file:urls" >/application/inputparams</source> |
||
| 92 | </sources> |
||
| 93 | </node> |
||
| 94 | </workflow> |
||
| 95 | </code></pre> |
||
| 96 | |||
| 97 | and the file _inputparams_ contains the list of values: |
||
| 98 | |||
| 99 | <pre><code class="ruby"> |
||
| 100 | [ user@sb ~] cat /application/inputparams |
||
| 101 | -10,-10,10,10 |
||
| 102 | 10,10,20,20 |
||
| 103 | </code></pre> |
||
| 104 | |||
| 105 | In the example above, the executable manages the parameters (bounding boxes) with: |
||
| 106 | |||
| 107 | <pre><code class="ruby"> |
||
| 108 | while read bbox |
||
| 109 | do |
||
| 110 | echo "processing bounding box $bbox" |
||
| 111 | done |
||
| 112 | </code></pre> |
||
| 113 | |||
| 114 | * Products available in the Sandbox internal catalogue |
||
| 115 | |||
| 116 | During the sandbox definition and creation you may have selected a list of data products, the references to these products are available in the sandbox internal catalogue. |
||
| 117 | The workflow is defined as follows: |
||
| 118 | |||
| 119 | <pre><code class="xml"> |
||
| 120 | <workflow id="testVomir"> |
||
| 121 | <workflowVersion>1.0</workflowVersion> |
||
| 122 | <node id="Vimage"> <!-- workflow node unique id --> |
||
| 123 | <job id="imager"></job> <!-- job defined above --> |
||
| 124 | <sources> |
||
| 125 | <source refid="cas:serie" >ATS_TOA_1P</source> |
||
| 126 | </sources> |
||
| 127 | <parameters> <!-- parameters of the job --> |
||
| 128 | <parameter id="volcano_db"></parameter> |
||
| 129 | </parameters> |
||
| 130 | </node> |
||
| 131 | </code></pre> |
||
| 132 | |||
| 133 | As an example, the job executable would contain the lines below to copy the data products locally: |
||
| 134 | |||
| 135 | <pre><code class="ruby"> |
||
| 136 | while read product |
||
| 137 | do |
||
| 138 | echo $product | ciop-copy -o ./ - |
||
| 139 | done |
||
| 140 | </code></pre> |
||
| 141 | |||
| 142 | h2. Jobs |
||
| 143 | |||
| 144 | h3. What environmental variables can I use in my jobs? |
||
| 145 | |||
| 146 | CCBox provides the environmental variables: |
||
| 147 | * _CIOP_APPLICATION_PATH is the path to the application.xml files and all other underlying folders. Its value is /application |
||
| 148 | > Note: do not use its value in the executable scripts, always use $_CIOP_APPLICATION_PATH |
||
| 149 | * _JOB_DIR |
||
| 150 | * TMPDIR is temporary directory for the task. |
||
| 151 | * _JOB_ID contains the job id |
||
| 152 | * _JOB_LOCAL_DIR is the job specific shared scratch space |
||
| 153 | * _TASK_ID is the task id |
||
| 154 | * _TASK_LOCAL_DIR is the task specific scratch space |
||
| 155 | * _TASK_NUM contains the number of tasks |
||
| 156 | * _TASK_INDEX |
||
| 157 | |||
| 158 | The best way to get acquainted to the values of the environmental variables is to have them logged in a job with: |
||
| 159 | |||
| 160 | <pre> |
||
| 161 | ciop-log "DEBUG" "TMPDIR = $TMPDIR" |
||
| 162 | ciop-log "DEBUG" "_JOB_ID = ${_JOB_ID}" |
||
| 163 | ciop-log "DEBUG" "_JOB_LOCAL_DIR = ${_JOB_LOCAL_DIR}" |
||
| 164 | ciop-log "DEBUG" "_TASK_ID = ${_TASK_ID}" |
||
| 165 | ciop-log "DEBUG" "_TASK_LOCAL_DIR = ${_TASK_LOCAL_DIR}" |
||
| 166 | ciop-log "DEBUG" "_TASK_NUM = ${_TASK_NUM}" |
||
| 167 | ciop-log "DEBUG" "_TASK_INDEX = ${_TASK_INDEX}" |
||
| 168 | </pre> |
||
| 169 | |||
| 170 | h3. How do I test a single job of a workflow? |
||
| 171 | |||
| 172 | For that you have to know the nodeid of the job in the workflow. |
||
| 173 | |||
| 174 | <pre><code class="xml"> |
||
| 175 | <workflow id="testVomir"> |
||
| 176 | <workflowVersion>1.0</workflowVersion> |
||
| 177 | <node id="Vimage"> |
||
| 178 | ... |
||
| 179 | </node> |
||
| 180 | |||
| 181 | </code></pre> |
||
| 182 | |||
| 183 | With that value simply do: |
||
| 184 | |||
| 185 | <pre> |
||
| 186 | [user@sb ~] ciop-simjob -f Vimage |
||
| 187 | </pre> |
||
| 188 | |||
| 189 | h2. Workflows |
||
| 190 | |||
| 191 | h3. How do I test a workflow? |
||
| 192 | |||
| 193 | Simply run the command: |
||
| 194 | |||
| 195 | <pre><code class="ruby"> |
||
| 196 | [ user@sb ~] ciop-simwf |
||
| 197 | </code></pre> |
||
| 198 | |||
| 199 | h3. How do I access the details of my workflow run? |
||
| 200 | |||
| 201 | When you run the _ciop-simwf_ you'll see on your terminal window the image below. The link to the details is highlighted. |
||
| 202 | Copy and paste the URL on your browser and navigate through the pages to find details about the workflow execution. |
||
| 203 | |||
| 204 | !workflow_url.png! |
||
| 205 | |||
| 206 | h3. How do I access the results of my workflow? |
||
| 207 | |||
| 208 | After a successful run of your workflow, your results including logs can be found in the folder: |
||
| 209 | |||
| 210 | <pre> |
||
| 211 | /share/tmp/sandbox/<workflow name> |
||
| 212 | </pre> |