Sandbox FAQ » History » Version 1
  Herve Caumont, 2013-06-17 10:26 
  
| 1 | 1 | Herve Caumont | h1. Sandbox Frequently Asked Questions  | 
|---|---|---|---|
| 2 | |||
| 3 | {{>toc}} | 
||
| 4 | |||
| 5 | h2. Installation of applications and software  | 
||
| 6 | |||
| 7 | h3. How do I install external libraries (e.g. HDF5)?  | 
||
| 8 | |||
| 9 | Libraries and associated binaries (e.g. h5dump) shall be made available via yum:  | 
||
| 10 | |||
| 11 | <pre>  | 
||
| 12 | [user@fb ~] sudo yum search hdf  | 
||
| 13 | </pre>  | 
||
| 14 | |||
| 15 | This will list all packages related with HDF.  | 
||
| 16 | We will install the hdf5 libraries and binaries with:  | 
||
| 17 | |||
| 18 | <pre>  | 
||
| 19 | [user@sb ~]sudo yum install hdf5.x86_64  | 
||
| 20 | </pre>  | 
||
| 21 | |||
| 22 | h3. Several jobs of my workflow use the same software, where is it installed?  | 
||
| 23 | |||
| 24 | When several jobs use the same software (e.g. NEST toolbox or CDAT) the location to install these packages is:  | 
||
| 25 | |||
| 26 | <pre>  | 
||
| 27 | /application/share/<software name>  | 
||
| 28 | </pre>  | 
||
| 29 | |||
| 30 | instead of installing it several times under  | 
||
| 31 | |||
| 32 | <pre>  | 
||
| 33 | /application/job1/  | 
||
| 34 | /application/job2/  | 
||
| 35 | </pre>  | 
||
| 36 | |||
| 37 | h3. Am I a sudoer ?  | 
||
| 38 | |||
| 39 | Yes, you are, but with the command listed below:  | 
||
| 40 | |||
| 41 | * yum - yum allows you to install packages on your sandbox  | 
||
| 42 | |||
| 43 | h2. Parameters and inputs  | 
||
| 44 | |||
| 45 | h3. How do I manage the inputs to a job?  | 
||
| 46 | |||
| 47 | There are several ways to pass inputs to a job:  | 
||
| 48 | |||
| 49 | * Local inputs - local files will use the file:// protocol and are defined in the workflow as follows:  | 
||
| 50 | |||
| 51 | <pre><code class="xml">  | 
||
| 52 | <workflow id="somename">  | 
||
| 53 | <workflowVersion>1.0</workflowVersion>  | 
||
| 54 | <node id="somenodeid">  | 
||
| 55 | <job id="ceda-collect"></job>  | 
||
| 56 | <sources>  | 
||
| 57 | <source refid="file:urls" >/application/input.urls</source>  | 
||
| 58 | </sources>  | 
||
| 59 | </node>  | 
||
| 60 | </workflow>  | 
||
| 61 | </code></pre>  | 
||
| 62 | |||
| 63 | and the file _input.urls_ contains the references to the local files:  | 
||
| 64 | |||
| 65 | <pre><code class="ruby">  | 
||
| 66 | [ user@sb ~] cat /application/input.urls  | 
||
| 67 | file:///home/user/somefile1  | 
||
| 68 | file:///home/user/somefile2  | 
||
| 69 | file:///home/user/somefile3  | 
||
| 70 | </code></pre>  | 
||
| 71 | |||
| 72 | Then the job executable can use ciop-copy to copy the files if needed.  | 
||
| 73 | |||
| 74 | <pre><code class="c">  | 
||
| 75 | while read inputfile  | 
||
| 76 | do  | 
||
| 77 | echo $inputfile | ciop-copy -o ./ -  | 
||
| 78 | done  | 
||
| 79 | </code></pre>  | 
||
| 80 | |||
| 81 | * Values  | 
||
| 82 | |||
| 83 | Passing values to a job follows the same approach as above.  | 
||
| 84 | |||
| 85 | <pre><code class="xml">  | 
||
| 86 | <workflow id="somename">  | 
||
| 87 | <workflowVersion>1.0</workflowVersion>  | 
||
| 88 | <node id="somenodeid">  | 
||
| 89 | <job id="ceda-collect"></job>  | 
||
| 90 | <sources>  | 
||
| 91 | <source refid="file:urls" >/application/inputparams</source>  | 
||
| 92 | </sources>  | 
||
| 93 | </node>  | 
||
| 94 | </workflow>  | 
||
| 95 | </code></pre>  | 
||
| 96 | |||
| 97 | and the file _inputparams_ contains the list of values:  | 
||
| 98 | |||
| 99 | <pre><code class="ruby">  | 
||
| 100 | [ user@sb ~] cat /application/inputparams  | 
||
| 101 | -10,-10,10,10  | 
||
| 102 | 10,10,20,20  | 
||
| 103 | </code></pre>  | 
||
| 104 | |||
| 105 | In the example above, the executable manages the parameters (bounding boxes) with:  | 
||
| 106 | |||
| 107 | <pre><code class="ruby">  | 
||
| 108 | while read bbox  | 
||
| 109 | do  | 
||
| 110 | echo "processing bounding box $bbox"  | 
||
| 111 | done  | 
||
| 112 | </code></pre>  | 
||
| 113 | |||
| 114 | * Products available in the Sandbox internal catalogue  | 
||
| 115 | |||
| 116 | During the sandbox definition and creation you may have selected a list of data products, the references to these products are available in the sandbox internal catalogue.  | 
||
| 117 | The workflow is defined as follows:  | 
||
| 118 | |||
| 119 | <pre><code class="xml">  | 
||
| 120 | <workflow id="testVomir">  | 
||
| 121 | <workflowVersion>1.0</workflowVersion>  | 
||
| 122 | <node id="Vimage"> <!-- workflow node unique id -->  | 
||
| 123 | <job id="imager"></job> <!-- job defined above -->  | 
||
| 124 | <sources>  | 
||
| 125 | <source refid="cas:serie" >ATS_TOA_1P</source>  | 
||
| 126 | </sources>  | 
||
| 127 | <parameters> <!-- parameters of the job -->  | 
||
| 128 | <parameter id="volcano_db"></parameter>  | 
||
| 129 | </parameters>  | 
||
| 130 | </node>  | 
||
| 131 | </code></pre>  | 
||
| 132 | |||
| 133 | As an example, the job executable would contain the lines below to copy the data products locally:  | 
||
| 134 | |||
| 135 | <pre><code class="ruby">  | 
||
| 136 | while read product  | 
||
| 137 | do  | 
||
| 138 | echo $product | ciop-copy -o ./ -  | 
||
| 139 | done  | 
||
| 140 | </code></pre>  | 
||
| 141 | |||
| 142 | h2. Jobs  | 
||
| 143 | |||
| 144 | h3. What environmental variables can I use in my jobs?  | 
||
| 145 | |||
| 146 | CCBox provides the environmental variables:  | 
||
| 147 | * _CIOP_APPLICATION_PATH is the path to the application.xml files and all other underlying folders. Its value is /application  | 
||
| 148 | > Note: do not use its value in the executable scripts, always use $_CIOP_APPLICATION_PATH  | 
||
| 149 | * _JOB_DIR  | 
||
| 150 | * TMPDIR is temporary directory for the task.  | 
||
| 151 | * _JOB_ID contains the job id  | 
||
| 152 | * _JOB_LOCAL_DIR is the job specific shared scratch space  | 
||
| 153 | * _TASK_ID is the task id  | 
||
| 154 | * _TASK_LOCAL_DIR is the task specific scratch space  | 
||
| 155 | * _TASK_NUM contains the number of tasks  | 
||
| 156 | * _TASK_INDEX  | 
||
| 157 | |||
| 158 | The best way to get acquainted to the values of the environmental variables is to have them logged in a job with:  | 
||
| 159 | |||
| 160 | <pre>  | 
||
| 161 | ciop-log "DEBUG" "TMPDIR = $TMPDIR"  | 
||
| 162 | ciop-log "DEBUG" "_JOB_ID                 = ${_JOB_ID}"               | 
||
| 163 | ciop-log "DEBUG" "_JOB_LOCAL_DIR          = ${_JOB_LOCAL_DIR}"           | 
||
| 164 | ciop-log "DEBUG" "_TASK_ID                = ${_TASK_ID}"              | 
||
| 165 | ciop-log "DEBUG" "_TASK_LOCAL_DIR         = ${_TASK_LOCAL_DIR}"       | 
||
| 166 | ciop-log "DEBUG" "_TASK_NUM               = ${_TASK_NUM}"             | 
||
| 167 | ciop-log "DEBUG" "_TASK_INDEX             = ${_TASK_INDEX}" | 
||
| 168 | </pre>  | 
||
| 169 | |||
| 170 | h3. How do I test a single job of a workflow?  | 
||
| 171 | |||
| 172 | For that you have to know the nodeid of the job in the workflow.  | 
||
| 173 | |||
| 174 | <pre><code class="xml">  | 
||
| 175 | <workflow id="testVomir">  | 
||
| 176 | <workflowVersion>1.0</workflowVersion>  | 
||
| 177 | <node id="Vimage">  | 
||
| 178 | ...  | 
||
| 179 | </node>  | 
||
| 180 | |||
| 181 | </code></pre>  | 
||
| 182 | |||
| 183 | With that value simply do:  | 
||
| 184 | |||
| 185 | <pre>  | 
||
| 186 | [user@sb ~] ciop-simjob -f Vimage  | 
||
| 187 | </pre>  | 
||
| 188 | |||
| 189 | h2. Workflows  | 
||
| 190 | |||
| 191 | h3. How do I test a workflow?  | 
||
| 192 | |||
| 193 | Simply run the command:  | 
||
| 194 | |||
| 195 | <pre><code class="ruby">  | 
||
| 196 | [ user@sb ~] ciop-simwf  | 
||
| 197 | </code></pre>  | 
||
| 198 | |||
| 199 | h3. How do I access the details of my workflow run?  | 
||
| 200 | |||
| 201 | When you run the _ciop-simwf_ you'll see on your terminal window the image below. The link to the details is highlighted.  | 
||
| 202 | Copy and paste the URL on your browser and navigate through the pages to find details about the workflow execution.  | 
||
| 203 | |||
| 204 | !workflow_url.png!  | 
||
| 205 | |||
| 206 | h3. How do I access the results of my workflow?  | 
||
| 207 | |||
| 208 | After a successful run of your workflow, your results including logs can be found in the folder:  | 
||
| 209 | |||
| 210 | <pre>  | 
||
| 211 | /share/tmp/sandbox/<workflow name>  | 
||
| 212 | </pre>  |