GDAL Simple job¶
Introduction¶
This simple job will demonstrate a few things about a typical sandbox usage:- installing software packages
- configuration of a simple application taking geotiff files available on a remote FTP site and convert them to a chosen format (e.g. PNG)
Pre-requisites¶
To follow this simple tutorial you need:- a running sandbox
- access to the sandbox
less than 30 minutes of time
Step 1: install gdal on the sandbox¶
The installation of gdal is done via yum(Yellowdog Updater Modified)
Run the command below on your sandbox:
[user@sb ~]$ sudo yum -y install gdal
Step 2: create the simplejob¶
The simple job will require a folder under /application:
[user@sb ~] cd /application [user@sb application] mkdir simplejob [user@sb application] cd simplejob
Create the wrapper file which handles the input parameters¶
[user@sb simplejob] vi run
Note: You can use any editor you like
Paste the code below in the run file:¶
#!/bin/bash # Project: ${project.name} # Author: $Author: stripodi $ (Terradue Srl) # Last update: $Date: 2011-09-08 12:01:58 +0200 (Thu, 08 Sep 2011) $ # Element: ${project.name} # Context: services/${project.artifactId} # Version: ${project.version} (${implementation.build}) # Description: ${project.description} # # This document is the property of Terradue and contains information directly # resulting from knowledge and experience of Terradue. # Any changes to this code is forbidden without written consent from Terradue Srl # # Contact: info@terradue.com # 2012-02-10 - NEST in jobConfig upgraded to version 4B-1.1 # source the ciop functions (e.g. ciop-log) source ${ciop_job_include} # define the exit codes SUCCESS=0 ERR_NOINPUT=1 ERR_NOPARAMS=2 ERR_GDAL=4 # add a trap to exit gracefully function cleanExit () { local retval=$? local msg="" case "$retval" in $SUCCESS) msg="Processing successfully concluded";; $ERR_NOPARAMS) msg="Outout format not defined";; $ERR_GDAL) msg="Graph processing of job ${JOBNAME} failed (exit code $res)";; *) msg="Unknown error";; esac [ "$retval" != "0" ] && ciop-log "ERROR" "Error $retval - $msg, processing aborted" || ciop-log "INFO" "$msg" exit $retval } trap cleanExit EXIT # retrieve the parameters value from workflow or job default value format=`ciop-getparam format` # run a check on the format value [ -z "$format" ] && exit $ERR_NOPARAMS # loop through all geotiff URLs passed as stdin while read inputfile do # report activity in log ciop-log "INFO" "Retrieving $inputfile from storage" # retrieve the remote geotiff product to the local temporary folder retrieved=`ciop-copy -o $TMPDIR $inputfile` # check if the file was retrieved [ "$?" == "0" -a -e "$retrieved" ] || exit $ERR_NOINPUT # report activity ciop-log "INFO" "Retrieved $retrieved" # invoke gdal to convert the geotiff into selected format gdal_translate -of $format $retrieved $OUTPUTDIR/`basename $retrieved` # check error code [ "$?" != "0" ] && exit $ERR_GDAL || ciop-log "INFO" "Processed $inputfile" done exit 0
Create the application descriptor¶
Go up one level to /application
[user@sb ~] cd /application [user@sb application] vi application.xml
Paste the XML content:
<?xml version="1.0" encoding="UTF-8"?> <application id="example"> <!-- you can type any id you want --> <jobTemplates> <jobTemplate id="gdalformatconv"> <!-- this is the job name --> <streamingExecutable>/application/simplejob/run</streamingExecutable> <!-- this is the wrapper script --> <defaultParameters> <parameter id="format">PNG</parameter> <!-- this sets the default value for parameter format --> </defaultParameters> </jobTemplate> </jobTemplates> <workflow id="workflow"> <!-- Sample workflow --> <workflowVersion>1.0</workflowVersion> <workflowDescription>My simple workflow</workflowDescription> <!-- provide a description to the workflow --> <node id="gdal"> <!-- workflow node unique id --> <job id="gdalformatconv"></job> <!-- job defined above --> <sources> <source refid="file:urls" >/home/fbrito/geotiff.urls</source> <!-- the geotiff URLs are provided on an ASCII file, set your username value --> </sources> <parameters></parameters> </node> </workflow> </application>
Create the URLs files in your home directory
[user@sb application] cd [user@sb ~] vi geotiff.urls
Add a few URLs:
ftp://ftp.remotesensing.org/pub/geotiff/samples/spot/chicago/SP27GTIF.TIF ftp://ftp.remotesensing.org/pub/geotiff/samples/spot/chicago/UTM2GTIF.TIF
Step 3: execute the job¶
Execute the simple job¶
Optionally list the nodes you can execute:
[user@sb ~] ciop-simjob -n
This will return:
gdal
Process it!
[user@sb ~] ciop-simjob gdal
The job is executed and a tracking URL is provided to follow the progress and access the execution logs. Open the URL on a browser
TBC
Check the gdal node results¶
The generated files are published on a local filesystem:
[user@sb ~] cd /share/tmp/TBC
There should be two image files in the file format defined by default: PNG
Define another file format to test the node¶
Edit the application.xml file and add the line:
<parameter id="format">JPEG</parameter>
To obtain:
<?xml version="1.0" encoding="UTF-8"?> <application id="example"> <jobTemplates> <jobTemplate id="gdalformatconv"> <streamingExecutable>/application/simplejob/run</streamingExecutable> <defaultParameters> <parameter id="format">PNG</parameter> </defaultParameters> </jobTemplate> </jobTemplates> <workflow id="workflow"> <!-- Sample workflow --> <workflowVersion>1.0</workflowVersion> <workflowDescription>My simple workflow</workflowDescription> <node id="gdal"> <!-- workflow node unique id --> <job id="gdalformatconv"></job> <!-- job defined above --> <sources> <source refid="file:urls" >/home/fbrito/geotiff.urls</source> </sources> <parameters> <parameter id="format">JPEG</parameter> </parameters> </node> </workflow> </application>
Run the job again this time using the ciop-simjob flag to delete the previous run results:
[user@sb ~] ciop-simjob -f gdal
The generated files are published on a local filesystem:
[user@sb ~] cd /share/tmp/TBC
There should be two image files in the file format defined at workflow level: JPEG
Run the application as a workflow¶
Our workflow has a single job but it's still a workflow!
You can trigger the workflow with:
[user@sb ~] ciop-simwf
You can track the workflow execution on the shell. Wait for the workflow conclusion.
Use the command below to get the latest workflow run:
[user@sb ~] ciop-simwf -l 0000000-130405042430716-oozie-oozi-W
Use the value returned above to check the workflow results:
[user@sb ~] ll tmp/sandbox/run/0000000-130405042430716-oozie-oozi-W/gdal/output
You can optionally delete the generated results:
Conclusion¶
With this simple job you have learned:- how to install packages to run your application using yum
- how to create a job template in your sanbox
- how to write the job wrapper script
- how to define the application descriptor
- how to execute the job with the default parameter
- how to execute the job with the parameter value definition
- how to execute the workflow
- how to check the generated results
Updated by Herve Caumont about 11 years ago ยท 3 revisions