Project

General

Profile

Lib-gdal » History » Revision 2

Revision 1 (Herve Caumont, 2013-06-19 18:05) → Revision 2/3 (Herve Caumont, 2013-10-25 16:42)

h1. GDAL Simple job 

 {{>toc}} 

 

 h2. Introduction 

 

 This simple job will demonstrate a few things about a typical sandbox usage: 
 
 * installing software packages 
 
 * configuration of a simple application taking geotiff files available on a remote FTP site and convert them to a chosen format (e.g. PNG)  

  

 h2. Pre-requisites 

 

 To follow this simple tutorial you need: 
 
 * a running sandbox 
 
 * access to the sandbox  
  
 less than 30 minutes of time  

  

 h2. Step 1: install gdal on the sandbox 

 

 The installation of gdal is done via yum(Yellowdog Updater Modified) 

 

 Run the command below on your sandbox: 

 

 <pre> 
 
 [user@sb ~]$ sudo yum -y install gdal 
 
 </pre> 

 

 h2. Step 2: create the simplejob 

 

 !src/logo.png! 

 

 The simple job will require a folder under /application: 

 

 <pre> 
 
 [user@sb ~] cd /application 
 
 [user@sb application] mkdir simplejob 
 
 [user@sb application] cd simplejob 
 
 </pre> 

 

 h3. Create the wrapper file which handles the input parameters  

  

 <pre> 
 
 [user@sb simplejob] vi run 
 
 </pre> 

 

 Note: You can use any editor you like 

 

 h3. Paste the code below in the run file: 

 

 <pre> 
 
 #!/bin/bash 

 

 # Project:         ${project.name} 
 
 # Author:          $Author: stripodi $ (Terradue Srl) 
 
 # Last update:     $Date: 2011-09-08 12:01:58 +0200 (Thu, 08 Sep 2011) $ 
 
 # Element:         ${project.name} 
 
 # Context:         services/${project.artifactId} 
 
 # Version:         ${project.version} (${implementation.build}) 
 
 # Description:     ${project.description} 
 
 # 
 
 # This document is the property of Terradue and contains information directly 
 
 # resulting from knowledge and experience of Terradue. 
 
 # Any changes to this code is forbidden without written consent from Terradue Srl 
 
 # 
 
 # Contact: info@terradue.com 
 
 # 2012-02-10 - NEST in jobConfig upgraded to version 4B-1.1 

 

 # source the ciop functions (e.g. ciop-log) 
 
 source ${ciop_job_include} 

 

 # define the exit codes 
 
 SUCCESS=0 
 
 ERR_NOINPUT=1 
 
 ERR_NOPARAMS=2 
 
 ERR_GDAL=4 

 

 # add a trap to exit gracefully 
 
 function cleanExit () 
 
 { 
    
    local retval=$? 
    
    local msg="" 
    
    case "$retval" in 
      
      $SUCCESS)        msg="Processing successfully concluded";; 
      
      $ERR_NOPARAMS) msg="Outout format not defined";; 
      
      $ERR_GDAL)      msg="Graph processing of job ${JOBNAME} failed (exit code $res)";; 
      
      *)               msg="Unknown error";; 
    
    esac 
    
    [ "$retval" != "0" ] && ciop-log "ERROR" "Error $retval - $msg, processing aborted" || ciop-log "INFO" "$msg" 
    
    exit $retval 
 
 } 
 
 trap cleanExit EXIT 

 

 # retrieve the parameters value from workflow or job default value 
 
 format=`ciop-getparam format` 

 

 # run a check on the format value 
 
 [ -z "$format" ] && exit $ERR_NOPARAMS 

 

 # loop through all geotiff URLs passed as stdin 
 
 while read inputfile  
  
 do 
	 
	 # report activity in log 
	 
	 ciop-log "INFO" "Retrieving $inputfile from storage" 

	 

	 # retrieve the remote geotiff product to the local temporary folder 
	 
	 retrieved=`ciop-copy -o $TMPDIR $inputfile` 
	
	 
	
	 # check if the file was retrieved 
	 
	 [ "$?" == "0" -a -e "$retrieved" ] || exit $ERR_NOINPUT 
	
	 
	
	 # report activity 
	 
	 ciop-log "INFO" "Retrieved $retrieved" 
	
	 
	
	 # invoke gdal to convert the geotiff into selected format 
	 
	 gdal_translate -of $format $retrieved $OUTPUTDIR/`basename $retrieved` 	

	 	

	 # check error code 
	 
	 [ "$?" != "0" ] && exit $ERR_GDAL || ciop-log "INFO" "Processed $inputfile" 
 
 done 

 

 exit 0 
 
 </pre> 

 

 h3. Create the application descriptor  

  

 Go up one level to /application 

 

 <pre> 
 
 [user@sb ~] cd /application 
 
 [user@sb application] vi application.xml 
 
 </pre> 

 

 Paste the XML content: 

 

 <pre> 
 
 <?xml version="1.0" encoding="UTF-8"?> 
 
 <application id="example"> <!-- you can type any id you want -->  
	  
	 <jobTemplates> 
		 
		 <jobTemplate id="gdalformatconv"> <!-- this is the job name --> 
			 
			 <streamingExecutable>/application/simplejob/run</streamingExecutable> <!-- this is the wrapper script -->  
			  
			 <defaultParameters> 							
				 							
				 <parameter id="format">PNG</parameter> <!-- this sets the default value for parameter format --> 
			 
			 </defaultParameters> 
		 
		 </jobTemplate> 
	 
	 </jobTemplates> 
	 
	 <workflow id="workflow"> 	 <!-- Sample workflow --> 
		 
		 <workflowVersion>1.0</workflowVersion> 
		 
		 <workflowDescription>My simple workflow</workflowDescription> <!-- provide a description to the workflow --> 
		 
		 <node id="gdal">  		 <!-- workflow node unique id --> 
			 
			 <job id="gdalformatconv"></job> 		 <!-- job defined above --> 
			 
			 <sources> 
				 
				 <source refid="file:urls" >/home/fbrito/geotiff.urls</source> <!-- the geotiff URLs are provided on an ASCII file, set your username value --> 
			 
			 </sources> 
			 
			 <parameters></parameters> 
		 
		 </node> 
	 
	 </workflow> 
 
 </application> 
 
 </pre> 

 

 Create the URLs files in your home directory  

  

 <pre> 
 
 [user@sb application] cd  
  
 [user@sb ~] vi geotiff.urls 
 
 </pre> 

 

 Add a few URLs: 

 

 <pre> 
 
 ftp://ftp.remotesensing.org/pub/geotiff/samples/spot/chicago/SP27GTIF.TIF 
 
 ftp://ftp.remotesensing.org/pub/geotiff/samples/spot/chicago/UTM2GTIF.TIF 
 
 </pre> 

 

 h2. Step 3: execute the job 

 

 h3. Execute the simple job 

 

 Optionally list the nodes you can execute: 

 

 <pre> 
 
 [user@sb ~] ciop-simjob -n 
 
 </pre> 

 

 This will return: 
 
 <pre> 
 
 gdal 
 
 </pre> 

 

 Process it! 

 

 <pre>  
  
 [user@sb ~] ciop-simjob gdal 
 
 </pre> 

 

 The job is executed and a tracking URL is provided to follow the progress and access the execution logs. Open the URL on a browser 

 

 TBC 

 

 h3. Check the gdal node results 

 

 The generated files are published on a local filesystem: 

 

 <pre> 
 
 [user@sb ~] cd /share/tmp/TBC 
 
 </pre> 

 

 There should be two image files in the file format defined by default: PNG 

 

 h3. Define another file format to test the node 

 

 Edit the application.xml file and add the line: 

 

 <pre> 
 
 <parameter id="format">JPEG</parameter> 
 
 </pre> 

 

 To obtain: 

 

 <pre> 
 
 <?xml version="1.0" encoding="UTF-8"?> 
 
 <application id="example"> 
	 
	 <jobTemplates> 
		 
		 <jobTemplate id="gdalformatconv"> 
			 
			 <streamingExecutable>/application/simplejob/run</streamingExecutable> 
			 
			 <defaultParameters> 							
				 							
				 <parameter id="format">PNG</parameter> 
			 
			 </defaultParameters> 
		 
		 </jobTemplate> 
	 
	 </jobTemplates> 
	 
	 <workflow id="workflow"> 							 <!-- Sample workflow --> 
		 
		 <workflowVersion>1.0</workflowVersion> 
		 
		 <workflowDescription>My simple workflow</workflowDescription> 
		 
		 <node id="gdal"> 							 <!-- workflow node unique id --> 
			 
			 <job id="gdalformatconv"></job> 					 <!-- job defined above --> 
			 
			 <sources> 
				 
				 <source refid="file:urls" >/home/fbrito/geotiff.urls</source> 
			 
			 </sources> 
			 
			 <parameters> 
				 
				 <parameter id="format">JPEG</parameter> 
			 
			 </parameters> 
		 
		 </node> 
	 
	 </workflow> 
 
 </application> 
 
 </pre> 

 

 Run the job again this time using the ciop-simjob flag to delete the previous run results: 

 

 <pre>  
  
 [user@sb ~] ciop-simjob -f gdal 
 
 </pre> 

 

 The generated files are published on a local filesystem: 

 

 <pre> 
 
 [user@sb ~] cd /share/tmp/TBC 
 
 </pre> 

 

 There should be two image files in the file format defined at workflow level: JPEG 

 

 h3. Run the application as a workflow 

 

 Our workflow has a single job but it's still a workflow! 

 

 You can trigger the workflow with: 

 

 <pre> 
 
 [user@sb ~] ciop-simwf 
 
 </pre> 

 

 You can track the workflow execution on the shell. Wait for the workflow conclusion. 

 

 Use the command below to get the latest workflow run: 
 
 <pre> 
 
 [user@sb ~] ciop-simwf -l 
 
 0000000-130405042430716-oozie-oozi-W 
 
 </pre> 

 

 Use the value returned above to check the workflow results: 

 

 <pre> 
 
 [user@sb ~] ll tmp/sandbox/run/0000000-130405042430716-oozie-oozi-W/gdal/output 
 
 </pre> 

 

 You can optionally delete the generated results: 



 



 h2. Conclusion 

 

 With this simple job you have learned: 
 
 * how to install packages to run your application using yum 
 
 * how to create a job template in your sanbox 
 
 * how to write the job wrapper script 
 
 * how to define the application descriptor 
 
 * how to execute the job with the default parameter 
 
 * how to execute the job with the parameter value definition  
  
 * how to execute the workflow 
 
 * how to check the generated results