Version 2 - History - Lib-gmtsar - Developer Cloud Sandboxes solution - Terradue Support Center

1

Herve Caumont

h1. GMTSAR tutorial

2

3

{{>toc}}

4

7

1

Herve Caumont

h2. Sandbox per-requisites

8

9

*We assume that you have already a Sandbox ready and the following items are completed:*

11

1

Herve Caumont

12

> It is not mandatory but strongly recommended to follow the [[Getting started]] tutorial before starting this one.

13

14

h2. 1. Application concepts and terminology

15

16

In order to ease the execution of the tutorial, it is important to understand the concept of an application and its terminology. This section describes the example application used all along this guide. When a word is put in *+underlined bold+*, it is an terminology keyword always designating the same concept.

17

18

h3. 1.1 The application workflow

19

20

Our example in this tutorial is an interferometry application that processes SAR data (Envisat ASAR Image Mode level 0) to generate interferogram between a master and one or more slave products. It is composed of 6 steps that run independently but in a specific order and produce results that are inputs for the other remaining steps.

21

The following figure illustrates the *+Workflow+* of our application as a directed acyclic graph (DAG). This is also how the CIOP framework handles the execution of processes in terms of parallel computing and orchestration of the processing steps.

22

23

!! _<<insert here the DAG that represent the workflow >>_

24

25

Each box represents a *+Job+* which is a step of our application process. The arrows represents the data flow between the *+jobs+*. When a *+job+* is connected to another, it means that the *+output+* of this *+job+* is passed as *+input+* for the other.

26

27

It is important to keep in mind that in CIOP framework, *+input+* and *+ouput+* are text references (e.g. to data). Indeed, when a *+job+* process *+input+*, it actually reads *line by line* the reference workflow as described in the next figure

28

29

!https://ciop.eo.esa.int/attachments/40/focus-align.png!

30

31

It is therefore important to define precisely the inter- *+job+* references.

32

33

h3. 1.2 The job

34

35

Each *+job+* has a set of basic characteristics:

36

37

* a unique *+Job name+* in the workflow. (e.g. 'PreProc')

38

* zero, 1 or several *+sources+* that define the *+jobs+* interdependency. In the example, the *+job+* 'Interfere' has 2 dependencies: 'AlignSlave' and 'MakeTropo'.

39

* a maximum number of simultaneous *+tasks+* in which it can be forked. This is further explained in section [[Sandbox Application Integration Tutorial#1.3 The processing task|1.3 The processing task]].

40

* a *+processing trigger*+ which is a software executable of the *+job template+* that handles *+input+*/*+output+* streaming process. Practically, the executable that reads the *+input+* lines and writes the *+output+* lines.

41

42

The job characteristics above are mandatory in the *+workflow+* definition.

43

If incomplete, the CIOP framework reports error in the workflow.

44

45

For out tutorial example, here are the characteristics for 'PreProc', 'AlignSlave' and 'Interfere' *+jobs+*:

46

47

* *PreProc*

48

49

PreProc is the first job in the workflow. It takes both SAR products, master and slave, and pre-processess them:

50

> It is a job with a single task, the _defaultJobconf_ _property_ (a CIOP property, not an application property) _*ciop.job.max.tasks*_ is set to *1*

51

> Its executable is located is /application/preproc/run

52

> It has a number of application parameters to run: SAT, master, num_patches, near_range, earth_radius, fd1, stop_on_error. The parameter values are not set in this job template.

53

54

<pre><code class="xml">

55

<jobTemplate id="preproc">

56

	<streamingExecutable>/application/preproc/run</streamingExecutable>	<!-- processing trigger -->

57

	<defaultParameters>	<!-- default parameters of the job -->

58

		<!-- Default values are specified here -->

59

		<parameter id="SAT"></parameter>	<!-- no default value -->

60

		<parameter id="master"></parameter>

61

		<parameter id="num_patches"></parameter>	<!-- no default value -->

62

		<parameter id="near_range"></parameter>	<!-- no default value -->

63

		<parameter id="earth_radius"></parameter>	<!-- no default value -->

64

		<parameter id="fd1">1</parameter>	<!-- no default value -->

65

		<parameter id="stop_on_error">false</parameter> 	<!-- don't stop on error by default -->

66

	</defaultParameters>

67

	<defaultJobconf>

68

		<property id="ciop.job.max.tasks">1</property>	<!-- Maximum number of parallel tasks -->

69

	</defaultJobconf>

70

</jobTemplate>

71

</code></pre>

72

73

* *+Job name+*: *'AlignSlave'*

74

* *+sources+*: 'PreProc'

75

* maximum number of simultaneous *+tasks+*: unlimited

76

* *+processing trigger*+: /application/align/run

77

78

* *+Job name+*: *'Interfere'*

79

* *+sources+*:  'AlignSlave' and 'MakeTropo'

80

* maximum number of simultaneous *+tasks+*: 1

81

* *+processing trigger*+: /application/interfere/run

82

83

h3. 1.3 The processing task

84

85

To exploit the parallelism offered by the CIOP framework, a *+job+* may process its *+input+* in several *+tasks+*. In principle, the CIOP framework will run those *+tasks+* in parallel. This is an important and sometimes complex paradigm that can be addressed in different ways.

86

The following questions & answers describe the parallelism paradigm of the CIOP framework.

87

88

* Is it *+task+* parallelism or *+job+* parallelism?

89

90

> In this section, we definitely speak about task parallelism. Job parallelism is at level !+1. In the example, 'MakeTropo' and 'AlignSlave' are two *+jobs+* that run in parallel besides the number of *+tasks+* each of them may trigger.

91

> 'AlignSlave' may be forked into an unlimited number of *+tasks+*. Practically the framework calculates automatically the number *n* of available processing slots in the computing resource and start *n* times the *+processing trigger+* (based on the number of *+inputs+*).

92

93

* How to divide a *+job+* into *+tasks+*?

94

95

> It is actually the application developer who chooses the granularity of the *+job+* division. The computing framework will simply divide the *+input+* flow (*k* lines) in to the *n* *+tasks+*. In the example provided in this tutorial, if the *+job+* 'PreProc' produces *+output+* of 11 lines and the computing resources divide the *+job+* 'AlignSlave' into 4 *+tasks+* and the following division is done:

96

97

!https://ciop.eo.esa.int/attachments/41/parallelism.png!

98

99

Where stands the processing loop?

100

101

> The processing loop stands in the +*processing trigger*+. As shown in this tutorial with the example, the *+processing triggers+* implements a loop that reads *line by line* the *+task*+ input.

102

103

h2. 2. Home Directory

104

106

1

Herve Caumont

107

> *+Be careful to never link any element (e.g. executable, auxiliary data) from the application directory to the home directory which is critical for the application+*. Indeed, the *+home+* directory is not present when using CIOP in Runtime Environment mode and therefore any linked elements won't be available and cause the processing phase to fail.

108

109

h2. 3. Application Directory

110

112

1

Herve Caumont

The application directory is the place where integrated application resides.

113

It should be a clean environment and thus +*SHOULD NOT*+ contain any temporary files or used to do compilations and/or manual testing. Instead, it is used for the simulation the application *+jobs+* and +*workflow*+.

114

116

1

Herve Caumont

In the next sections the elements of the application directory are described.

117

118

h3. 3.1 Files and folders structure

119

120

The application directory follows some best practices in its folders and files structure to ease the subsequent deployment of the application to the CIOP Runtime Environment.

121

The folder structure of the application example with the description of each item is shown below:

122

123

!https://ciop.eo.esa.int/attachments/44/application_folder.png!

124

125

126

> Even if the names are quite similar in the our tutorial example, *+job+* and +*job template*+ are not the same concept. _A *job* is an instance of a +*job template*+ in a given +*workflow*+. This paradigm allows to have several *+jobs+* in a +*workflow*+ that point to the same *+job template+*. This is explained in more detail in the next section.

127

128

h3. 3.2 The Application XML definition file

129

130

The Application XML definition file is the reference of your application for the CIOP computing framework. It contains all the characteristics of the *+job templates+* and the *+workflows+*.

131

132

The Application XML definition file is is described in the page [[Application XML definition file]].

133

134

h2. 4 -- Using sample datasets -- (section under revision)

135

136

This section guides you through the tutorial example to introduce the data manipulation tools.

137

141

1

Herve Caumont

142

Use

143

<pre>ciop-<command> -h</pre>

144

to display the CLI reference

145

146

These commands can be used in the processing triggers of a +*job*+.

147

149

1

Herve Caumont

150

For the tutorial purpose, the first test is to ensure the test dataset needed for the Application integration and testing is complete.

151

152

(to be updated)

153

154

h3. 4.2 Copy data

155

156

To copy data from a reference link as displayed in the previous section, just use the following command:

157

158

<pre><code class="ruby">[user@sb ~]$ ciop-copy http://localhost/catalogue/sandbox/ASA_IM__0P/ASA_IM__0CNPDE20040602_091147_000000152027_00222_11799_1335.N1/rdf</code></pre>

159

160

output:

161

162

<pre>

163

[INFO   ][ciop-copy][starting] url 'http://localhost/catalogue/sandbox/ASA_IM__0P/ASA_IM__0CNPDE20040602_091147_000000152027_00222_11799_1335.N1/rdf' > local '/application/'

164

[INFO   ][ciop-copy][success] got URIs 'https://eo-virtual-archive4.esa.int/supersites/ASA_IM__0CNPDE20040602_091147_000000152027_00222_11799_1335.N1 '

165

[INFO   ][ciop-copy][starting] url 'https://eo-virtual-archive4.esa.int/supersites/ASA_IM__0CNPDE20040602_091147_000000152027_00222_11799_1335.N1' > local '/application/'

166

[INFO   ][ciop-copy][success] url 'https://eo-virtual-archive4.esa.int/supersites/ASA_IM__0CNPDE20040602_091147_000000152027_00222_11799_1335.N1' > local '/application/ASA_IM__0CNPDE20040602_091147_000000152027_00222_11799_1335.N1'

167

/home/user/ASA_IM__0CNPDE20040602_091147_000000152027_00222_11799_1335.N1

168

</pre>

169

170

The command displays information on the _stderr_ by default and returns on _stdout_ the path of the copied data.

171

172

Many other data schemas are supported by the ciop-copy CLI such as http, https, hdfs, etc.

173

There are also many other options to specify the output directory or to unpack compressed data.

174

The complete reference is available here [[ciop-copy CLI reference|ciop-copy usage]] or by using the inline help typing:

175

<pre><code class="ruby">[user@sb ~]$ciop-copy -h</code></pre>

176

177

h3. 4.3 Using other sources of data in a job

178

179

So far we have introduced two types of data sources:

180

* data coming from a catalogue series

181

* data coming from a previous job in the workflow.

182

183

In the first case, we define the workflow for the job imager:

184

185

<pre><code class="xml"><workflow id="testVomir">							<!-- Sample workflow -->

186

		<workflowVersion>1.0</workflowVersion>

187

		<node id="Vimage">							<!-- workflow node unique id -->

188

			<job id="imager"></job>					<!-- job defined above -->

189

			<sources>

190

				<source refid="cas:serie" >ATS_TOA_1P</source>

191

			</sources>

192

			<parameters>							<!-- parameters of the job -->

193

				<parameter id="volcano_db"></parameter>

194

			</parameters>

195

		</node>

196

		<node id="Quarc">

197

			<job id="quarcXML"/>

198

			<sources>

199

				<source refid="wf:node" >Vimage</source>

200

			</sources>

201

		</node></code></pre>

202

203

In the second case, we define the workflow for the Quarc job:

204

205

<pre><code class="xml"><workflow id="testVomir">							<!-- Sample workflow -->

206

		<workflowVersion>1.0</workflowVersion>

207

		<node id="Vimage">							<!-- workflow node unique id -->

208

			<job id="imager"></job>					<!-- job defined above -->

209

			<sources>

210

				<source refid="cas:serie" >ATS_TOA_1P</source>

211

			</sources>

212

			<parameters>							<!-- parameters of the job -->

213

				<parameter id="volcano_db"></parameter>

214

			</parameters>

215

		</node>

216

		<node id="Quarc">

217

			<job id="quarcXML"/>

218

			<sources>

219

				<source refid="wf:node" >Vimage</source>

220

			</sources>

221

		</node></code></pre>

222

223

It may be the case where the input data does not come from EO catalogues and thus there is the need to define another source of data.

224

225

<pre><code class="xml"><workflow id="someworkflow">							<!-- Sample workflow -->

226

		<workflowVersion>1.0</workflowVersion>

227

		<node id="somenode">							<!-- workflow node unique id -->

228

			<job id="somejobid"></job>					<!-- job defined above -->

229

			<sources>

230

				<source refid="file:urls" >/application/test.urls</source>

231

			</sources>

232

		</node>

233

</code></pre>

234

235

where the file test.urls contains the input lines that will be pipped to the processing trigger executable

236

237

h2. 5. Job integration

238

239

In this section, the +*job template+* 'align' and its instance the *+job+* 'AlignSlave' is integrated using the tools previously introduced in this tutorial.

240

241

h3. 5.1 Installation and configuration of the GMTSAR toolbox on the Sandbox

242

244

1

Herve Caumont

> The steps below are done in the +*home directory*+

245

246

* Step - Download the GMTSAR

247

248

GMTSAR software is available on the University of California web server:

249

250

<pre><code class="ruby">

251

[user@sb ~]$ wget http://topex.ucsd.edu/gmtsar/tar/GMTSAR.tar

252

</code></pre>

253

254

Then the GMTSAR.tar archive is unpacked

255

256

<pre><code class="ruby">

257

[user@sb ~]$ tar xvf GMTSAR.tar

258

</code></pre>

259

260

GMTSAR relies on GMT with the dependencies netCDF, GMT is installed via yum:

261

262

<pre><code class="ruby">

263

[user@sb ~]$ cd GMTSAR

264

[user@sb GMTSAR]$ sudo yum search gmt

265

[user@sb GMTSAR]$ sudo yum install GMT-devel

266

[user@sb GMTSAR]$ sudo yum install netcdf-devel

267

[user@sb GMTSAR]$ make

268

</code></pre>

269

270

The steps above compile GMTSAR in the +*home directory*+. The required files (binaries, libraries, etc.) are copied to the /Application environment (remember that the +*home directory*+ is only available in the CIOP Sandbox mode and not in the Runtime Environment).

271

272

h3. 5.2 Job template definition in the application.xml

273

274

The application.xml file has two main blocks: the job template section and the workflow template section.

275

276

The first part is to define the *+job templates+* in the workflow XML application definition file.

277

Each processing block of the GMTSAR workflow needs a *+job template+*.

278

279

Here is the *+job template+* for the 'align'

280

281

<pre><code class="xml">

282

<jobTemplate id="preproc">

283

	<streamingExecutable>/application/preproc/run</streamingExecutable> <!-- processing trigger -->

284

	<defaultParameters> <!-- default parameters of the job -->

285

		<!-- Default values are specified here -->

286

		<parameter id="SAT"></parameter>	<!-- no default value -->

287

		<parameter id="master"></parameter>

288

		<parameter id="num_patches"></parameter>	<!-- no default value -->

289

		<parameter id="near_range"></parameter>	<!-- no default value -->

290

		<parameter id="earth_radius"></parameter>	<!-- no default value -->

291

		<parameter id="fd1">1</parameter>	<!-- no default value -->

292

		<parameter id="stop_on_error">false</parameter>	<!-- dont stop on error by default -->

293

	</defaultParameters>

294

	<defaultJobconf>

295

		<property id="ciop.job.max.tasks">1</property>	<!-- Maximum number of parallel tasks -->

296

		</defaultJobconf>

297

</jobTemplate>

298

</code></pre>

299

300

To test this +*job*+ with the +*ciop-symjob*+, we need to fill the second part of the +*application.xml*+ to add a *+node+*:

301

302

<pre><code class="xml">

303

<node id="PreProc">	<!-- workflow node unique id -->

304

	<job id="preproc"></job> <!-- job template defined before -->

305

	<sources>

306

		<!-- Source is the series of data selection -->

307

		<source refid="cas:serie">ASA_IM__0P</source>

308

	</sources>

309

	<parameters>	<!-- parameters of the job -->

310

		<parameter id="SAT">ENV</parameter>

311

		<parameter id="master">http://localhost/catalogue/sandbox/ASA_IM__0P/ASA_IM__0CNPDE20040602_091147_000000152027_00222_11799_1335.N1/rdf</parameter>

312

		<parameter id="near">978992.922</parameter>

313

		<parameter id="radius">6378000</parameter>

314

		<parameter id="stop_on_error">true</parameter>	<!-- during integration, preferably stop on error -->

315

	</parameters>

316

</node>

317

</code></pre>

318

319

> The complete application.xml is available here: TBW

320

> The application.xml and all its elements are described in details in the page [[Application XML definition file]].

321

323

1

Herve Caumont

324

<pre>

325

http://localhost/catalogue/sandbox/ASA_IM__0P/ASA_IM__0CNPDE20090412_092436_000000162078_00079_37207_1556.N1/rdf

326

http://localhost/catalogue/sandbox/ASA_IM__0P/ASA_IM__0CNPAM20080427_092430_000000172068_00079_32197_3368.N1/rdf

327

</pre>

328

329

h3. 5.3 Processing trigger script

330

331

In section 1.2, we have seen that each +*job*+ must have a processing trigger which is specified in <streamingExecutable> element of the +*job template*+. In our example, this executable shall be a shell script:

332

333

<pre><code class="ruby">

334

# FIRST OF ALL, LOAD CIOP INCLUDES

335

source ${ciop_job_include}

336

337

# If you want to have a complete debug information during implementation

338

ciop-enable-debug

339

340

# All return codes are predefined

341

SUCCESS=0

342

ERR_BADARG=2

343

ERR_MISSING_PREPROC_BIN=3

344

ERR_MISSING_NEAR_PARAM=4

345

ERR_MISSING_RADIUS_PARAM=5

346

ERR_MISSING_MASTER_PARAM=6

347

ERR_MISSING_SAT_PARAM=7

348

ERR_MISSING_FD1_PARAM=8

349

ERR_MISSING_NUMPATCH_PARAM=9

350

ERR_INPUT_DATA_COPY=18

351

ERR_PREPROC_ERROR=19

352

ERR_NOOUTPUT=20

353

DEBUG_EXIT=66

354

355

# This functions handle the exit of the executable

356

# with the corresponding error codes and will return a short message

357

# with the termination reason. It is important to have a synthetic and brief

358

# message because it will be raised to many upper level of the computing framework

359

# up to the user interface

360

function cleanExit ()

361

362

   local retval=$?

363

   local msg=""

364

   case "$retval" in

365

		$SUCCESS)

366

    		msg="Processing successfully concluded";;

367

		$ERR_BADARG)

368

    		msg="function checklibs called with non-directory parameter, returning $res";;

369

		$ERR_MISSING_PREPROC_BIN)

370

    		msg="binary 'pre_proc' not found in path, returning $res";;

371

		$ERR_MISSING_NEAR_PARAM)

372

    		msg="parameter 'near_range' missing or empty, returning $res";;

373

		$ERR_MISSING_RADIUS_PARAM)

374

    		msg="parameter 'earth_radius' missing or empty, returning $res";;

375

        $ERR_MISSING_MASTER_PARAM)

376

            msg="parameter 'master' missing or empty, returning $res";;

377

        $ERR_MISSING_FD1_PARAM)

378

            msg="parameter 'fd1' missing or empty, returning $res";;

379

        $ERR_MISSING_SAT_PARAM)

380

            msg="parameter 'sat' missing or empty, returning $res";;

381

        $ERR_MISSING_NUMPATCH_PARAM)

382

            msg="parameter 'num_patch' missing or empty, returning $res";;

383

        $ERR_INPUT_DATA_COPY)

384

            msg="Unable to retrieve an input file";;

385

        $ERR_PREPROC_ERROR)

386

            msg="Error during processing, aborting task [$res]";;

387

        $ERR_NOOUTPUT)

388

            msg="No output results";;

389

		$DEBUG_EXIT)

390

    		msg="Breaking at debug exit";;

391

*)

392

      msg="Unknown error";;

393

   esac

394

   [ "$retval" != 0 ] && ciop-log "ERROR" "Error $retval - $msg, processing aborted" || ciop-log "INFO" "$msg"

395

   exit "$retval"

396

397

398

# trap an exit signal to exit properly

399

trap cleanExit EXIT

400

401

# Use ciop-log to log message at different level : INFO, WARN, DEBUG

402

ciop-log "DEBUG" '##########################################################'

403

ciop-log "DEBUG" '# Set of useful environment variables                    #'

404

ciop-log "DEBUG" '##########################################################'

405

ciop-log "DEBUG" "TMPDIR                  = $TMPDIR"                  # The temporary directory for the task.

406

ciop-log "DEBUG" "_JOB_ID                 = ${_JOB_ID}"               # The job id

407

ciop-log "DEBUG" "_JOB_LOCAL_DIR          = ${_JOB_LOCAL_DIR}" 		  # The job specific shared scratch space

408

ciop-log "DEBUG" "_TASK_ID                = ${_TASK_ID}"              # The task id

409

ciop-log "DEBUG" "_TASK_LOCAL_DIR         = ${_TASK_LOCAL_DIR}"       # The task specific scratch space

410

ciop-log "DEBUG" "_TASK_NUM               = ${_TASK_NUM}"             # The number of tasks

411

ciop-log "DEBUG" "_TASK_INDEX             = ${_TASK_INDEX}"           # The id of the task within the job

412

413

# Get the processing trigger directory to link binaries and libraries

414

PREPROC_BASE_DIR=`dirname $0`

415

export PATH=$PREPROC_BASE_DIR/bin:$PATH

416

export LD_LIBRARY_PATH=$PREPROC_BASE_DIR/lib:$LD_LIBRARY_PATH

417

418

${_CIOP_APPLICATION_PATH}/GMTSAR/gmtsar_config

419

420

# Test that all my necessary binaries are accessible

421

# if not, exit with the corresponding error.

422

PREPROC_BIN=`which pre_proc_batch.csh`

423

[ -z "$PREPROC_BIN" ] && exit $ERR_MISSING_PREPROC_BIN

424

425

# Processor Environment

426

# definition and creation of input/output directory

427

OUTPUTDIR="$_TASK_LOCAL_DIR/output"		    # results directory

428

INPUTDIR="$_TASK_LOCAL_DIR/input"		    # data input directory

429

MASTERDIR="$_TASK_LOCAL_DIR/master"

430

mkdir -p $OUTPUTDIR $INPUTDIR

431

432

# Processing Variables

433

# Retrieve the near variable

434

NUMPATCH=`ciop-getparam num_patches`

435

[ $? != 0 ] && exit $ERR_MISSING_NUMPATCH_PARAM

436

NEAR=`ciop-getparam near_range`

437

[ $? != 0 ] && exit $ERR_MISSING_NEAR_PARAM

438

RADIUS=`ciop-getparam earth_radius`

439

[ $? != 0 ] && exit $ERR_MISSING_RADIUS_PARAM

440

FD1=`ciop-getparam fd1`

441

[ $? != 0 ] && exit $ERR_MISSING_FD1_PARAM

442

SAT=`ciop-getparam SAT`

443

[ $? != 0 ] && exit $ERR_MISSING_SAT_PARAM

444

MASTER=`ciop-getparam master`

445

[ $? != 0 ] && exit $ERR_MISSING_MASTER_PARAM

446

STOPONERROR=`ciop-getparam stop_on_error`

447

[ $? != 0 ] && STOPONERROR=false

448

449

# Create the batch.config parameter file

450

cat >${_TASK_LOCAL_DIR}/batch.config << EOF

451

452

num_patches = $NUMPATCH

453

earth_radius = $RADIUS

454

near_range = $NEAR

455

fd1 = $FD1

456

457

EOF

458

459

# the parameter 'master' is a reference to a data file

460

# we need to copy it for the rest of our processing

461

# This parameter is at job level so if another prallel task on the same

462

# computing resource has already copied it, we save a useless copy

463

masterFile=`ciop-copy -c -o "$MASTERDIR" -r 10 "$MASTER"`

464

[[ -s $masterFile ]] || {

465

	ciop-log "ERROR" "Unable to retrieve master input at $url" ; exit $ERR_INPUT_DATA_COPY ;

466

467

ciop-log "INFO" "Retrieved master input at $masterFile"

468

469

echo $masterFile >${_TASK_LOCAL_DIR}/data.in

470

471

# Begin processing loop

472

# Read line by line the input in url variable

473

while read url

474

do

475

		# First we copy the data in the INPUT dire

476

		ciop-log "INFO" "Copying data $url" "preproc"

477

		# ciop-copy $url in $INPUTDIR and retry 10 times in case of failure

478

		# local path of the copied file is returned in the $tmpFile variable

479

        tmpFile=`ciop-copy -o "$INPUTDIR" -r 10 "$url"`

480

        [[ -s $tmpFile ]] || {

481

			ciop-log "ERROR" "Unable to retrieve inputfile at $url" ;

482

			[[ $STOPONERROR == true ]] && exit $ERR_INPUT_DATA_COPY ;

483

484

        ciop-log "INFO" "Retrieved inputfile $tmpFile"

485

486

		echo $tmpFile >${_TASK_LOCAL_DIR}/data.in

487

488

done

489

490

# here we start the processing of the stack of data

491

ciop-log "INFO" "Processing stack of data" "preproc"

492

ciop-log "DEBUG" "$PREPROC_BIN $SAT data.in batch.config"

493

494

$PREPROC_BIN $SAT ${_TASK_LOCAL_DIR}/data.in ${_TASK_LOCAL_DIR}/batch.config >$OUTPUTDIR/preproc.log 2>&1

495

rcpp=$?

496

497

if [ "$rcpp" != 0 ]; then

498

	ciop-log "ERROR" "$PREPROC_BIN failed to process, return code $?" "preproc"

499

	cat $OUTPUTDIR/preproc.log >&2

500

	exit $ERR_PREPROC_ERROR

501

fi

502

503

ciop-log "INFO" "Processing complete" "preproc"

504

505

# The the results are "published" for next job

506

# Practically, the output shall be published to a job shared space

507

# and the directory referenced as an url for next job

508

ciop-publish $OUTPUTDIR/

509

510

exit 0

511

512

</code></pre>

513

514

> /\ !!! Keep in mind that the execution shall take place in a non-interactive environment so +error catching+ and +logging+ are very important. They enforce the robustness of your application and avoid loosing time later in debugging !!!

515

516

Here is a summary of the framework tools used in this script with their eventual online help:

517

* *source ${ciop_job_include}* --> include library for many functions such as ciop-log, ciop-enable-debug and ciop-getparam

518

* *ciop-enable-debug* --> this enable the DEBUG level for logging system, otherwise just INFO and WARN message are displayed

519

* *ciop-log* --> log message both in interactive computing framework and in processing stdout/err files. [[ciop-log CLI reference|ciop-log usage]]

520

* *ciop-getparam* --> retrieve job parameter. [[ciop-getparam CLI reference|ciop-getparam usage]]

522

1

Herve Caumont

* *ciop-copy* --> copy remote file to local directory. [[ciop-copy CLI reference|ciop-copy usage]]

523

* *ciop-publish* --> copy *+task+* result files in +*workflow*+ shared space. [[ciop-publish CLI reference|ciop-publish usage]]

524

525

h3. 5.4 Simulating a single job of the workflow

526

527

*ciop-simjob* --> simulates the execution of one processing +*job*+ of the +*work=flow*+. [[ciop-simjob CLI reference|ciop-simjob usage]]

528

529

We will use it to test the first processing block of GMTSAR:

530

531

<pre><code class="ruby">ciop-simjob -f PreProc</code></pre>

532

533

This will output to _stdout_ the URL of the Hadoop Map/Reduce job. Open the link to check if the processing is correctly executed.

534

The command will show the progress messages:

535

536

<pre>

537

Deleted hdfs://sb-10-10-14-24.lab14.sandbox.ciop.int:8020/tmp/sandbox/sample/input.0

538

rmr: cannot remove /tmp/sandbox/sample/PreProc/logs: No such file or directory.

539

mkdir: cannot create directory /tmp/sandbox/sample/PreProc: File exists

540

Deleted hdfs://sb-10-10-14-24.lab14.sandbox.ciop.int:8020/tmp/sandbox/sample/workflow-params.xml

541

Submitting job 25764 ...

542

12/11/21 12:26:56 WARN streaming.StreamJob: -jobconf option is deprecated, please use -D instead.

543

packageJobJar: [/var/lib/hadoop-0.20/cache/emathot/hadoop-unjar5187515757952179540/] [] /tmp/streamjob7738227981987732817.jar tmpDir=null

544

12/11/21 12:26:58 WARN util.NativeCodeLoader: Unable to load native-hadoop library for your platform... using builtin-java classes where applicable

545

12/11/21 12:26:58 WARN snappy.LoadSnappy: Snappy native library not loaded

546

12/11/21 12:26:58 INFO mapred.FileInputFormat: Total input paths to process : 1

547

12/11/21 12:26:58 INFO streaming.StreamJob: getLocalDirs(): [/var/lib/hadoop-0.20/cache/emathot/mapred/local]

548

12/11/21 12:26:58 INFO streaming.StreamJob: Running job: job_201211101342_0045

549

12/11/21 12:26:58 INFO streaming.StreamJob: To kill this job, run:

550

12/11/21 12:26:58 INFO streaming.StreamJob: /usr/lib/hadoop-0.20/bin/hadoop job  -Dmapred.job.tracker=sb-10-10-14-24.lab14.sandbox.ciop.int:8021 -kill job_201211101342_0045

551

12/11/21 12:26:58 INFO streaming.StreamJob: Tracking URL: http://sb-10-10-14-24.lab14.sandbox.ciop.int:50030/jobdetails.jsp?jobid=job_201211101342_0045

552

12/11/21 12:26:59 INFO streaming.StreamJob:  map 0%  reduce 0%

553

12/11/21 12:27:06 INFO streaming.StreamJob:  map 17%  reduce 0%

554

12/11/21 12:27:07 INFO streaming.StreamJob:  map 33%  reduce 0%

555

12/11/21 12:27:13 INFO streaming.StreamJob:  map 67%  reduce 0%

556

12/11/21 12:27:18 INFO streaming.StreamJob:  map 83%  reduce 0%

557

12/11/21 12:27:19 INFO streaming.StreamJob:  map 100%  reduce 0%

558

12/11/21 12:27:24 INFO streaming.StreamJob:  map 100%  reduce 33%

559

12/11/21 12:27:27 INFO streaming.StreamJob:  map 100%  reduce 100%

560

^@12/11/21 12:28:02 INFO streaming.StreamJob:  map 100%  reduce 0%

561

12/11/21 12:28:05 INFO streaming.StreamJob:  map 100%  reduce 100%

562

12/11/21 12:28:05 INFO streaming.StreamJob: To kill this job, run:

563

12/11/21 12:28:05 INFO streaming.StreamJob: /usr/lib/hadoop-0.20/bin/hadoop job  -Dmapred.job.tracker=sb-10-10-14-24.lab14.sandbox.ciop.int:8021 -kill job_201211101342_0045

564

12/11/21 12:28:05 INFO streaming.StreamJob: Tracking URL: http://sb-10-10-14-24.lab14.sandbox.ciop.int:50030/jobdetails.jsp?jobid=job_201211101342_0045

565

12/11/21 12:28:05 ERROR streaming.StreamJob: Job not successful. Error: NA

566

12/11/21 12:28:05 INFO streaming.StreamJob: killJob...

567

Streaming Command Failed!

568

[INFO   ][log] All data, output and logs available at /share//tmp/sandbox/sample/PreProc

569

</pre>

570

571

At this point you can use your browser to display the URL of the _*Traking URL*_

572

573

!https://ciop.eo.esa.int/attachments/46/single_job_debug_1.png!

574

575

This is a single thread job, see the reduce in the table

576

You can click on the kill job in the reduce line. The page shows the task attempt, usually one.

577

578

!https://ciop.eo.esa.int/attachments/47/single_job_debug_2.png!

579

580

In this case, the job ended with exit code 19

581

582

Click on the task link, the same info as before is shown but there are more details (e.g. log in the last column)

583

584

You can click on _*all*_

585

586

_stout_ and _stderr_ appears, you can debug the processing job with this information.

587

588

h3. 5.6 -- Simulating a complete workflow -- (Section under revision)

589

590

h2. 6. -- Application deployment -- (section under revision)

591

592

This section describes the procedure to deploy your application once ready and successfully integrated.

593

594

h3. 6.1 Deploy as a service

595

596

h3. 6.2 Test application in pre-operation

597

598

h3. 6.3 Plan the production

Project

General

Profile

Developer Cloud Sandboxes solution

Lib-gmtsar » History » Version 2