Project

General

Profile

Cas-esa-va4 » History » Version 2

Herve Caumont, 2013-10-25 17:00

1 1 Herve Caumont
h1. ESA VA4 tutorial
2
3 2 Herve Caumont
{{>toc}}
4
5 1 Herve Caumont
This tutorial provides guidance on how to apply for and access ESA Virtual Archive 4 data.
6
7
h2. Introduction 
8
9
Virtual Archives are online archives that provide an easy access to EO data by coupling high bandwidth, large storage space and software. The Virtual Archive 4 provides a Cloud based service for storing and providing access to ESA Synthetic Aperture Radar (SAR) data.
10
11
This virtual archive represents ESA contribution to the supersites initiative. This huge amount of SAR data (today over thirty thousand products are hosted on Virtual Archive 4) is accessible to science communities dealing with interferometry, landslide and change detection.
12
13
The virtual archive is a Cloud solution providing Storage-as-a-Service for storing the data and is coupled with complementary services:
14
15
* User authentication and authorization
16
* Data discovery implementing "simple interfaces":http://eo-virtual-archive4.esa.int/help.html such as OpenSearch and results in Atom, RDF and KML format
17
* Data access via common web protocols such as HTTP(s).
18
19
The virtual archive technical solution is based on research and development performed by "Terradue":http://www.terradue.com and partially funded by European Commission Framework Programme 7 in the context of the "GENESI-DEC":http://www.genesi-dec.eu/ and "GEOWOW":http://www.geowow.eu projects.
20
21
h2. General Concept
22
23
The data discovery is based on the "OpenSearch":http://www.opensearch.org interface provided by the Virtual Archive 4 Catalogue Access Service (VA4-CAS).
24
25
For a more detailed discussion of the OpenSearch interface of the Virtual Archive, see here: [[opensearch|OpenSearch in Virtual Archive 4]].
26
27
h2. Catalogue Data and Output Formats
28
29
The VA4-CAS offers the following data:
30
31
|_<. Series name |_<. Description |_<. Example URL (ATOM) |
32
| ASA_IM__0P | ASAR Image Mode source packets Level 0 | http://eo-virtual-archive4.esa.int/search/ASA_IM__0P/atom |
33
| ASA_IMP_1P | ENVISAT ASAR Image Mode Precision Image  | http://eo-virtual-archive4.esa.int/search/ASA_IMP_1P/atom |
34
| ASA_IMS_1P | ENVISAT ASAR Image Mode Single Look Complex Image | http://eo-virtual-archive4.esa.int/search/ASA_IMS_1P/atom |
35
| ASA_WS__0P | ASAR Wide Swath Level 0 product | http://eo-virtual-archive4.esa.int/search/ASA_WS__0P/atom |
36
| ASA_WSM_1P | ENVISAT ASAR Wide Swath Mode | http://eo-virtual-archive4.esa.int/search/ASA_WSM_1P/atom |
37
| ASA_WSS_1P | ENVISAT ASAR Wide Swath Single Look Complex  | http://eo-virtual-archive4.esa.int/search/ASA_WSS_1P/atom |
38
| ASA_XCA_AX | ENVISAT ASAR External calibration data | http://eo-virtual-archive4.esa.int/search/ER02_SAR_RAW_0P/atom |
39
| ER01_SAR_IM__0P | ERS-1 SAR Image Mode source packets Level 0 | http://eo-virtual-archive4.esa.int/search/ER01_SAR_IM__0P/atom |
40
| ER01_SAR_IMP_1P | ERS-1 SAR Precision Image Product | http://eo-virtual-archive4.esa.int/search/ER01_SAR_IMP_1P/atom |
41
| ER01_SAR_IMS_1P | ERS-1 SAR Single Look Complex Image Product  | http://eo-virtual-archive4.esa.int/search/ER01_SAR_IMS_1P/atom |
42
| ER01_SAR_RAW_0P | ERS-1 SAR Image SAR Annotated Raw Data Product Level 0 | http://eo-virtual-archive4.esa.int/search/ER01_SAR_RAW_0P/atom |
43
| ER02_SAR_IM__0P | ERS-2 SAR Image Mode source packets Level 0 | http://eo-virtual-archive4.esa.int/search/ER02_SAR_IM__0P/atom |
44
| ER02_SAR_IMP_1P | ERS-2 SAR Precision Image Product | http://eo-virtual-archive4.esa.int/search/ER02_SAR_IMP_1P/atom |
45
| ER02_SAR_IMS_1P | ERS-2 SAR Single Look Complex Image Product  | http://eo-virtual-archive4.esa.int/search/ER02_SAR_IMS_1P/atom |
46
| ER02_SAR_RAW_0P | ERS-2 SAR Image SAR Annotated Raw Data Product Level 0 | http://eo-virtual-archive4.esa.int/search/ER02_SAR_RAW_0P/atom |
47
48
The following output formats are available (the *ASA_IM__0P* series is used as an example in the remainder of this document, but everything explained below applies to the other series too):
49
50
|_<. Format |_<. URL (using ASA_IM__0P as example) |_<. Comment |
51
| RDF | http://eo-virtual-archive4.esa.int/search/ASA_IM__0P/rdf | The most universal format, contains the complete product metadata |
52
| ATOM | http://eo-virtual-archive4.esa.int/search/ASA_IM__0P/atom | ATOM feed |
53
| HTML | http://eo-virtual-archive4.esa.int/search/ASA_IM__0P/html | Web-optimised representation, useful for quick verifications with a web browser |
54
| KML | http://eo-virtual-archive4.esa.int/search/ASA_IM__0P/kml | Allows viewing product footprints in Google Earth |
55
| WKT | http://eo-virtual-archive4.esa.int/search/ASA_IM__0P/wkt | Returns product footprints as "Well-known text":http://en.wikipedia.org/wiki/Well-known_text |
56
| Only download URLs | http://eo-virtual-archive4.esa.int/search/ASA_IM__0P/txt | Returns a text document containing the download URLs of the products |
57
58
h2. Viewing Product Information
59
60
The best way to get familiar with the VA4-CAS is making requests using a web browser. You can do this by chosing any of the above URLs (e.g. HTML or ATOM format).
61
62
At this point, you may refine your search by specifying search criteria. This is done by appending query string parameters to the base URL. Doing so, only information of the matching products is returned (below there is an example).
63
64
These are the search criteria you can specify:
65
66
|_<. Query string parameter |_<. Description |_<. Value format |_<. Value set |_<. Example |
67
| @bbox@ | *Rectangular area of interest* | @minlon,minlat,maxlon,maxlat@ | _none_ | @bbox=13.3,50.5,14.3,51.5@ |
68
| @geometry@ | *Arbitrary area of interest* (alternative to @bbox@) | "Well-known text":http://en.wikipedia.org/wiki/Well-known_text string | _none_ | @geometry=POLYGON%28%2812.5%2041%2C13%2042%2C12.5%2043%2C12%2042%2C12.5%2041%29%29@ (URL-encoded) |
69
|/2. @start@ |/2. *Start date* of the period to be covered | @YYYY-MM-DD@, or |/2. _none_ | @start=2010-09-12@ |
70
| @YYYY-MM-DDThh:mm:ssZ@ | @start=2010-09-12T13:58:26Z@ |
71
|/2. @stop@ |/2. *End date* of the period to be covered | @YYYY-MM-DD@, or |/2. _none_ | @start=2010-09-15@ |
72
| date and time in format @YYYY-MM-DDThh:mm:ssZ@ | @start=2010-09-15T21:24:30Z@ |
73
|/5. @processingCenter@ |/5. *Processing centre* |/5. Text, asterisk wildcard can be used | @I-PAC@ |/5. @processingCenter=PDAS-*@ |
74
| @PDAS-F@ |
75
| @PDAS-M@ |
76
| @PDHS-E@ |
77
| @PDHS-K@ |
78
| @acquisitionStation@ | *Acquisition station* | Text | | @acquisitionStation=...@ |
79
|/2. @orbitDirection@ |/2. *Orbit direction* |/2. Text | @ASCENDING@ |/2. @orbitDirection=ASCENDING@ |
80
| @DESCENDING@ |
81
| @orbitNumber@ | *Orbit number* | Integer value or interval | _none_ | @orbitNumber=41923@ or @orbitNumber=[41920,41930]@ |
82
| @frame@ | *Frame* | | _none_ | @frame=2205@ or @frame=[2000,2500],[3000@ |
83
| @track@ | *Track* | Integer value or interval | _none_ | @track=129@ or track=[128@ |
84
| @count@ | *Number of records* to be returned (default is @20@) | Integer value | _none_ | @count=100@ |
85
| @startIndex@ | *Index of first result record* to be returned (starting with @0@, default is @0@) | Integer value | _none_ | @startIndex=20@ |
86
| @startPage@ | *Result page* (starting with @0@, default is @0@), each page contains @count@ records (alternative to @startIndex@) | Integer value | _none_ | @startPage=1@ |
87
88
89
By default the number of returned products is 20. To obtain all matching products, you can do the following:
90
91
* Provide a value for the @count@ query string parameter, e.g. @count=100@
92
* Do sequential requests providing the index of the first product to be returned until the returned result is empty (beyond the total number of matching products), e.g. @startIndex=0@, @startIndex=20@, @startIndex=40@, etc (note that the first product has the index @0@).
93
94
h2. Web Browser Examples:
95
96
* *RDF*, test URL: http://eo-virtual-archive4.esa.int/search/ASA_IM__0P/rdf?start=2004-09-01&stop=2004-09-10&bbox=4,36,20,48&orbitDirection=ASCENDING
97
  !example-web-rdf.png!
98
99
* *ATOM*, test URL: http://eo-virtual-archive4.esa.int/search/ASA_IM__0P/atom?start=2004-09-01&stop=2004-09-10&bbox=4,36,20,48&orbitDirection=ASCENDING
100
  !example-web-atom.png!
101
102
* *HTML*, test URL: http://eo-virtual-archive4.esa.int/search/ASA_IM__0P/html?start=2004-09-01&stop=2004-09-10&bbox=4,36,20,48&orbitDirection=ASCENDING
103
  !example-web-html.png!
104
105
* *KML*, test URL: http://eo-virtual-archive4.esa.int/search/ASA_IM__0P/kml?start=2004-09-01&stop=2004-09-10&bbox=4,36,20,48&orbitDirection=ASCENDING (viewed in Google Earth)
106
  !example-web-kml.png!
107
108
h2. Discovering Data Using @curl@
109
110
Accessing the VA4-CAS via @curl@ is very similar to using the web browser as described in the previous section, except that not all formats offered by the VA4-CAS are useful in a machine-to-machine context. The output format you will use most likely is *@txt@*, which returns download locations of matching products.
111
112
In the next section an example for this is given.
113
114
h3. Example (L'Aquila Earthquake) with @curl@
115
116
To obtain the download locations of ASA_IM__0P imagery for the April 2009 earthquake in L'Aquila (before and after), you can use the following @curl@ command:
117
<pre>curl "http://eo-virtual-archive4.esa.int/search/ASA_IM__0P/txt?bbox=13,42,13,42&start=2009-03-01&stop=2009-06-01"</pre>
118
119
It returns 7 results (depicted in the map below):
120
121
<pre>
122
https://eo-virtual-archive4.esa.int/supersites/ASA_IM__0CNPAM20090517_092439_000000172079_00079_37708_4238.N1
123
https://eo-virtual-archive4.esa.int/supersites/ASA_IM__0CNPAM20090517_092430_000000162079_00079_37708_4238.N1
124
https://eo-virtual-archive4.esa.int/supersites/ASA_IM__0CNPAM20090504_205036_000000162078_00401_37529_4202.N1
125
https://eo-virtual-archive4.esa.int/supersites/ASA_IM__0CNPDE20090412_092436_000000162078_00079_37207_1556.N1
126
https://eo-virtual-archive4.esa.int/supersites/ASA_IM__0CNPDE20090412_092426_000000162078_00079_37207_1556.N1
127
https://eo-virtual-archive4.esa.int/supersites/ASA_IM__0CNPAM20090330_205031_000000172077_00401_37028_4122.N1
128
https://eo-virtual-archive4.esa.int/supersites/ASA_IM__0CNPAM20090308_092431_000000162077_00079_36706_4070.N1
129
</pre>
130
131
!example-laquila.png!
132
133
To further refine the result in order to have the same orbit direction, you can use this slightly changed @curl@ command:
134
135
<pre>curl "http://eo-virtual-archive4.esa.int/search/ASA_IM__0P/txt?bbox=13,42,13,42&start=2009-03-01&stop=2009-06-01&orbitDirection=DESCENDING"</pre>
136
137
Now the results are only 4:
138
139
<pre>
140
https://eo-virtual-archive4.esa.int/supersites/ASA_IM__0CNPAM20090517_092439_000000172079_00079_37708_4238.N1
141
https://eo-virtual-archive4.esa.int/supersites/ASA_IM__0CNPAM20090517_092430_000000162079_00079_37708_4238.N1
142
https://eo-virtual-archive4.esa.int/supersites/ASA_IM__0CNPDE20090412_092436_000000162078_00079_37207_1556.N1
143
https://eo-virtual-archive4.esa.int/supersites/ASA_IM__0CNPDE20090412_092426_000000162078_00079_37207_1556.N1
144
</pre>
145
146
!example-laquila-desc.png!
147
148
h3. Discovering Data Using @ciop-catquery@
149
150
The @ciop-catquery@ tool is a command-line client aimed at simplifying more the already simple direct catalogue access.
151
152
It has the following usage: @ciop-catquery [options] <catalogue_url>@, where the @<catalogue_url>@ is http://eo-virtual-archive4.esa.int/search/. The name of the series can be either appended or added using the @-se@ option.
153
154
The following options are available for refining a query (see also @./ciop-catquery -h@ for help):
155
156
* *Spatial coverage*: @-b@ or @--bounding-box=@
157
   The option value option value must be a boundbing box in @minlon,minlat,maxlon,maxlat@ format.
158
159
* *Temporal coverage*: @-tq@ or @--time-query=@
160
  The option value must be a string in the format @begin=YYYY-MM-DDThh:mm:ssZ;end=YYYY-MM-DDThh:mm:ssZ[;method=<method>]@;|/2. @time:start@, @time:end@.
161
  The @method@ value (optional) allows to specify which of the sensing start/end times must be covered by the specified period (@sensing_interval@, @sensing_start@, @sensing_stop@) or when the data was processed or inserted (@processing@, @insertion@)
162
163
* *Other attributes*: @-a@ or @--attribute=@ (can occur multiple times for different attributes)
164
  The option value must be a string in the format @attribute:condition@, the following attributes can be used:
165
  
166
  * @size@
167
  * @orbitNumber@
168
  * @processorVersion@
169
  * @processingCenter@
170
  * @acquisitionStation@
171
 
172
  The @condition@ can be a simple term, e.g. @-a orbitnumber:41923@ or an interval, e.g. @-a orbitNumber:[41920,41930]@
173
174
175
The following options allow to change the output of the script:
176
177
* *Receive full RDF*: @-ox@ or @--output-xml@
178
179
* *Receive selected fields*: @-o@ or @outputfields=@ as a space-separated field list with one line per data set
180
  The option value is a comma-separated list of field names (the XML element names can be obtained using the @-ox@ switch (see above)
181
182
* *Receive only download locations of matching products*: @-o dclite4g:onlineResource@ or @outputfields=dclite4g:onlineResource@
183
  This returns URLs for matching data sets, one per line.
184
185
* *Redirection to a file*: @-O@ or @--outputfile=@
186
  The value must be the path to the output file
187
188
h2. Data Staging
189
190
Having obtained the list of download locations from the VA4-CAS, it is possible to stage datasets by using either @curl@ or @ciop-copy@.
191
192
h3. Data Access Using @curl@
193
194
If the download URLs are available (e.g. from a call to @ciop-catquery -o dclite4g:onlineResource@), they can be used with curl to download the data set files.
195
196
The difficulty of the file download lies in the fact that the downloadable resources are protected by ESA's Single Sign-on service, which is usually accessed by a web browser; therefore an automated download has to simulate a web browser session and handle several cookies. The following bash script shows how this can be achieved using @curl@; it is assumed that the URLs are contained in a file named @urls.txt@, one URL per line.
197
198
<pre>
199
SSO_USERNAME=<YOUR-UMSSO-USERNAME>
200
SSO_PASSWORD=<YOUR-UMSSO-PASSWORD>
201
202
rm -rf cookie*.txt
203
cookies=
204
for url in $(cat urls.txt)
205
do
206
    filename=$(basename $url)
207
    echo "Downloading $url"
208
    curl -k --cacert /etc/grid-security/certificates/cacert.crt -L $cookies -c cookie.txt -o "$filename" "$url"
209
    [ -z "$cookies" ] && cookies="-b cookie.txt"
210
    if [ "$(grep "<title>EO SSO</title>" $filename)" ]
211
    then
212
        poststring="cn=$SSO_USERNAME&password=$SSO_PASSWORD&idleTime=oneday&sessionTime=untilbrowserclose&loginButton=Login"
213
        curl -k -XPOST -d "$poststring" -L -H "Content-type: application/x-www-form-urlencoded" $cookies -c cookie2.txt -o "$filename" "https://eo-sso-idp.eo.esa.int/idp/umsso20/login?null"
214
        cookies="$cookies -b cookie2.txt"
215
        if [ "$(grep "Please wait..." $filename)" ]
216
        then
217
            poststring="cn=$SSO_USERNAME&password=$SSO_PASSWORD&idleTime=oneday&sessionTime=untilbrowserclose&loginFields=cn%40password&loginMethod=umsso"
218
            curl -k -X POST -d "$poststring" -L -H "Content-type: application/x-www-form-urlencoded" $cookies -o "$filename" "https://eo-sso-idp.eo.esa.int/idp/umsso20/login?null"
219
        fi
220
     fi
221
done
222
</pre>
223
224
h3. Data Access Using @ciop-copy@
225
226
@./ciop-copy -f -o ./ "https://eo-virtual-archive4.esa.int/supersites/ASA_WSS_1PNUPA20100125_025705_000000632086_00190_41326_0381.N1"@
227
228
229
h2. Conclusion
230
231
With this lesson you have learned:
232
- to query ESA VA4 dataset series
233
- to query and downlaod ESA VA4 datasets
234