Cas-esa-va4 » History » Version 1
Herve Caumont, 2013-06-19 18:05
1 | 1 | Herve Caumont | h1. ESA VA4 tutorial |
---|---|---|---|
2 | |||
3 | This tutorial provides guidance on how to apply for and access ESA Virtual Archive 4 data. |
||
4 | |||
5 | h2. Introduction |
||
6 | |||
7 | Virtual Archives are online archives that provide an easy access to EO data by coupling high bandwidth, large storage space and software. The Virtual Archive 4 provides a Cloud based service for storing and providing access to ESA Synthetic Aperture Radar (SAR) data. |
||
8 | |||
9 | This virtual archive represents ESA contribution to the supersites initiative. This huge amount of SAR data (today over thirty thousand products are hosted on Virtual Archive 4) is accessible to science communities dealing with interferometry, landslide and change detection. |
||
10 | |||
11 | The virtual archive is a Cloud solution providing Storage-as-a-Service for storing the data and is coupled with complementary services: |
||
12 | |||
13 | * User authentication and authorization |
||
14 | * Data discovery implementing "simple interfaces":http://eo-virtual-archive4.esa.int/help.html such as OpenSearch and results in Atom, RDF and KML format |
||
15 | * Data access via common web protocols such as HTTP(s). |
||
16 | |||
17 | The virtual archive technical solution is based on research and development performed by "Terradue":http://www.terradue.com and partially funded by European Commission Framework Programme 7 in the context of the "GENESI-DEC":http://www.genesi-dec.eu/ and "GEOWOW":http://www.geowow.eu projects. |
||
18 | |||
19 | h2. General Concept |
||
20 | |||
21 | The data discovery is based on the "OpenSearch":http://www.opensearch.org interface provided by the Virtual Archive 4 Catalogue Access Service (VA4-CAS). |
||
22 | |||
23 | For a more detailed discussion of the OpenSearch interface of the Virtual Archive, see here: [[opensearch|OpenSearch in Virtual Archive 4]]. |
||
24 | |||
25 | h2. Catalogue Data and Output Formats |
||
26 | |||
27 | The VA4-CAS offers the following data: |
||
28 | |||
29 | |_<. Series name |_<. Description |_<. Example URL (ATOM) | |
||
30 | | ASA_IM__0P | ASAR Image Mode source packets Level 0 | http://eo-virtual-archive4.esa.int/search/ASA_IM__0P/atom | |
||
31 | | ASA_IMP_1P | ENVISAT ASAR Image Mode Precision Image | http://eo-virtual-archive4.esa.int/search/ASA_IMP_1P/atom | |
||
32 | | ASA_IMS_1P | ENVISAT ASAR Image Mode Single Look Complex Image | http://eo-virtual-archive4.esa.int/search/ASA_IMS_1P/atom | |
||
33 | | ASA_WS__0P | ASAR Wide Swath Level 0 product | http://eo-virtual-archive4.esa.int/search/ASA_WS__0P/atom | |
||
34 | | ASA_WSM_1P | ENVISAT ASAR Wide Swath Mode | http://eo-virtual-archive4.esa.int/search/ASA_WSM_1P/atom | |
||
35 | | ASA_WSS_1P | ENVISAT ASAR Wide Swath Single Look Complex | http://eo-virtual-archive4.esa.int/search/ASA_WSS_1P/atom | |
||
36 | | ASA_XCA_AX | ENVISAT ASAR External calibration data | http://eo-virtual-archive4.esa.int/search/ER02_SAR_RAW_0P/atom | |
||
37 | | ER01_SAR_IM__0P | ERS-1 SAR Image Mode source packets Level 0 | http://eo-virtual-archive4.esa.int/search/ER01_SAR_IM__0P/atom | |
||
38 | | ER01_SAR_IMP_1P | ERS-1 SAR Precision Image Product | http://eo-virtual-archive4.esa.int/search/ER01_SAR_IMP_1P/atom | |
||
39 | | ER01_SAR_IMS_1P | ERS-1 SAR Single Look Complex Image Product | http://eo-virtual-archive4.esa.int/search/ER01_SAR_IMS_1P/atom | |
||
40 | | ER01_SAR_RAW_0P | ERS-1 SAR Image SAR Annotated Raw Data Product Level 0 | http://eo-virtual-archive4.esa.int/search/ER01_SAR_RAW_0P/atom | |
||
41 | | ER02_SAR_IM__0P | ERS-2 SAR Image Mode source packets Level 0 | http://eo-virtual-archive4.esa.int/search/ER02_SAR_IM__0P/atom | |
||
42 | | ER02_SAR_IMP_1P | ERS-2 SAR Precision Image Product | http://eo-virtual-archive4.esa.int/search/ER02_SAR_IMP_1P/atom | |
||
43 | | ER02_SAR_IMS_1P | ERS-2 SAR Single Look Complex Image Product | http://eo-virtual-archive4.esa.int/search/ER02_SAR_IMS_1P/atom | |
||
44 | | ER02_SAR_RAW_0P | ERS-2 SAR Image SAR Annotated Raw Data Product Level 0 | http://eo-virtual-archive4.esa.int/search/ER02_SAR_RAW_0P/atom | |
||
45 | |||
46 | The following output formats are available (the *ASA_IM__0P* series is used as an example in the remainder of this document, but everything explained below applies to the other series too): |
||
47 | |||
48 | |_<. Format |_<. URL (using ASA_IM__0P as example) |_<. Comment | |
||
49 | | RDF | http://eo-virtual-archive4.esa.int/search/ASA_IM__0P/rdf | The most universal format, contains the complete product metadata | |
||
50 | | ATOM | http://eo-virtual-archive4.esa.int/search/ASA_IM__0P/atom | ATOM feed | |
||
51 | | HTML | http://eo-virtual-archive4.esa.int/search/ASA_IM__0P/html | Web-optimised representation, useful for quick verifications with a web browser | |
||
52 | | KML | http://eo-virtual-archive4.esa.int/search/ASA_IM__0P/kml | Allows viewing product footprints in Google Earth | |
||
53 | | WKT | http://eo-virtual-archive4.esa.int/search/ASA_IM__0P/wkt | Returns product footprints as "Well-known text":http://en.wikipedia.org/wiki/Well-known_text | |
||
54 | | Only download URLs | http://eo-virtual-archive4.esa.int/search/ASA_IM__0P/txt | Returns a text document containing the download URLs of the products | |
||
55 | |||
56 | h2. Viewing Product Information |
||
57 | |||
58 | The best way to get familiar with the VA4-CAS is making requests using a web browser. You can do this by chosing any of the above URLs (e.g. HTML or ATOM format). |
||
59 | |||
60 | At this point, you may refine your search by specifying search criteria. This is done by appending query string parameters to the base URL. Doing so, only information of the matching products is returned (below there is an example). |
||
61 | |||
62 | These are the search criteria you can specify: |
||
63 | |||
64 | |_<. Query string parameter |_<. Description |_<. Value format |_<. Value set |_<. Example | |
||
65 | | @bbox@ | *Rectangular area of interest* | @minlon,minlat,maxlon,maxlat@ | _none_ | @bbox=13.3,50.5,14.3,51.5@ | |
||
66 | | @geometry@ | *Arbitrary area of interest* (alternative to @bbox@) | "Well-known text":http://en.wikipedia.org/wiki/Well-known_text string | _none_ | @geometry=POLYGON%28%2812.5%2041%2C13%2042%2C12.5%2043%2C12%2042%2C12.5%2041%29%29@ (URL-encoded) | |
||
67 | |/2. @start@ |/2. *Start date* of the period to be covered | @YYYY-MM-DD@, or |/2. _none_ | @start=2010-09-12@ | |
||
68 | | @YYYY-MM-DDThh:mm:ssZ@ | @start=2010-09-12T13:58:26Z@ | |
||
69 | |/2. @stop@ |/2. *End date* of the period to be covered | @YYYY-MM-DD@, or |/2. _none_ | @start=2010-09-15@ | |
||
70 | | date and time in format @YYYY-MM-DDThh:mm:ssZ@ | @start=2010-09-15T21:24:30Z@ | |
||
71 | |/5. @processingCenter@ |/5. *Processing centre* |/5. Text, asterisk wildcard can be used | @I-PAC@ |/5. @processingCenter=PDAS-*@ | |
||
72 | | @PDAS-F@ | |
||
73 | | @PDAS-M@ | |
||
74 | | @PDHS-E@ | |
||
75 | | @PDHS-K@ | |
||
76 | | @acquisitionStation@ | *Acquisition station* | Text | | @acquisitionStation=...@ | |
||
77 | |/2. @orbitDirection@ |/2. *Orbit direction* |/2. Text | @ASCENDING@ |/2. @orbitDirection=ASCENDING@ | |
||
78 | | @DESCENDING@ | |
||
79 | | @orbitNumber@ | *Orbit number* | Integer value or interval | _none_ | @orbitNumber=41923@ or @orbitNumber=[41920,41930]@ | |
||
80 | | @frame@ | *Frame* | | _none_ | @frame=2205@ or @frame=[2000,2500],[3000@ | |
||
81 | | @track@ | *Track* | Integer value or interval | _none_ | @track=129@ or track=[128@ | |
||
82 | | @count@ | *Number of records* to be returned (default is @20@) | Integer value | _none_ | @count=100@ | |
||
83 | | @startIndex@ | *Index of first result record* to be returned (starting with @0@, default is @0@) | Integer value | _none_ | @startIndex=20@ | |
||
84 | | @startPage@ | *Result page* (starting with @0@, default is @0@), each page contains @count@ records (alternative to @startIndex@) | Integer value | _none_ | @startPage=1@ | |
||
85 | |||
86 | |||
87 | By default the number of returned products is 20. To obtain all matching products, you can do the following: |
||
88 | |||
89 | * Provide a value for the @count@ query string parameter, e.g. @count=100@ |
||
90 | * Do sequential requests providing the index of the first product to be returned until the returned result is empty (beyond the total number of matching products), e.g. @startIndex=0@, @startIndex=20@, @startIndex=40@, etc (note that the first product has the index @0@). |
||
91 | |||
92 | h2. Web Browser Examples: |
||
93 | |||
94 | * *RDF*, test URL: http://eo-virtual-archive4.esa.int/search/ASA_IM__0P/rdf?start=2004-09-01&stop=2004-09-10&bbox=4,36,20,48&orbitDirection=ASCENDING |
||
95 | !example-web-rdf.png! |
||
96 | |||
97 | * *ATOM*, test URL: http://eo-virtual-archive4.esa.int/search/ASA_IM__0P/atom?start=2004-09-01&stop=2004-09-10&bbox=4,36,20,48&orbitDirection=ASCENDING |
||
98 | !example-web-atom.png! |
||
99 | |||
100 | * *HTML*, test URL: http://eo-virtual-archive4.esa.int/search/ASA_IM__0P/html?start=2004-09-01&stop=2004-09-10&bbox=4,36,20,48&orbitDirection=ASCENDING |
||
101 | !example-web-html.png! |
||
102 | |||
103 | * *KML*, test URL: http://eo-virtual-archive4.esa.int/search/ASA_IM__0P/kml?start=2004-09-01&stop=2004-09-10&bbox=4,36,20,48&orbitDirection=ASCENDING (viewed in Google Earth) |
||
104 | !example-web-kml.png! |
||
105 | |||
106 | h2. Discovering Data Using @curl@ |
||
107 | |||
108 | Accessing the VA4-CAS via @curl@ is very similar to using the web browser as described in the previous section, except that not all formats offered by the VA4-CAS are useful in a machine-to-machine context. The output format you will use most likely is *@txt@*, which returns download locations of matching products. |
||
109 | |||
110 | In the next section an example for this is given. |
||
111 | |||
112 | h3. Example (L'Aquila Earthquake) with @curl@ |
||
113 | |||
114 | To obtain the download locations of ASA_IM__0P imagery for the April 2009 earthquake in L'Aquila (before and after), you can use the following @curl@ command: |
||
115 | <pre>curl "http://eo-virtual-archive4.esa.int/search/ASA_IM__0P/txt?bbox=13,42,13,42&start=2009-03-01&stop=2009-06-01"</pre> |
||
116 | |||
117 | It returns 7 results (depicted in the map below): |
||
118 | |||
119 | <pre> |
||
120 | https://eo-virtual-archive4.esa.int/supersites/ASA_IM__0CNPAM20090517_092439_000000172079_00079_37708_4238.N1 |
||
121 | https://eo-virtual-archive4.esa.int/supersites/ASA_IM__0CNPAM20090517_092430_000000162079_00079_37708_4238.N1 |
||
122 | https://eo-virtual-archive4.esa.int/supersites/ASA_IM__0CNPAM20090504_205036_000000162078_00401_37529_4202.N1 |
||
123 | https://eo-virtual-archive4.esa.int/supersites/ASA_IM__0CNPDE20090412_092436_000000162078_00079_37207_1556.N1 |
||
124 | https://eo-virtual-archive4.esa.int/supersites/ASA_IM__0CNPDE20090412_092426_000000162078_00079_37207_1556.N1 |
||
125 | https://eo-virtual-archive4.esa.int/supersites/ASA_IM__0CNPAM20090330_205031_000000172077_00401_37028_4122.N1 |
||
126 | https://eo-virtual-archive4.esa.int/supersites/ASA_IM__0CNPAM20090308_092431_000000162077_00079_36706_4070.N1 |
||
127 | </pre> |
||
128 | |||
129 | !example-laquila.png! |
||
130 | |||
131 | To further refine the result in order to have the same orbit direction, you can use this slightly changed @curl@ command: |
||
132 | |||
133 | <pre>curl "http://eo-virtual-archive4.esa.int/search/ASA_IM__0P/txt?bbox=13,42,13,42&start=2009-03-01&stop=2009-06-01&orbitDirection=DESCENDING"</pre> |
||
134 | |||
135 | Now the results are only 4: |
||
136 | |||
137 | <pre> |
||
138 | https://eo-virtual-archive4.esa.int/supersites/ASA_IM__0CNPAM20090517_092439_000000172079_00079_37708_4238.N1 |
||
139 | https://eo-virtual-archive4.esa.int/supersites/ASA_IM__0CNPAM20090517_092430_000000162079_00079_37708_4238.N1 |
||
140 | https://eo-virtual-archive4.esa.int/supersites/ASA_IM__0CNPDE20090412_092436_000000162078_00079_37207_1556.N1 |
||
141 | https://eo-virtual-archive4.esa.int/supersites/ASA_IM__0CNPDE20090412_092426_000000162078_00079_37207_1556.N1 |
||
142 | </pre> |
||
143 | |||
144 | !example-laquila-desc.png! |
||
145 | |||
146 | h3. Discovering Data Using @ciop-catquery@ |
||
147 | |||
148 | The @ciop-catquery@ tool is a command-line client aimed at simplifying more the already simple direct catalogue access. |
||
149 | |||
150 | It has the following usage: @ciop-catquery [options] <catalogue_url>@, where the @<catalogue_url>@ is http://eo-virtual-archive4.esa.int/search/. The name of the series can be either appended or added using the @-se@ option. |
||
151 | |||
152 | The following options are available for refining a query (see also @./ciop-catquery -h@ for help): |
||
153 | |||
154 | * *Spatial coverage*: @-b@ or @--bounding-box=@ |
||
155 | The option value option value must be a boundbing box in @minlon,minlat,maxlon,maxlat@ format. |
||
156 | |||
157 | * *Temporal coverage*: @-tq@ or @--time-query=@ |
||
158 | The option value must be a string in the format @begin=YYYY-MM-DDThh:mm:ssZ;end=YYYY-MM-DDThh:mm:ssZ[;method=<method>]@;|/2. @time:start@, @time:end@. |
||
159 | The @method@ value (optional) allows to specify which of the sensing start/end times must be covered by the specified period (@sensing_interval@, @sensing_start@, @sensing_stop@) or when the data was processed or inserted (@processing@, @insertion@) |
||
160 | |||
161 | * *Other attributes*: @-a@ or @--attribute=@ (can occur multiple times for different attributes) |
||
162 | The option value must be a string in the format @attribute:condition@, the following attributes can be used: |
||
163 | |||
164 | * @size@ |
||
165 | * @orbitNumber@ |
||
166 | * @processorVersion@ |
||
167 | * @processingCenter@ |
||
168 | * @acquisitionStation@ |
||
169 | |||
170 | The @condition@ can be a simple term, e.g. @-a orbitnumber:41923@ or an interval, e.g. @-a orbitNumber:[41920,41930]@ |
||
171 | |||
172 | |||
173 | The following options allow to change the output of the script: |
||
174 | |||
175 | * *Receive full RDF*: @-ox@ or @--output-xml@ |
||
176 | |||
177 | * *Receive selected fields*: @-o@ or @outputfields=@ as a space-separated field list with one line per data set |
||
178 | The option value is a comma-separated list of field names (the XML element names can be obtained using the @-ox@ switch (see above) |
||
179 | |||
180 | * *Receive only download locations of matching products*: @-o dclite4g:onlineResource@ or @outputfields=dclite4g:onlineResource@ |
||
181 | This returns URLs for matching data sets, one per line. |
||
182 | |||
183 | * *Redirection to a file*: @-O@ or @--outputfile=@ |
||
184 | The value must be the path to the output file |
||
185 | |||
186 | h2. Data Staging |
||
187 | |||
188 | Having obtained the list of download locations from the VA4-CAS, it is possible to stage datasets by using either @curl@ or @ciop-copy@. |
||
189 | |||
190 | h3. Data Access Using @curl@ |
||
191 | |||
192 | If the download URLs are available (e.g. from a call to @ciop-catquery -o dclite4g:onlineResource@), they can be used with curl to download the data set files. |
||
193 | |||
194 | The difficulty of the file download lies in the fact that the downloadable resources are protected by ESA's Single Sign-on service, which is usually accessed by a web browser; therefore an automated download has to simulate a web browser session and handle several cookies. The following bash script shows how this can be achieved using @curl@; it is assumed that the URLs are contained in a file named @urls.txt@, one URL per line. |
||
195 | |||
196 | <pre> |
||
197 | SSO_USERNAME=<YOUR-UMSSO-USERNAME> |
||
198 | SSO_PASSWORD=<YOUR-UMSSO-PASSWORD> |
||
199 | |||
200 | rm -rf cookie*.txt |
||
201 | cookies= |
||
202 | for url in $(cat urls.txt) |
||
203 | do |
||
204 | filename=$(basename $url) |
||
205 | echo "Downloading $url" |
||
206 | curl -k --cacert /etc/grid-security/certificates/cacert.crt -L $cookies -c cookie.txt -o "$filename" "$url" |
||
207 | [ -z "$cookies" ] && cookies="-b cookie.txt" |
||
208 | if [ "$(grep "<title>EO SSO</title>" $filename)" ] |
||
209 | then |
||
210 | poststring="cn=$SSO_USERNAME&password=$SSO_PASSWORD&idleTime=oneday&sessionTime=untilbrowserclose&loginButton=Login" |
||
211 | curl -k -XPOST -d "$poststring" -L -H "Content-type: application/x-www-form-urlencoded" $cookies -c cookie2.txt -o "$filename" "https://eo-sso-idp.eo.esa.int/idp/umsso20/login?null" |
||
212 | cookies="$cookies -b cookie2.txt" |
||
213 | if [ "$(grep "Please wait..." $filename)" ] |
||
214 | then |
||
215 | poststring="cn=$SSO_USERNAME&password=$SSO_PASSWORD&idleTime=oneday&sessionTime=untilbrowserclose&loginFields=cn%40password&loginMethod=umsso" |
||
216 | curl -k -X POST -d "$poststring" -L -H "Content-type: application/x-www-form-urlencoded" $cookies -o "$filename" "https://eo-sso-idp.eo.esa.int/idp/umsso20/login?null" |
||
217 | fi |
||
218 | fi |
||
219 | done |
||
220 | </pre> |
||
221 | |||
222 | h3. Data Access Using @ciop-copy@ |
||
223 | |||
224 | @./ciop-copy -f -o ./ "https://eo-virtual-archive4.esa.int/supersites/ASA_WSS_1PNUPA20100125_025705_000000632086_00190_41326_0381.N1"@ |
||
225 | |||
226 | |||
227 | h2. Conclusion |
||
228 | |||
229 | With this lesson you have learned: |
||
230 | - to query ESA VA4 dataset series |
||
231 | - to query and downlaod ESA VA4 datasets |
||
232 |