ESA VA4 tutorial¶
This tutorial provides guidance on how to apply for and access ESA Virtual Archive 4 data.
Introduction¶
Virtual Archives are online archives that provide an easy access to EO data by coupling high bandwidth, large storage space and software. The Virtual Archive 4 provides a Cloud based service for storing and providing access to ESA Synthetic Aperture Radar (SAR) data.
This virtual archive represents ESA contribution to the supersites initiative. This huge amount of SAR data (today over thirty thousand products are hosted on Virtual Archive 4) is accessible to science communities dealing with interferometry, landslide and change detection.
The virtual archive is a Cloud solution providing Storage-as-a-Service for storing the data and is coupled with complementary services:
- User authentication and authorization
- Data discovery implementing simple interfaces such as OpenSearch and results in Atom, RDF and KML format
- Data access via common web protocols such as HTTP.
The virtual archive technical solution is based on research and development performed by Terradue and partially funded by European Commission Framework Programme 7 in the context of the GENESI-DEC and GEOWOW projects.
General Concept¶
The data discovery is based on the OpenSearch interface provided by the Virtual Archive 4 Catalogue Access Service (VA4-CAS).
For a more detailed discussion of the OpenSearch interface of the Virtual Archive, see here: OpenSearch in Virtual Archive 4.
Catalogue Data and Output Formats¶
The VA4-CAS offers the following data:
The following output formats are available (the ASA_IM__0P series is used as an example in the remainder of this document, but everything explained below applies to the other series too):
Format | URL (using ASA_IM__0P as example) | Comment |
---|---|---|
RDF | http://eo-virtual-archive4.esa.int/search/ASA_IM__0P/rdf | The most universal format, contains the complete product metadata |
ATOM | http://eo-virtual-archive4.esa.int/search/ASA_IM__0P/atom | ATOM feed |
HTML | http://eo-virtual-archive4.esa.int/search/ASA_IM__0P/html | Web-optimised representation, useful for quick verifications with a web browser |
KML | http://eo-virtual-archive4.esa.int/search/ASA_IM__0P/kml | Allows viewing product footprints in Google Earth |
WKT | http://eo-virtual-archive4.esa.int/search/ASA_IM__0P/wkt | Returns product footprints as Well-known text |
Only download URLs | http://eo-virtual-archive4.esa.int/search/ASA_IM__0P/txt | Returns a text document containing the download URLs of the products |
Viewing Product Information¶
The best way to get familiar with the VA4-CAS is making requests using a web browser. You can do this by chosing any of the above URLs (e.g. HTML or ATOM format).
At this point, you may refine your search by specifying search criteria. This is done by appending query string parameters to the base URL. Doing so, only information of the matching products is returned (below there is an example).
These are the search criteria you can specify:
Query string parameter | Description | Value format | Value set | Example |
---|---|---|---|---|
bbox |
Rectangular area of interest | minlon,minlat,maxlon,maxlat |
none | bbox=13.3,50.5,14.3,51.5 |
geometry |
Arbitrary area of interest (alternative to bbox ) |
Well-known text string | none | geometry=POLYGON%28%2812.5%2041%2C13%2042%2C12.5%2043%2C12%2042%2C12.5%2041%29%29 (URL-encoded) |
start |
Start date of the period to be covered | YYYY-MM-DD , or |
none | start=2010-09-12 |
YYYY-MM-DDThh:mm:ssZ |
start=2010-09-12T13:58:26Z |
|||
stop |
End date of the period to be covered | YYYY-MM-DD , or |
none | start=2010-09-15 |
date and time in format YYYY-MM-DDThh:mm:ssZ |
start=2010-09-15T21:24:30Z |
|||
processingCenter |
Processing centre | Text, asterisk wildcard can be used | I-PAC |
processingCenter=PDAS-* |
PDAS-F |
||||
PDAS-M |
||||
PDHS-E |
||||
PDHS-K |
||||
acquisitionStation |
Acquisition station | Text | acquisitionStation=... |
|
orbitDirection |
Orbit direction | Text | ASCENDING |
orbitDirection=ASCENDING |
DESCENDING |
||||
orbitNumber |
Orbit number | Integer value or interval | none | orbitNumber=41923 or orbitNumber=[41920,41930] |
frame |
Frame | none | frame=2205 or frame=[2000,2500],[3000 |
|
track |
Track | Integer value or interval | none | track=129 or track=[128@ |
count |
Number of records to be returned (default is 20 ) |
Integer value | none | count=100 |
startIndex |
Index of first result record to be returned (starting with 0 , default is 0 ) |
Integer value | none | startIndex=20 |
startPage |
Result page (starting with 0 , default is 0 ), each page contains count records (alternative to startIndex ) |
Integer value | none | startPage=1 |
By default the number of returned products is 20. To obtain all matching products, you can do the following:
- Provide a value for the
count
query string parameter, e.g.count=100
- Do sequential requests providing the index of the first product to be returned until the returned result is empty (beyond the total number of matching products), e.g.
startIndex=0
,startIndex=20
,startIndex=40
, etc (note that the first product has the index0
).
Web Browser Examples:¶
- RDF, test URL: http://eo-virtual-archive4.esa.int/search/ASA_IM__0P/rdf?start=2004-09-01&stop=2004-09-10&bbox=4,36,20,48&orbitDirection=ASCENDING
- ATOM, test URL: http://eo-virtual-archive4.esa.int/search/ASA_IM__0P/atom?start=2004-09-01&stop=2004-09-10&bbox=4,36,20,48&orbitDirection=ASCENDING
- HTML, test URL: http://eo-virtual-archive4.esa.int/search/ASA_IM__0P/html?start=2004-09-01&stop=2004-09-10&bbox=4,36,20,48&orbitDirection=ASCENDING
- KML, test URL: http://eo-virtual-archive4.esa.int/search/ASA_IM__0P/kml?start=2004-09-01&stop=2004-09-10&bbox=4,36,20,48&orbitDirection=ASCENDING (viewed in Google Earth)
Discovering Data Using curl
¶
Accessing the VA4-CAS via curl
is very similar to using the web browser as described in the previous section, except that not all formats offered by the VA4-CAS are useful in a machine-to-machine context. The output format you will use most likely is txt
, which returns download locations of matching products.
In the next section an example for this is given.
Example (L'Aquila Earthquake) with curl
¶
To obtain the download locations of ASA_IM__0P imagery for the April 2009 earthquake in L'Aquila (before and after), you can use the following curl
command:
curl "http://eo-virtual-archive4.esa.int/search/ASA_IM__0P/txt?bbox=13,42,13,42&start=2009-03-01&stop=2009-06-01"
It returns 7 results (depicted in the map below):
https://eo-virtual-archive4.esa.int/supersites/ASA_IM__0CNPAM20090517_092439_000000172079_00079_37708_4238.N1 https://eo-virtual-archive4.esa.int/supersites/ASA_IM__0CNPAM20090517_092430_000000162079_00079_37708_4238.N1 https://eo-virtual-archive4.esa.int/supersites/ASA_IM__0CNPAM20090504_205036_000000162078_00401_37529_4202.N1 https://eo-virtual-archive4.esa.int/supersites/ASA_IM__0CNPDE20090412_092436_000000162078_00079_37207_1556.N1 https://eo-virtual-archive4.esa.int/supersites/ASA_IM__0CNPDE20090412_092426_000000162078_00079_37207_1556.N1 https://eo-virtual-archive4.esa.int/supersites/ASA_IM__0CNPAM20090330_205031_000000172077_00401_37028_4122.N1 https://eo-virtual-archive4.esa.int/supersites/ASA_IM__0CNPAM20090308_092431_000000162077_00079_36706_4070.N1
To further refine the result in order to have the same orbit direction, you can use this slightly changed curl
command:
curl "http://eo-virtual-archive4.esa.int/search/ASA_IM__0P/txt?bbox=13,42,13,42&start=2009-03-01&stop=2009-06-01&orbitDirection=DESCENDING"
Now the results are only 4:
https://eo-virtual-archive4.esa.int/supersites/ASA_IM__0CNPAM20090517_092439_000000172079_00079_37708_4238.N1 https://eo-virtual-archive4.esa.int/supersites/ASA_IM__0CNPAM20090517_092430_000000162079_00079_37708_4238.N1 https://eo-virtual-archive4.esa.int/supersites/ASA_IM__0CNPDE20090412_092436_000000162078_00079_37207_1556.N1 https://eo-virtual-archive4.esa.int/supersites/ASA_IM__0CNPDE20090412_092426_000000162078_00079_37207_1556.N1
Discovering Data Using ciop-catquery
¶
The ciop-catquery
tool is a command-line client aimed at simplifying more the already simple direct catalogue access.
It has the following usage: ciop-catquery [options] <catalogue_url>
, where the <catalogue_url>
is http://eo-virtual-archive4.esa.int/search/. The name of the series can be either appended or added using the -se
option.
The following options are available for refining a query (see also ./ciop-catquery -h
for help):
- Spatial coverage:
-b
or--bounding-box=
The option value option value must be a boundbing box inminlon,minlat,maxlon,maxlat
format.
- Temporal coverage:
-tq
or--time-query=
The option value must be a string in the formatbegin=YYYY-MM-DDThh:mm:ssZ;end=YYYY-MM-DDThh:mm:ssZ[;method=<method>]
;|/2.time:start
,time:end
.
Themethod
value (optional) allows to specify which of the sensing start/end times must be covered by the specified period (sensing_interval
,sensing_start
,sensing_stop
) or when the data was processed or inserted (processing
,insertion
)
- Other attributes:
-a
or--attribute=
(can occur multiple times for different attributes)
The option value must be a string in the formatattribute:condition
, the following attributes can be used:size
orbitNumber
processorVersion
processingCenter
acquisitionStation
The
condition
can be a simple term, e.g.-a orbitnumber:41923
or an interval, e.g.-a orbitNumber:[41920,41930]
The following options allow to change the output of the script:
- Receive full RDF:
-ox
or--output-xml
- Receive selected fields:
-o
oroutputfields=
as a space-separated field list with one line per data set
The option value is a comma-separated list of field names (the XML element names can be obtained using the-ox
switch (see above)
- Receive only download locations of matching products:
-o dclite4g:onlineResource
oroutputfields=dclite4g:onlineResource
This returns URLs for matching data sets, one per line.
- Redirection to a file:
-O
or--outputfile=
The value must be the path to the output file
Data Staging¶
Having obtained the list of download locations from the VA4-CAS, it is possible to stage datasets by using either curl
or ciop-copy
.
Data Access Using curl
¶
If the download URLs are available (e.g. from a call to ciop-catquery -o dclite4g:onlineResource
), they can be used with curl to download the data set files.
The difficulty of the file download lies in the fact that the downloadable resources are protected by ESA's Single Sign-on service, which is usually accessed by a web browser; therefore an automated download has to simulate a web browser session and handle several cookies. The following bash script shows how this can be achieved using curl
; it is assumed that the URLs are contained in a file named urls.txt
, one URL per line.
SSO_USERNAME=<YOUR-UMSSO-USERNAME> SSO_PASSWORD=<YOUR-UMSSO-PASSWORD> rm -rf cookie*.txt cookies= for url in $(cat urls.txt) do filename=$(basename $url) echo "Downloading $url" curl -k --cacert /etc/grid-security/certificates/cacert.crt -L $cookies -c cookie.txt -o "$filename" "$url" [ -z "$cookies" ] && cookies="-b cookie.txt" if [ "$(grep "<title>EO SSO</title>" $filename)" ] then poststring="cn=$SSO_USERNAME&password=$SSO_PASSWORD&idleTime=oneday&sessionTime=untilbrowserclose&loginButton=Login" curl -k -XPOST -d "$poststring" -L -H "Content-type: application/x-www-form-urlencoded" $cookies -c cookie2.txt -o "$filename" "https://eo-sso-idp.eo.esa.int/idp/umsso20/login?null" cookies="$cookies -b cookie2.txt" if [ "$(grep "Please wait..." $filename)" ] then poststring="cn=$SSO_USERNAME&password=$SSO_PASSWORD&idleTime=oneday&sessionTime=untilbrowserclose&loginFields=cn%40password&loginMethod=umsso" curl -k -X POST -d "$poststring" -L -H "Content-type: application/x-www-form-urlencoded" $cookies -o "$filename" "https://eo-sso-idp.eo.esa.int/idp/umsso20/login?null" fi fi done
Data Access Using ciop-copy
¶
./ciop-copy -f -o ./ "https://eo-virtual-archive4.esa.int/supersites/ASA_WSS_1PNUPA20100125_025705_000000632086_00190_41326_0381.N1"
Conclusion¶
With this lesson you have learned:
- to query ESA VA4 dataset series
- to query and downlaod ESA VA4 datasets
Updated by Herve Caumont about 11 years ago ยท 3 revisions