Project

General

Profile

Actions

Cas-esa-va4 » History » Revision 1

Revision 1/3 | Next »
Herve Caumont, 2013-06-19 18:05


ESA VA4 tutorial

This tutorial provides guidance on how to apply for and access ESA Virtual Archive 4 data.

Introduction

Virtual Archives are online archives that provide an easy access to EO data by coupling high bandwidth, large storage space and software. The Virtual Archive 4 provides a Cloud based service for storing and providing access to ESA Synthetic Aperture Radar (SAR) data.

This virtual archive represents ESA contribution to the supersites initiative. This huge amount of SAR data (today over thirty thousand products are hosted on Virtual Archive 4) is accessible to science communities dealing with interferometry, landslide and change detection.

The virtual archive is a Cloud solution providing Storage-as-a-Service for storing the data and is coupled with complementary services:

  • User authentication and authorization
  • Data discovery implementing simple interfaces such as OpenSearch and results in Atom, RDF and KML format
  • Data access via common web protocols such as HTTP.

The virtual archive technical solution is based on research and development performed by Terradue and partially funded by European Commission Framework Programme 7 in the context of the GENESI-DEC and GEOWOW projects.

General Concept

The data discovery is based on the OpenSearch interface provided by the Virtual Archive 4 Catalogue Access Service (VA4-CAS).

For a more detailed discussion of the OpenSearch interface of the Virtual Archive, see here: OpenSearch in Virtual Archive 4.

Catalogue Data and Output Formats

The VA4-CAS offers the following data:

Series name Description Example URL (ATOM)
ASA_IM__0P ASAR Image Mode source packets Level 0 http://eo-virtual-archive4.esa.int/search/ASA_IM__0P/atom
ASA_IMP_1P ENVISAT ASAR Image Mode Precision Image http://eo-virtual-archive4.esa.int/search/ASA_IMP_1P/atom
ASA_IMS_1P ENVISAT ASAR Image Mode Single Look Complex Image http://eo-virtual-archive4.esa.int/search/ASA_IMS_1P/atom
ASA_WS__0P ASAR Wide Swath Level 0 product http://eo-virtual-archive4.esa.int/search/ASA_WS__0P/atom
ASA_WSM_1P ENVISAT ASAR Wide Swath Mode http://eo-virtual-archive4.esa.int/search/ASA_WSM_1P/atom
ASA_WSS_1P ENVISAT ASAR Wide Swath Single Look Complex http://eo-virtual-archive4.esa.int/search/ASA_WSS_1P/atom
ASA_XCA_AX ENVISAT ASAR External calibration data http://eo-virtual-archive4.esa.int/search/ER02_SAR_RAW_0P/atom
ER01_SAR_IM__0P ERS-1 SAR Image Mode source packets Level 0 http://eo-virtual-archive4.esa.int/search/ER01_SAR_IM__0P/atom
ER01_SAR_IMP_1P ERS-1 SAR Precision Image Product http://eo-virtual-archive4.esa.int/search/ER01_SAR_IMP_1P/atom
ER01_SAR_IMS_1P ERS-1 SAR Single Look Complex Image Product http://eo-virtual-archive4.esa.int/search/ER01_SAR_IMS_1P/atom
ER01_SAR_RAW_0P ERS-1 SAR Image SAR Annotated Raw Data Product Level 0 http://eo-virtual-archive4.esa.int/search/ER01_SAR_RAW_0P/atom
ER02_SAR_IM__0P ERS-2 SAR Image Mode source packets Level 0 http://eo-virtual-archive4.esa.int/search/ER02_SAR_IM__0P/atom
ER02_SAR_IMP_1P ERS-2 SAR Precision Image Product http://eo-virtual-archive4.esa.int/search/ER02_SAR_IMP_1P/atom
ER02_SAR_IMS_1P ERS-2 SAR Single Look Complex Image Product http://eo-virtual-archive4.esa.int/search/ER02_SAR_IMS_1P/atom
ER02_SAR_RAW_0P ERS-2 SAR Image SAR Annotated Raw Data Product Level 0 http://eo-virtual-archive4.esa.int/search/ER02_SAR_RAW_0P/atom

The following output formats are available (the ASA_IM__0P series is used as an example in the remainder of this document, but everything explained below applies to the other series too):

Format URL (using ASA_IM__0P as example) Comment
RDF http://eo-virtual-archive4.esa.int/search/ASA_IM__0P/rdf The most universal format, contains the complete product metadata
ATOM http://eo-virtual-archive4.esa.int/search/ASA_IM__0P/atom ATOM feed
HTML http://eo-virtual-archive4.esa.int/search/ASA_IM__0P/html Web-optimised representation, useful for quick verifications with a web browser
KML http://eo-virtual-archive4.esa.int/search/ASA_IM__0P/kml Allows viewing product footprints in Google Earth
WKT http://eo-virtual-archive4.esa.int/search/ASA_IM__0P/wkt Returns product footprints as Well-known text
Only download URLs http://eo-virtual-archive4.esa.int/search/ASA_IM__0P/txt Returns a text document containing the download URLs of the products

Viewing Product Information

The best way to get familiar with the VA4-CAS is making requests using a web browser. You can do this by chosing any of the above URLs (e.g. HTML or ATOM format).

At this point, you may refine your search by specifying search criteria. This is done by appending query string parameters to the base URL. Doing so, only information of the matching products is returned (below there is an example).

These are the search criteria you can specify:

Query string parameter Description Value format Value set Example
bbox Rectangular area of interest minlon,minlat,maxlon,maxlat none bbox=13.3,50.5,14.3,51.5
geometry Arbitrary area of interest (alternative to bbox) Well-known text string none geometry=POLYGON%28%2812.5%2041%2C13%2042%2C12.5%2043%2C12%2042%2C12.5%2041%29%29 (URL-encoded)
start Start date of the period to be covered YYYY-MM-DD, or none start=2010-09-12
YYYY-MM-DDThh:mm:ssZ start=2010-09-12T13:58:26Z
stop End date of the period to be covered YYYY-MM-DD, or none start=2010-09-15
date and time in format YYYY-MM-DDThh:mm:ssZ start=2010-09-15T21:24:30Z
processingCenter Processing centre Text, asterisk wildcard can be used I-PAC processingCenter=PDAS-*
PDAS-F
PDAS-M
PDHS-E
PDHS-K
acquisitionStation Acquisition station Text acquisitionStation=...
orbitDirection Orbit direction Text ASCENDING orbitDirection=ASCENDING
DESCENDING
orbitNumber Orbit number Integer value or interval none orbitNumber=41923 or orbitNumber=[41920,41930]
frame Frame none frame=2205 or frame=[2000,2500],[3000
track Track Integer value or interval none track=129 or track=[128@
count Number of records to be returned (default is 20) Integer value none count=100
startIndex Index of first result record to be returned (starting with 0, default is 0) Integer value none startIndex=20
startPage Result page (starting with 0, default is 0), each page contains count records (alternative to startIndex) Integer value none startPage=1

By default the number of returned products is 20. To obtain all matching products, you can do the following:

  • Provide a value for the count query string parameter, e.g. count=100
  • Do sequential requests providing the index of the first product to be returned until the returned result is empty (beyond the total number of matching products), e.g. startIndex=0, startIndex=20, startIndex=40, etc (note that the first product has the index 0).

Web Browser Examples:

Discovering Data Using curl

Accessing the VA4-CAS via curl is very similar to using the web browser as described in the previous section, except that not all formats offered by the VA4-CAS are useful in a machine-to-machine context. The output format you will use most likely is txt, which returns download locations of matching products.

In the next section an example for this is given.

Example (L'Aquila Earthquake) with curl

To obtain the download locations of ASA_IM__0P imagery for the April 2009 earthquake in L'Aquila (before and after), you can use the following curl command:

curl "http://eo-virtual-archive4.esa.int/search/ASA_IM__0P/txt?bbox=13,42,13,42&start=2009-03-01&stop=2009-06-01"

It returns 7 results (depicted in the map below):

https://eo-virtual-archive4.esa.int/supersites/ASA_IM__0CNPAM20090517_092439_000000172079_00079_37708_4238.N1
https://eo-virtual-archive4.esa.int/supersites/ASA_IM__0CNPAM20090517_092430_000000162079_00079_37708_4238.N1
https://eo-virtual-archive4.esa.int/supersites/ASA_IM__0CNPAM20090504_205036_000000162078_00401_37529_4202.N1
https://eo-virtual-archive4.esa.int/supersites/ASA_IM__0CNPDE20090412_092436_000000162078_00079_37207_1556.N1
https://eo-virtual-archive4.esa.int/supersites/ASA_IM__0CNPDE20090412_092426_000000162078_00079_37207_1556.N1
https://eo-virtual-archive4.esa.int/supersites/ASA_IM__0CNPAM20090330_205031_000000172077_00401_37028_4122.N1
https://eo-virtual-archive4.esa.int/supersites/ASA_IM__0CNPAM20090308_092431_000000162077_00079_36706_4070.N1

To further refine the result in order to have the same orbit direction, you can use this slightly changed curl command:

curl "http://eo-virtual-archive4.esa.int/search/ASA_IM__0P/txt?bbox=13,42,13,42&start=2009-03-01&stop=2009-06-01&orbitDirection=DESCENDING"

Now the results are only 4:

https://eo-virtual-archive4.esa.int/supersites/ASA_IM__0CNPAM20090517_092439_000000172079_00079_37708_4238.N1
https://eo-virtual-archive4.esa.int/supersites/ASA_IM__0CNPAM20090517_092430_000000162079_00079_37708_4238.N1
https://eo-virtual-archive4.esa.int/supersites/ASA_IM__0CNPDE20090412_092436_000000162078_00079_37207_1556.N1
https://eo-virtual-archive4.esa.int/supersites/ASA_IM__0CNPDE20090412_092426_000000162078_00079_37207_1556.N1

Discovering Data Using ciop-catquery

The ciop-catquery tool is a command-line client aimed at simplifying more the already simple direct catalogue access.

It has the following usage: ciop-catquery [options] <catalogue_url>, where the <catalogue_url> is http://eo-virtual-archive4.esa.int/search/. The name of the series can be either appended or added using the -se option.

The following options are available for refining a query (see also ./ciop-catquery -h for help):

  • Spatial coverage: -b or --bounding-box=
    The option value option value must be a boundbing box in minlon,minlat,maxlon,maxlat format.
  • Temporal coverage: -tq or --time-query=
    The option value must be a string in the format begin=YYYY-MM-DDThh:mm:ssZ;end=YYYY-MM-DDThh:mm:ssZ[;method=<method>];|/2. time:start, time:end.
    The method value (optional) allows to specify which of the sensing start/end times must be covered by the specified period (sensing_interval, sensing_start, sensing_stop) or when the data was processed or inserted (processing, insertion)
  • Other attributes: -a or --attribute= (can occur multiple times for different attributes)
    The option value must be a string in the format attribute:condition, the following attributes can be used:
    • size
    • orbitNumber
    • processorVersion
    • processingCenter
    • acquisitionStation

    The condition can be a simple term, e.g. -a orbitnumber:41923 or an interval, e.g. -a orbitNumber:[41920,41930]

The following options allow to change the output of the script:

  • Receive full RDF: -ox or --output-xml
  • Receive selected fields: -o or outputfields= as a space-separated field list with one line per data set
    The option value is a comma-separated list of field names (the XML element names can be obtained using the -ox switch (see above)
  • Receive only download locations of matching products: -o dclite4g:onlineResource or outputfields=dclite4g:onlineResource
    This returns URLs for matching data sets, one per line.
  • Redirection to a file: -O or --outputfile=
    The value must be the path to the output file

Data Staging

Having obtained the list of download locations from the VA4-CAS, it is possible to stage datasets by using either curl or ciop-copy.

Data Access Using curl

If the download URLs are available (e.g. from a call to ciop-catquery -o dclite4g:onlineResource), they can be used with curl to download the data set files.

The difficulty of the file download lies in the fact that the downloadable resources are protected by ESA's Single Sign-on service, which is usually accessed by a web browser; therefore an automated download has to simulate a web browser session and handle several cookies. The following bash script shows how this can be achieved using curl; it is assumed that the URLs are contained in a file named urls.txt, one URL per line.

SSO_USERNAME=<YOUR-UMSSO-USERNAME>
SSO_PASSWORD=<YOUR-UMSSO-PASSWORD>

rm -rf cookie*.txt
cookies=
for url in $(cat urls.txt)
do
    filename=$(basename $url)
    echo "Downloading $url" 
    curl -k --cacert /etc/grid-security/certificates/cacert.crt -L $cookies -c cookie.txt -o "$filename" "$url" 
    [ -z "$cookies" ] && cookies="-b cookie.txt" 
    if [ "$(grep "<title>EO SSO</title>" $filename)" ]
    then
        poststring="cn=$SSO_USERNAME&password=$SSO_PASSWORD&idleTime=oneday&sessionTime=untilbrowserclose&loginButton=Login" 
        curl -k -XPOST -d "$poststring" -L -H "Content-type: application/x-www-form-urlencoded" $cookies -c cookie2.txt -o "$filename" "https://eo-sso-idp.eo.esa.int/idp/umsso20/login?null" 
        cookies="$cookies -b cookie2.txt" 
        if [ "$(grep "Please wait..." $filename)" ]
        then
            poststring="cn=$SSO_USERNAME&password=$SSO_PASSWORD&idleTime=oneday&sessionTime=untilbrowserclose&loginFields=cn%40password&loginMethod=umsso" 
            curl -k -X POST -d "$poststring" -L -H "Content-type: application/x-www-form-urlencoded" $cookies -o "$filename" "https://eo-sso-idp.eo.esa.int/idp/umsso20/login?null" 
        fi
     fi
done

Data Access Using ciop-copy

./ciop-copy -f -o ./ "https://eo-virtual-archive4.esa.int/supersites/ASA_WSS_1PNUPA20100125_025705_000000632086_00190_41326_0381.N1"

Conclusion

With this lesson you have learned:
- to query ESA VA4 dataset series
- to query and downlaod ESA VA4 datasets

Updated by Herve Caumont over 11 years ago · 1 revisions