Project

General

Profile

Esgf-client » History » Version 7

Francesco Barchetta, 2014-01-16 15:58

1 1 Herve Caumont
h1. ESGFClient
2
3 2 Herve Caumont
{{>toc}}
4
5 1 Herve Caumont
h2. Overview
6
7 5 Herve Caumont
ESGFClient is a Command Line tool written in C# able to download products from the "ESGF Data Search" getting as input the location of a RDF file or a list of single product's urls.
8 1 Herve Caumont
9
h2. Installation
10
11
The user can install the ESGFClient following these easy steps:
12
13
<pre><code class="XML">
14
yum install esgf-tools
15
</code></pre>
16
17
and test the installation with
18
19
<pre><code class="XML">
20
ESGFClient --help
21
</code></pre>
22
23 7 Francesco Barchetta
h2. How it works
24
25
ESGFClient is designed to perform data *search* and *access* from the Earth System Grid Federation.
26
27
The first task includes querying the "GEOWOW Terradue ESGF Catalogue":https://support.terradue.com/projects/devel-cloud-sb/wiki/Cas-esgf-cmip5 at http://geowow.terradue.com/catalogue/esgf/rdf .
28
The query string is directly passed to the ESGFClient through the parameter --uri.
29
30
Example:
31
32
<pre>
33
http://geowow.terradue.com/catalogue/esgf/thetao/rdf?time_frequency=mon&experiment=rcp85&ensemble=r1i1p1&institute=MRI&count=1
34
</pre> 
35
36
As you can see opening the link with a web browser, this query returns rdf metadata about the subset of data filtered by using the given parameters: time_frequency, experiment, ensemble and institute...
37
The ESGFClient parses the response and retrieves the list of OPeNDAP _online resources_ 
38
39
Example:
40
41
<pre>
42
http://dapp2p.cccma.ec.gc.ca/thredds/dodsC/cmip5.output1.CCCma.CanESM2.rcp85.mon.ocean.Omon.r1i1p1.thetao.20120407.aggregation
43
</pre>
44
45
Starting from this list, the ESGFClient build a new list of query strings by adding the rest of the options (such as --dtastart, --zmax etc.). 
46
Since the OPeNDAP needs "indexes" rather than absolute values, the ESGFClient converts the options.
47
Finally the list is "injected" in a revised version of the wget script provided by the ESGF Portal and the wget itself is launched.
48
49
50
51 1 Herve Caumont
h2. Usage
52
53
These are the options given:
54
<pre><code class="XML">
55
Options:
56 5 Herve Caumont
  -u, --uri=VALUE            RDF's URI to parse and download
57
  -o, --output=VALUE         Output folder
58
  -r, --resource=VALUE       Resource to download
59
  -O, --openid=VALUE         OpenId used to access.
60
  -p, --password=VALUE       Password for OpenId used to access.
61
  -s, --dtstart=VALUE        The beginning of the time query  to restrict
62
                               the subset of data to download. YYYY-MM-
63
                               DDTHH:mm:ssZ.
64
  -e, --dtend=VALUE          The end of the time query  to restrict the
65
                               subset of data to download. YYYY-MM-DDTHH:mm:ssZ.
66
  --zM, --zmax=VALUE         Maximum level (z)
67
  --zm, --zmin=VALUE         Minimum level (z). By default it's equal to 0
68
  -h, --help                 Show this message and exit 
69 1 Herve Caumont
</code></pre>
70
71
h2. Download from rdf url
72
73 6 Herve Caumont
The Client allows the user to download only the OPeNDAP online resources.
74 5 Herve Caumont
It tries to retrieve from the RDF file all the parameters needed for a query: a time start and a time stop (for temporal queries), the level range (z dimension if it's provided by the variable) and the variable to query.
75 1 Herve Caumont
76
Example:
77
<pre><code class="XML">
78 5 Herve Caumont
ESGFClient -u "http://geowow.terradue.com/catalogue/esgf/thetao/rdf?time_frequency=mon&experiment=rcp85&ensemble=r1i1p1&institute=MRI&count=1" -O "https://pcmdi9.llnl.gov/esgf-idp/openid/user" -p password -s 2006-01-17 -e 2008-12-16 -zmin 500 -zmax 700 -o "./tmp/"
79 1 Herve Caumont
</code></pre>
80
81 6 Herve Caumont
In order to build these correct OPeNDAP query, the time search and level parameters are converted to a set of indexes needed to make a spatial and temporal query.
82 1 Herve Caumont
83 5 Herve Caumont
The download is based on a revised wget script built upon a template called *wgetTemplate.sh*, filled with the URLs to be gotten.
84 1 Herve Caumont
85 5 Herve Caumont
Then the client run the WGET script, that saves the files into the file system.
86 1 Herve Caumont
87 5 Herve Caumont
All these files are protected by OpenID, so the user has to give his credentials by specifying the "--openid" option an --password. Conversely the security certificates are automatically downloaded by the WGET script.
88 1 Herve Caumont
89 6 Herve Caumont
After the OPeNDAP files (.dods, .das, .lev, .lat etc..) are downloaded, an internal script converts the .dods Binary Format file to a NetCDF file, according to the .das Dataset Attribute Structure description file. Once the NetDCF file is generated, the script also deletes the downloaded temporary files.
90 1 Herve Caumont
91 6 Herve Caumont
For more details on the OPeNDAP formats, refer to http://docs.opendap.org/index.php/UserGuideOPeNDAPMessages
92 1 Herve Caumont
93 6 Herve Caumont
h2. OPeNDAP limits
94
95
Since this version of the ESGFClient is exclusively based on the OPeNDAP servers, there is a size limit for every download. With the ESGF federation, this limit is set to the default value (500 MBytes), and if the query reaches this limit, the server returns a 403 error and the download fails. You can deal with this limit by setting the right temporal and level slices of your sub-datasets.