Esgf-client » History » Version 7
Francesco Barchetta, 2014-01-16 15:58
1 | 1 | Herve Caumont | h1. ESGFClient |
---|---|---|---|
2 | |||
3 | 2 | Herve Caumont | {{>toc}} |
4 | |||
5 | 1 | Herve Caumont | h2. Overview |
6 | |||
7 | 5 | Herve Caumont | ESGFClient is a Command Line tool written in C# able to download products from the "ESGF Data Search" getting as input the location of a RDF file or a list of single product's urls. |
8 | 1 | Herve Caumont | |
9 | h2. Installation |
||
10 | |||
11 | The user can install the ESGFClient following these easy steps: |
||
12 | |||
13 | <pre><code class="XML"> |
||
14 | yum install esgf-tools |
||
15 | </code></pre> |
||
16 | |||
17 | and test the installation with |
||
18 | |||
19 | <pre><code class="XML"> |
||
20 | ESGFClient --help |
||
21 | </code></pre> |
||
22 | |||
23 | 7 | Francesco Barchetta | h2. How it works |
24 | |||
25 | ESGFClient is designed to perform data *search* and *access* from the Earth System Grid Federation. |
||
26 | |||
27 | The first task includes querying the "GEOWOW Terradue ESGF Catalogue":https://support.terradue.com/projects/devel-cloud-sb/wiki/Cas-esgf-cmip5 at http://geowow.terradue.com/catalogue/esgf/rdf . |
||
28 | The query string is directly passed to the ESGFClient through the parameter --uri. |
||
29 | |||
30 | Example: |
||
31 | |||
32 | <pre> |
||
33 | http://geowow.terradue.com/catalogue/esgf/thetao/rdf?time_frequency=mon&experiment=rcp85&ensemble=r1i1p1&institute=MRI&count=1 |
||
34 | </pre> |
||
35 | |||
36 | As you can see opening the link with a web browser, this query returns rdf metadata about the subset of data filtered by using the given parameters: time_frequency, experiment, ensemble and institute... |
||
37 | The ESGFClient parses the response and retrieves the list of OPeNDAP _online resources_ |
||
38 | |||
39 | Example: |
||
40 | |||
41 | <pre> |
||
42 | http://dapp2p.cccma.ec.gc.ca/thredds/dodsC/cmip5.output1.CCCma.CanESM2.rcp85.mon.ocean.Omon.r1i1p1.thetao.20120407.aggregation |
||
43 | </pre> |
||
44 | |||
45 | Starting from this list, the ESGFClient build a new list of query strings by adding the rest of the options (such as --dtastart, --zmax etc.). |
||
46 | Since the OPeNDAP needs "indexes" rather than absolute values, the ESGFClient converts the options. |
||
47 | Finally the list is "injected" in a revised version of the wget script provided by the ESGF Portal and the wget itself is launched. |
||
48 | |||
49 | |||
50 | |||
51 | 1 | Herve Caumont | h2. Usage |
52 | |||
53 | These are the options given: |
||
54 | <pre><code class="XML"> |
||
55 | Options: |
||
56 | 5 | Herve Caumont | -u, --uri=VALUE RDF's URI to parse and download |
57 | -o, --output=VALUE Output folder |
||
58 | -r, --resource=VALUE Resource to download |
||
59 | -O, --openid=VALUE OpenId used to access. |
||
60 | -p, --password=VALUE Password for OpenId used to access. |
||
61 | -s, --dtstart=VALUE The beginning of the time query to restrict |
||
62 | the subset of data to download. YYYY-MM- |
||
63 | DDTHH:mm:ssZ. |
||
64 | -e, --dtend=VALUE The end of the time query to restrict the |
||
65 | subset of data to download. YYYY-MM-DDTHH:mm:ssZ. |
||
66 | --zM, --zmax=VALUE Maximum level (z) |
||
67 | --zm, --zmin=VALUE Minimum level (z). By default it's equal to 0 |
||
68 | -h, --help Show this message and exit |
||
69 | 1 | Herve Caumont | </code></pre> |
70 | |||
71 | h2. Download from rdf url |
||
72 | |||
73 | 6 | Herve Caumont | The Client allows the user to download only the OPeNDAP online resources. |
74 | 5 | Herve Caumont | It tries to retrieve from the RDF file all the parameters needed for a query: a time start and a time stop (for temporal queries), the level range (z dimension if it's provided by the variable) and the variable to query. |
75 | 1 | Herve Caumont | |
76 | Example: |
||
77 | <pre><code class="XML"> |
||
78 | 5 | Herve Caumont | ESGFClient -u "http://geowow.terradue.com/catalogue/esgf/thetao/rdf?time_frequency=mon&experiment=rcp85&ensemble=r1i1p1&institute=MRI&count=1" -O "https://pcmdi9.llnl.gov/esgf-idp/openid/user" -p password -s 2006-01-17 -e 2008-12-16 -zmin 500 -zmax 700 -o "./tmp/" |
79 | 1 | Herve Caumont | </code></pre> |
80 | |||
81 | 6 | Herve Caumont | In order to build these correct OPeNDAP query, the time search and level parameters are converted to a set of indexes needed to make a spatial and temporal query. |
82 | 1 | Herve Caumont | |
83 | 5 | Herve Caumont | The download is based on a revised wget script built upon a template called *wgetTemplate.sh*, filled with the URLs to be gotten. |
84 | 1 | Herve Caumont | |
85 | 5 | Herve Caumont | Then the client run the WGET script, that saves the files into the file system. |
86 | 1 | Herve Caumont | |
87 | 5 | Herve Caumont | All these files are protected by OpenID, so the user has to give his credentials by specifying the "--openid" option an --password. Conversely the security certificates are automatically downloaded by the WGET script. |
88 | 1 | Herve Caumont | |
89 | 6 | Herve Caumont | After the OPeNDAP files (.dods, .das, .lev, .lat etc..) are downloaded, an internal script converts the .dods Binary Format file to a NetCDF file, according to the .das Dataset Attribute Structure description file. Once the NetDCF file is generated, the script also deletes the downloaded temporary files. |
90 | 1 | Herve Caumont | |
91 | 6 | Herve Caumont | For more details on the OPeNDAP formats, refer to http://docs.opendap.org/index.php/UserGuideOPeNDAPMessages |
92 | 1 | Herve Caumont | |
93 | 6 | Herve Caumont | h2. OPeNDAP limits |
94 | |||
95 | Since this version of the ESGFClient is exclusively based on the OPeNDAP servers, there is a size limit for every download. With the ESGF federation, this limit is set to the default value (500 MBytes), and if the query reaches this limit, the server returns a 403 error and the download fails. You can deal with this limit by setting the right temporal and level slices of your sub-datasets. |