Esgf-client » History » Version 7
Francesco Barchetta, 2014-01-16 15:58
| 1 | 1 | Herve Caumont | h1. ESGFClient |
|---|---|---|---|
| 2 | |||
| 3 | 2 | Herve Caumont | {{>toc}} |
| 4 | |||
| 5 | 1 | Herve Caumont | h2. Overview |
| 6 | |||
| 7 | 5 | Herve Caumont | ESGFClient is a Command Line tool written in C# able to download products from the "ESGF Data Search" getting as input the location of a RDF file or a list of single product's urls. |
| 8 | 1 | Herve Caumont | |
| 9 | h2. Installation |
||
| 10 | |||
| 11 | The user can install the ESGFClient following these easy steps: |
||
| 12 | |||
| 13 | <pre><code class="XML"> |
||
| 14 | yum install esgf-tools |
||
| 15 | </code></pre> |
||
| 16 | |||
| 17 | and test the installation with |
||
| 18 | |||
| 19 | <pre><code class="XML"> |
||
| 20 | ESGFClient --help |
||
| 21 | </code></pre> |
||
| 22 | |||
| 23 | 7 | Francesco Barchetta | h2. How it works |
| 24 | |||
| 25 | ESGFClient is designed to perform data *search* and *access* from the Earth System Grid Federation. |
||
| 26 | |||
| 27 | The first task includes querying the "GEOWOW Terradue ESGF Catalogue":https://support.terradue.com/projects/devel-cloud-sb/wiki/Cas-esgf-cmip5 at http://geowow.terradue.com/catalogue/esgf/rdf . |
||
| 28 | The query string is directly passed to the ESGFClient through the parameter --uri. |
||
| 29 | |||
| 30 | Example: |
||
| 31 | |||
| 32 | <pre> |
||
| 33 | http://geowow.terradue.com/catalogue/esgf/thetao/rdf?time_frequency=mon&experiment=rcp85&ensemble=r1i1p1&institute=MRI&count=1 |
||
| 34 | </pre> |
||
| 35 | |||
| 36 | As you can see opening the link with a web browser, this query returns rdf metadata about the subset of data filtered by using the given parameters: time_frequency, experiment, ensemble and institute... |
||
| 37 | The ESGFClient parses the response and retrieves the list of OPeNDAP _online resources_ |
||
| 38 | |||
| 39 | Example: |
||
| 40 | |||
| 41 | <pre> |
||
| 42 | http://dapp2p.cccma.ec.gc.ca/thredds/dodsC/cmip5.output1.CCCma.CanESM2.rcp85.mon.ocean.Omon.r1i1p1.thetao.20120407.aggregation |
||
| 43 | </pre> |
||
| 44 | |||
| 45 | Starting from this list, the ESGFClient build a new list of query strings by adding the rest of the options (such as --dtastart, --zmax etc.). |
||
| 46 | Since the OPeNDAP needs "indexes" rather than absolute values, the ESGFClient converts the options. |
||
| 47 | Finally the list is "injected" in a revised version of the wget script provided by the ESGF Portal and the wget itself is launched. |
||
| 48 | |||
| 49 | |||
| 50 | |||
| 51 | 1 | Herve Caumont | h2. Usage |
| 52 | |||
| 53 | These are the options given: |
||
| 54 | <pre><code class="XML"> |
||
| 55 | Options: |
||
| 56 | 5 | Herve Caumont | -u, --uri=VALUE RDF's URI to parse and download |
| 57 | -o, --output=VALUE Output folder |
||
| 58 | -r, --resource=VALUE Resource to download |
||
| 59 | -O, --openid=VALUE OpenId used to access. |
||
| 60 | -p, --password=VALUE Password for OpenId used to access. |
||
| 61 | -s, --dtstart=VALUE The beginning of the time query to restrict |
||
| 62 | the subset of data to download. YYYY-MM- |
||
| 63 | DDTHH:mm:ssZ. |
||
| 64 | -e, --dtend=VALUE The end of the time query to restrict the |
||
| 65 | subset of data to download. YYYY-MM-DDTHH:mm:ssZ. |
||
| 66 | --zM, --zmax=VALUE Maximum level (z) |
||
| 67 | --zm, --zmin=VALUE Minimum level (z). By default it's equal to 0 |
||
| 68 | -h, --help Show this message and exit |
||
| 69 | 1 | Herve Caumont | </code></pre> |
| 70 | |||
| 71 | h2. Download from rdf url |
||
| 72 | |||
| 73 | 6 | Herve Caumont | The Client allows the user to download only the OPeNDAP online resources. |
| 74 | 5 | Herve Caumont | It tries to retrieve from the RDF file all the parameters needed for a query: a time start and a time stop (for temporal queries), the level range (z dimension if it's provided by the variable) and the variable to query. |
| 75 | 1 | Herve Caumont | |
| 76 | Example: |
||
| 77 | <pre><code class="XML"> |
||
| 78 | 5 | Herve Caumont | ESGFClient -u "http://geowow.terradue.com/catalogue/esgf/thetao/rdf?time_frequency=mon&experiment=rcp85&ensemble=r1i1p1&institute=MRI&count=1" -O "https://pcmdi9.llnl.gov/esgf-idp/openid/user" -p password -s 2006-01-17 -e 2008-12-16 -zmin 500 -zmax 700 -o "./tmp/" |
| 79 | 1 | Herve Caumont | </code></pre> |
| 80 | |||
| 81 | 6 | Herve Caumont | In order to build these correct OPeNDAP query, the time search and level parameters are converted to a set of indexes needed to make a spatial and temporal query. |
| 82 | 1 | Herve Caumont | |
| 83 | 5 | Herve Caumont | The download is based on a revised wget script built upon a template called *wgetTemplate.sh*, filled with the URLs to be gotten. |
| 84 | 1 | Herve Caumont | |
| 85 | 5 | Herve Caumont | Then the client run the WGET script, that saves the files into the file system. |
| 86 | 1 | Herve Caumont | |
| 87 | 5 | Herve Caumont | All these files are protected by OpenID, so the user has to give his credentials by specifying the "--openid" option an --password. Conversely the security certificates are automatically downloaded by the WGET script. |
| 88 | 1 | Herve Caumont | |
| 89 | 6 | Herve Caumont | After the OPeNDAP files (.dods, .das, .lev, .lat etc..) are downloaded, an internal script converts the .dods Binary Format file to a NetCDF file, according to the .das Dataset Attribute Structure description file. Once the NetDCF file is generated, the script also deletes the downloaded temporary files. |
| 90 | 1 | Herve Caumont | |
| 91 | 6 | Herve Caumont | For more details on the OPeNDAP formats, refer to http://docs.opendap.org/index.php/UserGuideOPeNDAPMessages |
| 92 | 1 | Herve Caumont | |
| 93 | 6 | Herve Caumont | h2. OPeNDAP limits |
| 94 | |||
| 95 | Since this version of the ESGFClient is exclusively based on the OPeNDAP servers, there is a size limit for every download. With the ESGF federation, this limit is set to the default value (500 MBytes), and if the query reaches this limit, the server returns a 403 error and the download fails. You can deal with this limit by setting the right temporal and level slices of your sub-datasets. |