Prev Up Next Index
Go backward to 1.1 Why Use OPeNDAP to Read Data?
Go up to 1.1 Why Use OPeNDAP to Read Data?
Go forward to 1.1.2 An Example: Using OPeNDAP

1.1.1 An Example: Using ftp

The advent of the WWW has made possible simple data browsers that allow sophisticated interactive sampling of on-line datasets. Using a web browser and ftp , a user can sample any of several large oceanographic datasets available on the Internet. However, there are several problems with these data search engines that may only become apparent when a user actually tries to use the data.

Among the problems that can arise are those that appear when a user tries to use the results of one dataset to search a second dataset. Suppose that a user wishes to choose a sea-surface temperature image from the NOAA/NASA Pathfinder AVHRR archive at:

http://podaac-www.jpl.nasa.gov/mcsst/mcsst_subset.html

using the results of a time-series generated from the COADS Climatology archive at:

http://ferret.wrc.noaa.gov/fbin/climate_server   

The steps are theoretically straightforward:

  1. Create the time series from the COADS Climatology archive. This is done by answering the menu of options on the COADS web page.
  2. Import the time series from step 1 to the user's local data analysis system. Note that this step may itself require several steps:
    1. The data must be down-loaded, using ftp or a similar program.
    2. Once down-loaded, the data may have to be converted into a format that can be read by the data analysis program.
  3. Examine the data and formulate a request to the AVHRR archive. This is again done by answering the menu of option on the AVHRR Web page. Note that the COADS and AVHRR pages are not completely compatible in this respect. For example, the date formats of the two pages are different.
  4. Import the result of step 3 to the user's local data display system. This may also require several steps:
    1. The data must be down-loaded again.
    2. And again, once down-loaded, the data may have to be converted into a format that can be read by the data analysis program. Note that the set of available formats on the COADS page are distinct from the available options from the AVHRR archive.
  5. Think about the results.

Though the procedure is straightforward and the web servers designed to make sampling the datasets a simple task, upon close examination, the combination of the steps may create unforeseen difficulties. For example, a request to the COADS server will return either a spreadsheet suitable for use on a PC, a netCDF format file, or a file in one of a selection of simple ASCII formats. If the user is fortunate, the returned file will already be in a format compatible with the desired analysis package. But not all users will be so fortunate. Often this file must be converted to some other file format before it can be imported to the user's analysis program. This may or may not be a simple task.

Even a file format for which a user is properly equipped may be used in an unfamiliar manner. For example, the independent and dependent variables might be in a different order or an ASCII data file may use tabs instead of spaces.

Assuming the import of the COADS data has been accomplished and boundaries for the AVHRR search identified, the task of selecting from the second archive may begin. Unfortunately, the request to the AVHRR archive will return either a GIF picture, an HDF format file, or a raw (binary) data file. Again, importing this output into the user's analysis program may or may not be simple, but it will not be the same procedure as the one used for the first data request.

Other problems are also apparent. The COADS Climatology sampling program requests the user supply dates (month and day), whereas the AVHRR archive asks for the "Julian day" (an integer between 1 and 365 or 366). One server will accept "S" and "W" to indicate South latitudes and West longitudes, while the other requires that these be indicated with negative coordinate values. The sampling of the COADS dataset, while flexible, may not allow sampling in the manner the user needs. It cannot, for example, provide a section except along a line of constant latitude or longitude. If a user wanted to see a section along a NE-SW line, it would be a challenging and time-consuming task to assemble one from many small data requests.

Further, it might be desirable to use the results of sampling these two databases to construct a time series. This could conceivably mean repeating the entire procedure many times.


Tom Sgouros, August 25, 2004

Prev Up Next