/* Copyright 2009, UCAR/Unidata and OPeNDAP, Inc. See the COPYRIGHT file for more information. */
This code was produced by pulling it out of the DRNO code with which it had become entangled (bad design choice). As such, there are a number of netCDF style design elements still within the oc code.
nonterm: nonterm1 nonterm2 nonterm3 ;The corresponding action calls an external procedure named for the left hand side and taking the values of the right side non-terminals as arguments.
{$$=nonterm(parsestate,$1,$2,$3);}Note that this form of parsing action was requested by John Caron so that the same .y file could be used for C and Java parsers. In line with this, all non-terminals are defined to return a type of "Object", which is "void*" for C parsers and "Object" for Java parsers. The cost is the use of a lot of casting in the action procedures.
Note the extra "parsestate" argument. The parsers are constructed as reentrant and this argument contains the per-parser state information.
The bodies of the action procedures is defined in a separate file called "dapparselex.c". That file also contains the lexer required by the parser. Note that lex was not used because of the simplicity of the lexer.
One of the issues that must be addressed by any bottom-up parser is handling the accumulation of sets of items (nodes, etc.)
The canonical way that this is handled in the oc parsers is to use the following form of production.
1 declarations: 2 /* empty */ {$$=declarations(parsestate,NULL,NULL);} 3 | declarations declaration {$$=declarations(parsestate,$1,$2);} 4 ;The base case (line 2) action is called with NULL arguments to indicate the base case. The recursive case (line 3) is called with the values of the two right side non-terminals.
The corresponding action code is defined as follows.
1 Object 2 declarations(DAPparsestate* state, Object decls, Object decl) 3 { 4 Oclist* alist = (Oclist*)decls; 5 if(alist == NULL) alist = oclistnew(); 6 else oclistpush(alist,(ocelem)decl); 7 return alist; 8 }The base case is handled in line 5. It creates and returns a Sequence instance; a Sequence is a dynamically extendible array of arbitrary items (see below). The recursive case is in line 6, where it is assumed that the Sequence argument is defined and there is a decl object that should be inserted into the sequence.
This pattern, in various forms, is ubiquitous in the parsers.
OCtype octype | - | Defines the kind of node. | ||||||
OCtype etype | - | Used for attribute nodes and primitive nodes to define the primitive type. | ||||||
char* name | - | From the DDS. | ||||||
char* fullname | - | Fully qualified name such as a.b.c. | ||||||
int active | - | True if this node participates in the datadds data packet; currently not used. | ||||||
OCnode* container | - | Parent node of this node. | ||||||
Diminfo dim | - | Extra information about dimension nodes. | ||||||
Arrayinfo array | - | Extra information about nodes that have rank > 0. | ||||||
Attinfo att | - | Extra information about attribute nodes. | ||||||
Sequence* subnodes | - | (SequenceSequence* attributes | - | (Sequence | void* public | - | Place for users to attach arbitary info
to the node instances.
| |
This particular structure is relatively similar to that of the Ocapi node, but with all the extra data storage information elided.
The connection is used for a variety of purposes and is as a rule the first argument of any of the API procedures. The basic connection API is as follows.
Operation | Arguments | Output(s) | Semantics |
---|---|---|---|
oc_open | N.A | Connection | Return a reference to an new Connection. |
oc_close | 1. Connection | Errno | Close a connection and reclaim any associated resources. |
oc_fetchdds | 1. Connection 2. char* url | Errno | Fetch a DDS from the DAP server. The specific DDS is determined by the url, which should itself not end in ".dds". The returned DDS is parsed and the rootnode of the parse is stored in the Connection state. |
oc_fetchdas | 1. Connection 2. char* url | Errno | Fetch a DAS from the DAP server. The specific DAS is determined by the url, which should itself not end in ".das". The returned DAS is parsed and the rootnode of the parse is stored in the Connection state. |
oc_fetchdatadds | 1. Connection 2. char* url | Errno | Fetch a DATADDS from the DAP server. The specific DATADDS is determined by the url, which should itself not end in ".dods". It may include constraints, however. The returned DDS is parsed and the rootnode of the parse is stored in the Connection state. The associated data, referred to here as the "data packet" is also captured and stored in a temporary file with a random name. For security reasons, the file must not already exist, and only the creatorhas read/write permission to the file. |
oc_getdds | 1. Connection | Nodeid | Return the root node of the DDS. If oc_fetchdds has not been called or if the DDS is malformed, then the root will not exist and the value NULL will be returned. |
oc_getdas | 1. Connection | Nodeid | Return the root node of the DAS. If oc_fetchdas has not been called or if the DAS is malformed, then the root will not exist and NULL will be returned. |
oc_getdatadds | 1. Connection | Nodeid | Return the root node of the DATADDS. If oc_fetchdatadds has not been called or if the DATADDS is malformed, then the root will not exist and NULL will be returned. |
Structure {...} S[2][3]
,
represents the 2 X 3 = 6 instances of the structure.
Sequence {...} S
,
represents all the records for a given Sequence.
The mapping between nodes and contents is one-to-many. That is, there often will be multiple occurrences of a given node type in a DATADDS response. Consider the following example.
Dataset { Structure { int16 f11[2]; float32 f12; } S1; Structure { int16 f21; float32 f22[2]; } S2[3] } D1;If we have a data response with this DDS, then the following instances will exist.
Class | Count | Instances |
---|---|---|
D1 | 1 | D1 |
S1 | 1 | D1.S1 |
f11 | 2 | D1.S1.f11[0] D1.S1.f11[1] |
f12 | 1 | D1.S1.f12 |
S2 | 3 | D1.S2[0] D1.S2[1] D1.S2[2] |
f21 | 3 | D1.S2[0].f21 D1.S2[1].f21 D1.S2[2].f21 |
f22 | 6 | D1.S2[0].f22[0] D1.S2[0].f22[1] D1.S2[1].f22[0] D1.S2[1].f22[1] D1.S2[2].f22[0] D1.S2[2].f22[1] |
The basic API is as follows.
Operation | Arguments | Output(s) | Semantics |
---|---|---|---|
oc_newcontent | N.A. | Content | Return a reference to an empty Content object. |
oc_freecontent | 1. Connection 2. Content | Errno | Destroy a reference to a Content object and release any associated resources. |
oc_clonecontent | 1. Connection 2. Content | Errno | Create a new Content object with the same values as the input Content object. |
oc_rootcontent | 1. Connection 2. Content | Errno | Given a Content object, modify the content object to refer to the whole dataset; this corresponds to all of the data that was returned in response to a DATADDS request. |
oc_dimcontent | 1. Connection 2. Content 3. Content 4. size_t i | Errno | Given a reference an existing content (arg 1) that is in Dimmode, modify the given content object (arg 2) to refer to the ith instance of the dimension. See the section on handling multi-dimensional arrays to see how a multi-dimensional object is reduced to a single integer index. |
oc_recordcontent | 1. Connection 2. Content 3. Content 4. size_t i | Errno | Given a reference an existing content (arg 1) that is in Recordmode, modify the given content object (arg 2) to refer to the ith record of the sequence instance. |
oc_fieldcontent | 1. Connection 2. Content 3. Content 4. size_t | Errno | Given a reference an existing (parent) content (arg 1) that is in Fieldmode, modify the given content object (arg 2) to refer to the ith field of the parent content. |
oc_getcontent | 1. Connection 2. Content 3. void* memory 4. size_t memsize 5. size_t start 6. size_t count | Errno | Given a reference an existing, defined, dimensioned content object, extract some subset of the data and store it in the space defined by the memory argument. It is assumed that the current content references is in Dimmode, which means that it was reached using oc_fieldcontent(). The subset of count items beginning at start are extracted into the memory argument. This routine will also work for scalars, but the count must be one and the start must be zero. |
oc_recordcount | 1. Connection 2. Content | size_t | Given a reference an existing content (arg 1) that is in Recordmode, return the number of records associated with this content. Note that this can be an expensive operation because some part of the data must be processed to count the number of records. |
oc_dimcount | 1. Connection 2. Content | size_t | Given a reference an existing content (arg 1) that is in Dimmode, return the number of actual elements in the xdr packet. Because of projections, this may differ from the count determined by combining multi-dimensional arrays. |
oc_fieldcount | 1. Connection 2. Content | size_t | Given a reference an existing content (arg 1) that is in Fieldmode, return the number of fields. |
Over time, the list of API procedures is likely to grow, so the above may be somewhat out-of-date. The file "occontent.h" should contain the definitive set of procedures.
Of course, it is possible to define a number of useful procedures on top of these basic operations. For example, it might be useful to define a variant of oc_dimcontent that takes a multiple dimensions and returns the associated content. In effect it would compute the multi-dimensional conversion algorithm for the user.
One other note about Content objects. The reason that there are explicit create and destroy operations is to allow/force the user to control the number of created Content objects and to reuse previously created Content objects. If the API created a new object for every call to, say, oc_dimcontent, then there would be an explosion of Content objects equal to the size of the dimension. There would be no way to reclaim them either because it is impossible to know which are still actively in use.
In order to experiment with this issue, the API
extern int oc_compile(struct OCconnection*);has been added. It does a one-time conversion of the xdr data to an in-memory structure. The OCcontent API operations (oc_dimcontent, etc.) will use the memory version if it is available.
One good thing about Ocapi was it provided a mechanism for returning detailed error information strings. In order to keep something like that, oc has a log mechanism (oclog.[ch]). It can be used to dump extra error or warning info and it can be used to dump debug info (see the DEBUG macros in ocdebug.h).
Operation | Arguments | Output(s) | Semantics |
---|---|---|---|
dapurlparse | 1. const char* url 2. DAPURL* | int error | Parses an oc url string into its component parts. It returns 0 if fails, 1 otherwise.
The component parts are as follows.
|
dapurlclear | 1. DAPURL* | void | Reclaim all the allocated space in a dapurl. The DAPURL instance itself is NOT reclaimed. |
dapurllookup | 1. DAPURL* 2. const char* name | const char* | Search the client parameters for name. If not found return NULL, otherwise return the associated value. |
dapurlreplace | 1. DAPURL* 2. const char* name 3. const char* value | const char* | Replace, insert, or delete a specified client parameter. Return 0 if the parameter does not already exist, and return 1 otherwise. If the value is NULL, then delete the parameter; otherwise, if the name is found, then replace the value, else insert the new name/value pair. |
The canonical code for non-destructive walking of a Sequence
Bytebuffer provides two ways to access its internal buffer of characters.
One is "bbContents()", which returns a direct pointer to the buffer,
and the other is "bbDup()", which returns a malloc'd string containing
the contents and null terminated.
Suppose we have the DDS field
A particular point in the three dimensions, say [x][y][z], is reduced to
a number in the range 0..29 by computing
for(i=0;i<sqLength(seq);i++) {
T* element = (T*)sqGet(seq,i);
...
}
Multi-Dimensional Array Handling
Within a data packet, the DAP protocol "linearizes" multi-dimensional
arrays into a single dimension. The rule for converting a multi-dimensional
array to a single dimensions is as follows.
Int F[2][5][3];
.
There are obviously a total of 2 X 5 X 3 = 30 integers in F.
Thus, these three dimensions will be reduced to a single dimension of size 30.
((x*5)+y)*3+z
.
The corresponding general C code is as follows.
size_t
dimmap(int rank, size_t* indices, size_t* sizes)
{
int i;
size_t count = 0;
for(i=0;i
In this code, the indices variable corresponds to the x,y, and z.
The sizes variable corresponds to the 2,5, and 3.
Change Log