2. The SYMBOLICDATA Perl tools

Next: 3. Realizing the Goals Up: 2. The Design of Previous: 1. The structure of

2. The SYMBOLICDATA Perl tools

The design of the SYMBOLICDATA tools has to take into consideration several circumstances. First, the operations they have to perform are of very different natures and requirements: they range from the insertion and validation of a single record, over the initiation, control and evaluation of benchmark computations on selected records, up to the transformation of parts or the entire data base into other representations like HTML or SQL. Second, the usability of these tools has to be as simple and as flexible as possible. And third, the tools need to be extendible at different levels.

With these circumstances in mind, the SYMBOLICDATA tools are designed to provide

a programming environment to be used for independent and rapid development of new components and specialized applications which, on the one hand, allows a maximum on code reusability and similarity of the look-and-feel of different components, and on the other hand, a maximum on flexibility and component independence.
a well-documented, flexible, and intuitive standard interface program which can initiate and control most of the implemented operations in a standardized and extendible way.

The SYMBOLICDATA Perl tools are the main vehicle for operations on the data base. They are implemented as a hierarchy of Perl modules which we divide into four categories:

Basic modules: : They implement primitive operations, like I/O and tag/value access of sd-records.
Action modules: : They implement the generic part of actions like validate, insert, compute, transform, etc. to be performed with the data base.
Table modules: : They implement those parts of actions that are specific for a given table, e.g., how to validate a bibliography entry.
The symbolicdata program: : It provides a standard interface that realizes command-line parsing, initialization of global variables and required modules, and execution of the well defined actions inherited from the command line.

To give the reader a feeling of how these modules cooperate we describe the main steps executed by the symbolicdata program. Its synopsis is

symbolicdata [-req file] actions [options] [args]

On start-up, symbolicdata loads all the basic modules, parses the command-line arguments up to the mandatory action argument(s), and loads the global action hash which specifies, in a well-defined format, all known (or, ``registered'') actions and their properties, e.g., the Perl modules required for the action, a description of the action etc. The action hash can dynamically be extended at run-time using the first (optional) -req file argument, where file is the name of a Perl module which is loaded before the actions are parsed. Next, for each action, the modules listed in the respective action hash entry are loaded.

Then, symbolicdata initializes the global command-line hash which stores the recognized command-line options, their properties (like syntax of the argument, documentation, etc.) and (default) values. Each loaded module, including the basic modules, may add general, or action-specific entries to this global command-line hash. This way, the list of recognized command-line options is dynamically built up at run-time, and, hence, can independently be extended by other modules and is kept as small as possible. Values for command-line options can also be given in so-called init-files, which allow convenient editing and storing of these values.

After the modules are loaded and the command-line hash is set up, all remaining command-line arguments are parsed, and their values are stored in the appropriate slots of the command-line hash.

Finally, symbolicdata calls the specified action(s) in the order in which they are listed on the command-line: The first action gets the remaining command-line arguments as input, subsequent actions get the output of their preceding action as input, unless, of course, an error occurred.

The Perl tools use a hierarchy of hashes as internal data representation of the data base: the entire data base is a hash of Type/table pairs, a table is a hash of Key/record pairs etc. Furthermore, these hashes are implemented as so-called tied hashes, i.e., the basic hash operations like creation, value access, iteration, and destruction are overloaded. This overloading enables transparent data manipulations on both, the internal sd-record hashes and the external (persistent) sd-files. It also enables automatic loading, caching and storing of sd-records; read-only access of sd-records; automatic or explicit conversion of tag values into strings/lists/hashes, etc.

To increase the usability of the implemented tools, it is necessary to provide adequate and up-to-date documentation of their various features. From our experience, this is best realized by keeping the documentation and the source code closely together. Therefore, each module, action, and command-line option specification also has to provide well-defined hashes or hash entries which describe and illustrate the provided feature(s). This way, extensive documentation in various formats, e.g., a short ASCII description of relevant command-line options, or a detailed HTML table of all actions and their respective command-line options together with relevant examples, can be generated directly from the source code.

Next: 3. Realizing the Goals Up: 2. The Design of Previous: 1. The structure of

| ZCA Home | Reports |