WebMumps Interpreter User's Guide
Kevin C. O'Kane, Ph.D.
Computer Science Department
University of Northern Iowa
Cedar Falls, IA 50614
okane@cs.uni.edu
http://www.cs.uni.edu/~okane
November 4, 1996

This document describes the implementation of a C-Based WebMumps subset interpreter for use with Windows 95, Windows NT, Linux, OS/2 and Sun Solaris. The material herein is preliminary in nature and subject to change. Use of the software described here is at the user's risk. Please send error reports the the e-mail address above.

Copyright (C) 1996 by Kevin C. O'Kane, all rights reserved.


Introduction

The purpose of this document is to present an overview of the C-Based WebMumps implementation of the MUMPS language. Documentation concerning the MUMPS (Massachusetts General Hospital Utility Multi-Programming System) language is available from the MUMPS Users' Group:

http://www.mcenter.com

Usage of the C-Based WebMumps interpreter is entirely at the user's risk. The C-Based WebMumps software package described herein is not warranted in any manner whatsoever. The licensor disclaims responsibility for any and all damages, either direct or consequential, which may arise through use of the interpreter. The Linux and Solaris versions were compiled with the GNU C++ compiler. The other versions were compiled with the Watcom 10.6 C++ compiler.

Installing the System

OS/2, Win 95 and Win NT

If you are going to use WebMumps in standalone mode (i.e., without a web server), copy the executable mumps.exe to a directory that is in your execution path, for example, \windows. Now create a directory named \globals. If you do not create this directory, the system will hang.

Linux, Solaris

The global array files are created in /tmp.

All Systems

The global array files created are called global.dat and data.dat. To recreate the file system, delete these files. WebMumps will recreate them.

If you are using WebMumps in conjunction with a web server, you need to copy the executable module to the directory from which the server executes cgi-bin programs. In some systems, you may need to rename the module to something like mumps.cgi .

In each directory from which you expect to execute WebMumps, install a program named init.mps which WebMumps will execute when you invoke it without specifing a particular routine to execute. If you do not have an init.mps file and you invoke WebMumps without giving the name of a file to execute, you will get a warning message but WebMumps will otherwise correctly start in interactive mode.

In standalone mode, the init.mps file will execute and control will be returned to your terminal when it has finished. The interpreter is ready to accept direct mode input as denoted by the ">" prompt. You may exit by the "ze" command. In interactive mode, any C-Based WebMumps command may be entered. Interactive mode input lines may contain multiple commands.

The distributed system was compiled with the Watcom 10.6 C compiler. Options were set for Pentium execution.

Global Arrays

The global arrays are automatically initialized when you start the interpreter for the first time or when you use the interpreter for the first time after you have deleted the global array file system. The global array file system is normally stored in the directory /globals of /tmp in two files named global.dat and data.dat.

Global array references may contain either string or numeric subscripts. Only printable ASCII characters are permitted as values for global array subscripts. No subscript should consist of a negative number. The global array files are /globals/global.dat and /globals/data.dat and they will grow in size as elements are added. These files may be copied as backups.

Use with Web Server

To use the WebMumps interpreter with a web server, install the code in the server's cgi-bin (or other appropriate directory). From HTML documents, you will reference the interpreter with lines of the form:

<A HREF="/cgi-bin/mumps.exe?prog=^pgmname.mps&var1=11111&var2=123"> test </A>

Here, the name of the interpreter is taken to be mumps.exe but in some systems it may need to be different. The name of the WebMumps program to execute is given in the "prog=" field and the starting values of variables are given next. Note the "^" character. Upon initialization, the variables appearing in the HREF (var1 and var2 in the above) will exist in the WebMumps symbol table and have the values provided by the browser. These can be obtained, for example, from FORMS input.

Be certain to Halt or ZE at the end of a program in order to terminate the WebMumps interpreter. If you do not correctly terminate the interpreter, the web server will hang. The interpreter will insure that only one copy is running at any given time as it opens the global array files for exclusive access. Thus, WebMumps programs should be short, transaction oriented jobs that do not delay the system. This, however, is the typical case for web server environments. Note: since only one copy of the interpreter is ever running, all other file accesses are also exclusive.

Uncontrolled Termination

If you halt the interpreter with a control-C or other external Kill command, the global array data base may be corrupted. Back up copies of critical data should be maintained. A dump function exists that copies the global arrays to an ASCII text file which can be used to reload the data base.

Added Program Mode Commands (Z-Commands)

In addition to the regular C-Based WebMumps commands which may be used in program mode, the interpreter supports the following additional commands: ZE, ZP, and Z.... These commands may be executed from program or direct mode. Normally, they may not be post-conditionalized.

Z Functions

Program Formats

C-Based WebMumps programs may be created with any standard Unix system editor. They are ordinary ASCII files with the following conventions:

Implementation Notes


Language Specification

Introduction

The purpose of this section is to provide you with an introduction to the MUMPS language in general C-Based WebMumps interpreter in particular. The MUMPS language originated in the mid-60's at the Massachusetts General Hospital. The acronym stands for "Massachusetts General Hospital Utility Multi-Programming System". It is a language which is similar in some respects to BASIC but it contains many additional features not found in BASIC, or for that matter, in most other languages. WebMumps is an interpretive language. In fact, parts of the language specification require that it can never fully become a "compiled" language such as FORTRAN, COBOL or PL/I. As an interpreter, WebMumps has many advantages over traditional "compile, load and go" languages. Among the features which make attractive for both bio-medical and general scientific applications are:

Hierarchical data base facility. WebMumps data sets are not only organized along traditional sequential and direct access methods, but also as hierarchical trees whose data nodes are addressed as path descriptions in a manner which is easy for a programmer to master in a relatively short time.

Flexible and powerful string manipulation facilities. WebMumps built-in string manipulation operators and functions provide programmer's with access to efficient means to effect complex string manipulation and pattern matching operations.

Transportability to widely different systems MUMPS presently runs under a large number of operating systems on many machine architectures. These systems range in size from small home micro-computers to the largest central time sharing systems. Through efforts that have taken place by the MUMPS Development Committee over the years, a well organized language definition has been written and formally published. This standard provides for a far tighter specification for system performance and linguistic definition than is normally the case. As a result, programs written under a MUMPS system can be moved with relatively little effort from one system to another.

Full numeric data handling facilities WebMumps provides, in addition to string handling facilities, a full range of fixed and floating point computational facilities. Unlike the case with many interpreters, the C-Based WebMumps system and several other MUMPS systems permit the user to link special functions and subroutines directly to the main interpreter. This permits execution of high speed compiled code subroutines while retaining the many advantages of an interpretive system's control and debugging aids.

Data Types and Values

Basically, WebMumps has only one data type: string, although it does allow integer and floating point computations as well as logical expressions. A string variable is restricted to 255 characters in length or less (20 characters or less if it is being used as a number. Note: this is a restriction of this implementation of WebMumps; see above for a detailed list of such restrictions). The values in a string may be any ASCII code from 1 to 128 (decimal) inclusive with the exception that an ASCII 1 may not be used in a string index to a global array. WebMumps does not permit usage of the ASCII zero character (). Ordinarily, strings will contain ASCII printable characters. String constants are enclosed in double quote marks ("). A double quote mark can be included in the usual manner with two adjacent quote marks (""). A constant containing only numerics (and, optionally, plus, minus and decimal point), need not be enclosed by quotes (although doing so has no effect). The following are examples of valid C-based WebMumps character string constants:

"THE SEAS DIVIDE AND MANY A TIDE"
"123.45"
"BRIDGET O'SHAUNESSEY? YOU'RE NOT MAKING THAT UP?"
"""THE TIME HAS COME,"" THE WALRUS SAID"

When a string is being used as a number (e.g., in addition), the numeric portion must be 20 characters or less in length. Numeric constants are restricted to integer or decimal values (positive or negative). "E-type" notation is not permitted. If a string begins with a number but ends with non-numeric characters, only the numeric leading portion will participate in operations requiring numeric operands (e.g., add, subtract, etc.); the trailing non-numeric portion is lost. On the other hand, if a string begins with non-numeric characters, its value will be interpreted as 0. The following are examples:

1+2 will be evaluated as 3.
"ABC"+2 will be evaluated as 2.
"1AB"+2 will be evaluated as 3.
"AA1"+2 will be evaluated as 2.
"1"+"2" will be evaluated as 3.

Although "string" is the basic data type, C-based WebMumps converts strings internally to floating point values for calculations. Consequently, numbers are of approximately 7 digit precision. A number may range in magnitude from 10**-19 to 10**19.

Logical values in C-based WebMumps are special cases of the numerics. A numeric value of zero is interpreted as false while a non-zero value is interpreted as true. Logical operators yield either zero or one and their results can be treated like any other numeric. Similarly, the numeric result of any numeric operator can be used as a logical operand. The results of string operators are interpreted either as zero (leading characters non-numeric) or some value (leading characters numeric). Strings and the results of string operations can therefore participate as the operands of logical operators.

Variables

Variables are named in C-based WebMumps in much the same manner they are named in other languages. A C-based WebMumps variable name must begin with a letter (A through Z) or percent sign (%) and may be followed by either letters or numbers. In general, variable names should be nine or fewer characters in length (the maximum variable name length is 255 characters). Unlike most languages, C-based WebMumps variables are not automatically data typed by their first letter. C-based WebMumps, in effect has only one data type so any variable name may be any value. All C-based WebMumps variables are varying length strings (length may range from 0 to 255 characters).

In C-based WebMumps there are no data declaration statements. Variables come into existence through assignment statements (SET) or the "READ" command and pass from existence through the "KILL" command. In C-based WebMumps there are two kinds of arrays: internal arrays and global arrays. The following pertains to internal arrays: arrays are not dimensioned. A name used as an array variable may also, at the same time, be used as a scalar. Array values are created by assignment or appearance in a "READ" statement. If you create an element of an array, let us say element 10, it does not mean that C-based WebMumps has created any other elements: that is, it does not imply that there exist elements 1 through 9. You may specifically create these or not. Array indices may be positive or negative numbers or character strings. Arrays in C-based WebMumps may have multiple dimensions. The following are some examples of arrays:

SET A(1,2,3)="ARRAY"
READ TEST(22)
WRITE TEST(22)
SET I=10 SET A(I)=10
SET A("TEST")=100
SET I="TESTING" SET A(I)=1001
SET A("MUMPS","USERS'","GROUP")="MUG"

Global arrays are unique to C-based WebMumps. As a programmer, you will work with them as though they were arrays. The system, however, interprets them as tree path descriptions for the system's external data files. A global array is distinguished by beginning with the circumflex character (^). The remainder of the specification is the same as an internal array. global arrays are not dimensioned and they may appear anywhere an ordinary variable may appear (except in certain forms of the "KILL" command). A typical global array specification consists of the array name followed by some number of indices (indices may be constants, variables [including internal or global arrays] or expressions of string, numeric or mixed type). For example:

SET ^A(1,43,5,99)="TEST"
SET ^SHIP("1ST FLEET","BOSTON","FLAG")="CONSTITUTION"
SET ^CAPTAIN(^SHIP("1ST FLEET","BOSTON","FLAG"))="JONES"
SET ^HOME(^CAPTAIN(^SHIP("1ST FLEET","BOSTON","FLAG")))=
... "PORTSMOUTH"
WRITE ^SHIP("1ST FLEET","BOSTON","FLAG")
... CONSTITUTION
WRITE ^CAPTAIN("CONSTITUTION")
... JONES
WRITE ^HOME("JONES")
... PORTSMOUTH
WRITE ^HOME(^CAPTAIN("CONSTITUTION"))
... PORTSMOUTH

The system files are viewed as trees. Each global array name ("A", "SHIP", "CAPTAIN", and "HOME" in the above) is the root of a tree. The indices are thought of as path descriptions to leaves. For example, out of the root "A" there may be many branches, the above specifies to take the branch labeled "1" (note: this does not mean the "first" branch out of the node - it means the branch with label "1"). At the second level the specification says to take the branch labeled "43" (note: this does not imply that branches 1 through 42 necessarily exist). The path description is followed (or, possibly, created if the global array specification appears on the left hand side of an assignment statement or in a "READ" statement) to a final node. The value at the node is either retrieved or a new value stored depending upon the context in which the global array specification was used. The indices of global arrays may be numeric or character strings. The second sequence of examples above illustrates this usage.

Both string and character indices may be mixed in the same path description.

A value may be stored at any position in the tree. For example:

SET ^A(1,43,5)=22
SET ^A(1,43)="TEST MIDDLE LEVEL"

Operators

Arithmetic unary operators: + -

The arithmetic unary operators are: + and -. The plus operator (+) has no effect other than to force the expression to its right to be interpreted as numeric. The minus operator forces numeric interpretation and negates the result. For example:

SET I="123 ELM STREET"
WRITE +I yields 123
WRITE -I yields -123

Commands

Each statement in C-based WebMumps begins with a unique command word. Most of the time, to save space (remember: C-based WebMumps is interpreted), the command word i s abbreviated to a single character. The single character abbreviations are unique for all commands except those which begin with the letter "Z". For commands not beginning with the letter "Z", C-based WebMumps does not check the spe lling of the command word if more than one character of the spelling is given. The first letter is used to determine the command. Thus "WRITE", "W", and "WRIGHT" all have the same meaning. When you run C-based WebMumps, you are initially in direct mode. That is, if you type a command, the interpreter executes it immediately. You can tell that you're in direct mode by the ">" character which the interpreter places at the left-hand side of the screen. In direct mode you may enter a line which contains multiple commands. The syntax of the command portion of a line of C-based WebMumps code consists (in t he general case - there are exceptions) of the command word or letter followed (optionally) by a post-conditional, followed by exactly one blank followed by the arguments to the command. Most commands can have multiple arguments. Multiple arguments are delimited by commas. If a line is to have more than one command, the first command is delimited by exactly one blank and the next command word or letter follows immediately. Blanks are very significant in C-based WebMumps.

As noted above, most commands may be "post-conditionalized". A post-conditional is a logical expression which is used to determine if the command (and all its arguments) should be executed. It is like a small "IF" statement. Some commands, such as "DO" and "GOTO", may not only be post-conditionalized at the command level, but also at the argument level: that is, a separate post-conditional may be specified for each argument. A post-conditional appears as a colon followed by an expression. If the expression evaluates to 0 (false), the command (or argument) is not executed. If the expression evaluates non-zero, the command or argument is executed.

The following are examples of the above:

an ordinary assignment statement:

SET I=10*5

same as above with command word abbreviation:

S I=10*5

an assignment statement with multiple arguments:

S I=10*5,J=5,K=I+J (K will equal 55)

an assignment statement post-conditionalized:

S:I=10 J=0 (set J to zero if I equals 10)

a multiple command line:

S I=10*5 S j=5 S S=I+J (same as above)

Table of Commands

Functions

Built-in Variables