Grammar

Now we'll look at a grammar for XL. Note that we are building just a recognizer at this state, and the grammar may (and will!) change.

Let's start at the top, shall we?

Program Specification

XL has one program per file, so this is really our starting rule.

program

  : PROGRAM IDENT EQUALS 

        subprogramBody 

    DOT

    "@" // end-of-file

  ;

           

Pretty straightforward. Note that so far, we don't care that we are definitely declaring that IDENT here. We will later, though...


Subprogram Bodies

A subprogram body is a little (or big) block of code that makes up the "what I do" part of a program, procedure, or function. It's not bad to define, but it will get a bit tricker later. Well, not too much so.

subprogramBody

  : (basicDecl)*

    (procedureDecl)*

    BEGIN

        (statement)*

    END IDENT

  ;

           

So what is it? Basically, define your local variables, types and constants (basicDecls). There can be zero or more of these, hence the use of the ()* closure. Then, define any nested procedures or functions. Again, zero or more of these. Finally, we get to the definition of what the current program/procedure/function does. This starts with a BEGIN, has zero or more statements in it, and is ended by END IDENT. Note that the XL spec stated that the identifer that ends a subroutine must match the beginning one. Right now, we have no way of doing that, as the name for the subroutine is outside the scope of this rule. We'll handle this later, though, and in a pretty neat way I must say. Yacc can't hold a candle to it, you'll see!

Note that the XL spec said nothing about "procedures must be declared after vars, consts and types." This was one of the many things that the language designer told us during midnight interrogation... Similar to Pascal's definition order (CONST TYPE VAR FUNCTION/PROCEDURE) but not quite that rigid.


Basic Declarations

XL has three main declarations: variables, constants, and types:

basicDecl

  : varDecl

  | constDecl

  | typeDecl

  ;

           

Just so this section isn't so short, I'll define varDecl and constDecl here.

A variable declaration looks like:

varDecl

  : VAR identList COLON typeName

    {BECOMES constantValue}

    SEMI

  ;

           

Unlike Pascal, each declaration must start with VAR; there is no "VAR section" that starts with the keyword VAR. The varDecl states that you can defined any number of idents at once, and you can optionally initialize the variable(s).

A constant declaration is similar to a variable declaration, except that you use the keyword CONST and must assign a value:

constDecl

  : CONST identList COLON typeName

    BECOMES constantValue SEMI

  ;

           

In the above rules, there subrules identList and constantValue are mentioned. These are:

identList

  : IDENT (COMMA IDENT)*

  ;

           

which says "one or more IDENTs separated by COMMAs", and

constantValue

  : INTLIT

  | STRING_LITERAL

  | IDENT

  ;

           

which is pretty self-explanatory. One thing to note, though, there there is nothing right now that prevents us from using a variable IDENT instead of a constant IDENT. That's handled later...


Type Declarations

XL defines three user-defined types: arrays, records and enumeration types. In my project, I only defined arrays and records, so for now, I'll skip enumeration types. I may add them at another time.

A type declaration is either an array declaration or a record declaration:

typeDecl

  : TYPE IDENT EQUALS

    ( arrayDecl

    | recordDecl

    )

    SEMI

  ;

           

I got a bit fancier here by using a subrule to say "array or record" instead of defining a new rule for it. This is also a bit more efficient than the extra function call created by another rule there. Basic rule of thumb -- if a subrule is clear, and not deeply nested, feel free to use it. However, watch out for using several nested subrules, as the meaning can get hidden quickly.

Arrays are defined as ARRAY [x..y] OF type. Only one-dimensional, pretty simple:

arrayDecl

  : ARRAY LBRACKET integerConstant

      DOTDOT integerConstant RBRACKET

      OF typeName

  ;



integerConstant

  : INTLIT

  | IDENT // again, a constant...

  ;

           

Records are defined as RECORD x,y,z:typename; END RECORD. . Again, simple until we have to know what it means:

recordDecl

  : RECORD (identList COLON typeName SEMI)+ END RECORD

  ;

           

So what is this typeName I keep bringing up? Well, it's either one of the predefined types, Integer or Boolean, or it's a user-defined type (which means its an IDENT), so:

typeName

  : IDENT

  | INTEGER

  | BOOLEAN

  ;

           

Enough about types, now, on to


Procedure Declarations

A procedure in XL is similar to a program, so basically, it's a small heading followed by a subprogramBody:

procedureDecl

  : PROCEDURE IDENT {formalParameters} EQUALS

        subprogramBody

    SEMI

  ;

           

At the time I originally did this project, I only did procedures, not functions. Perhaps I'll add them later... (You may think "boy he left a lot out," but Dr. Moore made a large subset of the project required, and I did a few extra point things, but not the whole thing. I was working full-time you know...)

Notice that the formalParameters are optional... Let's define what they look like:

formalParameters

  : LPAREN parameterSpec (SEMI parameterSpec)* RPAREN

  ;

           

Again we see the familiar x (COMMA x)* notation -- the ()* closure is very handy and efficient for matching lists of things...

parameterSpec

  : {VAR} identList COLON typeName

  ;

           

You'll notice that this is quite a bit like a variable declaration, except that VAR is optional, and there's no semicolon after it. We'll handle it a bit differently as well, once we add action code.

Next, we'll look at statements...