Misty Programming Language:

The Preprocessor

Misty programs enjoy a pass through a preprocessor before being compiled. The preprocessor provides support for literate programming: code fragments, macros, and include files.

Directives

misty-program => (comment | preprocessor-outer-conditional | preprocessor-content)*

preprocessor-content
=> token
=> preprocessor-comment
=> preprocessor-include
=> preprocessor-macro
=> preprocessor-fragment
=> preprocessor-token
=> preprocessor-close

Preprocessor directives start with a @ (percent sign). They can appear anywhere except inside of text constants.

Conditional

preprocessor-outer-conditional => '@if' preprocessor-condition '@then' misty-program ('@elif' preprocessor-condition '@then' misty-program)* ['@else' misty-program] '@fi'

preprocessor-condition => name 'is' ['not'] (text | 'defined')

Conditionals can be used to adapt a program to various environments or configurations. Regions of the program will be included or excluded as determined by the existence or value of simple macros. Simple macros can be provided by the local environment or by a build system.

Block directives

A block directive starts a block which ends with the next block directive or with the @program directive. A block directive cannot contain within its block another block directive.

A block (even a comment block) contains a sequence of tokens. Therefore, the block must contain a sequence of names, texts, operators, numbers, and comments. Badly formed tokens will cause a syntax error.

@comment

preprocessor-comment => preprocessor-comment-tag token*

preprocessor-comment-tag => '@comment' | '@book' | '@volume' |'@chapter' | '@article' | '@section' | '@subsection' | '@note' | '@specimen' | '@doc'

The @comment directive causes all of the following tokens to be ignored until the next block directive or block terminator. There are other directives that act the same as @comment (@chapter, @section, @subsection, etc.) in the preprocessor, but which produce documentation in the literate processor.

@macro

preprocessor-macro => '@macro' name [ '('[name (',' name)*]')'] ':' token*

The @macro directive defines a macro. There are two kinds of macros. Simple macros have no parameters. Regular macros have a block of zero or more parameters wrapped in parens. A macro binds a sequence of tokens to a name. The sequence of tokens ends with the next preprocessor directive.

(Unlike the C #define directive, the sequence of tokens is not limited to a single line. Also, there is a single name space for all macros; there is not a separate namespace for simple macros vs regular macros.)

The sequence of tokens is not expanded at definition time.

When the macro name is encountered later, it will be replaced with its sequence of tokens.

@fragment

preprocessor-fragment => '@fragment' text ':' token*

The @fragment directive causes all of the following tokens to be associated with the text. Fragments differ from macros in some important ways:

Fragments can be inserted by use of the @ text operator. Fragment insertion can take place before fragments are defined. This allows for out-of-order exposition. Every fragment must be inserted exactly once.

Token directives

preprocessor-text => ('@name' | '@number' | '@text') '(' (name | text)* ')'

Token directives are used to construct tokens. The parameter is one or more simple macros or string literals or number literals. They will all be concatenated together to form a token.

@text

The @text directive makes a text token. If it has no parameters, it is the empty text. Do not confuse this with @ text.

@name

The @name directive makes a name token. It must conform to the rules for names. If it has no parameters, it is an error.

@number

The @number directive makes a number token. It must conform to the rules for numbers. If it has no parameters, it is an error.

Block closing directive

There are three ways of closing a block directive. One is to start another block directive, since a block cannot contain another block. The second is to run off the end of the input. The third is to use the @program directive. The tokens following the @program tag are interpreted as a Misty program.

@program

preprocessor-close => '@program'

The @program directive closes a block directive.

Inclusion

@include

preprocessor_include => '@include' (text | name)

The @include directive replaces itself with the tokens associated with the text. The text has previously been registered, possibly with a file system. The file will be included only once. Attempts to include the file again in the same instance of the preprocessor will be ignored.

Fragment Expansion

Every fragment must be expanded exactly once.

@ text

The sequence @ followed by a text is replaced with the fragment associated with the text. A fragment can be expanded before it is defined or fully appended.

Macro Expansion

There are two forms of macro expansion: simple and regular.

Unlike fragments, a macro can be expanded any number of times, including not at all.

Simple

A simple macro expansion is indicated simply by the name of a simple macro.

Regular

preprocessor-parameter => (token_except_,_(_)_{_}_[_]_ | '(' preprocessor-parameter-filler ')' | '{' preprocessor-parameter-filler '}' | '[' preprocessor-parameter-filler ']')*

preprocessor-parameter-filler => (token_except_(_)_{_}_[_]_ | '(' preprocessor-parameter-filler ')' | '{' preprocessor-parameter-filler '}' | '[' preprocessor-parameter-filler ']')*

A regular macro expansion is indicated by a name followed by zero or more arguments in parens. The number of arguments must exactly match the number of parameters in the macro definition.

The arguments are separated by commas. Within macro arguments, the tokens ( ) [ ] { } must balance. Commas within balanced parens and brackets are not used to separate the arguments.

A macro must have exactly the right number of arguments.

Macros cannot be called recursively.

How it works

First the source text is tokenized with text.tokens(), producing an array of tokens. The array.macro() method then produces a new array of tokens. It works in two phases.

Phase 1

In phase 1, # comments are deleted, @if regions are resolved, includes are inserted, and macros and fragments are defined.

If an @include directive is found, then obtain the text, tokenize it, and insert the tokens at this point.

If a comment-type directive is found, then delete all tokens until finding another block directive or a block closing directive.

If a @macro or @fragment directive is found, then accumulate tokens under that name or token until finding another block directive or block closing directive. No expansion occurs.

If a block closing directive is found, delete it.

Phase 2

In phase 2, macros and fragments are expanded.

If an @text is found, note that and replace the @text with the tokens of the fragment. Phase 2 continues with the first token of the fragment.

If a @name, @text, or @number expression is found, the expression is replaced by the new token.

If a name is found that matches the name of a macro, expand it.

There will be an error raised if the final array is empty, or if there is a fragment that was not used, or if a fragment is expanded twice.