This document describes the Misty parse tree. This is not a part of the Misty System Standard. A conforming implementation does not have to conform to this. This is descriptive of one particular implementation.
The tree can be fully realized as a JSON object.
The module tokenize.mst
takes a source text and produces an array of tokens. The tokens are made up of these fields:
"name"
, "text"
, "number"
,
"comment"
, "newline"
, "space"
,
and all of the intrinsics and punctuators:
. , / | ƒ ¶ ! - + * @ ( ) [ ] { } = < > ≠ ≤ ≥ ÷ ~ ≈ /\ \/
The characters making up the token.
At this stage, kind: "number"
tokens are represented as text to avoid the danger of early loss of precision.
The kind: "text"
tokens have already had escape sequences decoded and outer quotes removed.
A contigous run of spaces is made into a single kind: "space"
token with a text
field containing the entire run of spaces.
The character position in the source text where the token begins, starting at 0
.
The line number of the the token, starting at 0
.
The column number of the start of the token, starting at 0.
For kind: "text"
tokens, one of:
"double quote «left chevron
The kind
and and text
fields will be meaningful to the parser that consumes the token list. The other fields are intended for error reporting.
The module parse.mst
turns a list of tokens into an abstract parse tree.
The root of the tree is a record. Its fields are
"misty"
The name of the file.
An array of functions. For each ƒ
there is a record in this array, plus function 0
, the program or module body.
An array of log names.
An array of pattern definitions.
An array of scopes. This array is parallel with the functions
array. It contains records containing the names that are used in each function. The names can be the function name and parameters, and additional names created with def
, use
, var
. The scope also contains names that are found in outer functions and intrinsics.
A scope is used to determine what a name in a function refers to.
A record where the field names are intrinsics that are used. This information might be used by a linker.
An array of modules that are mentioned in use
statements. This information might be used by a linker.
"program"
0
An array of statements.
A module is like an actor except that it does not have access to @
at (the source of an actor's capabilies) and it has a return
statement that the actor does not have.
"module"
0
An array of statement tokens.
Names are use to represent functions, variables, and constants.
"name"
The name, as a simple text.
More information about the name token can be found in the scope.
"number"
A text.
"text"
A text.
"array"
An array (possibly empty) of expression tokens.
"record"
An array (possibly empty) of pair records. A pair record is
left
A simple text.
right
An expression token.
@
This is the actor's source of capability. It can be used to call @
functions.
"@"
@
addressThis is the actor's private address.
"@ address"
| * / ÷ + - ~ ≈ = <> < <= > >= /\ \/
The left operand token.
The right operand token.
.
operator"."
The left operand token.
A simple text.
"["
The left operand token.
The right operand token. If right
is missing, then []
are representing the push/pop operator.
"("
The left operand token, which should resolve to a function.
An array of expression tokens, an argument list.
then
operatorThis is the ternary operator.
"then"
A condition token.
The expression token if true.
The expression token if false.
"break"
If the statement is labelled, this is a simple text.
"call"
A (
expression token.
"def"
A name token.
An expression token.
"do"
An array of statement tokens.
If the do
statement is labelled, this is a simple text.
"disrupt"
"if"
A conditional expression token.
An array of statement tokens.
An array of if
statement tokens (without list
or else
) that are from else if
.
an array of statement tokens (optional).
"return"
An optional expression token.
"send"
The address expression token.
The message expression token.
The optional reply function expression token.
"set"
The destination expression token.
The value expression token.
"use"
A name token.
An optional name or text token.
"var"
A name token.
An optional expression token. The default is null
.
The ƒ
operator does two things: It makes a function definition that it appends to the functions array, and it makes a reference to that function definition.
"function"
The location of the function definition in the functions
array.
"ƒ"
If the function is named, this is a simple text.
The location of the function definition in the functions
array.
The number of the outer function that made this one.
A list of name tokens, the parameters. Each has a parameter_nr
field, between 0 and 3. If a parameter has a default value, it will be in the name token's expression
field.
If the function is an expression function, the expression token is here.
If the function is a statement function, then this is an array of statement tokens.
An array of statement tokens, the function's disruption handler. (Optional.)
The root token contains an array of scopes. The scopes
array is indexed by function_nr
, like the functions
array. Each function definition has a scope record containing all of the names used in that function.
A scope record contains name fields, where the key is a variable name, and the field is a record containing:
The name of the variable.
The distance to the outer scope.
The maker of the name:
function
parameter
def
use
var
intrinsic
Only names that were made by the var
statement can have their values replaced by the set
statement.
The number of the function that declared the name.
If the name is used by an inner function, then this is true
.
The intrinsic record contains a field for each intrinsic used. This might be used by a linker.
The ¶
operator does two things: It makes a function definition that it appends to the patterns
array, and it makes a reference to that pattern definition.
"pattern"
The location of the pattern definition in the patterns
list.
Coming soon.