The Parse Tree

This document describes the Misty parse tree. This is not a part of the Misty System Standard. A conforming implementation does not have to conform to this. This is descriptive of one particular implementation.

The tree can be fully realized as a JSON object.

Tokens

The module tokenize.mst takes a source text and produces an array of tokens. The tokens are made up of these fields:

kind

"name", "text", "number", "comment", "newline", "space", and all of the intrinsics and punctuators:

.  ,  /  |  ƒ  ¶  !  -  +  *  @  (  )  [  ]  {  }  =  <  >  ≠  ≤  ≥  ÷  ~  ≈  /\  \/

text

The characters making up the token.

At this stage, kind: "number" tokens are represented as text to avoid the danger of early loss of precision.

The kind: "text" tokens have already had escape sequences decoded and outer quotes removed.

A contigous run of spaces is made into a single kind: "space" token with a text field containing the entire run of spaces.

at

The character position in the source text where the token begins, starting at 0.

line_nr

The line number of the the token, starting at 0.

column_nr

The column number of the start of the token, starting at 0.

quote

For kind: "text" tokens, one of:

"double quote  «left chevron

The kind and and text fields will be meaningful to the parser that consumes the token list. The other fields are intended for error reporting.

Tree

The module parse.mst turns a list of tokens into an abstract parse tree.

Root

The root of the tree is a record. Its fields are

kind

"misty"

name

The name of the file.

functions

An array of functions. For each ƒ there is a record in this array, plus function 0, the program or module body.

logs

An array of log names.

patterns

An array of pattern definitions.

scopes

An array of scopes. This array is parallel with the functions array. It contains records containing the names that are used in each function. The names can be the function name and parameters, and additional names created with def, use, var. The scope also contains names that are found in outer functions and intrinsics.

A scope is used to determine what a name in a function refers to.

intrinsics

A record where the field names are intrinsics that are used. This information might be used by a linker.

uses

An array of modules that are mentioned in use statements. This information might be used by a linker.

Program

kind

"program"

function_nr

0

statements

An array of statements.

Module

A module is like an actor except that it does not have access to @at (the source of an actor's capabilies) and it has a return statement that the actor does not have.

kind

"module"

function_nr

0

statements

An array of statement tokens.

Names

Names are use to represent functions, variables, and constants.

kind

"name"

name

The name, as a simple text.

More information about the name token can be found in the scope.

Literals

Numbers

kind

"number"

value

A text.

Texts

kind

"text"

value

A text.

Array

kind

"array"

list

An array (possibly empty) of expression tokens.

Record

kind

"record"

list

An array (possibly empty) of pair records. A pair record is

left

A simple text.

right

An expression token.

`@`

This is the actor's source of capability. It can be used to call @ functions.

kind

"@"

`@` address

This is the actor's private address.

kind

"@ address"

Operators

Infix operators

kind

|  *  /  ÷  +  -  ~  ≈  =  <>  <  <=  >  >=  /\  \/

left

The left operand token.

right

The right operand token.

The `.` operator

kind

"."

left

The left operand token.

right

A simple text.

Subscript

kind

"["

left

The left operand token.

right

The right operand token. If right is missing, then [] are representing the push/pop operator.

The ( operator

kind

"("

left

The left operand token, which should resolve to a function.

list

An array of expression tokens, an argument list.

The `then` operator

This is the ternary operator.

kind

"then"

expression

A condition token.

then

The expression token if true.

else

The expression token if false.

Statements

The break statement

kind

"break"

name

If the statement is labelled, this is a simple text.

The call statement

kind

"call"

left

A ( expression token.

The def statement

kind

"def"

left

A name token.

right

An expression token.

The do statement

kind

"do"

statements

An array of statement tokens.

name

If the do statement is labelled, this is a simple text.

The disrupt statement

kind

"disrupt"

The if statement

kind

"if"

expression

A conditional expression token.

then

An array of statement tokens.

list

An array of if statement tokens (without list or else) that are from else if.

else

an array of statement tokens (optional).

The return statement

kind

"return"

expression

An optional expression token.

The send statement

kind

"send"

left

The address expression token.

right

The message expression token.

expression

The optional reply function expression token.

The set statement

kind

"set"

left

The destination expression token.

right

The value expression token.

The use statement

kind

"use"

left

A name token.

right

An optional name or text token.

The var statement

kind

"var"

left

A name token.

right

An optional expression token. The default is null.

Function

The ƒ operator does two things: It makes a function definition that it appends to the functions array, and it makes a reference to that function definition.

Function reference

kind

"function"

function_nr

The location of the function definition in the functions array.

Function definition

kind

"ƒ"

name

If the function is named, this is a simple text.

function_nr

The location of the function definition in the functions array.

outer

The number of the outer function that made this one.

list

A list of name tokens, the parameters. Each has a parameter_nr field, between 0 and 3. If a parameter has a default value, it will be in the name token's expression field.

expression

If the function is an expression function, the expression token is here.

statements

If the function is a statement function, then this is an array of statement tokens.

disruption

An array of statement tokens, the function's disruption handler. (Optional.)

Scope

The root token contains an array of scopes. The scopes array is indexed by function_nr, like the functions array. Each function definition has a scope record containing all of the names used in that function.

A scope record contains name fields, where the key is a variable name, and the field is a record containing:

name

The name of the variable.

level

The distance to the outer scope.

make

The maker of the name:

 function  parameter  def  use  var  intrinsic

Only names that were made by the var statement can have their values replaced by the set statement.

function_nr

The number of the function that declared the name.

closure

If the name is used by an inner function, then this is true.

Intrinsic

The intrinsic record contains a field for each intrinsic used. This might be used by a linker.

Pattern

The ¶ operator does two things: It makes a function definition that it appends to the patterns array, and it makes a reference to that pattern definition.

Pattern reference

kind

"pattern"

pattern_nr

The location of the pattern definition in the patterns list.

Pattern definition

Coming soon.

The Parse Tree

Tokens

kind

text

at

line_nr

column_nr

quote

Tree

Root

kind

name

functions

logs

patterns

scopes

intrinsics

uses

Program

kind

function_nr

statements

Module

kind

function_nr

statements

Names

kind

name

Literals

Numbers

kind

value

Texts

kind

value

Array

kind

list

Record

kind

list

left

right

@

kind

@ address

kind

Operators

Infix operators

kind

left

right

The . operator

kind

left

right

Subscript

kind

left

right

The ( operator

kind

left

list

The then operator

kind

expression

then

else

Statements

The break statement

kind

name

The call statement

kind

left

The def statement

kind

left

`@`

`@` address

The `.` operator

The `then` operator