Lexer Submodule

Lexer Submodule

The Lexer sub-module is concerned with transcribing a given stream of characters into a sequence of domain specific lexical units called "token".

Basic methodology:

  1. Wrap a plain IO object into a Lexer.CharStream.

  2. Call Lexer.next_token to collect another Lexer.Token from the character stream.

  3. Goto 2. unless end of file is reached.

For convenience the above process is simplified by providing the type Lexer.TokenStream, which supports eof, read and Lexer.peek.

source

Types

CharStream(io::IO)

Stateful decorator around io to keep track of some context information, as well as allow the use of peek (i.e. looking at the next character without consuming it).

source
TokenStream(cs::CharStream)

Stateful decorator around cs to allow the use of peek (i.e. looking at the next Token without consuming it).

It uses the function next_token to create a new Token from the current position of cs onwards.

source
Token(name::Char, [value::String])

A SGF specific lexical token. It can be either for the following:

  • Token('\0'): Empty token to denote trailing whitespaces.

  • Token(';'): Separator for nodes.

  • Token('(') and Token(')'): Delimiter for game trees.

  • Token('[') and Token(']'): Delimiter for property values.

  • Token('I', "AB1"): Identifier for properties. In general these are made up of one or more uppercase letters. However, with the exception of the first position, digits are also allowed to occur. This is done in order to supported older FF versions.

  • Token('S', "abc 23(\)"): Any property value between '[' and ']'. This includes moves, numbers, simple text, and text.

source

Functions

peek(cs::CharStream, ::Type{Char}) -> Char

Return the next Char in cs without consuming it, which means that the next time peek or read is called, the same Char will be returned.

source
peek(ts::TokenStream, ::Type{Token}) -> Token

Return the next Token in ts without consuming it, which means that the next time peek or read is called, the same Token will be returned.

source
next_token(cs::CharStream) -> Token

Reads and returns the next Token from the given character stream cs. If no more token are available, then a EOFError will be thrown.

Note that the lexer should support FF[1]-FF[4] versions. In case any unambiguously illegal character sequence is encountered, the function will throw a LexicalError.

source

Exceptions

LexicalError(msg)

The string or stream passed to Lexer.next_token was not a valid sequence of characters according to the smart game format.

source