Anglr Compiler


Introduction

Anglr compiler is a tool which translates syntax and lexical rules specified in an Angler file into the set of source files written in selected programming language. It is used in two ways:

  • manually from the command prompt
  • automatically as an extension of Microsoft Visual Studio.

Currently it has only the following parameters:

  • -d - switch to activate debug messages
  • -t - switch which enables generation of source code
  • -l - switch which generates code which will detect run-time loops in generated code
  • -gen <lang> - parameter with wich we can specify the programming language of generated code. Currently only cs is valid selection for <lang>, meaning C# source code
  • <src> - name of Anglr source file, containing the syntax rules which will be translated into the set of source files of selected programming language.
All other parameters concerning the compilation process are specified in Anglr source files.

Example:

anglr -t -l -gen cs math.anglr > math-anglr.txt
                        
With this command, we translate the syntax rules in file math.anglr into a set of C# source files. Output messages are stored into the file math-anglr.txt

Generated source code

An Anglr compiler generates source code in the selected programming language using the instructions provided in the Anglr file, which makes it possible to create a text analsysis tool for text files whose content is built according to the syntax rules found in the said Anglr file. The most important product of the compiler is the source code generated in the selected programming language. The source code generated serves the following purposes:

  • source code lexical analysis for text analyses tool
  • source code syntax analyzis for text analyses tool
  • manipulation of syntax tree produced by text analyses tool

The structure of generated source code should be described in this way:

  • for every syntax rule found in any parser part of Anglr file there is a file in selected programming language containig definition of a class associated with this syntax rule
  • for every declaration part of Anglr file there is a file in selected programming language containing definition of a class associated with this part
  • for every lexical part of Anglr file there is a source file in selected programming language containing the implementation of class representing this lexical part. This source file contains also classes implementating all scanner parts of Anglr file referenced by the lexical part.
  • for every parser part of Anglr file there is a source file in a selected programming language containg the implementation of class representing this parser part.
  • implementation of class which builds syntax tree when parsing text files.
  • implementation of class which provides tree traversal methodes for syntax tree generated by parsers

What's the purpose of the generated code? With this code, we build an application that analyzes a text file whose syntax meets the rules listed in the Anglr file. All these applications are built in a similar way:

  • first, the text syntax analysis is performed. This part of the application uses classes associated with parser parts of Anglr file.
  • A lexical analyzer is intensively used during the syntax analysis of the text. Classes associated with lexer parts of Anglr files are used to implement lexical analyzers. At this point, it is worth pointing out that the compiler itself takes care of all this.
  • A syntax tree is also built during the analysis of the text. This is also done automatically, which is provided by the Anglr compiler.
  • When the syntax analysis is complete, the most important part begins: semantic analysis of the source file, in the broadest sense of that word. Semantic analysis is essentially a transformation of source text into a new form. Each syntax tree node represents part of the source text. The leaves of this tree represent terminal symbols together with their character representation, and the inner nodes represent non-terminal symbols. The inner nodes that are closer to the top of the tree represent a larger portion of the source file than those located lower. The root node represents the whole file. In this phase we use syntax tree traversal methods of special class generated by Anglr compiler. Syntax tree can be traversed as many times as needed. Any iteration can perform a different task. At every step we take while traversing through a syntax tree, we can do part of semantic analysis or transformation of the source text. As noted above, we can do as many tree traversals as needed, making as many text trasnformations as we need to do. If we perform mutually independent transformations, we can make all of them in a single traversal.

Class associated with syntax rule

Every syntax rule is associated with exactly one class implemented in selected programming language. Instances of this class represent nodes of syntax tree generated by syntax analyzer. All classes are composed in a similar way, as follows:

  • their names are a modified versions of the syntax rule names to which they belong.
  • they are subclasses of the class SyntaxTreeBase
  • enumeration production_kind enumerates all productions of class associated with particular syntax rule
  • There is one constructor for each production of syntax rule.
  • Every class has explicitely defined copy constructor
  • then the Clone() method is specified. It is wrapper for copy constructor.
  • The Dispose method is used to release a sub-tree that is rooted in node associated with this syntax rule.
  • for every production of syntax tree there is a change() method which completely rebuilds the sub-tree of node
  • the checkInclusion() method is used to verify if a node is located in a sub-tree.
  • Emit() and EmitProductionTree() methods are used to display the sub-tree of a given node
  • InvokeTraverse() method is used to traverse the sub-tree of a given node
  • reparent() method is used to move the sub-tree to another node
  • init() method initializes all properties of the node
  • every class has static counter which counts all nodes created by this class.
  • public properties of this class represent the branches emanating from the node

Let's take a look at the well-known example of syntax rule and at the class generated by Anglr compiler that belongs to it so we can get a sense of what we're talking about:

additive-expression
	: multiplicative-expression
	| additive-expression '+' multiplicative-expression
	| additive-expression '-' multiplicative-expression
	;
						

Class associated with above syntax rule is presented in text below:

namespace Math.Parser
{
	//
	// class associated with syntax rule additive-expression
	//

	public class	additive_expression : SyntaxTreeBase
	{
		public enum production_kind	// enumerated production(s) of syntax rule additive-expression
		{
			g_additive_expression_1 = 1,	// multiplicative-expression
			g_additive_expression_2 = 2,	// additive-expression add multiplicative-expression
			g_additive_expression_3 = 3,	// additive-expression sub multiplicative-expression
		};

		// Constructor declaration(s) associated with production(s) of syntax rule additive-expression

		//
		// Constructor associated with the following production(s)
		// additive-expression: multiplicative-expression
		//

		public additive_expression (multiplicative_expression p_multiplicative_expression) : base ((uint) SyntaxTreeWalker.ProductionID._additive_expression_ID, (uint) production_kind.g_additive_expression_1)
		{
			++g_counter;
			_init ();
			m_multiplicative_expression = p_multiplicative_expression;
		}

		//
		// Constructor associated with the following production(s)
		// additive-expression: additive-expression add multiplicative-expression
		// additive-expression: additive-expression sub multiplicative-expression
		//

		public additive_expression (additive_expression p_additive_expression, SyntaxTreeToken p_token, multiplicative_expression p_multiplicative_expression, int kind) : base ((uint) SyntaxTreeWalker.ProductionID._additive_expression_ID, (uint) kind)
		{
			++g_counter;
			_init ();
			switch ((production_kind) this.kind)
			{
			case production_kind.g_additive_expression_2:
				m_additive_expression = p_additive_expression;
				m_add = p_token;
				m_multiplicative_expression = p_multiplicative_expression;
				break;
			case production_kind.g_additive_expression_3:
				m_additive_expression = p_additive_expression;
				m_sub = p_token;
				m_multiplicative_expression = p_multiplicative_expression;
				break;
			default:
				{
					string[] args = new string[] { "additive_expression" };
					throw new SyntaxTreeError (SyntaxTreeError.InvalidKindError, args);
				}
			}
		}

		// Copy constructor

		public additive_expression (additive_expression p_additive_expression) : base (p_additive_expression.id, p_additive_expression.kind)
		{
			++g_counter;
			if ((m_multiplicative_expression = (p_additive_expression.m_multiplicative_expression != null) ? new multiplicative_expression (p_additive_expression.m_multiplicative_expression) : null) != null) m_multiplicative_expression.parent = this;
			if ((m_additive_expression = (p_additive_expression.m_additive_expression != null) ? new additive_expression (p_additive_expression.m_additive_expression) : null) != null) m_additive_expression.parent = this;
			if ((m_add = (p_additive_expression.m_add != null) ? new SyntaxTreeToken (p_additive_expression.m_add) : null) != null) m_add.parent = this;
			if ((m_sub = (p_additive_expression.m_sub != null) ? new SyntaxTreeToken (p_additive_expression.m_sub) : null) != null) m_sub.parent = this;
		}

		// Clone node - wrapper of copy constructor

		public override SyntaxTreeBase Clone () { return (SyntaxTreeBase) new additive_expression (this); }

		// Destructor

		public new void Dispose ()
		{
			base.Dispose ();
			--g_counter;
			if (m_multiplicative_expression != null)
			{
				m_multiplicative_expression.Dispose ();
				m_multiplicative_expression =null;
			}

			if (m_additive_expression != null)
			{
				m_additive_expression.Dispose ();
				m_additive_expression =null;
			}

			if (m_add != null)
			{
				m_add.Dispose ();
				m_add =null;
			}

			if (m_sub != null)
			{
				m_sub.Dispose ();
				m_sub =null;
			}

			_init ();
		}

		// Content changing function(s) associated with production(s) of syntax rule additive-expression

		//
		// Content changing function associated with following production(s)
		// additive-expression: multiplicative-expression
		//

		public void change(multiplicative_expression p_multiplicative_expression)
		{
			_init ();
			m_kind = (uint) production_kind.g_additive_expression_1;
			m_multiplicative_expression = p_multiplicative_expression;
			reparent (parent);
		}

		//
		// Content changing function associated with following production(s)
		// additive-expression: additive-expression add multiplicative-expression
		// additive-expression: additive-expression sub multiplicative-expression
		//

		public void change(additive_expression p_additive_expression, SyntaxTreeToken p_token, multiplicative_expression p_multiplicative_expression, int kind)
		{
			_init ();
			m_kind = (uint) kind;
			switch ((production_kind) this.kind)
			{
			case production_kind.g_additive_expression_2:
				m_additive_expression = p_additive_expression;
				m_add = p_token;
				m_multiplicative_expression = p_multiplicative_expression;
				break;
			case production_kind.g_additive_expression_3:
				m_additive_expression = p_additive_expression;
				m_sub = p_token;
				m_multiplicative_expression = p_multiplicative_expression;
				break;
			default:
				{
					string[] args = new string[] { "additive_expression" };
					throw new SyntaxTreeError (SyntaxTreeError.InvalidKindError, args);
				}
			}
			reparent (parent);
		}

		//
		// check if node 'element' is included within syntax tree associated with syntax rule additive-expression
		//

		public new bool checkInclusion (SyntaxTreeBase element)
		{
			return
				(element == this) ||
				((m_multiplicative_expression != null) && m_multiplicative_expression.checkInclusion (element)) ||
				((m_additive_expression != null) && m_additive_expression.checkInclusion (element)) ||
				((m_add != null) && m_add.checkInclusion (element)) ||
				((m_sub != null) && m_sub.checkInclusion (element)) ||
				false;
		}

		//
		// emit production tree node associated with any production of syntax rule additive-expression
		//

		public override string Emit (int depth)
		{
			string s = "";
			if (depth != 0)
			switch ((production_kind) kind)
			{
				case production_kind.g_additive_expression_1:
					// emit syntax tree node associated with production
					// additive-expression: multiplicative-expression
					s += m_multiplicative_expression.Emit (depth - 1);
				break;
				case production_kind.g_additive_expression_2:
					// emit syntax tree node associated with production
					// additive-expression: additive-expression add multiplicative-expression
					s += m_additive_expression.Emit (depth - 1);
					s += " ";
					s += m_add.Emit(depth - 1);
					s += " ";
					s += m_multiplicative_expression.Emit (depth - 1);
				break;
				case production_kind.g_additive_expression_3:
					// emit syntax tree node associated with production
					// additive-expression: additive-expression sub multiplicative-expression
					s += m_additive_expression.Emit (depth - 1);
					s += " ";
					s += m_sub.Emit(depth - 1);
					s += " ";
					s += m_multiplicative_expression.Emit (depth - 1);
				break;
			}
			return s;
		}

		//
		// emit production tree node associated with any production of syntax rule additive-expression
		//

		public override string EmitProductionTree (int depth)
		{
			string s = "additive_expression (";
			if (depth != 0)
			switch ((production_kind) kind)
			{
				case production_kind.g_additive_expression_1:
					// emit syntax tree node associated with production
					// additive-expression: multiplicative-expression
					s += m_multiplicative_expression.EmitProductionTree (depth - 1);
				break;
				case production_kind.g_additive_expression_2:
					// emit syntax tree node associated with production
					// additive-expression: additive-expression add multiplicative-expression
					s += m_additive_expression.EmitProductionTree (depth - 1);
					s += ' ';
					s += "add";
					s += ' ';
					s += m_multiplicative_expression.EmitProductionTree (depth - 1);
				break;
				case production_kind.g_additive_expression_3:
					// emit syntax tree node associated with production
					// additive-expression: additive-expression sub multiplicative-expression
					s += m_additive_expression.EmitProductionTree (depth - 1);
					s += ' ';
					s += "sub";
					s += ' ';
					s += m_multiplicative_expression.EmitProductionTree (depth - 1);
				break;
			}
			s += ")";
			return s;
		}

		//
		// traverse sub-tree rooted in this node using selected syntax tree walker
		//

		public override void InvokeTraverse (SyntaxTreeWalker syntaxTreeWalker)
		{
			syntaxTreeWalker.Traverse (this);
		}

		// Change parent node - move sub-tree

		public override void reparent (SyntaxTreeBase parent)
		{
			this.parent = parent;
			if (m_multiplicative_expression != null)
				m_multiplicative_expression.reparent (this);

			if (m_additive_expression != null)
				m_additive_expression.reparent (this);

			if (m_add != null)
				m_add.reparent (this);

			if (m_sub != null)
				m_sub.reparent (this);

		}

		//
		// initialize object associated with syntax rule additive-expression
		//

		public void _init ()
		{
			m_multiplicative_expression = null;
			m_additive_expression = null;
			m_add = null;
			m_sub = null;
		}

		// counter of all nodes associated with syntax rule additive-expression
		public static int g_counter;

		// objects associated with terminal and non-terminal symbols within production(s) of syntax rule additive-expression
		public multiplicative_expression m_multiplicative_expression { get; private set; }
		public additive_expression m_additive_expression { get; private set; }
		public SyntaxTreeToken m_add { get; private set; }
		public SyntaxTreeToken m_sub { get; private set; }

	};

	public partial class SyntaxTreeWalker
	{

		// delegate function (callback) prototype associated with syntax rule additive-expression
		public delegate bool additive_expression_Callback (SyntaxTreeCallbackReason reason, additive_expression.production_kind kind, additive_expression p_additive_expression);

		// event associated with syntax rule additive-expression
		public event additive_expression_Callback additive_expression_Event;

		// event trigger associated with syntax rule additive-expression
		public bool Raise_additive_expression_Event (SyntaxTreeCallbackReason reason, additive_expression.production_kind kind, additive_expression p_additive_expression)
		{
			bool? status = additive_expression_Event?.Invoke (reason, kind, p_additive_expression);
			return (status == null) || status.Value;
		}
		//
		// traverse syntax tree node associated with any production of syntax rule additive-expression
		//

		public void Traverse (additive_expression p_additive_expression)
		{
			if (p_additive_expression.isLocked())
				return;
			p_additive_expression.dolock();
			additive_expression.production_kind kind = (additive_expression.production_kind) p_additive_expression.kind;
			p_additive_expression.turn_reset ();
			if (Raise_additive_expression_Event (SyntaxTreeCallbackReason.TraversalPrologueCallbackReason, kind, p_additive_expression))
			switch (kind)
			{
				case additive_expression.production_kind.g_additive_expression_1:
					// traverse syntax tree node associated with production
					// additive-expression: multiplicative-expression
					if (Raise_additive_expression_Event (SyntaxTreeCallbackReason.TraversalMidTermCallbackReason, kind, p_additive_expression))
						Traverse (p_additive_expression.m_multiplicative_expression);
					p_additive_expression.turn_inc ();
					Raise_additive_expression_Event (SyntaxTreeCallbackReason.TraversalMidTermCallbackReason, kind, p_additive_expression);
				break;
				case additive_expression.production_kind.g_additive_expression_2:
					// traverse syntax tree node associated with production
					// additive-expression: additive-expression add multiplicative-expression
					if (Raise_additive_expression_Event (SyntaxTreeCallbackReason.TraversalMidTermCallbackReason, kind, p_additive_expression))
						Traverse (p_additive_expression.m_additive_expression);
					p_additive_expression.turn_inc ();
					if (Raise_additive_expression_Event (SyntaxTreeCallbackReason.TraversalMidTermCallbackReason, kind, p_additive_expression))
						Traverse (p_additive_expression.m_multiplicative_expression);
					p_additive_expression.turn_inc ();
					Raise_additive_expression_Event (SyntaxTreeCallbackReason.TraversalMidTermCallbackReason, kind, p_additive_expression);
				break;
				case additive_expression.production_kind.g_additive_expression_3:
					// traverse syntax tree node associated with production
					// additive-expression: additive-expression sub multiplicative-expression
					if (Raise_additive_expression_Event (SyntaxTreeCallbackReason.TraversalMidTermCallbackReason, kind, p_additive_expression))
						Traverse (p_additive_expression.m_additive_expression);
					p_additive_expression.turn_inc ();
					if (Raise_additive_expression_Event (SyntaxTreeCallbackReason.TraversalMidTermCallbackReason, kind, p_additive_expression))
						Traverse (p_additive_expression.m_multiplicative_expression);
					p_additive_expression.turn_inc ();
					Raise_additive_expression_Event (SyntaxTreeCallbackReason.TraversalMidTermCallbackReason, kind, p_additive_expression);
				break;
			}
			Raise_additive_expression_Event (SyntaxTreeCallbackReason.TraversalEpilogueCallbackReason, kind, p_additive_expression);
			p_additive_expression.unlock();
		}

		//
		// traverse syntax tree node associated with any production of syntax rule additive-expression
		//

		public void TraverseCommon (additive_expression p_additive_expression)
		{
			additive_expression.production_kind kind = (additive_expression.production_kind) p_additive_expression.kind;
			if (Raise_Common_Event (SyntaxTreeCallbackReason.TraversalPrologueCallbackReason, (int) kind, p_additive_expression))
			switch (kind)
			{
				case additive_expression.production_kind.g_additive_expression_1:
					// traverse syntax tree node associated with production
					// additive-expression: multiplicative-expression
					TraverseCommon (p_additive_expression.m_multiplicative_expression);
				break;
				case additive_expression.production_kind.g_additive_expression_2:
					// traverse syntax tree node associated with production
					// additive-expression: additive-expression add multiplicative-expression
					TraverseCommon (p_additive_expression.m_additive_expression);
					TraverseCommon (p_additive_expression.m_add);
					TraverseCommon (p_additive_expression.m_multiplicative_expression);
				break;
				case additive_expression.production_kind.g_additive_expression_3:
					// traverse syntax tree node associated with production
					// additive-expression: additive-expression sub multiplicative-expression
					TraverseCommon (p_additive_expression.m_additive_expression);
					TraverseCommon (p_additive_expression.m_sub);
					TraverseCommon (p_additive_expression.m_multiplicative_expression);
				break;
			}
			Raise_Common_Event (SyntaxTreeCallbackReason.TraversalEpilogueCallbackReason, (int) kind, p_additive_expression);
		}
	};

	public partial class SyntaxTreeBuilder : SyntaxTreeWalker
	{

		//
		// create syntax tree node associated with production
		// additive-expression: multiplicative-expression
		//

		public additive_expression additive_expression_1 (multiplicative_expression p_multiplicative_expression)
		{
			additive_expression p_additive_expression_ref = new additive_expression(p_multiplicative_expression);
			p_multiplicative_expression.parent = p_additive_expression_ref;
			Raise_additive_expression_Event (SyntaxTreeCallbackReason.BuilderCallbackReason, additive_expression.production_kind.g_additive_expression_1, p_additive_expression_ref);
			return p_additive_expression_ref;
		}

		//
		// create syntax tree node associated with production
		// additive-expression: additive-expression add multiplicative-expression
		//

		public additive_expression additive_expression_2 (additive_expression p_additive_expression, SyntaxTreeToken p_token, multiplicative_expression p_multiplicative_expression)
		{
			additive_expression p_additive_expression_ref = new additive_expression(p_additive_expression, p_token, p_multiplicative_expression, 2);
			p_additive_expression.parent = p_additive_expression_ref;
			p_token.parent = p_additive_expression_ref;
			p_multiplicative_expression.parent = p_additive_expression_ref;
			Raise_additive_expression_Event (SyntaxTreeCallbackReason.BuilderCallbackReason, additive_expression.production_kind.g_additive_expression_2, p_additive_expression_ref);
			return p_additive_expression_ref;
		}

		//
		// create syntax tree node associated with production
		// additive-expression: additive-expression sub multiplicative-expression
		//

		public additive_expression additive_expression_3 (additive_expression p_additive_expression, SyntaxTreeToken p_token, multiplicative_expression p_multiplicative_expression)
		{
			additive_expression p_additive_expression_ref = new additive_expression(p_additive_expression, p_token, p_multiplicative_expression, 3);
			p_additive_expression.parent = p_additive_expression_ref;
			p_token.parent = p_additive_expression_ref;
			p_multiplicative_expression.parent = p_additive_expression_ref;
			Raise_additive_expression_Event (SyntaxTreeCallbackReason.BuilderCallbackReason, additive_expression.production_kind.g_additive_expression_3, p_additive_expression_ref);
			return p_additive_expression_ref;
		}
	};
}
						

Class name

Name of the class associated with syntax rule is similar to the name of syntax rule. The compiler is trying to make a class name that is as similar as possible to the syntax rule name. If the syntax rule name contains characters that should not be contained within variable names in the selected programing language, it replaces them with an underscore. If syntax rule names does not contain such characters then their associated class names are equal to their names.

For example, let's take a look at a well-known example of a syntax rule:

additive-expression
    : multiplicative-expression
    | additive-expression '+' multiplicative-expression
    | additive-expression '-' multiplicative-expression
    ;
                        
Since the name of syntax rule additive-expression contains character '-' that should not be used to compose variable names in most programming languages, Anglr compiler will turn it into character '_' when it composes the associated class name. The name of associated class would then be additive_expression.

Now let's imagine that the syntax rule name would be equal to <additive expression>. In this case the associated class name would be equal to _additive_expression_, since characters '<', '>' and ' ' should not be used to compose variable names in most programming languages.

Super class SyntaxTreeBase
Enumeration of productions
Constructors

Productions of some syntax rule having the same length and whose non-terminal symbols at the same position have the same names shall have the same constructors. In example above this is the case with the following productions:

    additive-expression '+' multiplicative-expression
    additive-expression '-' multiplicative-expression
                        

Copy constructor
Clone() method
Dispose() method
change() methods
checkInclusion () method
Emit () method
EmitProductionTree () method
InvokeTraverse () method
reparent () method
_init () method
Node counter
Public properties

Class associated with declaration part of Anglr file

Class associated with scanner part of Anglr file

Class associated with lexer part of Anglr file

Class associated with parser part of Anglr file

Class associated with syntax tree builder

Class associated with syntax tree traversal