Currently I’m going through AST, but the documentation is scarce. AST is in-memory tree representation of code that can be modified and is used while creating syntactic macros. I was wondering what is the best way to check how syntax is transformed to AST when I came across Luis Diego Fallas’ post in which he shows a little application that parses Boo source file and displays AST in tree control. Unfortunately there are some issues with this application. Firstly, it contains a bug and doesn’t show full syntax tree. Secondly, it only displays node names without information what code the node represents. So I decided to modify his code and fix this. Here is a sample screenshot:
There is AST tree structure of code on the left and information about selected subtree on the right. It includes the code itself and serialized form of the subtree. Name of the file to parse is passed to the application as a parameter.
Visitor pattern is employed to visit all nodes in AST tree. Unfortunately deriving from Boo’s DepthFirstVisitor is a tedious task, because there are as many methods to override as there are AST node types. Luis proposed a solution basing on dynamically created type, that derives from DepthFirstVisitor. Method overrides are constructed in-memory from AST nodes and then the dynamic type is compiled. While being compact, this solution is not very readable, especially when you’re not familiar with AST. Fortunately there seems to be alternative way to do this, namely the mysterious [| <code here> |] syntax. I think it’s called quasiquotes. I couldn’t find any documentation for this, only one example in Boo package. From what I was able to figure out, this code (where m is instance of MethodInfo class):
// Node for TreeNode('') tnCreation = MethodInvocationExpression(ReferenceExpression("TreeNode")) tnCreation.Arguments.Add( StringLiteralExpression(m.GetParameters()[0].ParameterType.Name)) // Create node for 't = TreeNode('')' decl = DeclarationStatement( Declaration: Declaration(Name:"t", Type: SimpleTypeReference("System.Windows.Forms.TreeNode")), Initializer: tnCreation)
is equivalent to this code:
decl = [| t = TreeNode($(m.GetParameters()[0].ParameterType.Name)) |]
So there is a huge improvement. Code between [| and |] is being transformed to AST and assigned to a variable. The $() construct is similar to string interpolation construct. It allows you to insert a string from variable as if it was part of the code (here it is a parameter of the TreeNode constructor, but it also can be a method name for example). When using this syntax, you have to be careful about the indentation. It seems that statements can be in the same line as [|, but when declaring members, they have to be in new line and indented. Otherwise you get compilation errors.
Luis has overridden “Enter*” and “Leave*” (for example EnterBlockExpression) methods of DepthFirstVisitor, which is a problem because “Enter*” and ”Leave*” methods exist only for nodes that can contain subnodes. So for example there is no Enter method for IntegerLiteralExpression node. In my code I override the “On*” methods.
Here is the full code:
import System import System.Reflection import System.IO import Boo.Lang.Compiler import Boo.Lang.Compiler.Ast import Boo.Lang.Compiler.Ast.Visitors import Boo.Lang.Parser import System.Windows.Forms import System.Xml.Serialization import System.Drawing // Get the type instance for DepthFirstVisitor a = Assembly.Load("Boo.Lang.Compiler") visitorType = a.GetType("Boo.Lang.Compiler.Ast.DepthFirstVisitor") // Create a class definition for a custom visitor myVisitor = ClassDefinition(Name: "DynamicVisitor") myVisitor.BaseTypes.Add(SimpleTypeReference("Boo.Lang.Compiler.Ast.DepthFirstVisitor")) // Add a new field for our node stack stackField = [| public stack as System.Collections.Stack |] myVisitor.Members.Add(stackField) // Override On* methods for m as MethodInfo in [m for m in visitorType.GetMethods() if m.Name.StartsWith("On")]: method = [| def $(m.Name)(node as $(m.GetParameters()[0].ParameterType.FullName)): t = TreeNode($(m.GetParameters()[0].ParameterType.Name)) t.Tag = node; (stack.Peek() as TreeNode).Nodes.Add(t) stack.Push(t) // call method in superclass to visit subtrees super.$(m.Name)(node) stack.Pop() |] myVisitor.Members.Add(method) // Compile unit and module creation cu = CompileUnit() mod = Boo.Lang.Compiler.Ast.Module(Name:"Module") cu.Modules.Add(mod) mod.Imports.Add(Import(Namespace:"System")) mod.Imports.Add(Import(Namespace:"System.Windows.Forms")) mod.Imports.Add(Import(Namespace:"Boo.Lang.Compiler.Ast")) mod.Members.Add(myVisitor) // Initialize compiler preferences pipeline = Pipelines.CompileToMemory() ctxt = CompilerContext(cu) // Run the compiler pipeline.Run(ctxt) // Check the results if(ctxt.GeneratedAssembly == null): for e in ctxt.Errors: print e Console.ReadLine() return // Create an instance of the new visitor and initialize it visitorInstance as object = ctxt.GeneratedAssembly.CreateInstance("DynamicVisitor"); visitorType = ctxt.GeneratedAssembly.GetType("DynamicVisitor") tStack = System.Collections.Stack() tStack.Push(TreeNode("root")) visitorType.GetField("stack").SetValue(visitorInstance, tStack); // Parse a file an run the visitor compileUnit = BooParser.ParseFile(argv[0]) compileUnit.Accept(visitorInstance) //Initialize a form an show the results frm = Form(Text: "AST content", Width: 800, Height: 600) outerSplit = SplitContainer(Dock: DockStyle.Fill, Orientation: Orientation.Vertical) innerSplit = SplitContainer(Dock: DockStyle.Fill, Orientation: Orientation.Horizontal) definitionText = TextBox(Dock: DockStyle.Fill, Multiline: true, ScrollBars: ScrollBars.Both, Font: Font("Courier New", 8.0)) xmlText = TextBox(Dock: DockStyle.Fill, Multiline: true, ScrollBars: ScrollBars.Both, Font: Font("Courier New", 8.0)) tv = TreeView(Dock: DockStyle.Fill) tv.AfterSelect += def (sender as object, e as TreeViewEventArgs): if e.Node.Tag != null: writer = StringWriter() BooPrinterVisitor(writer).Visit(e.Node.Tag as Node) definitionText.Text = writer.ToString() serializer = XmlSerializer(e.Node.Tag.GetType()) writer = StringWriter() serializer.Serialize(writer, e.Node.Tag) xmlText.Text = writer.ToString() tv.Nodes.Add(tStack.Pop() as TreeNode) frm.Controls.Add(outerSplit) outerSplit.Panel1.Controls.Add(tv) outerSplit.Panel2.Controls.Add(innerSplit) innerSplit.Panel1.Controls.Add(definitionText) innerSplit.Panel2.Controls.Add(xmlText) Application.Run(frm)
Some highlights:
- There is a BooPrinterVisitor class in Boo.Lang.Compiler.Ast.Visitors namespace, that converts AST to source code. Unfortunately this class doesn’t support quasiquotes, so passing a file with quasiquotes will result in exception.
- The file passed to the application doesn’t necessarily have to be compilation error free. It can contain macros, that are not yet defined. This way you can write your desired syntax, and then use this application to see how it is parsed. This should make macro implementation easier.
