Created by: gwideman, May 8, 2012 5:47 pm
Revised by: gwideman, Oct 1, 2012 1:49 pm (21 revisions)

Overview

Some notes gathered during the process of installing and testing ANTLR 4, and reviewing the exercises in DAR2.

Preliminaries

  • Page numbers are relative to PDF of 2012-04-18.
  • Sample code for book: code_antlr2ed_20120430.zip
  • ANTLR 4 version: antlr4-review-1.jar of 2012-05-08
  • In the following, I isolated the code for each example into its own separate IntelliJ module, so that each could be examined separately, without the clutter of the many other source files.

Chapter 1

"Install Java"

DAR2-1.3 says "you've got to have Java installed" (1.6 or above). The narrative should elaborate:
  • The ANTLR tool itself, being written in Java, requires a Java runtime, and is run with a command like
    • java -jar /path/to/antlr-jar.jar
      • (Within that jar, evidently the antlr tool corresponds to the executable class.)
  • The ANTLR tool can generate parsers in a variety of languages. If you are creating a parser for a Java project, then for that project you'll need a Java SDK (JDK) installed, not just Java runtime. And between the various "editions" of Java, it's Java SE (Standard Edition) that we want.
  • When Java is installed, either the runtime or the JDK, inexplicably the Sun installer doesn't set the Windows PATH (might be similar issue on other platforms), so actually java cannot be launched (unless your command specifies the full path to the java executable). So that needs to be addressed.
Helpful docs?
So, here's the blow-by-blow for installing Java.
  • Step -1: DON'T Install Java from http://www.java.com/en/, "Free Java Download" button.
  • Step 0: If you did step -1 by mistake, uninstall what you just did, because that's just the JRE, and you probably want the full JDK.
  • Step 1, download "Java Platform (JDK) 7u4" (or whatever XuY is the latest) from http://www.oracle.com/technetwork/java/javase/downloads/index.html, and install.
    • I chose all default installation directories
      • Which are in C:\Program Files (x86), so of course include spaces. Also the the jdk uses a version specific dir name, so the PATH is going to be version-specific, likely to break on update.
    • A JRE runtime got installed at the same time -- that probably duplicates what I installed and uninstalled in steps -1 and 0.
    • JavaFX gets installed at the same time.
  • Step 2: Fix PATH
    • I used Redfern Software's Path Editor to make this much easier
    • I added the following directory: C:\Program Files (x86)\Java\jdk1.7.0_04\bin
      • ...which seems prone to breaking when Java updates. Grrrrr.
  • Step 3: CLASSPATH
    • I noted that CLASSPATH = ".;C:\Program Files (x86)\QuickTime\QTSystem\QTJava.zip", so presumably the Java install didn't add anything exciting, and it was Apple's installer that set this up earlier.
  • Step 4: Test

Test execution of ANTLR4

  • Basic smoke test: java -jar F:\Proj_ANTLR\20120430\antlr4-review-1.jar
    • Error: "no main manifest attribute, in F:\Proj_ANTLR\20120430\antlr4-review-1.jar"
    • Hmmm. OK, needed to fix MANIFEST.MF to add:
      • Main-Class: org.antlr.v4.Tool
  • Now it runs and prints the help listing

More thoughts about ANTLR location and CLASSPATH

Looking ahead I see that we'll need the ANTLR jar in a "common" location, with a CLASSPATH pointing to it, so, for lack of more obvious option I set things up to parallel the DAR2's unix instructions:
  • Created c:\usr\local\lib\
  • Placed antlr4-review-1.jar within
  • Created antlr.bat: java -jar C:\usr\local\lib\antlr4-review-1a.jar %*
  • Added to system environment variable CLASSPATH (note leading dot):
    • .;C:\usr\local\lib\antlr4-review-1a.jar;C:\Program Files (x86)\QuickTime\QTSystem\QTJava.zip;
  • Created gtest.bat: java org.antlr.v4.runtime.misc.TestRig %*
  • bat files in C:\_Commands dir (my convention), which is listed in PATH

Exercise 1.5

  • cd to exercise dir Ex-01.5
  • antlr4 Hello.g4
    • produces expected files
  • javac *.java
    • produces *.class files
  • Test using: gtest Hello r -tokens
    • Waits for input. Entered sample input: hello parrt<Enter><Ctrl-Z>
    • Returns message that looks like:
      • 'ine 1:11 token recognition error at: '
    • but is actually
      • line 1:11 token recognition error at: '<CR>'
    • So, evidently this code doesn't discard CR.
    • Also printed the parse results, as expected
  • I revised Hello.g4 to add \r to the WS (whitespace) rule.
    • This fixed the error
  • -print test -- OK
  • -gui test -- OK
  • Compiling and running the test parser:
    • javac Hello*.java Test.java
  • java Test
  • hello world

Set up IntelliJ IDEA IDE for use with ANTLR

It seemed like a good idea to work in an IDE for ease of browsing, obtaining class diagrams, run with a debugger, and so on.I decided to try IntelliJ IDEA v11.1.1 for this purpose. But there were some issues.
  • Project structure
    • IntelliJ has a particular Project > Module structure it wants to follow, with some configurability, but also some obstacles. It's pretty much necessary to take the DAR sample files and arrange them in modules corresponding to distinct executables.
    • There seemed to be no way to use IDEA to compile output .class files in the same directory as the source files. See this discussion: .class output in same dir as source? Bug?
      • Not fatal, just a bit tedious. Can put data files and so on in the module's output directory, and run there.
  • The IDE's console window doesn't pass Ctrl-z through to the running program. So that defeats the ANTLR samples that depend on input from the console and require keying in Ctrl-z as End-Of-File.

Other IntelliJ IDEA notes in passing

  • Dependencies on external libraries:
    • First, declare a particular external library at the Project level.
    • For each Module set a dependency on that Project-known library
      • Module settings > Modules > Select module > Dependency tab > [+] Plus button > Library

Chapter 2 Tour

Exercise 2.1

  • I skipped Expr.g4 and went directly to LibExpr.g4 (what is the significance of the "Lib" prefix -- seems a bit random. Hmmm, maybe it means "The version of Expr that uses lexer tokens from a Lib-rary", though the narrative doesn't describe CommonLexerRules.g4 as a library.)
  • Compiled and ran OK

Section 2.1 Handling Erroneous Input

  • My dir Ex-02.01
  • Example console session references ExprJoyRide, with no prior mention of instructions for compiling it. (But it would have got compiled by previous javac *.java steps.)
  • Turns out it depends on class ExprLexer and ExprParser, which comes from Expr.g4, so I shouldn't have skipped that exercise.
  • OK, I threw Expr.g4 into the module for this project, ANTLR-ized it, and recompiled.
  • java ExprJoyRide produces error as expected and as displayed in DAR's sample output.
  • DAR's narrative goes on to discuss the gui parse tree output, but doesn't show the commands necessary.
    • gtest LibExpr prog -gui ... and then enter some bogus input (as was tried with ExprJoyRide).
    • So ExprJoyRide is somewhat of a non-sequitur. I think its purpose is mostly just as a test program to call the parser not using the TestRig framework. But why doesn't it use LibExpr? I suppose no big deal.

Exercise 2.2 Calculator using a Visitor

  • My dir Ex-02.02
  • In following along step-by-step, the narrative on p30 says "let's see what ANTLR generates for us", which might prompt the reader to use the same antlr commands as previously. In fact, you have to use other command options, as exemplified two pages later in"And here is the build and test sequence..."
    • antlr4 -no-listener -visitor LabeledExpr.g4
  • I notice that Parser._BuildParseTrees defaults to true. So parser.setBuildParseTree(true); is unnecessary
  • p32: cat command is not relevant on Windows. I created an equivalent text file with an editor.
  • Runs and produces output as expected

Exercise 2.3 Building a Translator, using a Listener

  • Begins page 33
  • Main class is ExtractInterfaceTool.java
  • My dir Ex-02.03
  • Complete build + test sequence is on page 35.
    • Copy Extract*.java and Java.g4 to project src dir
    • Follow page 35 with appropriate mods for IntelliJ IDEA IDE
    • Copy Demo.java to output dir
    • Runs as expected

Exercise 2.4 Actions in parse phase

  • Begins page 36
  • My dir Ex-02.04
  • No launcher class, tested from test rig
  • Copy
    • ActionExpr.g4
    • t.expr from Ex 2.2 --> out
  • Runs as expected

Exercise 2.4b Altering the Parse with Semantic Predicates

  • Begins page 38
  • My dir Ex-02.04b
  • Tested from test rig
  • Copy
    • Data.g4
    • t.data --> out
  • Runs as expected

Ex 2.5a Island grammars

  • Begins page 39
  • My dir Ex-02.05a
  • Tested from test rig
  • Copy
    • XML.g4
    • t.xml --> out
  • Commands on p41
    • Runs almost as expected
    • Slight difference -- first line of output in book is omitted in output I see. My @0 is book's @1

Ex 2.5b Rewrite the input stream

  • Begins p41
  • My dir Ex-02.05b
  • Main is InsertSerialId.java
  • Copy
    • Insert*.java
    • Java.g4
    • Demo.java --> out
  • Commands on p42
  • Creates output like p43, but not including the "public static final... line", so demo is broken
    • InsertSerialID.java does not match pdf, specifically uses CommonTokenStream instead of TokenRewriteStream
    • Additionally, InsertSerialIDListener doesn't match pdf, uses TokenStreamRewriter instead of TokenRewriteStream, etc.
    • So basically major mismatch between book and demo code.

Chapter 3 Designing Grammars

There are demo grammar files and corresponding input samples in the examples dir that's shared with Ch 4. No complete modules in this chapter, though the grammars could be tested with the test rig.

Chapter 4 Exploring some real grammars

  • 4.1 CSV
  • 4.2 JSON
  • 4.3 DOT (Graphviz)
  • 4.4 Cymbol
  • 4.5 R

Section 4.1 Parsing CSV

  • Begins p72
  • My dir Ex-04.01
  • Tested using rig
  • Copy
    • CSV.g4
    • CSV-input --> out
  • Commands on p73
    • -tokens option produces output as expected
    • -print output produces correct output info, but badly formatted as their are no linebreaks (contrary to what's printed in the PDF).

Section 4.2 JSON

  • Begins p74
  • My dir Ex-04.02
  • Test using rig
  • Copy
    • JSON.g4
    • JSON-input --> out
  • Commands
    • Missing command for running antlr
    • Testing commands on p80 (two sets, for -tokens and -print)
    • Input requires Ctl-z (so not feasible in IntelliJ)
    • Output matches that shown in pdf

Section 4.3 DOT (Graphviz)

Installing Graphviz on Windows

  • Download link: http://www.graphviz.org/Download_windows.php
    • I downloaded "current stable release" graphviz-2.28.0.msi
    • Hmmm, 58 Meg MSI -- what all is this going to do? Blurb says "just unzips the executables and adds to the PATH.
  • Docs at: http://www.graphviz.org/Documentation.php, but many also included with the installer and accessible from the Programs menu
  • Launched the msi file
    • Disk cost 270 Meg!
    • Accepted defaults
    • After install, PATH has additional entry: C:\Program Files (x86)\Graphviz 2.28\bin
  • Reality check
    • In dotguide.pdf there are sample dot files, and corresponding command lines.
    • The example dot commands produce ps (postscript) output, which requires unnecessary hassle to view on Windows. So instead I used dot commands that produce pdf outputs:
      • dot -Tpdf xxx.dot -o xxx.pdf
    • Seems to work as expected

Exercise

  • Begins p81
  • My dir Ex-04.03
  • Test using rig
  • Copy
    • DOT.g4
    • DOT-input --> out
  • Commands
    • None given in pdf
    • Generate parser etc: antlr4 DOT.g4
    • Test: gtest DOT graph -tokens DOT-input
    • Show tree (as on pdf p84 fig 16): gtest DOT graph -gui DOT-input

Section 4.4 Cymbol

  • Begins p86
  • My dir Ex-04.04
  • Test using rig
  • Copy
    • Cymbol.g4
    • Cymbol-input --> out
  • Commands
    • No commands given in pdf
    • Generate parser etc: antlr4 Cymbol.g4
    • Show tree (as on p87 Fig17): gtest Cymbol file -gui Cymbol-input
      • GUI tree does not match that in the pdf:
        • Interposed formalParameter node, and nodes representing and above x-1 are different.

Section 4.5 R

  • Begins p89
  • My dir Ex-04.05
  • Test using TestR.java, or rig?
  • Copy
    • TestR.java -- except this file seems to be a red herring, should be removed?
    • R.g4
    • R-input --> out
  • Commands
    • No commands provided
    • Generate parser etc: antlr4 R.g4
    • Show tree as on p95, Fig 19: gtest R prog -gui R-input
      • GUI tree matches what is shown in the pdf

Chapter 5

Continued on next page.