# Matching arithmetic expression

An arithmetic grammar for +-x/ and parenthesis:

``````grammar Expr;

/** The start rule; begin parsing here. */
prog:   stat+ ;         // a program contains multiple statement
//a statement can either be expr \n or assignment \n or \n
stat:   expr NEWLINE
|   ID '=' expr NEWLINE
|   NEWLINE
;

expr:   expr ('*'|'/') expr
|   expr ('+'|'-') expr
|   INT
|   ID
|   '(' expr ')'
;

ID  :   [a-zA-Z]+ ;      // match identifiers <label id="code.tour.expr.3"/>
INT :   [0-9]+ ;         // match integers
NEWLINE:'\r'? '\n' ;     // return newlines to parser (is end-statement signal)
WS  :   [ \t]+ -> skip ; // toss out whitespace

``````

Some highlights

1. Grammar consists of rules for synatctic structure and rules for vocabulary symbols(tokens).
2. program consisits of multiple statements
3. Seperate alternatives of a rule with |
4. Group symbols with parenthesis into subrules(‘*’ | ‘/’)
5. We can do left recursion(invoke itself at the start of alternative rules.

To test, we run

``````**\$ antlr4 Expr.g4
\$ javac Expr*.java
\$ grun Expr prog -gui t.expr # launches org.antlr.v4.runtime.misc.TestRig**
``````

We will see the parse tree for input in t.expr and **invoke the parser with rule prog **.

Calling a rule method is like invoking that rule. We can call any parser rule method.

# Building a calculator

We need to perform some actions using parse-tree visitor.

To do so wee need to add

1.label the alternative rules so we can have different visitor per rule so we can have different event for different input.

``````expr: expr op=('*'|'/') expr # MulDiv
| expr op=('+'|'-') expr # AddSub
| INT # int
| ID # id
| '(' expr ')' # parens
;
``````
1. We need to define token names for the operator literals so we can reference token names as Java constants in the visitor
``````MUL:'*';
DIV:'/';
``````

This time, we will use the visitor pattern and ANTLR will generate a method for each labeled alternative name

``````\$ antlr4 -no-listener -visitor LabeledExpr.g4
``````

It will also also generate default visitor implementation and we can subclass it. We will need to override methods associated with statement and expression alternatives.

Some interesting override is:

``````    @Override
public Integer visitMulDiv(LabeledExprParser.MulDivContext ctx) {
int left = visit(ctx.expr(0));  // get value of left subexpression
int right = visit(ctx.expr(1)); // get value of right subexpression
if ( ctx.op.getType() == LabeledExprParser.MUL ) return left * right;
return left / right; // must be DIV
}

``````

Here, when we visit multi or divide, we will first infer the left and right expression. Then we will use ctx.op.getType to determine the type of the op. Then do operation accordingly.

# Translation of java Source code

``````import java.util.List;
import java.util.Map;
public class Demo {
void f(int x, String y) { }
int[ ] g(/*no args*/) { return null; }
List<Map<String, Integer>>[] h() { return null; }
}
``````

We want to translate the source code into interface(just like header definition) while preserving whitespace and common.

``````<br />interface IDemo {
void f(int x, String y);
int[ ] g(/*no args*/);
List<Map<String, Integer>>[] h();
}

``````

We are given a Java Grammar and make a parse tree. We will derive interface from class name and grab signature(return type, name and arg list).

WE want to know how a class is decleared and how the method is defined.

``````classDeclaration
:   'class' Identifier typeParameters? ('extends' type)?
('implements' typeList)?
classBody
;
``````
``````methodDeclaration
:   type Identifier formalParameters ('[' ']')* methodDeclarationRest
|   'void' Identifier formalParameters methodDeclarationRest
;
``````

We will use antlr4 to get the visitors

``````\$ antlr4 Java.g4
``````

We do a

``````ls Java*.java
``````

and got

`````` JavaBaseListener.java JavaListener.java
JavaLexer.java JavaParser.java
``````

We will then extract an extended class from JavaBaseListener, (Java here means a listener for the Java grammar). We need to have 3 strategies:

1. print out interface header at the start of a class
2. print a terminating } at the end of class definition
3. When we see each method definition, print out signature.
``````    @Override
public void enterClassDeclaration(JavaParser.ClassDeclarationContext ctx){
System.out.println("interface I"+ctx.Identifier()+" {");
}

@Override
public void exitClassDeclaration(JavaParser.ClassDeclarationContext ctx) {
System.out.println("}");
}

@Override
public void enterMethodDeclaration(
JavaParser.MethodDeclarationContext ctx
)
{
// need parser to get tokens
TokenStream tokens = parser.getTokenStream();
String type = "void";
if ( ctx.type()!=null ) {
type = tokens.getText(ctx.type());
}
String args = tokens.getText(ctx.formalParameters());
System.out.println("\t"+type+" "+ctx.Identifier()+args+";");
}
``````

We also need to write a main to run the parser and walk the tree with listener.

``````public class ExtractInterfaceTool {
public static void main(String[] args) throws Exception {
String inputFile = null;
if ( args.length>0 ) inputFile = args[0];
InputStream is = System.in;
if ( inputFile!=null ) {
is = new FileInputStream(inputFile);
}
ANTLRInputStream input = new ANTLRInputStream(is);

JavaLexer lexer = new JavaLexer(input);
CommonTokenStream tokens = new CommonTokenStream(lexer);
JavaParser parser = new JavaParser(tokens);
ParseTree tree = parser.compilationUnit(); // parse

ParseTreeWalker walker = new ParseTreeWalker(); // create standard walker
ExtractInterfaceListener extractor = new ExtractInterfaceListener(parser);
walker.walk(extractor, tree); // initiate walk of tree with listener
}
``````

# embedding actions in a grammar

For more control, we can embed actions in grammar. These actions will be copied into the recursive-descent parser. actions are code snippets surrounded by curly braces.

Disclaimer :code snippets are from the ANTLR reference. If you found this note to be infrindging upon its copy right, please contact me and I will take it down.