Writing a Parser in Java: Setting Variables

During the course of this series we have written a tokenizer, designed a grammar, implemented the grammar in a recursive descent parser and built an expression tree that corresponds to the mathematical expression and  that is able to calculate a value from it. As I announced in the last post we are almost done, except for one little feature. We want to be able to assign values to the variables in the expression. This will be the topic of the current and final post in this tutorial.

So let’s think about what exactly we want to to. Once the expression is parsed and we have an internal representation in the form of an ExpressionNode, we want to iterate through the whole tree and find VariableExpressionNode objects. If the nodes found in this way contain a variable with the name that we want to set, then we will assign the value to it.

The most straightforward way of doing this would be to add a new method to the ExpressionNode interface that each concrete implementation will have to define. This method might have the following signature.

public void setVariable(String name, double value);

The implementation in most of the concrete node types would simply pass the information on to the child nodes by calling their setVariable() method. Only the VariableExpressionNode would compare the name passed in the argument with its own name and set its value accordingly. This is indeed not a bad programming style but it has one disadvantage. Let’s think a bit into the future. There might be other operations that we want to perform on the expression tree. Maybe we want to print out the contents in a particular format or we want to find sub-expressions that match certain criteria. In each case we would have to iterate through the tree and in each case we would have to add a new method to the ExpressionNode interface and all its implementations. As your software grows you might have a multitude of different methods for different tasks defined in this way. Not only is this difficult to maintain because the code for a specific task is always spread over many classes but it also makes is necessary to keep modifying existing code for every new operation on the tree.

The Visitor Design Pattern

Ideally we would like to separate all the code for a specific task into a single class with the ability of adding new functionality without modifying existing code. This can be achieved with the visitor design pattern. In this software design pattern, we create an interface that acts as a visitor to the class hierarchy. It is useful when the class hierarchy can be expected to be relatively stable while the number of operations on the classes can increase over time.  The visitor will have one method for each class that is part of the hierarchy.

public interface ExpressionNodeVisitor
{
  public void visit(VariableExpressionNode node);
  public void visit(ConstantExpressionNode node);
  public void visit(AdditionExpressionNode node);
  public void visit(MultiplicationExpressionNode node);
  public void visit(ExponentiationExpressionNode node);
  public void visit(FunctionExpressionNode node);
}

Next we add a method called accept() the ExpressionNode interface which  takes an ExpressionNodeVisitor.

public void accept(ExpressionNodeVisitor visitor);

Now that we have defined the method in the interface we have to define it in every class that implements the interface. For the VariableExpressionNode and the ConstantExpressionNode the implementation is a one-liner

public class VariableExpressionNode
    implements ExpressionNode {

  ...

  public void accept(ExpressionNodeVisitor visitor) {
    visitor.visit(this);
  }
}

public class ConstantExpressionNode
    implements ExpressionNode {

  ...

  public void accept(ExpressionNodeVisitor visitor) {
    visitor.visit(this);
  }
}

When passing an ExpressionNodeVisitor to the accept() method, the method will call the appropriate visit method of the visitor class. This is achieved by a combination of overriding and overloading of methods. Don’t be fooled. Even though both of the implementations look the same, they call different visit() methods in the ExpressionNodeVisitor because the this member has a different type in each case.

In order for the visitor to be visited by all the nodes in the tree, the accept() methods of the other class implementations will also have to call the accept() methods of their children.

public class AdditionExpressionNode 
    extends SequenceExpressionNode {

  ...

  public void accept(ExpressionNodeVisitor visitor) {
    visitor.visit(this);
    for (Term t: terms)
      t.expression.accept(visitor);
  }
}

public class MultiplicationExpressionNode 
    extends SequenceExpressionNode {

  ...

  public void accept(ExpressionNodeVisitor visitor) {
    visitor.visit(this);
    for (Term t: terms)
      t.expression.accept(visitor);
  }
}

The implementations again look identical but have slightly different effects because of the different types of this.

Finally the FunctionExpressionNode and the ExponentiationExpressionNode follow a similar pattern

public class FunctionExpressionNode
    implements ExpressionNode {

  ...

  public void accept(ExpressionNodeVisitor visitor) {
    visitor.visit(this);
    argument.accept(visitor);
  }
}

public class ExponentiationExpressionNode
    implements ExpressionNode {

  ...

  public void accept(ExpressionNodeVisitor visitor) {
    visitor.visit(this);
    base.accept(visitor);
    exponent.accept(visitor);
  }
}

With these few lines of code we have added a very powerful mechanism to or class design. We can now freely create new functionality by implementing the ExpressionNodeVisitor interface.

Setting Variables

Let’s get back to the task at hand. We would like to set the value of variables with a given name. To do this we create a new class, SetVariable, that implements ExpressionNodeVisitor.

public class SetVariable implements ExpressionNodeVisitor
{
  private String name;
  private double value;

  public SetVariable(String name, double value) {
    super();
    this.name = name;
    this.value = value;
  }

  public void visit(VariableExpressionNode node) {
    if (node.getName().equals(name))
      node.setValue(value);
  }

  public void visit(ConstantExpressionNode node) {}
  public void visit(AdditionExpressionNode node) {}
  public void visit(MultiplicationExpressionNode node) {}
  public void visit(ExponentiationExpressionNode node) {}
  public void visit(FunctionExpressionNode node) {}

}

The class contains the name and the value of the variable to be set and these are initialised in the standard way in the constructor. Because we are implementing the ExpressionNodeVisitor we need to define all the corresponding methods. However, as you can see most of these methods are empty and do nothing because we are really only interested in the VariableExpressionNode elements in the tree. Only two lines of code are needed to compare and set the variable. All the iteration through the tree is handled by the accept() methods of the individual classes and in the visitor we can concentrate on writing the code that actually does something.

Using the visitor is also a one-liner. Let’s say we have an ExpressionNode called expression, then the following code will set the value of pi.

expression.accept(new SetVariable("pi", Math.PI));

To test this, let’s update the main method from the last post.

Parser parser = new Parser();
try {
  ExpressionNode expression = parser.parse("sin(pi/2)");
  expression.accept(new SetVariable("pi", Math.PI));
  System.out.println("The value of the expression is "
    + expression.getValue());
}
catch (ParserException e) {
  System.out.println(e.getMessage());
}
catch (EvaluationException e) {
  System.out.println(e.getMessage());
}

As expected, the program prints out

The value of the expression is 1.0

Final Words

This is concludes our series of the Java expression parser. You can download the complete parser here. It is released under the MIT license, so feel free to use it in your own projects but please keep the copyright notice.

The code reference can be found here.

We have used the parser in our Android app DiffIt! which can calculate derivatives of mathematical expressions.

12 Comments

  1. Gerlof Fokkema

    I’ve found a bug in your code.

    When entering an expression of the following form:
    2(2*2 or 2)2*2
    This is obviously an invalid expression, but the Parser does not throw an exception. Instead it ignores everything after the opening bracket.

    This bug is also present in your Android application DiffIt.

    Reply
    1. Mikail

      Thanks you for pointing that out. The bug has been fixed in CogPar and I will push the fix into DiffIt this week. Essentially the parser has to check that there are no more tokens in the lookahead when parsing has finished.

      Reply
  2. Luke

    Nice tutorial!

    I have one question.
    “sin(x)^2″ gives the same result as “sin(x^2)”.
    How to fix it?

    Reply
    1. Mikail

      Thanks for reporting this inconsistency. There is an ambiguity when allowing expressions like “sin x^2″ without any brackets. I wanted it to be interpreted as “sin(x^2)” but, given the limitations of the LL(1) grammar, this also resulted in “sin(x)^2″ being interpreted that way. I have now modified the grammar and the code so that it will always interpret it as “(sin(x))^2″. I have inculded further details in the update of the post on the grammar.

      Reply
  3. Christos Kalonakis

    Thats a great tool indeed! Infact I will be using it in my fractal software so the user can give its own formula. I extended the code using my Complex numbers library and added about 20 functions, but I had to limit the variable names to only z and c. It will be included in version 1.0.4.8

    Reply
    1. Christos Kalonakis
  4. Capacitor

    Hi. Great work! I wast studying your code very carefully and now I’d like to expand it by adding ability of solving differential with one variable (1st degree) . What would you recommend for a start? Do you think it is possible? Do you think that I would have to create new class or extened Parser class or maybe adding new DifferentialExpressionNode. Please, give me some hints for a start.

    P.S. I’m from Poland and I don’t know if I had translate differential equation correctly.
    eg. input:
    [2x^2]’ = 4x^1
    [sin(x)]’ = cos(x)
    [tg(x)]’ = 1/cos^2(x)

    I want to implement only basic formulas

    Reply
    1. Mikail

      Hi Capacitor,

      It is definitely possible. Just take a look at my Android App DiffIt. It has been written using the parser presented here.

      Essentially, calculating the derivative of an expression works by applying a set of transformations to the expression tree. For every basic function and operator you need to define what the derivative should be. An then you need to define transformations that correspond to product rule, quotient rule and the chain rule. Finally, if you want the result to look nice, you need to create a method for simplifying the expression tree, e.g. collect multiple numeric terms or factors into a single number, factorising common factors, etc.

      All in all it is quite a tedious undertaking.

      Reply
  5. Capacitor

    Once again. I’d like to
    1. Input a String with variable, which I’d set with SetVariable class
    2. Transform input expression to desire form (after differential) – this is the part of the code I’d don’t know where to put to get possibility of exploit your project
    3. Get the expression in String after transformation.
    4. Execute getValue() method on my new String to get actual value of transformed expression with my variable i used in section 1.

    Reply
  6. Nick Anderson

    Hi! Thanks so much for your tutorials! I’m trying to extend this parser to handle first variables, and then derivatives. I know it can be done using this parser because of your wonderful app, so I’m trying to do so to learn more about parsing. However, I’m having a hard time with the first step: introducing variables. I don’t mean variables like pi or e, but uninitialized variables. So, for example, inputting “5 + 2*x” would return “5 + 2*x”, since x does not have a value. Can you please give me a hint, or a start on how to go about this?

    Regards,
    Nick

    Reply
    1. Mikail

      Hi Nick,

      If you want to handle uninitialised variables I would suggest adding a method

      boolean canEvaluate()

      to the ExpressionNode interface. The ConstantExpressionNode would simply return true, however the VariableExpressionNode would return true only if the value has been set and false otherwise. All the other types of expression nodes return a value depending on whether all their children (in the expression tree) return true.

      You can then traverse the expression tree and replace any node whose canEvaluate method returns true by a ConstantExpressionNode with the value returned by the getValue method.

      Reply
  7. Frederik Heick

    What about the issue with double negatives, aka “4–8″, this wont parse, but “4-(-8)” will.
    I can think of some poor ways to solve this (remove all spaces and replace “–” with “-“), but I would like to figure a more correct way to solve this.

    Reply

Leave a Comment

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>