WordHoard - Hibernate Changes

Installing the Database

Table of Contents

Building the Source Code

Hibernate Changes

Regular expression support in queries
Long "in" lists in queries
Rebuilding the Hibernate jar file

Regular expression support in queries

Some of the query functions of WordHoard allow you to use regular expressions which are supported by MySQL but not by Hibernate. We added support for regular expressions in Hibernate queries by changing Hibernate's query grammar to recognize the MySQL "rlike" operator, and some Hibernate source files to match. The steps below apply to Hibernate v3.1.1 through v3.2.0 ga. The steps for later releases of Hibernate will (hopefully) be similar.

Note: these changes have already been applied to the hibernate.jar file we distribute with WordHoard. You only need to make these changes if you choose to use a different version of Hibernate than the one we supply.

The Hibernate query grammar files are located in the grammar subdirectory of the Hibernate release. The files are named hql-sql.g, hql.g, and sql-gen.g. All three files need modification to support regular expressions. The basic idea is to copy the existing specifications for the "LIKE" operator to add the "RLIKE" operator. In addition two source code files, HqlParser.java and SqlASTFactory.java also require modification.

Changes to hql-sql.g

Add the following lines to the comparisonExpr section in hql-sql.g:

	| #(RLIKE expr expr ( #(ESCAPE expr) )? )
	| #(NOT_RLIKE expr expr ( #(ESCAPE expr) )? )

Changes to hql.g

Add the following line to the "tokens" section in hql.g:

	RLIKE="rlike";

Add the following line to the synthetic token types list:

	NOT_RLIKE;

Change the comment line in the "// expressions" section:

	//                   LIKE, NOT LIKE, BETWEEN, NOT BETWEEN, IN, NOT IN

to read:

	//                   LIKE, NOT LIKE, BETWEEN, NOT BETWEEN, IN, NOT IN,
	//			RLIKE, NOT_RLIKE

Change the following line appearing in the negatedExpression! section:

	//## OP: EQ | LT | GT | LE | GE | NE | SQL_NE | LIKE;

to read:

	//## OP: EQ | LT | GT | LE | GE | NE | SQL_NE | LIKE | RLIKE;

Change the lines before relationalExpression which read:

	// level 4 - LT, GT, LE, GE, LIKE, NOT LIKE, BETWEEN, NOT BETWEEN
	// NOTE: The NOT prefix for LIKE and BETWEEN will be represented in the

to read:

	// level 4 - LT, GT, LE, GE, LIKE, NOT LIKE, BETWEEN, NOT BETWEEN, RLIKE, NOT RLIKE
	// NOTE: The NOT prefix for LIKE, RLIKE, and BETWEEN will be represented in the

Add the lines:

	| (r:RLIKE^ {
		#r.setType( (n == null) ? RLIKE : NOT_RLIKE);
		#r.setText( (n == null) ? "rlike" : "not rlike");
	}
	concatenation rlikeEscape)

before the line in the relationalExpression section which reads:

			| (MEMBER! OF! p:path! {

Add the following lines:

	rlikeEscape
		: (ESCAPE^ concatenation)?
		;

following the "likeEscape" block.

Changes to sql-gen.g

Add the following lines to the exoticComparisonExpression section of sql-gen.g:

	| #(RLIKE expr { out(" rlike "); } expr rlikeEscape )
	| #(NOT_RLIKE expr { out(" not rlike "); } expr rlikeEscape)

Add the following lines following the "likeEscape" section:

	rlikeEscape
		: ( #(ESCAPE { out(" escape "); } expr) )?
		;

Changes to HqlParser.java

Edit the file:

	src\org\hibernate\hql\ast\HqlParser.java

and add the following lines to the Switch statement in the negateNode method:

	case RLIKE:
		x.setType( NOT_RLIKE );
		x.setText( "{not}" + x.getText() );
		return x;	// (NOT (RLIKE a b) ) => (NOT_RLIKE a b)
	case NOT_RLIKE:
		x.setType( RLIKE );
		x.setText( "{not}" + x.getText() );
		return x;	// (NOT (NOT_RLIKE a b) ) => (RLIKE a b)

Changes to SqlASTFactory.java

Edit the file:

	src\org\hibernate\hql\ast\SqlASTFactory.java

and add the following lines after the CASE NOT_LIKE switch selector in method getASTNodeType:

	case RLIKE:
	case NOT_RLIKE:

Long "in" lists in queries

In Hibernate 320ga a long "in" list in a query can result in a stack overflow error during the parsing stage. For example, an HQL query element like

	where x in (:x)

or a manually constructed

	where x in (1,2,3 .....)

can generate a stack overflow if the number of elements referenced by x exceeds a number dependent upon the amount of available stack space. For many JVMs, the limit is between 9,000 and 10,000 assuming a relatively empty stack at the point of query execution. WordHoard occasionally uses lists several times this size.

The stack overflow occurs in the Hibernate class org.hibernate.hql.ast.util.NodeTraverser which uses a recursive algorithm to walk a parse tree. Long "in" lists generate a subtree of depth about equal to the number of elements in the list. A sufficiently long list results in a stack overflow when NodeTraverser's internal method visitDepthFirst calls itself too many times.

The solution is to replace the recursive tree walking strategy with an iterative one that does not use up stack space. Our suggested replacement code follows. This has fixed the problem for WordHoard.

package org.hibernate.hql.ast.util;

import antlr.collections.AST;
import java.util.Map;
import java.util.HashMap;

/**
 * A visitor for traversing an AST tree.
 *
 * @author Steve Ebersole
 * @author Philip R. "Pib" Burns.   Replaced recursion in tree traversal
 *  with iteration.
 */

public class NodeTraverser {
    public static interface VisitationStrategy {
        public void visit(AST node);
    }

    private final VisitationStrategy strategy;

    public NodeTraverser(VisitationStrategy strategy) {
        this.strategy = strategy;
    }

    /** Traverse the AST tree depth first.
     *
     *  @param ast Root node of subtree to traverse.
     *
     *  <p>
     *  Note that the AST passed in is not visited itself.  Visitation
     *  starts with its children.
     *  </p>
     *
     *  <p>
     *  This method originally called a recursive method visitDepthFirst
     *  which performed a recursive traversal of the tree rooted at the
     *  node specified by the ast parameter.  The original code looked
     *  like this:
     *  </p>
     *  <code>
     *  <pre>
     *  private void visitDepthFirst(AST ast) {
     *      if ( ast == null ) {
     *          return;
     *      }
     *      strategy.visit( ast );
     *      visitDepthFirst( ast.getFirstChild() );
     *      visitDepthFirst( ast.getNextSibling() );
     *  }
     *  </pre>
     *  </code>
     *  </p>
     *
     *  <p>
     *  The current code for traverseDepthFirst uses iteration to
     *  walk the tree.  This corrects stack overflow problems for
     *  constructs such as "x in (:x)" where ":x" specifies a large number
     *  of items.
     *  </p>
     */

    public void traverseDepthFirst( AST ast )
    {
                                //  Root AST node cannot be null or
                                //  traversal of its subtree is impossible.
        if ( ast == null )
        {
            throw new IllegalArgumentException(
                "node to traverse cannot be null!" );
        }
                                //  Map to hold parents of each
                                //  AST node.  Unfortunately the AST
                                //  interface does not provide a method
                                //  for finding the parent of a node, so
                                //  we use the Map to save them.

        Map parentNodes = new HashMap();

                                //  Start tree traversal with first child
                                //  of the specified root AST node.

        AST currentNode = ast.getFirstChild();

                                //  Remember parent of first child.

        parentNodes.put( currentNode , ast );

                                //  Iterate through nodes, simulating
                                //  recursive tree traversal, and add them
                                //  to queue in proper order for later
                                //  linear traversal.  This "flattens" the
                                //  into a linear list of nodes which can
                                //  be visited non-recursively.

        while ( currentNode != null )
        {
                                //  Visit the current node.

            strategy.visit( currentNode );

                                //  Move down to current node's first child
                                //  if it exists.

            AST childNode   = currentNode.getFirstChild();

                                //  If the child is not null, make it
                                //  the current node.

            if ( childNode != null )
            {
                                //  Remember parent of the child.

                parentNodes.put( childNode , currentNode );

                                //  Make child the current node.

                currentNode = childNode;

                continue;
            }

            while ( currentNode != null )
            {
                                //  Move to next sibling if any.

                AST siblingNode = currentNode.getNextSibling();

                if ( siblingNode != null )
                {
                                //  Get current node's parent.
                                //  This is also the parent of the
                                //  sibling node.

                    AST parentNode  = (AST)parentNodes.get( currentNode );

                                //  Remember parent of sibling.

                    parentNodes.put( siblingNode , parentNode );

                                //  Make sibling the current node.

                    currentNode     = siblingNode;

                    break;
                }
                                //  Move up to parent if no sibling.
                                //  If parent is root node, we're done.

                currentNode = (AST)parentNodes.get( currentNode );

                if ( currentNode.equals( ast ) )
                {
                    currentNode = null;
                }
            }
        }
    }
}

Rebuilding the Hibernate jar file

Once you have made the above changes, rebuild the Hibernate jar file using the Ant target "jar" which appears in the root directory of the Hibernate release files. On Windows you can type

	build.bat jar

at a command prompt. On most Unix systems you can type

	./build jar

at a command prompt, assuming you have Ant properly installed. This generates updated query source files for Hibernate, recompiles all of Hibernate, and places the updated hibernate3.jar file in a version specific directory. For the current release, Hibernate 3.2, the generated jars are placed in the "build" subdirectory of the main Hibernate release directory.

Installing the Database

Table of Contents

Building the Source Code