Package org.apache.lucene.queryparser.flexible.standard
The old Lucene query parser used to have only one class that performed all the parsing operations. In the new query parser structure, the parsing was divided in 3 steps: parsing (syntax), processing (semantic) and building.
Flexible query parser is a modular, extensible framework for implementing Lucene query
parsers. In the flexible query parser model, query parsing takes three steps: syntax parsing,
processing (query semantics) and building (conversion to a Lucene Query
).
The flexible query parser module provides not just the framework but also the StandardQueryParser - the default implementation of a fully fledged query parser that supports most of the classic query parser's syntax but also adds support for interval functions, min-should-match operator on Boolean groups and many hooks for customization of how the parser behaves at runtime.
The flexible query parser is divided in two packages:
org.apache.lucene.queryparser.flexible.core
: contains the query parser API classes, which should be extended by custom query parser implementations.org.apache.lucene.queryparser.flexible.standard
: contains an example Lucene query parser implementation built on top of the flexible query parser API.
Features
- full support for Boolean expressions, including groups
- syntax parsers
- support for arbitrary syntax parsers, that can be converted into
QueryNode
trees. - query
node processors - optimize, validate, rewrite the
QueryNode
trees - processor pipelines - select your favorite query processors and build a pipeline to implement the features you need.
- query configuration handlers
- query
builders - convert
QueryNode
trees into LuceneQuery
instances.
Design
The flexible query parser was designed to have a very generic architecture, so that it can be easily used for different products with varying query syntax needs.
The query parser has three layers and its core is what we call the query node tree. It is a tree of objects that represent the syntax of the original query, for example, for 'a AND b' the tree could look like this:
AND / \ A B
The three flexible query parser layers are:
SyntaxParser
- This layer is the text parsing layer which simply transforms the query text string into a
QueryNode
tree. Every text parser must implement the interfaceSyntaxParser
. The default implementation isStandardSyntaxParser
. QueryNodeProcessor
- The query node processor does most of the work: it contains a chain of query node processors. Each processor can walk the tree and modify nodes or even the tree's structure. This allows for query optimization before the node tree is converted to an actual query.
QueryBuilder
- The third layer is a configurable map of builders, which map query nodes to their adapters
that convert each node into a
Query
.
-
ClassDescriptionConfiguration options common across queryparser implementations.This class defines utility methods to (help) parse query strings into
Query
objects.TheStandardQueryParser
is a pre-assembled query parser that supports most features of the classic Lucene query parser, allows dynamic configuration of some of its features (like multi-field expansion or wildcard query restrictions) and adds support for new query types and expressions.