Skip to main content

Full-Text Search and Query Acceleration Operators

This document introduces the query operators supported by Apache Doris inverted index, in two categories:

  • Full-text search operators: For fuzzy matching scenarios on text fields, such as keywords, phrases, prefixes, and regex.
  • Inverted index query acceleration: For precise filtering scenarios on structured fields, such as equality, range, set, and array.

Full-Text Search Operators

The following table lists all full-text search operators and their typical use cases:

OperatorTypical ScenarioMatching Rule
MATCH_ANYKeyword "OR" searchMatches if any keyword is hit
MATCH_ALLKeyword "AND" searchMust hit all keywords
MATCH_PHRASEStrict phrase searchTerms are adjacent and in the same order
MATCH_PHRASE (with slop)Fault-tolerant phrase searchAllows gaps between terms
MATCH_PHRASE (strict order)Phrase search with fixed term orderTerm order is fixed within the gap range
MATCH_PHRASE_PREFIXInput suggestion / prefix completionLast term matches by prefix
MATCH_REGEXPTerm-level regex matchingApplies regex to tokenized results
MATCH_PHRASE_EDGEMulti-edge fuzzy matchingFirst-term suffix + middle exact + last-term prefix

Keyword Search: MATCH_ANY / MATCH_ALL

Suitable for the scenario "given several keywords, find documents that contain these words."

  • MATCH_ANY: Matches rows that contain any of the keywords.

    SELECT * FROM table_name WHERE content MATCH_ANY 'keyword1 keyword2';
  • MATCH_ALL: Matches rows that contain all of the keywords simultaneously.

    SELECT * FROM table_name WHERE content MATCH_ALL 'keyword1 keyword2';

Phrase Search: MATCH_PHRASE Family

Suitable for precise phrase matching scenarios where "keywords need to be adjacent or maintain term order."

Strict Phrase Matching

Requires terms to be adjacent and in the same order. To enable index acceleration, set "support_phrase" = "true" in the index properties.

SELECT * FROM table_name WHERE content MATCH_PHRASE 'keyword1 keyword2';

Phrase Matching with Slop

Allows up to slop words between keywords, and term order can vary.

-- Allow up to 3 words between keyword1 and keyword2
SELECT * FROM table_name WHERE content MATCH_PHRASE 'keyword1 keyword2 ~3';

Strict-Order Phrase Matching

Combines + with slop to require a fixed term order.

SELECT * FROM table_name WHERE content MATCH_PHRASE 'keyword1 keyword2 ~3+';

Prefix and Edge Matching

Suitable for scenarios such as "input suggestion" and "prefix/suffix fuzzy matching."

MATCH_PHRASE_PREFIX

Phrase matching where the last word matches by prefix. When only a single word is given, this degenerates into a prefix match for that word.

-- The last word matches by prefix
SELECT * FROM table_name WHERE content MATCH_PHRASE_PREFIX 'keyword1 key';

-- A single word degenerates into a prefix match
SELECT * FROM table_name WHERE content MATCH_PHRASE_PREFIX 'keyword1';

MATCH_PHRASE_EDGE

Edge phrase matching, with the following matching rules:

  • The first word matches by suffix
  • Middle words match by exact match
  • The last word matches by prefix
  • Terms must be adjacent
SELECT * FROM table_name WHERE content MATCH_PHRASE_EDGE 'search engine optim';

Regex Matching: MATCH_REGEXP

Performs regex matching on tokenized terms.

SELECT * FROM table_name WHERE content MATCH_REGEXP '^key_word.*';

Specifying an Analyzer with USING ANALYZER

When multiple inverted indexes using different analyzers are created on a single column, you can use the USING ANALYZER clause to specify which analyzer to use at query time.

Syntax

SELECT * FROM table_name WHERE column MATCH 'keywords' USING ANALYZER analyzer_name;

Supported Operators

All MATCH operators support the USING ANALYZER clause:

  • MATCH / MATCH_ANY
  • MATCH_ALL
  • MATCH_PHRASE
  • MATCH_PHRASE_PREFIX
  • MATCH_PHRASE_EDGE
  • MATCH_REGEXP

Built-in Analyzers

NameDescription
noneExact match, no tokenization
standardStandard tokenization
chineseChinese tokenization

Usage Examples

-- Use the standard analyzer (tokenizes the text)
SELECT * FROM articles WHERE content MATCH 'hello world' USING ANALYZER std_analyzer;

-- Use the keyword analyzer (exact match, no tokenization)
SELECT * FROM articles WHERE content MATCH 'hello world' USING ANALYZER kw_analyzer;

-- Combined with MATCH_PHRASE
SELECT * FROM articles WHERE content MATCH_PHRASE 'hello world' USING ANALYZER std_analyzer;

-- Use built-in analyzers
SELECT * FROM articles WHERE content MATCH 'hello' USING ANALYZER standard;
SELECT * FROM articles WHERE content MATCH 'hello' USING ANALYZER none;

Notes

  • If the index for the specified analyzer has not been built, the query automatically falls back to the non-index path (results are correct, but performance is slower).
  • If no analyzer is specified, the system uses any available index.

Inverted Index Query Acceleration

In addition to full-text search, inverted index can also accelerate precise filtering on structured fields. The supported operators and functions are as follows:

CategoryOperator / Function
Equality and set=, !=, IN, NOT IN
Range>, >=, <, <=, BETWEEN
Null checkIS NULL, IS NOT NULL
Arrayarray_contains, array_overlaps

Usage examples:

-- Range query
SELECT * FROM t WHERE price >= 100 AND price < 200;

-- Set query
SELECT * FROM t WHERE tags IN ('a', 'b', 'c');

-- Array query
SELECT * FROM t WHERE array_contains(attributes, 'color');