Cover Image for MySQL ngram Fulltext Parser
125 views

MySQL ngram Fulltext Parser

The MySQL ngram full-text parser is a specialized text parser that allows you to perform n-gram based full-text searches. N-grams are contiguous sequences of ‘n’ items (characters or words) extracted from a text. Using the ngram full-text parser, you can perform searches based on such n-grams.

Here’s how you can use the ngram full-text parser for full-text searches in MySQL:

  1. Create a Full-Text Index with ngram Parser: You need to create a full-text index on the column you want to search using the ngram parser. You can specify this when creating the table or alter an existing table.
   CREATE TABLE articles (
       id INT AUTO_INCREMENT PRIMARY KEY,
       title VARCHAR(255),
       content TEXT,
       FULLTEXT(content) WITH PARSER ngram;

In the above example, we’ve specified the ngram parser for the content column.

  1. Perform a Full-Text Search: To perform a full-text search using the ngram parser, you use the MATCH ... AGAINST clause in your SQL query. You can search for words or phrases based on n-grams.
   SELECT * FROM articles
   WHERE MATCH(content) AGAINST('search_term' IN NATURAL LANGUAGE MODE);

Replace 'search_term' with the n-gram you want to search for.

  1. Sorting by Relevance: You can sort the results by relevance using the MATCH ... AGAINST clause in the ORDER BY clause, similar to the standard full-text search.
  2. Word Length Limit and Stopwords: The ngram parser does not have the same word length limit or stopwords as the standard full-text search. It works with shorter words and doesn’t exclude common stopwords.
  3. Customizing n-gram Length: By default, the ngram parser uses trigrams (3-grams), but you can customize the n-gram length by setting the innodb_ft_min_token_size and innodb_ft_max_token_size configuration parameters in your MySQL configuration.

Keep in mind that the ngram full-text parser is particularly useful for languages with no clear word boundaries, such as Chinese or Japanese. It can also be helpful in certain search scenarios where standard full-text search might not provide the desired results.

As with any full-text search method, the choice of the parser and search approach should depend on your specific use case and the characteristics of your text data.

YOU MAY ALSO LIKE...

The Tech Thunder

The Tech Thunder

The Tech Thunder


COMMENTS