5.7.1. ブーリアンモード

5.7.1.1. 概要

Mroongaは MATCH AGAINSTIN BOOLEAN MODE 修飾子をつけることでブーリアン全文検索を実行できます。:

SELECT ... WHERE MATCH(column) AGAINST ('...' IN BOOLEAN MODE);

Normally, IN BOOLEAN MODE is suitable rather than the default IN NATURAL LANGUAGE MODE. Because IN BOOLEAN MODE is similar to query in Web search engine. Most people familiar with query in Web search engine.

ブーリアン全文検索のクエリーでは MySQLがサポートしている修飾子 とMroonga独自のプラグマを使うことができます。

These qualifiers and pragmas can change the relative rank of search results.

In the case of a search string not using neither a qualifier nor a pragma, the search results that contain the search string will be rated higher.

5.7.1.2. 使いかた

以下は例を示すためのスキーマとデータです。:

CREATE TABLE books (
  `id` INTEGER AUTO_INCREMENT,
  `title` text,
  PRIMARY KEY(`id`),
  FULLTEXT INDEX title_index (title)
) ENGINE=Mroonga DEFAULT CHARSET=utf8mb4;

INSERT INTO books (title) VALUES ('Professional MySQL');
INSERT INTO books (title) VALUES ('MySQL for Professional');
INSERT INTO books (title) VALUES ('Mroonga = MySQL + Groonga');

5.7.1.3. Qualifier

Here are supported qualifiers.

5.7.1.3.1. KEYWORD1 KEYWORD2

No operator between keywords such as KEYWORD1 KEYWORD2 indicates that one of keywords must be present in each row that is returned.

Mroonga for query means that Mroonga or for must be present:

SELECT title
  FROM books
 WHERE MATCH(title) AGAINST('Mroonga for' IN BOOLEAN MODE);
-- +---------------------------+
-- | title                     |
-- +---------------------------+
-- | Mroonga = MySQL + Groonga |
-- | MySQL for Professional    |
-- +---------------------------+

5.7.1.3.2. KEYWORD1 OR KEYWORD2

OR (must be uppercase) indicates that left hand side keyword or right hand side keyword must be present in each row that is returned.

Mroonga OR for query means that Mroonga or for must be present:

SELECT title
  FROM books
 WHERE MATCH(title) AGAINST('Mroonga OR for' IN BOOLEAN MODE);
-- +---------------------------+
-- | title                     |
-- +---------------------------+
-- | Mroonga = MySQL + Groonga |
-- | MySQL for Professional    |
-- +---------------------------+

OR is the default operator. You can omit it. Both Mroonga OR for and Mroonga for return the same result.

5.7.1.3.3. +KEYWORD

A leading plus sign indicates that this word must be present in each row that is returned.

+MySQL +Mroonga query means that both MySQL and Mroonga must be present:

SELECT title
  FROM books
 WHERE MATCH(title) AGAINST('+MySQL +Groonga' IN BOOLEAN MODE);
-- +---------------------------+
-- | title                     |
-- +---------------------------+
-- | Mroonga = MySQL + Groonga |
-- +---------------------------+

5.7.1.3.4. -KEYWORD

A leading minus sign indicates that this word must not be present in any of the rows that are returned.

+MySQL -Mroonga query means that MySQL must be present but Mroonga must not be present:

SELECT title
  FROM books
 WHERE MATCH(title) AGAINST('+MySQL -Mroonga' IN BOOLEAN MODE);
-- +------------------------+
-- | title                  |
-- +------------------------+
-- | Professional MySQL     |
-- | MySQL for Professional |
-- +------------------------+

5.7.1.3.5. PREFIX*

A following asterisk indicates that all words starting with this word must be present in any of the rows that are returned.

+M* query means that words starting M (MySQL and Mroonga in this case) must be present:

SELECT title
  FROM books
 WHERE MATCH(title) AGAINST('+M*' IN BOOLEAN MODE);
-- +---------------------------+
-- | title                     |
-- +---------------------------+
-- | Mroonga = MySQL + Groonga |
-- | Professional MySQL        |
-- | MySQL for Professional    |
-- +---------------------------+

注釈

To be precise, "word" may not be "word" you think. "word" in this context is "token". "token" may not be word. For example, tokens in "It's" are "It", "'" and "s".

You can confirm token by mroonga_command() and tokenize:

SELECT mroonga_command('tokenize TokenBigram "It''s" NormalizerMySQLGeneralCI');
-- +--------------------------------------------------------------------------+
-- | mroonga_command('tokenize TokenBigram "It''s" NormalizerMySQLGeneralCI') |
-- +--------------------------------------------------------------------------+
-- | [                                                                        |
-- |   {                                                                      |
-- |     "value":"IT",                                                        |
-- |     "position":0,                                                        |
-- |     "force_prefix":false                                                 |
-- |   },                                                                     |
-- |   {                                                                      |
-- |     "value":"'",                                                         |
-- |     "position":1,                                                        |
-- |     "force_prefix":false                                                 |
-- |   },                                                                     |
-- |   {                                                                      |
-- |     "value":"S",                                                         |
-- |     "position":2,                                                        |
-- |     "force_prefix":false                                                 |
-- |   }                                                                      |
-- | ]                                                                        |
-- +--------------------------------------------------------------------------+

JSON value in the above result is formatted by hand.

5.7.1.3.6. "PHRASE"

Quoting phrase by double quote (") indicates that the phrase must be present in any of the rows that are returned.

+"Professional MySQL" query means that Professional MySQL phrase must be present. The query doesn't match to MySQL for Profession. MySQL for Profession includes both MySQL and Professional words but doesn't include Professional MySQL phrase:

SELECT title
  FROM books
 WHERE MATCH(title) AGAINST('+"Professional MySQL"' IN BOOLEAN MODE);
-- +--------------------+
-- | title              |
-- +--------------------+
-- | Professional MySQL |
-- +--------------------+

5.7.1.3.7. (SUBEXPRESSION...)

Parentheses groups expressions.

+(Groonga OR Mroonga) +MySQL query means the following:

  • Groonga or Mroonga must be present.
  • MySQL must be present.

以下はこのクエリーの実行結果です。:

SELECT title
  FROM books
 WHERE MATCH(title) AGAINST('+(Groonga OR Mroonga) +MySQL' IN BOOLEAN MODE);
-- +---------------------------+
-- | title                     |
-- +---------------------------+
-- | Mroonga = MySQL + Groonga |
-- +---------------------------+

5.7.1.4. Pragma

Pragma is metadata for query. You can change how to parse query by specifying pragma.

You can embed pragma at the head of query for specifying how to execute.

Pragma must exist in the beginning of a query. Don't put a blank into a head of the query. Pragma starts with *:

SELECT MATCH AGAINST('*PRAGMA ...' IN BOOLEAN MODE);

You can specify multiple pragmas:

SELECT MATCH AGAINST('*PRAGMA1PRAGMA2 ...' IN BOOLEAN MODE);

以下は利用可能なプラグマの一覧です。

5.7.1.4.1. D pragma

D pragma indicates the default operator. It's used when an individual operator is omitted.

Here is the D pragma syntax. You can choose one of OR, + or - as ${OPERATOR}:

*D${OPERATOR}

5.7.1.4.1.1. DOR

DOR means that "or" is used as the default operator.

This is the default.

Here is an example to use DOR. '*DOR for Mroonga' IN BOOLEAN MODE returns records that includes for or Mroonga:

SELECT title
 FROM books
WHERE MATCH (title) AGAINST('*DOR for Mroonga' IN BOOLEAN MODE);
-- +---------------------------+
-- | title                     |
-- +---------------------------+
-- | MySQL for Professional    |
-- | Mroonga = MySQL + Groonga |
-- +---------------------------+

5.7.1.4.1.2. D+

D+ means that "and" is used as the default operator. It's similar to query in Web search engine.

Here is an example to use D+. '*D+ MySQL Mroonga' IN BOOLEAN MODE returns records that includes MySQL and Mroonga:

SELECT title
 FROM books
WHERE MATCH (title) AGAINST('*D+ MySQL Mroonga' IN BOOLEAN MODE);
-- +---------------------------+
-- | title                     |
-- +---------------------------+
-- | Mroonga = MySQL + Groonga |
-- +---------------------------+

5.7.1.4.1.3. D-

D- means that "not" is used as the default operator.

Here is an example to use D-. '*D- MySQL Mroonga' IN BOOLEAN MODE returns records that includes MySQL but doesn't include Mroonga:

SELECT title
 FROM books
WHERE MATCH (title) AGAINST('*D- MySQL Mroonga' IN BOOLEAN MODE);
-- +------------------------+
-- | title                  |
-- +------------------------+
-- | Professional MySQL     |
-- | MySQL for Professional |
-- +---------------------------+

5.7.1.4.2. W pragma

W pragma indicates target section and its weight for multiple column index.

You can specify different weight for each section. The default weight is 1. 1 means that no weight.

Here is the W pragma syntax. ${SECTION} is a number that is begun not from 0 but from 1. ${WEIGHT} is omitable:

*W[${SECTION1}[:${WEIGHT1}]][,${SECTION2}[:${WEIGHT2}]][,...]

Here are schema and data to show examples. You need to create a multiple column index to use W pragma:

CREATE TABLE memos (
  `id` INTEGER AUTO_INCREMENT,
  `title` text,
  `content` text,
  PRIMARY KEY(`id`),
  FULLTEXT INDEX text_index (title, content)
) ENGINE=Mroonga DEFAULT CHARSET=utf8mb4;

INSERT INTO memos (title, content) VALUES (
  'MySQL', 'MySQL is a RDBMS.'
);
INSERT INTO memos (title, content) VALUES (
  'Groonga', 'Groonga is a full text search engine.'
);
INSERT INTO memos (title, content) VALUES (
  'Mroonga', 'Mroonga is a storage engine for MySQL based on Groonga.'
);

Here is an example to show weight. title column has 10 weight and content columns has 1 weight. It means that keyword in title column is 10 times important than keyword in content column:

SELECT title,
       content,
       MATCH (title, content) AGAINST('*W1:10,2:1 +Groonga' IN BOOLEAN MODE) AS score
  FROM memos;
-- +---------+--------------------------------------------------------+-------+
-- | title   | content                                                | score |
-- +---------+--------------------------------------------------------+-------+
-- | MySQL   | MySQL is a RDBMS.                                      |     0 |
-- | Groonga | Groonga is a full text search engine.                  |    11 |
-- | Mroonga | Mroonga is a storage engine for MySQL based on Groonga |     1 |
-- +---------+--------------------------------------------------------+-------+

The score of the first record is 0. Because it doesn't have any Groonga in both title column and content column.

The score of the second record is 11. Because it has Groonga in both title column and content column. Groonga in title column has score 10. Groonga in content column has score 1. 11 is sum of them.

The score of the third record is 1. Because it has Groonga in only content column. `Groonga in content column has score 1. So the score of the record is 1.

5.7.1.4.3. S pragma

S pragma indicates syntax of the query.

Here is a syntax of S pragma:

*S${SYNTAX}

Here is a list of available syntax:

5.7.1.4.3.1. *SS

*SS プラグマを使うと スクリプト構文 を使えます。スクリプト構文ではGroongaのすべての検索機能を使えます。

以下はスクリプト構文の使い方を示す例のためのスキーマとデータです。:

CREATE TABLE comments (
  `content` text,
  FULLTEXT INDEX content_index (content)
) ENGINE=Mroonga DEFAULT CHARSET=utf8mb4;

INSERT INTO comments VALUES (
  'A student started to use Mroonga storage engine. It is very fast!'
);
INSERT INTO comments VALUES (
  'Another student also started to use Mroonga storage engine. It is very fast!'
);

以下はスクリプト構文で 近傍検索 を使う例です。:

SELECT content,
       MATCH (content) AGAINST('*SS content *N "student fast"' IN BOOLEAN MODE) AS score
  FROM comments;
-- +------------------------------------------------------------------------------+-------+
-- | content                                                                      | score |
-- +------------------------------------------------------------------------------+-------+
-- | A student started to use Mroonga storage engine. It is very fast!            |     1 |
-- | Another student also started to use Mroonga storage engine. It is very fast! |     0 |
-- +------------------------------------------------------------------------------+-------+

近傍検索は指定した単語間(今回の場合は studentfast )に10単語以下しかない場合のみマッチします。そのため、 student started ...(8単語)... very fast はマッチしますが、 student also started ...(8単語)... very fast はマッチしません。

他の上級者向け機能も使えます。