mroonga - An open-source storage engine for fast fulltext search with MySQL.

3.3.2. Wrapper mode

Here we explain how to use wrapper mode of mroonga

3.3.2.1. How to use wrapper mode

In wrapper mode, mroonga works in wrapping an existing storage engine. To specify the wrapped storage engine, we use SQL comment like COMMENT = 'engine "innodb"' for now.

Note

For now, a primary key is mandatory in wrapper mode. That is not the case with storage mode.

Note

Wrapper mode supports the followings, that are not supported in storage mode for now.

  • null value
  • transaction (if storage engine supports. Note that rollback causes mismatch of indexes, it may affects search results, so recreate index of mroonga in such a case.)

3.3.2.3. How to get search score

We often want to display more relevant results first in full text search. We use search score in such case.

We can get search score by MySQL's standard way [1], i.e. we use MATCH...AGAINST in one of columns in SELECT or ORDER BY.

Let's try.

mysql> INSERT INTO diaries (content) VALUES ("It's fine today. It'll be fine tomorrow as well.");
Query OK, 1 row affected (0.00 sec)

mysql> INSERT INTO diaries (content) VALUES ("It's fine today. But it'll rain tomorrow.");
Query OK, 1 row affected (0.00 sec)

mysql> SELECT *, MATCH (content) AGAINST ("fine") FROM diaries WHERE MATCH (content) AGAINST ("fine") ORDER BY MATCH (content) AGAINST ("fine") DESC;
+----+--------------------------------------------------------------+------------------------------------+
| id | content                                                      | MATCH (content) AGAINST ("fine") |
+----+--------------------------------------------------------------+------------------------------------+
|  3 | It's fine today. It'll be fine tomorrow as well. |                                  2 |
|  1 | It'll be fine tomorrow.                      |                                  1 |
|  4 | It's fine today. But it'll rain tomorrow.    |                                  1 |
+----+--------------------------------------------------------------+------------------------------------+
3 rows in set (0.00 sec)

The result having the search word fine more, i.e. id = 3 message having the higher search score, is displayed first. And you also get search score by using MATCH AGAINST in SELECT phrase.

You can use AS to change the attribute name.

mysql> SELECT *, MATCH (content) AGAINST ("fine") AS score FROM diaries WHERE MATCH (content) AGAINST ("fine") ORDER BY MATCH (content) AGAINST ("fine") DESC;
+----+--------------------------------------------------------------+-------+
| id | content                                                      | score |
+----+--------------------------------------------------------------+-------+
|  3 | It's fine today. It'll be fine tomorrow as well. |     2 |
|  1 | It'll be fine tomorrow.                      |     1 |
|  4 | It's fine today. But it'll rain tomorrow.    |     1 |
+----+--------------------------------------------------------------+-------+
3 rows in set (0.00 sec)

3.3.2.5. How to get snippet (Keyword in context)

Note

This feature is in experimental stage. So, the required arguments or value is changed without notice in the future.

There is a case that you want to extract keyword and surrounding text as a search results.

Snippet means 'keyword and surrounding text'. It is called 'Keyword in context'.

mroonga_snippet function provides the way to get snippet from search results.

Here is the syntax of mroonga_snippet function:

SELECT mroonga_snippet(document, max_length, max_count, encoding,
  skip_leading_spaces, html_escape, prefix, suffix,
  word1, word1_prefix, word1_suffix,
  word2, word2_prefix, word2_suffix, ...);

Here is the detail of mroonga_snippet arguments.

document
The column name or string value is required.
max_length
The max length of snippet (bytes) is required.
max_count
The max elements of snippets (N word) is required.
encoding
The encoding of document is required. You can specify the value of encoding such as 'ascii_general_ci', 'cp932_japanese_ci', 'eucjpms_japanese_ci', 'utf8_japanese_ci' and so on.
skip_leading_spaces
Specify whether skip leading spaces or not. Specify the value 1 for skipping leading spaces, 0 for not.
html_escape
HTML escape is enabled or not. Specify the value 1 for enabling HTML escape, 0 for not.
prefix
The start text of snippet.
suffix
The end text of snippet.
wordN
Specify any word.
wordN_prefix
It is the start text of wordN.
wordN_suffix
It is the end text of wordN.

mroonga_snippet function is included in mroonga as a User-Defined Function (UDF), but if you have not yet register it in MySQL by CREATE FUNCTION, you need to invoke the following SQL for defining a function.

mysql> CREATE FUNCTION mroonga_snippet RETURNS STRING SONAME 'ha_mroonga.so';

mroonga_snippet function is useful for searching the text which contains keyword and associated one by using MATCH .. AGAINST syntax.

Imagine searching the document which contains 'fulltext' as a keyword. Assume that some keyword such as 'MySQL' and 'search' are associated with 'fulltext'.

mroonga_snippet function meets above.

Here is the schema definition for execution examples:

CREATE TABLE `snippet_test` (
  `id` int(11) NOT NULL,
  `text` text,
  PRIMARY KEY (`id`),
  FULLTEXT KEY `text` (`text`)
) ENGINE=mroonga COMMENT 'engine = "innodb"' DEFAULT CHARSET=utf8

Here is the sample data for execution examples:

insert into snippet_test (id, text) values (1, 'An open-source fulltext search engine and column store.');
insert into snippet_test (id, text) values (2, 'An open-source storage engine for fast fulltext search with MySQL.');
insert into snippet_test (id, text) values (3, 'Tritonn is a patched version of MySQL that supports better fulltext search function with Senna.');

Here is the results of execution examples:

mysql> select * from snippet_test;
+----+-------------------------------------------------------------------------------------------------+
| id | text                                                                                            |
+----+-------------------------------------------------------------------------------------------------+
|  1 | An open-source fulltext search engine and column store.                                         |
|  2 | An open-source storage engine for fast fulltext search with MySQL.                              |
|  3 | Tritonn is a patched version of MySQL that supports better fulltext search function with Senna. |
+----+-------------------------------------------------------------------------------------------------+
3 rows in set (0.00 sec)

mysql> select id, text, mroonga_snippet(text, 8, 2, 'ascii_general_ci', 1, 1, '...', '...<br>', 'fulltext', '<span class="w1">', '</span>', 'MySQL', '<span class="w2">', '</span>', 'search', '<span calss="w3">', '</span>') from snippet_test where match(text) against ('fulltext');
+----+-------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
| id | text                                                                                            | mroonga_snippet(text, 8, 2, 'ascii_general_ci', 1, 1, '...', '...<br>', 'fulltext', '<span class="w1">', '</span>', 'MySQL', '<span class="w2">', '</span>', 'search', '<span calss="w3">', '</span>') |
+----+-------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
|  1 | An open-source fulltext search engine and column store.                                         | ...<span class="w1">fulltext</span>...<br>... <span calss="w3">search</span> ...<br>                                                                                                                   |
|  2 | An open-source storage engine for fast fulltext search with MySQL.                              | ...<span class="w1">fulltext</span>...<br>... <span calss="w3">search</span> ...<br>                                                                                                                   |
|  3 | Tritonn is a patched version of MySQL that supports better fulltext search function with Senna. | ...f <span class="w2">MySQL</span> ...<br>...<span class="w1">fulltext</span>...<br>                                                                                                                   |
+----+-------------------------------------------------------------------------------------------------+--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+
3 rows in set (0.00 sec)

The keyword 'fulltext' and associated keyword 'MySQL' and 'search' has been extracted.

3.3.2.6. Logging

Mroonga outputs the logs by default.

Log files are located in MySQL's data directory with the filename groonga.log.

Here is the example of the log.

2010-10-07 17:32:39.209379|n|b1858f80|mroonga 1.10 started.
2010-10-07 17:32:44.934048|d|46953940|hash get not found (key=test)
2010-10-07 17:32:44.936113|d|46953940|hash put (key=test)

The default log level is NOTICE, i.e. we have important information only and we don't have debug information etc.).

You can get the log level by mroonga_log_level system variable, that is a global variable. You can also modify it dynamically by using SET phrase.

mysql> SHOW VARIABLES LIKE 'mroonga_log_level';
+-------------------+--------+
| Variable_name     | Value  |
+-------------------+--------+
| mroonga_log_level | NOTICE |
+-------------------+--------+
1 row in set (0.00 sec)

mysql> SET GLOBAL mroonga_log_level=DUMP;
Query OK, 0 rows affected (0.00 sec)

mysql> SHOW VARIABLES LIKE 'mroonga_log_level';
+-------------------+-------+
| Variable_name     | Value |
+-------------------+-------+
| mroonga_log_level | DUMP  |
+-------------------+-------+
1 row in set (0.00 sec)

Available log levels are the followings.

  • NONE
  • EMERG
  • ALERT
  • CRIT
  • ERROR
  • WARNING
  • NOTICE
  • INFO
  • DEBUG
  • DUMP

See mroonga_log_level about details.

You can reopen the log file by FLUSH LOGS. If you want to rotate the log file without stopping MySQL server, you can do in the following procedure.

  1. change the file name of groonga.log (by using OS's mv command etc.).
  2. invoke "FLUSH LOGS" in MySQL server (by mysql command or mysqladmin command).