Note
This command is an experimental feature.
New in version 5.0.5.
logical_select is a sharding version of select. logical_select searches records from multiple tables and outputs them.
You need to plugin_register sharding plugin because logical_select is included in sharding plugin.
logical_select has many parameters.
The required parameters are logical_table and shard_key. Other parameters are optional:
logical_select logical_table
shard_key
[min=null]
[min_border="include"]
[max=null]
[max_border="include"]
[filter=null]
[sortby=null]
[output_columns="_id, _key, *"]
[offset=0]
[limit=10]
[drilldown=null]
[drilldown_sortby=null]
[drilldown_output_columns="_key, _nsubrecs"]
[drilldown_offset=0]
[drilldown_limit=10]
[drilldown_calc_types=NONE]
[drilldown_calc_target=null]
logical_select has the following named parameters for advanced drilldown:
- drilldown[${LABEL}].keys=null
- drilldown[${LABEL}].sortby=null
- drilldown[${LABEL}].output_columns="_key, _nsubrecs"
- drilldown[${LABEL}].offset=0
- drilldown[${LABEL}].limit=10
- drilldown[${LABEL}].calc_types=NONE
- drilldown[${LABEL}].calc_target=null
You can use one or more alphabets, digits, _ and . for ${LABEL}. For example, parent.sub1 is a valid ${LABEL}.
Parameters that have the same ${LABEL} are grouped.
For example, the following parameters specify one drilldown:
- --drilldown[label].keys column
- --drilldown[label].sortby -_nsubrecs
The following parameters specify two drilldowns:
- --drilldown[label1].keys column1
- --drilldown[label1].sortby -_nsubrecs
- --drilldown[label2].keys column2
- --drilldown[label2].sortby _key
Most of logical_select features can be used like corresponding select features. For example, parameter name is same, output format is same and so on.
But there are some differences from select:
- logical_table and shard_key parameters are required instead of table parameter.
- sortby isn't supported when multiple shards are used. (Only one shard is used, they are supported.)
- _value.${KEY_NAME} in drilldown[${LABEL}].sortby doesn't work with multiple shards. It works with one shard. _key in drilldown[${LABEL}].sortby work with multiple shards.
- match_columns and query aren't supported yet.
- cache isn't supported yet.
- match_escalation_threshold isn't supported yet.
- query_flags isn't supported yet.
- query_expander isn't supported yet.
- adjuster isn't supported yet.
Let's learn about logical_select usage with examples. This section shows many popular usages.
You need to register sharding plugin because logical_select is included in sharding plugin.
Execution example:
plugin_register sharding
# [[0, 1337566253.89858, 0.000355720520019531], true]
Here are a schema definition and sample data to show usage.
Execution example:
table_create Entries_20150708 TABLE_HASH_KEY ShortText
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Entries_20150708 created_at COLUMN_SCALAR Time
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Entries_20150708 content COLUMN_SCALAR Text
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Entries_20150708 n_likes COLUMN_SCALAR UInt32
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Entries_20150708 tag COLUMN_SCALAR ShortText
# [[0, 1337566253.89858, 0.000355720520019531], true]
table_create Entries_20150709 TABLE_HASH_KEY ShortText
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Entries_20150709 created_at COLUMN_SCALAR Time
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Entries_20150709 content COLUMN_SCALAR Text
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Entries_20150709 n_likes COLUMN_SCALAR UInt32
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Entries_20150709 tag COLUMN_SCALAR ShortText
# [[0, 1337566253.89858, 0.000355720520019531], true]
table_create Terms TABLE_PAT_KEY ShortText \
--default_tokenizer TokenBigram \
--normalizer NormalizerAuto
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Terms entries_key_index_20150708 \
COLUMN_INDEX|WITH_POSITION Entries_20150708 _key
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Terms entries_content_index_20150708 \
COLUMN_INDEX|WITH_POSITION Entries_20150708 content
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Terms entries_key_index_20150709 \
COLUMN_INDEX|WITH_POSITION Entries_20150709 _key
# [[0, 1337566253.89858, 0.000355720520019531], true]
column_create Terms entries_content_index_20150709 \
COLUMN_INDEX|WITH_POSITION Entries_20150709 content
# [[0, 1337566253.89858, 0.000355720520019531], true]
load --table Entries_20150708
[
{"_key": "The first post!",
"created_at": "2015/07/08 00:00:00",
"content": "Welcome! This is my first post!",
"n_likes": 5,
"tag": "Hello"},
{"_key": "Groonga",
"created_at": "2015/07/08 01:00:00",
"content": "I started to use Groonga. It's very fast!",
"n_likes": 10,
"tag": "Groonga"},
{"_key": "Mroonga",
"created_at": "2015/07/08 02:00:00",
"content": "I also started to use Mroonga. It's also very fast! Really fast!",
"n_likes": 15,
"tag": "Groonga"}
]
# [[0, 1337566253.89858, 0.000355720520019531], 3]
load --table Entries_20150709
[
{"_key": "Good-bye Senna",
"created_at": "2015/07/09 00:00:00",
"content": "I migrated all Senna system!",
"n_likes": 3,
"tag": "Senna"},
{"_key": "Good-bye Tritonn",
"created_at": "2015/07/09 01:00:00",
"content": "I also migrated all Tritonn system!",
"n_likes": 3,
"tag": "Senna"}
]
# [[0, 1337566253.89858, 0.000355720520019531], 2]
There are two tables, Entries_20150708 and Entries_20150709, for blog entries.
Note
You need to use ${LOGICAL_TABLE_NAME}_${YYYYMMDD} naming rule for table names. In this example, LOGICAL_TABLE_NAME is Entries and YYYYMMDD is 20150708 or 20150709.
An entry has title, created time, content, the number of likes for the entry and tag. Title is key of Entries_YYYYMMDD. Created time is value of Entries_YYYYMMDD.created_at column. Content is value of Entries_YYYYMMDD.content column. The number of likes is value of Entries_YYYYMMDD.n_likes column. Tag is value of Entries_YYYYMMDD.tag column.
Entries_YYYYMMDD._key column and Entries_YYYYMMDD.content column are indexed using TokenBigram tokenizer. So both Entries_YYYYMMDD._key and Entries_YYYYMMDD.content are fulltext search ready.
OK. The schema and data for examples are ready.
TODO
This section describes parameters of logical_select.
There are required parameters, logical_table and shard_key.
Specifies logical table name. It means table name without _YYYYMMDD postfix. If you use actual table such as Entries_20150708, Entries_20150709 and so on, logical table name is Entries.
You can show 10 records by specifying logical_table and shard_key parameters. They are required parameters.
Execution example:
logical_select --logical_table Entries --shard_key created_at
# [
# [
# 0,
# 1337566253.89858,
# 0.000355720520019531
# ],
# [
# [
# [
# 5
# ],
# [
# [
# "_id",
# "UInt32"
# ],
# [
# "_key",
# "ShortText"
# ],
# [
# "content",
# "Text"
# ],
# [
# "created_at",
# "Time"
# ],
# [
# "n_likes",
# "UInt32"
# ],
# [
# "tag",
# "ShortText"
# ]
# ],
# [
# 1,
# "The first post!",
# "Welcome! This is my first post!",
# 1436281200.0,
# 5,
# "Hello"
# ],
# [
# 2,
# "Groonga",
# "I started to use Groonga. It's very fast!",
# 1436284800.0,
# 10,
# "Groonga"
# ],
# [
# 3,
# "Mroonga",
# "I also started to use Mroonga. It's also very fast! Really fast!",
# 1436288400.0,
# 15,
# "Groonga"
# ],
# [
# 1,
# "Good-bye Senna",
# "I migrated all Senna system!",
# 1436367600.0,
# 3,
# "Senna"
# ],
# [
# 2,
# "Good-bye Tritonn",
# "I also migrated all Tritonn system!",
# 1436371200.0,
# 3,
# "Senna"
# ]
# ]
# ]
# ]
If nonexistent table is specified, an error is returned.
Execution example:
logical_select --logical_table Nonexistent --shard_key created_at
# [
# [
# -22,
# 1337566253.89858,
# 0.000355720520019531,
# "[logical_select] no shard exists: logical_table: <Nonexistent>: shard_key: <created_at>",
# [
# [
# "Groonga::Context.set_groonga_error",
# "lib/mrb/scripts/context.rb",
# 27
# ]
# ]
# ]
# ]
Specifies column name which is treated as shared key. Shard key is a column that stores data that is used for distributing records to suitable shards.
Shard key must be Time type for now.
See logical_table how to specify shard_key.
There are optional parameters.
Specifies the minimum value of shard_key column. If shard doesn't have any matched records, the shard isn't searched.
For example, min is "2015/07/09 00:00:00", Entry_20150708 isn't searched. Because Entry_20150708 has only records for "2015/07/08".
The following example only uses Entry_20150709 table. Entry_20150708 isn't used.
Execution example:
logical_select \
--logical_table Entries \
--shard_key created_at \
--min "2015/07/09 00:00:00"
# [
# [
# 0,
# 1337566253.89858,
# 0.000355720520019531
# ],
# [
# [
# [
# 2
# ],
# [
# [
# "_id",
# "UInt32"
# ],
# [
# "_key",
# "ShortText"
# ],
# [
# "content",
# "Text"
# ],
# [
# "created_at",
# "Time"
# ],
# [
# "n_likes",
# "UInt32"
# ],
# [
# "tag",
# "ShortText"
# ]
# ],
# [
# 1,
# "Good-bye Senna",
# "I migrated all Senna system!",
# 1436367600.0,
# 3,
# "Senna"
# ],
# [
# 2,
# "Good-bye Tritonn",
# "I also migrated all Tritonn system!",
# 1436371200.0,
# 3,
# "Senna"
# ]
# ]
# ]
# ]
Specifies whether the minimum value is include or not. Here is available values.
Value | Description |
---|---|
include | Includes min value. This is the default. |
exclude | Doesn't include min value. |
Here is an example for exclude. The result doesn't include the "Good-bye Senna" record because its created_at value is "2015/07/09 00:00:00".
Execution example:
logical_select \
--logical_table Entries \
--shard_key created_at \
--min "2015/07/09 00:00:00" \
--min_border "exclude"
# [
# [
# 0,
# 1337566253.89858,
# 0.000355720520019531
# ],
# [
# [
# [
# 1
# ],
# [
# [
# "_id",
# "UInt32"
# ],
# [
# "_key",
# "ShortText"
# ],
# [
# "content",
# "Text"
# ],
# [
# "created_at",
# "Time"
# ],
# [
# "n_likes",
# "UInt32"
# ],
# [
# "tag",
# "ShortText"
# ]
# ],
# [
# 2,
# "Good-bye Tritonn",
# "I also migrated all Tritonn system!",
# 1436371200.0,
# 3,
# "Senna"
# ]
# ]
# ]
# ]
Specifies the maximum value of shard_key column. If shard doesn't have any matched records, the shard isn't searched.
For example, max is "2015/07/08 23:59:59", Entry_20150709 isn't searched. Because Entry_20150709 has only records for ""2015/07/09".
The following example only uses Entry_20150708 table. Entry_20150709 isn't used.
Execution example:
logical_select \
--logical_table Entries \
--shard_key created_at \
--max "2015/07/08 23:59:59"
# [
# [
# 0,
# 1337566253.89858,
# 0.000355720520019531
# ],
# [
# [
# [
# 3
# ],
# [
# [
# "_id",
# "UInt32"
# ],
# [
# "_key",
# "ShortText"
# ],
# [
# "content",
# "Text"
# ],
# [
# "created_at",
# "Time"
# ],
# [
# "n_likes",
# "UInt32"
# ],
# [
# "tag",
# "ShortText"
# ]
# ],
# [
# 1,
# "The first post!",
# "Welcome! This is my first post!",
# 1436281200.0,
# 5,
# "Hello"
# ],
# [
# 2,
# "Groonga",
# "I started to use Groonga. It's very fast!",
# 1436284800.0,
# 10,
# "Groonga"
# ],
# [
# 3,
# "Mroonga",
# "I also started to use Mroonga. It's also very fast! Really fast!",
# 1436288400.0,
# 15,
# "Groonga"
# ]
# ]
# ]
# ]
Specifies whether the maximum value is include or not. Here is available values.
Value | Description |
---|---|
include | Includes max value. This is the default. |
exclude | Doesn't include max value. |
Here is an example for exclude. The result doesn't include the "Good-bye Senna" record because its created_at value is "2015/07/09 00:00:00".
Execution example:
logical_select \
--logical_table Entries \
--shard_key created_at \
--max "2015/07/09 00:00:00" \
--max_border "exclude"
# [
# [
# 0,
# 1337566253.89858,
# 0.000355720520019531
# ],
# [
# [
# [
# 3
# ],
# [
# [
# "_id",
# "UInt32"
# ],
# [
# "_key",
# "ShortText"
# ],
# [
# "content",
# "Text"
# ],
# [
# "created_at",
# "Time"
# ],
# [
# "n_likes",
# "UInt32"
# ],
# [
# "tag",
# "ShortText"
# ]
# ],
# [
# 1,
# "The first post!",
# "Welcome! This is my first post!",
# 1436281200.0,
# 5,
# "Hello"
# ],
# [
# 2,
# "Groonga",
# "I started to use Groonga. It's very fast!",
# 1436284800.0,
# 10,
# "Groonga"
# ],
# [
# 3,
# "Mroonga",
# "I also started to use Mroonga. It's also very fast! Really fast!",
# 1436288400.0,
# 15,
# "Groonga"
# ]
# ]
# ]
# ]
The return value format of logical_select is compatible with select. See Return value for details.