You can use groonga as a library of programming language C or an executable file. This tutorial explains how to use groonga as an executable file. Using its file, you can create and operate databases, start and connect to server, and so on.
You can create a new database in the following command.
Form
groonga -n DB_PATH_NAME
'-n' option specifies to create a database. DB_PATH_NAME specifies full-path of new database.
Groonga starts as interactive mode after you create a database with this command, and so groonga accepts commands from standard input. This mode is terminated with Ctrl-d.
Execution example:
% groonga -n /tmp/tutorial.db
> ctrl-d
%
Form
groonga DB_PATH_NAME [COMMAND]
DB_PATH_NAME specifies full-path of existing database. If COMMAND is specified, result of COMMAND is returned.
With no COMMAND, this command starts groonga as interactive-mode. Groonga of this mode reads a command from standard input evaluates it repeatedly. This tutorial uses interactive-mode mainly.
For example, we will run the status command. This command returns status of groonga's execution.
Execution example:
> table_create --name Type --flags TABLE_HASH_KEY --key_type ShortText
[[0,1317212791.02322,0.03942904],true]
> column_create --table Type --name number --type Int32
[[0,1317212791.26314,0.124383285],true]
> column_create --table Type --name float --type Float
[[0,1317212791.58803,0.027924039],true]
> column_create --table Type --name string --type ShortText
[[0,1317212791.81654,0.040399047],true]
> column_create --table Type --name time --type Time
[[0,1317212792.05751,0.027354067],true]
> load --table Type
> [{"_key":"sample","number":12345,"float":42.195,"string":"GROONGA","time":1234567890.12}]
[[0,1317212792.28516,0.200775839],1]
> select --table Type
[[0,1317212792.68655,0.000199477],[[[1],[["_id","UInt32"],["_key","ShortText"],["time","Time"],["string","ShortText"],["number","Int32"],["float","Float"]],[1,"sample",1234567890.12,"GROONGA",12345,42.195]]]]
The mentioned above, results of executed commands are generally JSON style. The first element in a array of JSON has information of error-code, execution time, and so on. The second element has a result of exectuted command.
You can operate database with various commands via execution file of groonga or groonga server. There are forms of commands in the following:
Form1: COMMAND ARGUMENT1 ARGUMENT2 ..
Form2: COMMAND --ARAGUMENT1 VALUE1 --ARGUMENT2 VALUE2 ..
You can mix these forms in commands running.
In Form2, if you want to specify a value including some spaces or symbols("'()/), you should enclose its value with single-quote or double-quote.
For detail, you can see paragraph of "command" in groonga実行ファイル.
- status
- Show status of groonga process.
- table_list
- Show lists of tables defined in a database.
- column_list
- Show lists of columns defined in a table.
- table_create
- Add table to a database.
- column_create
- Add column to a table.
- select
- Search and show records included a table.
- load
- Insert record to a table.
table_create creates table.
In using groonga, to creating tables generally needed master key. Master key should be specified the types and the way to store.
We're going to explain the types in tutorial after. Please imagine it as expressing sort of data. How to store master key defines speed of search with master key and advisability of begins-with-match search. This is also explained in this tutorial later.
For example, we create 'Site' table. This table has master key of ShortText type, and the way to store its key is HASH.
Execution example:
> column_create --table Site --name link --type Site
[[0,1317212792.88872,0.060705006],true]
> load --table Site
> [{"_key":"http://example.org/","link":"http://example.net/"}]
[[0,1317212793.14984,0.200481934],1]
> select --table Site --output_columns _key,title,link._key,link.title --query title:@this
[[0,1317212793.55084,0.000485897],[[[1],[["_key","ShortText"],["title","ShortText"],["link._key","ShortText"],["link.title","ShortText"]],["http://example.org/","This is test record 1!","http://example.net/","test record 2."]]]]
select shows contents of table.
Execution example:
> column_create --table Site --name links --flags COLUMN_VECTOR --type Site
[[0,1317212793.75262,0.049658904],true]
> load --table Site
> [{"_key":"http://example.org/","links":["http://example.net/","http://example.org/","http://example.com/"]}]
[[0,1317212794.00274,0.200473621],1]
> select --table Site --output_columns _key,title,links._key,links.title --query title:@this
[[0,1317212794.40349,0.000384272],[[[1],[["_key","ShortText"],["title","ShortText"],["links._key","ShortText"],["links.title","ShortText"]],["http://example.org/","This is test record 1!",["http://example.net/","http://example.org/","http://example.com/"],["test record 2.","This is test record 1!","test test record three."]]]]]
With name of a table, 'select' command shows 10 contents of its table. [0] shows the number of searched records. ["_id","Uint32"] is column named "_id" and type of this column's value is UInt32. ["_key","ShortText"] is "_key" column, type of this column's value is ShortText.
'table_create' command creates table including two columns, '_id' and '_key' first. '_id' has ID-number given automatically by groonga. '_key' column is stored master key. You cannot modify this column's name.
column_create command create columns.
We add a column named 'comment' that lets us store value whose type is ShortText.
Execution example:
> column_create --table Site --name title --flags COLUMN_SCALAR --type ShortText
[[0,1317212712.91734,0.077833747],true]
> select --table Site
[[0,1317212713.19572,0.000121119],[[[0],[["_id","UInt32"],["_key","ShortText"],["title","ShortText"]]]]]
COLUMN_SCALAR means this is normal column.
This tutorial explains fulltext searching with entried data in groonga table.
We need terminology table in fulltext-searching. Terminology table is a table whose master key's values are words in text. We create 'Terms' table, it has type of master key value is ShortText.
Execution example:
> table_create --name Terms --flags TABLE_PAT_KEY|KEY_NORMALIZE --key_type ShortText --default_tokenizer TokenBigram
[[0,1317212713.39679,0.092312046],true]
Many parameters is specified in this execution example. You don't hove to understand all parameters. There are the simple explaination, but you can skipped.
In this examples, 'TABLE_PAT_KEY|KEY_NORMALIZE' stores master key in patricia-trie and entries each teminology after nomalized. The 'default_tokenizer' parametar specifies the way to tokenize target texts. In this examples, we specifies 'TokenBigram' as this parameter, and so we choose 'N-gram' generally called.
We will fulltext search 'title' column in 'Site' table. In this case, we create column whose type index in terminology table.
Execution example:
> column_create --table Terms --name blog_title --flags COLUMN_INDEX|WITH_POSITION --type Site --source title
[[0,1317212713.68994,0.19739078],true]
This command creates index column 'blog_title' in 'Term' table. '--type' option specifies target indexed table, and '--source' option does target index column. In execution example, 'COLUMN_INDEX|WITH_POSITION' for '--flags' option specifies that this column is index column for storing information of terminology existing position. This option should be specified 'COLUMN_INDEX|WITH_POSITION' in generally fulltext searching. This tutorial does not deal with the reason why store information of terminology existing position.
load is used to load data for groonga database. This command stores json-formatted data in a table.
Execution example:
> load --table Site
> [
> {"_key":"http://example.org/","title":"This is test record 1!"},
> {"_key":"http://example.net/","title":"test record 2."},
> {"_key":"http://example.com/","title":"test test record three."},
> {"_key":"http://example.net/afr","title":"test record four."},
> {"_key":"http://example.org/aba","title":"test test test record five."},
> {"_key":"http://example.com/rab","title":"test test test test record six."},
> {"_key":"http://example.net/atv","title":"test test test record seven."},
> {"_key":"http://example.org/gat","title":"test test record eight."},
> {"_key":"http://example.com/vdw","title":"test test record nine."},
> ]
[[0,1317212714.08816,2.203527402],9]
Let's make sure that its table has data with 'select' command.
Execution example:
> select --table Site
[[0,1317212716.49285,0.000270908],[[[9],[["_id","UInt32"],["_key","ShortText"],["title","ShortText"]],[1,"http://example.org/","This is test record 1!"],[2,"http://example.net/","test record 2."],[3,"http://example.com/","test test record three."],[4,"http://example.net/afr","test record four."],[5,"http://example.org/aba","test test test record five."],[6,"http://example.com/rab","test test test test record six."],[7,"http://example.net/atv","test test test record seven."],[8,"http://example.org/gat","test test record eight."],[9,"http://example.com/vdw","test test record nine."]]]]
'_id' and '_key' columns are unique in groonga's table, so let's search data in table using these columns.
You can search data using 'select' command with 'query' parameter.
Execution example:
> select --table Site --query _id:1
[[0,1317212716.69871,0.000308514],[[[1],[["_id","UInt32"],["_key","ShortText"],["title","ShortText"]],[1,"http://example.org/","This is test record 1!"]]]]
'_id:1' specified 'query' parameter means to search records whose '_id' column has '1'.
Let's search records with '_key' column.
Execution example:
> select --table Site --query "_key:\"http://example.org/\""
[[0,1317212716.9005,0.000478343],[[[1],[["_id","UInt32"],["_key","ShortText"],["title","ShortText"]],[1,"http://example.org/","This is test record 1!"]]]]
'_key:"http://example.org/"' specified 'query' parameter means to search records whose '_key' column has '"http://example.org/"'.
Using 'query' parameter, you can fulltext search with index.
Execution example:
> select --table Site --query title:@this
[[0,1317212717.10303,0.000581287],[[[1],[["_id","UInt32"],["_key","ShortText"],["title","ShortText"]],[1,"http://example.org/","This is test record 1!"]]]]
This command shows result of fulltext searching by string 'this' for 'title' column.
"title:@this" specified 'query' parameter means to search records whose 'title' column including 'this' string.
'select' command has parameter 'match_columns'.
If this parameter is specified, it means to search in columns specified 'match_columns' when 'query' parameter doesn't specify column-name condition.[1]_
If you specify 'match_columns' is 'title' and 'query' is 'this', you can take same result as above query.
Execution example:
> select --table Site --match_columns title --query this
[[0,1317212717.30596,0.000716439],[[[1],[["_id","UInt32"],["_key","ShortText"],["title","ShortText"]],[1,"http://example.org/","This is test record 1!"]]]]
'output_columns' parameter in 'select' command specifies columns shown in result of search.
If you want to specify some columns, you should separate column names by comma(,).
Execution example:
> select --table Site --output_columns _key,title,_score --query title:@test
[[0,1317212717.50916,0.00060758],[[[9],[["_key","ShortText"],["title","ShortText"],["_score","Int32"]],["http://example.org/","This is test record 1!",1],["http://example.net/","test record 2.",1],["http://example.com/","test test record three.",2],["http://example.net/afr","test record four.",1],["http://example.org/aba","test test test record five.",3],["http://example.com/rab","test test test test record six.",4],["http://example.net/atv","test test test record seven.",3],["http://example.org/gat","test test record eight.",2],["http://example.com/vdw","test test record nine.",2]]]]
"_score" column is added to The groonga's result. This column has the higher number, the more condition of fulltext seaching matches text.
'select' command can display result in only specified ranges using 'offset' and 'limit' parameter. This parameters is useful when you want to show only a page in much result of searching.
'offset' parameter specifies starting point of result. If you want 'select' command to return from first records, this parameter specifies '0'.
'limit' parameter specifies how many records of searching result.
Execution example:
> select --table Site --offset 0 --limit 3
[[0,1317212717.71574,0.000238544],[[[9],[["_id","UInt32"],["_key","ShortText"],["title","ShortText"]],[1,"http://example.org/","This is test record 1!"],[2,"http://example.net/","test record 2."],[3,"http://example.com/","test test record three."]]]]
> select --table Site --offset 3 --limit 3
[[0,1317212717.91925,0.00023617],[[[9],[["_id","UInt32"],["_key","ShortText"],["title","ShortText"]],[4,"http://example.net/afr","test record four."],[5,"http://example.org/aba","test test test record five."],[6,"http://example.com/rab","test test test test record six."]]]]
> select --table Site --offset 7 --limit 3
[[0,1317212718.12219,0.00019999],[[[9],[["_id","UInt32"],["_key","ShortText"],["title","ShortText"]],[8,"http://example.org/gat","test test record eight."],[9,"http://example.com/vdw","test test record nine."]]]]
If you use 'sortby' parameter in 'select' command, this command sorts result of searching.
When 'sortby' parameter specifies column name, result is sorted in ascending-order to its column's value. This 'select' command also sort in descending-order when you add hyphen(-) before column name.
Execution example:
> select --table Site --sortby -_id
[[0,1317212718.32565,0.000385755],[[[9],[["_id","UInt32"],["_key","ShortText"],["title","ShortText"]],[9,"http://example.com/vdw","test test record nine."],[8,"http://example.org/gat","test test record eight."],[7,"http://example.net/atv","test test test record seven."],[6,"http://example.com/rab","test test test test record six."],[5,"http://example.org/aba","test test test record five."],[4,"http://example.net/afr","test record four."],[3,"http://example.com/","test test record three."],[2,"http://example.net/","test record 2."],[1,"http://example.org/","This is test record 1!"]]]]
For condition of sort, you can use '_score' column introduced in the paragraph of "Specify output column".
Execution example:
> select --table Site --query title:@test --output_columns _id,_score,title --sortby _score
[[0,1317212718.5331,0.000667311],[[[9],[["_id","UInt32"],["_score","Int32"],["title","ShortText"]],[1,1,"This is test record 1!"],[2,1,"test record 2."],[4,1,"test record four."],[3,2,"test test record three."],[9,2,"test test record nine."],[8,2,"test test record eight."],[7,3,"test test test record seven."],[5,3,"test test test record five."],[6,4,"test test test test record six."]]]]
If you want to specify some column names, you should use comma(,) between these names. In this case, when same value of records is existed in first column, this command sorts result of searching to value of second column.
Execution example:
> select --table Site --query title:@test --output_columns _id,_score,title --sortby _score,_id
[[0,1317212718.73819,0.00069225],[[[9],[["_id","UInt32"],["_score","Int32"],["title","ShortText"]],[1,1,"This is test record 1!"],[2,1,"test record 2."],[4,1,"test record four."],[3,2,"test test record three."],[8,2,"test test record eight."],[9,2,"test test record nine."],[5,3,"test test test record five."],[7,3,"test test test record seven."],[6,4,"test test test test record six."]]]]
footnote
[1] | In now groonga's version, you can only use 'match_columns' parameter in the case of existing index of fulltext searching. This parameter cannot be use in searching for ordinary columns. |