search - how to configure the synonyms_path in elasticsearch -


i'm pretty new elasticsearch , want use synonyms, added these lines in configuration file:

index :     analysis :         analyzer :              synonym :                 type : custom                 tokenizer : whitespace                 filter : [synonym]         filter :             synonym :                 type : synonym                 synonyms_path: synonyms.txt 

then created index test:

"mappings" : {   "test" : {      "properties" : {         "text_1" : {            "type" : "string",            "analyzer" : "synonym"         },         "text_2" : {            "search_analyzer" : "standard",            "index_analyzer" : "synonym",            "type" : "string"         },         "text_3" : {            "type" : "string",            "analyzer" : "synonym"         }      }   } 

}

and insrted type test data:

{ "text_3" : "foo dog cat", "text_2" : "foo dog cat", "text_1" : "foo dog cat" } 

synonyms.txt contains "foo,bar,baz", , when search foo returns expected when search baz or bar return 0 results:

{ "query":{ "query_string":{     "query" : "bar",     "fields" : [ "text_1"],     "use_dis_max" : true,     "boost" : 1.0 }}}  

result:

{ "took":1, "timed_out":false, "_shards":{ "total":5, "successful":5, "failed":0 }, "hits":{ "total":0, "max_score":null, "hits":[ ] } } 

i don't know, if problem because defined bad synonyms "bar". said pretty new i'm going put example similar yours works. want show how elasticsearch deal synonyms @ search time , @ index time. hope helps.

first thing create synonym file:

foo => foo bar, baz 

now create index particular settings trying test:

curl -xput 'http://localhost:9200/test/' -d '{   "settings": {     "index": {       "analysis": {         "analyzer": {           "synonym": {             "tokenizer": "whitespace",             "filter": ["synonym"]           }         },         "filter" : {           "synonym" : {               "type" : "synonym",               "synonyms_path" : "synonyms.txt"           }         }       }     }   },   "mappings": {      "test" : {       "properties" : {         "text_1" : {            "type" : "string",            "analyzer" : "synonym"         },         "text_2" : {            "search_analyzer" : "standard",            "index_analyzer" : "standard",            "type" : "string"         },         "text_3" : {            "type" : "string",            "search_analyzer" : "synonym",            "index_analyzer" : "standard"         }       }     }   } }' 

note synonyms.txt must in same directory configuration file since path relative config dir.

now index doc:

curl -xput 'http://localhost:9200/test/test/1' -d '{   "text_3": "baz dog cat",   "text_2": "foo dog cat",   "text_1": "foo dog cat" }' 

now searches

searching in field text_1

curl -xget 'http://localhost:9200/test/_search?q=text_1:baz' {   "took": 3,   "timed_out": false,   "_shards": {     "total": 5,     "successful": 5,     "failed": 0   },   "hits": {     "total": 1,     "max_score": 0.15342641,     "hits": [       {         "_index": "test",         "_type": "test",         "_id": "1",         "_score": 0.15342641,         "_source": {           "text_3": "baz dog cat",           "text_2": "foo dog cat",           "text_1": "foo dog cat"         }       }     ]   } } 

you document because baz synonym of foo , @ index time foo expanded synonyms

searching in field text_2

curl -xget 'http://localhost:9200/test/_search?q=text_2:baz' 

result:

{   "took": 2,   "timed_out": false,   "_shards": {     "total": 5,     "successful": 5,     "failed": 0   },   "hits": {     "total": 0,     "max_score": null,     "hits": []   } } 

i don't hits because didn't expand synonyms while indexing (standard analyzer). and, since i'm searching baz , baz not in text, don't result.

searching in field text_3

curl -xget 'http://localhost:9200/test/_search?q=text_3:foo' {   "took": 3,   "timed_out": false,   "_shards": {     "total": 5,     "successful": 5,     "failed": 0   },   "hits": {     "total": 1,     "max_score": 0.15342641,     "hits": [       {         "_index": "test",         "_type": "test",         "_id": "1",         "_score": 0.15342641,         "_source": {           "text_3": "baz dog cat",           "text_2": "foo dog cat",           "text_1": "foo dog cat"         }       }     ]   } } 

note: text_3 "baz dog cat"

text_3 indexes without expanding synonyms. i'm searching foo, have "baz" 1 of synonyms result.

if want debug can use _analyze endpoint example:

curl -xget 'http://localhost:9200/test/_analyze?text=foo&analyzer=synonym&pretty=true' 

result:

{   "tokens": [     {       "token": "foo",       "start_offset": 0,       "end_offset": 3,       "type": "synonym",       "position": 1     },     {       "token": "baz",       "start_offset": 0,       "end_offset": 3,       "type": "synonym",       "position": 1     },     {       "token": "bar",       "start_offset": 0,       "end_offset": 3,       "type": "synonym",       "position": 2     }   ] } 

Comments

Popular posts from this blog

java - Run a .jar on Heroku -

java - Jtable duplicate Rows -

validation - How to pass paramaters like unix into windows batch file -