elasticsearch - Elasticsearc - nGram filter preserve/keep original token -

- April 15, 2015

i applying ngram-filter string field:

"custom_ngram": {     "type": "ngram",     "min_gram": 3,     "max_gram": 10 }

but result loose tokens shorter or longer ngram range.

original tokens "iq" or "a4" example can not found.

i applying language specific analysis before ngram, avoid copying whole field. looking expand tokens ngrams.

any ideas or ngram-suggestions?

here example of 1 of analyzers use custom_ngram filter:

"french": {     "type":"custom",     "tokenizer": "standard",     "filter": [         "french_elision",         "lowercase",         "french_stop",         "custom_ascii_folding",         "french_stemmer",         "custom_ngram"     ] }

you have no option use multi fields , index field different analyzer able keep shorter terms well. that:

    "text": {       "type": "string",       "analyzer": "french",       "fields": {         "standard_version": {           "type": "string",           "analyzer": "standard"         }       }     }

and adjust queries touch text.standard_version field well.

Search This Blog

M16

elasticsearch - Elasticsearc - nGram filter preserve/keep original token -

Comments

Post a Comment

Popular posts from this blog

Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.12:test (default-test) on project.Error occurred in starting fork -

unity3d - How do I remove the Unity Splash Screen from my iOS builds? -

android - CoordinatorLayout, FAB and container layout conflict -