elasticsearch - Elasticsearc - nGram filter preserve/keep original token -


i applying ngram-filter string field:

"custom_ngram": {     "type": "ngram",     "min_gram": 3,     "max_gram": 10 } 

but result loose tokens shorter or longer ngram range.

original tokens "iq" or "a4" example can not found.

i applying language specific analysis before ngram, avoid copying whole field. looking expand tokens ngrams.

any ideas or ngram-suggestions?

here example of 1 of analyzers use custom_ngram filter:

"french": {     "type":"custom",     "tokenizer": "standard",     "filter": [         "french_elision",         "lowercase",         "french_stop",         "custom_ascii_folding",         "french_stemmer",         "custom_ngram"     ] } 

you have no option use multi fields , index field different analyzer able keep shorter terms well. that:

    "text": {       "type": "string",       "analyzer": "french",       "fields": {         "standard_version": {           "type": "string",           "analyzer": "standard"         }       }     } 

and adjust queries touch text.standard_version field well.


Comments

Popular posts from this blog

Failed to execute goal org.apache.maven.plugins:maven-surefire-plugin:2.12:test (default-test) on project.Error occurred in starting fork -

windows - Debug iNetMgr.exe unhandle exception System.Management.Automation.CmdletInvocationException -

configurationsection - activeMq-5.13.3 setup configurations for wildfly 10.0.0 -