elasticsearch - Elasticsearc - nGram filter preserve/keep original token -
i applying ngram-filter string field:
"custom_ngram": { "type": "ngram", "min_gram": 3, "max_gram": 10 }
but result loose tokens shorter or longer ngram range.
original tokens "iq" or "a4" example can not found.
i applying language specific analysis before ngram, avoid copying whole field. looking expand tokens ngrams.
any ideas or ngram-suggestions?
here example of 1 of analyzers use custom_ngram filter:
"french": { "type":"custom", "tokenizer": "standard", "filter": [ "french_elision", "lowercase", "french_stop", "custom_ascii_folding", "french_stemmer", "custom_ngram" ] }
you have no option use multi fields , index field different analyzer able keep shorter terms well. that:
"text": { "type": "string", "analyzer": "french", "fields": { "standard_version": { "type": "string", "analyzer": "standard" } } }
and adjust queries touch text.standard_version
field well.
Comments
Post a Comment