### Apache Solr NGram/EdgeNGram ###
=== NGram ===
NGram is very useful for auto-complete, it will cut the word by a size specified by yourself.
For example, the word “paris”, minGramSize takes 2, maxGramSize takes 3, we will get :
paris => “pa”, “ar”, “ri”, “is”
=> “par”, “ari”, “ris”
By default, minGramSize is 1, maxGramSize is 1 and side is “front”.
You can also set side to “back” to generate the ngrams from right to left.
<fieldType name="text_general_ngram" positionIncrementGap="100"> <analyzer type="index"> <tokenizer .../> <filter minGramSize="2" maxGramSize="15"/> </analyzer> <analyzer type="query"> <tokenizer.../> </analyzer> </fieldType>
=== EdgeNGram ===
We can use also EdgeNGram, it will create n-grams from the beginning edge of a input token.
Also take the word “paris” as an example, and take minGramSize equals to 2, maxGramSize equals to 10, side from front
paris => “pa”, “par”, “pari”, “paris”
By default, minGramSize is 1, maxGramSize is 1 and side is “front”.
You can also set side to “back” to generate the ngrams from right to left.
<fieldType name="text_general_edge_ngram" positionIncrementGap="100"> <analyzer type="index"> <tokenizer .../> <filter minGramSize="2" maxGramSize="15" side="front"/> </analyzer> <analyzer type="query"> <tokenizer .../> </analyzer> </fieldType>