β¬οΈ Main Note
https://docs.google.com/document/d/17KKtCXe_nuQsuk9E_6050v01H_IAHBfYQtG11G06jas/edit
β οΈ Warning β οΈ Today's post contains a lot of images
(μμμΈ)
"Hello this is Monstershop. We are the best shop"
β€ Every single words are tokenized, and they are now tokens.
By tokenizing, elasticsearch can quickly search the words from the context.
β€ Faster than mysql (mysql searches in this format : "title":%Hello%)
β€ Full-text search (efficient for searching a sentence)
β€ "Hello" is saved as a token, so only need to get that "Hello" token.
β€ Analyzes how the searching works
Default Anlayzers
< Character-Filter > β€ removes things like ["!","~","@", "#"]
< Tokenizer > β€ splits by spaces
< Token-Filter > β€ Uppercase to lowercase
Settings:Analyzer, Tokenizer, Token-filter settings
Mappings: Set which analyzer the devloper wants to use for analyzing column
β¬οΈ Match: Get the result that literally matches the user input(search word the use types in).
β¬οΈ Prefix: More like automatic complete search.
{
"template": "*",
"settings": {
"analysis": {
"analyzer": {
"tattoo_ngram_analyzer": {
"type": "custom",
"tokenizer": "tattoo_ngram_tokenizer",
"filter": ["lowercase", "my_stop_filter"]
}
},
"tokenizer": {
"tattoo_ngram_tokenizer": {
"type": "nGram",
"min_gram": "1",
"max_gram": "10"
}
},
"filter": {
"my_stop_filter": {
"type": "stop",
"stopwords": ["the", "in", "..."]
}
}
},
"max_ngram_diff": "20"
},
"mappings": {
"properties": {
"name": {
"type": "text"
},
"description": {
"type": "text",
"analyzer": "tattoo_ngram_analyzer"
},
"price": {
"type": "long"
}
}
}
}