Hello we faced a problem with lowercase normalizer during aggregation query. We have an initial mapping like
"mappings": {
"properties": {
"keyword_value": {
"type": "keyword",'
"normalizer": "lowercase_normalizer"
}
}
}
During an aggregation query it will return aggregation function result like sum, count, etc. and the keyword_value as key in the lower case.
The issue is that we want to retrieve a keyword_value value in its original case.
If we make a basic search then we can retrieve data from keyword_value field in its original case.
We have a couple approaches in mind like making additional query to retrieve original values(could affect our performance). Also another approach is to update mapping with a new field without normalizer and update new fields value with additional query(not a suitable approach for us since we don't want to reindex the data).
So could you please suggest me the best approach how we can retrieve the keyword_value in its original case? Maybe we can somehow ignore lowercase normalizer during query? Why aggregation returns key in lower case but basic query returns in original?
Hello we faced a problem with lowercase normalizer during aggregation query. We have an initial mapping like
"mappings": {
"properties": {
"keyword_value": {
"type": "keyword",'
"normalizer": "lowercase_normalizer"
}
}
}
During an aggregation query it will return aggregation function result like sum, count, etc. and the keyword_value as key in the lower case.
The issue is that we want to retrieve a keyword_value value in its original case.
If we make a basic search then we can retrieve data from keyword_value field in its original case.
We have a couple approaches in mind like making additional query to retrieve original values(could affect our performance). Also another approach is to update mapping with a new field without normalizer and update new fields value with additional query(not a suitable approach for us since we don't want to reindex the data).
So could you please suggest me the best approach how we can retrieve the keyword_value in its original case? Maybe we can somehow ignore lowercase normalizer during query? Why aggregation returns key in lower case but basic query returns in original?
Share Improve this question edited Apr 2 at 6:54 Ilya Basalyha asked Apr 2 at 5:48 Ilya BasalyhaIlya Basalyha 11 bronze badge New contributor Ilya Basalyha is a new contributor to this site. Take care in asking for clarification, commenting, and answering. Check out our Code of Conduct. 2- how exactly is this java related? Seeing as you have: "normalizer": "lowercase_normalizer", are you really surprised you get lowercases? Maybe it's worth looking at other elasticsearch questions about the same topic, like: stackoverflow/questions/51664234/… – Stultuske Commented Apr 2 at 6:38
- Removed java tag thanks for the pointing. Yes, I have little experience with elastic and currently working with exisiting code. Thanks for sharing the topic, but as stated in the question we're already having similar approach in mind, but we're want to avoid updating existing index mapping – Ilya Basalyha Commented Apr 2 at 7:01
1 Answer
Reset to default 0update mapping with a new field without normalizer
This is the most efficient way for your use case because of the followings.
easy to implement
don't need to reindex
The new data will have both
keyword_value
andkeyword_value_original
for the existing data use _update_by_query API call
Better search speed when you compare with other solutions.
Here is how to:
PUT test_index_lowercase
{
"mappings": {
"properties": {
"keyword_value": {
"type": "keyword",
"normalizer": "lowercase_normalizer"
}
}
},
"settings": {
"analysis": {
"normalizer": {
"lowercase_normalizer": {
"type": "custom",
"filter": ["lowercase","asciifolding"]
}
}
}
}
}
PUT test_index_lowercase/_doc/1
{
"keyword_value": "MuSaB"
}
PUT test_index_lowercase/_doc/2
{
"keyword_value": "musab"
}
GET test_index_lowercase/_search
{
"size": 0,
"aggs": {
"NAME": {
"terms": {
"field": "keyword_value"
}
}
}
}
PUT test_index_lowercase/_mapping
{
"properties": {
"keyword_value": {
"type": "keyword",
"normalizer": "lowercase_normalizer",
"fields": {
"original": {
"type": "keyword"
}
}
}
}
}
POST test_index_lowercase/_update_by_query?conflicts=proceed
GET test_index_lowercase/_search
{
"size": 0,
"aggs": {
"1": {
"terms": {
"field": "keyword_value"
}
},
"2": {
"terms": {
"field": "keyword_value.original"
}
}
}
}