最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

mapping - How to return actual value (not lowercase) when performing aggregation in elasticsearch? - Stack Overflow

programmeradmin2浏览0评论

Hello we faced a problem with lowercase normalizer during aggregation query. We have an initial mapping like

"mappings": {
    "properties": {
      "keyword_value": {
        "type":  "keyword",'
        "normalizer": "lowercase_normalizer"
      }
    }
  }

During an aggregation query it will return aggregation function result like sum, count, etc. and the keyword_value as key in the lower case.

The issue is that we want to retrieve a keyword_value value in its original case.

If we make a basic search then we can retrieve data from keyword_value field in its original case.

We have a couple approaches in mind like making additional query to retrieve original values(could affect our performance). Also another approach is to update mapping with a new field without normalizer and update new fields value with additional query(not a suitable approach for us since we don't want to reindex the data).

So could you please suggest me the best approach how we can retrieve the keyword_value in its original case? Maybe we can somehow ignore lowercase normalizer during query? Why aggregation returns key in lower case but basic query returns in original?

Hello we faced a problem with lowercase normalizer during aggregation query. We have an initial mapping like

"mappings": {
    "properties": {
      "keyword_value": {
        "type":  "keyword",'
        "normalizer": "lowercase_normalizer"
      }
    }
  }

During an aggregation query it will return aggregation function result like sum, count, etc. and the keyword_value as key in the lower case.

The issue is that we want to retrieve a keyword_value value in its original case.

If we make a basic search then we can retrieve data from keyword_value field in its original case.

We have a couple approaches in mind like making additional query to retrieve original values(could affect our performance). Also another approach is to update mapping with a new field without normalizer and update new fields value with additional query(not a suitable approach for us since we don't want to reindex the data).

So could you please suggest me the best approach how we can retrieve the keyword_value in its original case? Maybe we can somehow ignore lowercase normalizer during query? Why aggregation returns key in lower case but basic query returns in original?

Share Improve this question edited Apr 2 at 6:54 Ilya Basalyha asked Apr 2 at 5:48 Ilya BasalyhaIlya Basalyha 11 bronze badge New contributor Ilya Basalyha is a new contributor to this site. Take care in asking for clarification, commenting, and answering. Check out our Code of Conduct. 2
  • how exactly is this java related? Seeing as you have: "normalizer": "lowercase_normalizer", are you really surprised you get lowercases? Maybe it's worth looking at other elasticsearch questions about the same topic, like: stackoverflow/questions/51664234/… – Stultuske Commented Apr 2 at 6:38
  • Removed java tag thanks for the pointing. Yes, I have little experience with elastic and currently working with exisiting code. Thanks for sharing the topic, but as stated in the question we're already having similar approach in mind, but we're want to avoid updating existing index mapping – Ilya Basalyha Commented Apr 2 at 7:01
Add a comment  | 

1 Answer 1

Reset to default 0

update mapping with a new field without normalizer

This is the most efficient way for your use case because of the followings.

  1. easy to implement

  2. don't need to reindex

    1. The new data will have both keyword_value and keyword_value_original

    2. for the existing data use _update_by_query API call

  3. Better search speed when you compare with other solutions.

Here is how to:

PUT test_index_lowercase
{
  "mappings": {
    "properties": {
      "keyword_value": {
        "type": "keyword",
        "normalizer": "lowercase_normalizer"
      }
    }
  },
  "settings": {
    "analysis": {
      "normalizer": {
        "lowercase_normalizer": {
          "type": "custom",
          "filter": ["lowercase","asciifolding"]
        }
      }
    }
  }
}

PUT test_index_lowercase/_doc/1
{
  "keyword_value": "MuSaB"
}

PUT test_index_lowercase/_doc/2
{
  "keyword_value": "musab"
}

GET test_index_lowercase/_search
{
  "size": 0,
  "aggs": {
    "NAME": {
      "terms": {
        "field": "keyword_value"
      }
    }
  }
}

PUT test_index_lowercase/_mapping
{
  "properties": {
    "keyword_value": {
      "type": "keyword",
      "normalizer": "lowercase_normalizer",
      "fields": {
        "original": {
          "type": "keyword"
        }
      }
    }
  }
}

POST test_index_lowercase/_update_by_query?conflicts=proceed

GET test_index_lowercase/_search
{
  "size": 0,
  "aggs": {
    "1": {
      "terms": {
        "field": "keyword_value"
      }
    },
    "2": {
      "terms": {
        "field": "keyword_value.original"
      }
    }
  }
}

与本文相关的文章

发布评论

评论列表(0)

  1. 暂无评论