I am on Elasticsearch 7.17.22. I have 2 indexes, the first "my_legacy_index", the second "my_new_index".
I have a spark scala job that inserts the same dataframe into both indexes.
Here's the libs used:
".elasticsearch" % "elasticsearch-spark-30_2.12" % "7.16.3",
".elasticsearch.client" % "elasticsearch-rest-client" % "7.16.3",
".elasticsearch.client" % "elasticsearch-rest-high-level-client" % "7.16.3",
When performing a search using the "routing" parameter, I have a different behavior. On the first one, the "routing" parameter is taken into account. On the second one, the "routing" parameter seems to be ignored.
Here is an example:
# 0 results
# Does not return any results (this is the correct behavior).
# The correct routing is 50_15
GET my_legacy_index/_search?routing=50_0
{
"query": {
"ids": {
"values": ["50-15-15-20250123152311-xxx"]
}
}
}
# 1 result
# Returns results when it shouldn't.
# The correct routing is 50_15
GET my_new_index/_search?routing=50_0
{
"query": {
"ids": {
"values": ["50-15-15-20250123152311-xxx"]
}
}
}
Knowing that:
- The 2 indexes are created in the same way.
- In both indexes I have exactly the same data.
- It is the same job that inserts the documents into both indexes.
- The same template is used for both indexes "my_*".
- I have the "_routing": { "required": true} in both indexes.
- The same ingestion pipeline is used when inserting into both indexes.
Question: Do you have any idea how to debug this problem? I tried to compare the 2 indexes and they seem exactly the same.