I'm trying to use the icu RuleBasedCollator in python. In my code I specify a rule wherby "ä" should sort before "a" as a secondary (accent) difference
from icu import RuleBasedCollator
l=["a","ä"]
rbc = RuleBasedCollator('\n&ä<<a')
sorted(l, key=rbc.getSortKey)
However, the output of the sorted
is:
['a', 'ä']
I expected: ['ä','a'] What did I do wrong?
Many thanks
I'm trying to use the icu RuleBasedCollator in python. In my code I specify a rule wherby "ä" should sort before "a" as a secondary (accent) difference
from icu import RuleBasedCollator
l=["a","ä"]
rbc = RuleBasedCollator('\n&ä<<a')
sorted(l, key=rbc.getSortKey)
However, the output of the sorted
is:
['a', 'ä']
I expected: ['ä','a'] What did I do wrong?
Many thanks
Share Improve this question asked 2 days ago korppu73korppu73 3392 gold badges3 silver badges6 bronze badges1 Answer
Reset to default 0It appears that the difference between a
and ä
is considered primary. Using [before 1]
you can achieve the expected result.
from icu import RuleBasedCollator
l=["a","ä"]
rbc = RuleBasedCollator('&[before 1]a < ä')
print(sorted(l, key=rbc.getSortKey))
To read more