最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

javascript - How can the Lodash orderBy function be made to work with accented characters? - Stack Overflow

programmeradmin4浏览0评论

Is there a way to make Lodash's orderBy function support accented characters?

Like á, é, ñ, etc. These are moved to the end of the array when the sort is performed.

Is there a way to make Lodash's orderBy function support accented characters?

Like á, é, ñ, etc. These are moved to the end of the array when the sort is performed.

Share Improve this question edited Jun 24, 2017 at 23:18 Peter Mortensen 31.6k22 gold badges110 silver badges133 bronze badges asked Jun 24, 2017 at 20:17 Omar CardonaOmar Cardona 1371 silver badge8 bronze badges
Add a ment  | 

2 Answers 2

Reset to default 13

The Problem

It sounds like it doesn't use localeCompare, defaulting instead to the equivalent of using < or >, which pares by UTF-16 code unit numeric values, not locale-aware collation (ordering).

Controlling Comparison Method

You can convert to array (if needed) and then use the native sort with localeCompare. For instance, instead of:

const result = _.orderBy(theArray, ["value"]);

you can do:

const result = theArray.slice().sort((a, b) => a.value.localeCompare(b.value));

or to sort in-place:

theArray.sort((a, b) => a.value.localeCompare(b.value));

localeCompare uses the default collation order for the default locale. Using Intl.Collator, you can have more control over the collation (like case-insensitivity, the handling of accents, the relative position of upper- and lower-case characters, etc.). For instance, if you wanted the default collation for the default locale but with upper-case characters first:

const collator = new Intl.Collator(undefined, {caseFirst: "upper"});
const result = theArray.slice().sort((a, b) => collator.pare(a.value, b.value));

Live Example:

const theArray = [
    {value: "n"},
    {value: "N"},
    {value: "ñ"},
    {value: "á"},
    {value: "A"},
    {value: "a"},
];
const lodashResult = _.orderBy(theArray, ["value"]);
const localeCompareResult = theArray.slice().sort((a, b) => a.value.localeCompare(b.value));
const collator = new Intl.Collator(undefined, {caseFirst: "upper"});
const collatorResult = theArray.slice().sort((a, b) => collator.pare(a.value, b.value));
show("unsorted:", theArray);
show("lodashResult:", lodashResult);
show("localeCompareResult:", localeCompareResult);
show("collatorResult:", collatorResult);

function show(label, array) {
    console.log(label, "[");
    for (const element of array) {
        console.log(`    ${JSON.stringify(element)}`);
    }
    console.log("]");
}
.as-console-wrapper {
    max-height: 100% !important;
}
<script src="https://cdnjs.cloudflare./ajax/libs/lodash.js/4.17.21/lodash.min.js"></script>

Stable vs Unstable Sort

When I first wrote this answer, there was a slight difference between _.orderBy and the native sort: _.orderBy, like _.sortBy, always does a stable sort, whereas at the time of the original answer JavaScript's native sort was not guaranteed to be stable. Since then, though, the JavaScript specification has been modified to require a stable sort (ES2019). So both _.orderBy/_.sortBy and native sort are stable now.

If "stable" vs. "unstable" sort aren't familiar terms: A "stable" sort is one where two elements that are considered equivalent for sorting purposes are guaranteed to remain in the same position relative to each other; in an "unstable" sort, their positions relative to to each other might be swapped (which is allowed because they're "equivalent" for sorting purposes). Consider this array:

const theArray = [
    {value: "x", id: 27},
    {value: "z", id: 14},
    {value: "x", id: 12},
];

If you do an unstable sort that sorts ascending on just value (disregarding id or any other properties the objects might have), there are two valid results:

// Valid result 1: id = 27 remained in front of id = 12
[
    {value: "x", id: 27},
    {value: "x", id: 12},
    {value: "z", id: 14},
]
// Valid result 2: id = 27 was moved after id = 12
[
    {value: "x", id: 12},
    {value: "x", id: 27},
    {value: "z", id: 14},
]

With a stable sort, though, only the first result is valid; the positions of equivalent elements relative to each other remains unchanged.

But again, that distinction no longer matters, since JavaScript's sort is stable now too.

I've solved it by paring a sanitized element.

theArray.sort(function(a, b) {
    return a.toLowerCase().removeAccents().localeCompare(b.toLowerCase().removeAccents());
});

The removeAccents function:

String.prototype.removeAccents = function () {
return this
    .replace(/[áàãâä]/gi,"a")
    .replace(/[éè¨ê]/gi,"e")
    .replace(/[íìïî]/gi,"i")
    .replace(/[óòöôõ]/gi,"o")
    .replace(/[úùüû]/gi, "u")
    .replace(/[ç]/gi, "c")
    .replace(/[ñ]/gi, "n")
    .replace(/[^a-zA-Z0-9]/g," ");
}
发布评论

评论列表(0)

  1. 暂无评论