I am encountering a strange issue while paring two strings. Here is my code:
console.log(x == y);
console.log("'" + x + "'=='" + y + "'");
console.log(typeof(x));
console.log(typeof(y));
In the console, I have :
false
'1Ä4±'=='1Ä4±'
string
string
I guess my strings contain strange characters, so how should I pare them?
I read Javascript string parison fails when paring unicode characters but in my case, x
and y
e from the same source and have the same encoding.
I am encountering a strange issue while paring two strings. Here is my code:
console.log(x == y);
console.log("'" + x + "'=='" + y + "'");
console.log(typeof(x));
console.log(typeof(y));
In the console, I have :
false
'1Ä4±'=='1Ä4±'
string
string
I guess my strings contain strange characters, so how should I pare them?
I read Javascript string parison fails when paring unicode characters but in my case, x
and y
e from the same source and have the same encoding.
-
3
Why do you reference
y.replace
instead ofy
? – ASGM Commented May 28, 2013 at 19:27 -
Oops sorry. Actually it's a copy/paste issue. At the beginning I was paring
x.replace(/\n/g,'')
andy.replace(/\n/g,'')
for my tests. I correct it in the post. – little-dude Commented May 28, 2013 at 19:29 - 2 please demonstrate the issue in working code. – akonsu Commented May 28, 2013 at 19:30
- 5 What do x.length and y.length say? – user1693593 Commented May 28, 2013 at 19:31
-
chrome developer tools console:
'1Ä4±'=='1Ä4±' true
,'1Ä4±'==='1Ä4±' true
, so the values are identical – Ted Commented May 28, 2013 at 19:32
4 Answers
Reset to default 6The Ä
in your strings can be represented either as a single UNICODE character (Latin Capital Letter A With Diaeresis, U+00C4), or as a posite character consisting of Latin Capital Letter A (U+0041) followed by a Combining Diaeresis (U+0308) diacritic.
There also might be any number of Zero-Width Spaces (U+200B), as well as other "invisible" characters in your strings.
Therefore, both strings may render the same, but actually be different.
Try to escape your two strings to see what chars are in them. In this case (although Frédéric has covered possible cases) since you're using PGP, you probably have a binary non-printable char present.
escape(x);
escape(y);
in your console and you will be able to detect the char in action.
BTW. try this code in JS (copy-paste) :)
console.log("A" == "А");
prints "false" :)
Comparing strings means paring character codes. In some fonts, different character codes have the same "picture", like "l" and "I" (first is L, second is i). In my example above, first A is cyrillic, second is latin.
If you are trying to do it in c# this might have to do something with Normalization. FormC vs FormD vs FormKC vs FormKD Reference : http://sharepoint.asia/two-exactly-same-strings-fail-while-parison-in-c-net/