I have strings which contains thousand separators, however no string-to-number function wants to consume it correctly (using JavaScript). I'm thinking about "preparing" the string by stripping all thousand separators, leaving anything else untoched and letting Number
/parseInt
/parseFloat
functions (I'm satisfied with their behavious otherwise) to decide the rest. But it seems what i have no idea which RegExp can do that!
Better ideas are welcome too!
UPDATE:
Sorry, answers enlightened me how badly formulated question it is. What i'm triyng to achieve is: 1) to strip thousand separators only if any, but 2) to not disturb original string much so i will get NaNs in the cases of invalid numerals.
MORE UPDATE:
JavaScript is limited to English locale for parsing, so lets assume thousand separator is ',' for simplicity (naturally, it never matches decimal separator in any locale, so changing to any other locale should not pose a problem)
Now, on parsing functions:
parseFloat('1023.95BARGAIN BYTES!') // parseXXX functions just "gives up" on invalid chars and returns 1023.95
Number('1023.95BARGAIN BYTES!') // while Number constructor behaves "strictly" and will return NaN
Sometimes I use rhw loose one, sometimes strict. I want to figure out the best approach for preparing string for both functions.
On validity of numerals:
'1,023.99'
is perfectly well-formed English number, and stripping all commas will lead to correct result.
'1,0,2,3.99'
is broken, however generic comma stripping will give '1023.99'
which is unlikely to be a correct result.
I have strings which contains thousand separators, however no string-to-number function wants to consume it correctly (using JavaScript). I'm thinking about "preparing" the string by stripping all thousand separators, leaving anything else untoched and letting Number
/parseInt
/parseFloat
functions (I'm satisfied with their behavious otherwise) to decide the rest. But it seems what i have no idea which RegExp can do that!
Better ideas are welcome too!
UPDATE:
Sorry, answers enlightened me how badly formulated question it is. What i'm triyng to achieve is: 1) to strip thousand separators only if any, but 2) to not disturb original string much so i will get NaNs in the cases of invalid numerals.
MORE UPDATE:
JavaScript is limited to English locale for parsing, so lets assume thousand separator is ',' for simplicity (naturally, it never matches decimal separator in any locale, so changing to any other locale should not pose a problem)
Now, on parsing functions:
parseFloat('1023.95BARGAIN BYTES!') // parseXXX functions just "gives up" on invalid chars and returns 1023.95
Number('1023.95BARGAIN BYTES!') // while Number constructor behaves "strictly" and will return NaN
Sometimes I use rhw loose one, sometimes strict. I want to figure out the best approach for preparing string for both functions.
On validity of numerals:
'1,023.99'
is perfectly well-formed English number, and stripping all commas will lead to correct result.
'1,0,2,3.99'
is broken, however generic comma stripping will give '1023.99'
which is unlikely to be a correct result.
7 Answers
Reset to default 7welp, I'll venture to throw my suggestion into the pot:
Note: Revised
stringWithNumbers = stringwithNumbers.replace(/(\d+),(?=\d{3}(\D|$))/g, "$1");
should turn
1,234,567.12
1,023.99
1,0,2,3.99
the dang thing costs $1,205!!
95,5,0,432
12345,0000
1,2345
into:
1234567.12
1023.99
1,0,2,3.99
the dang thing costs $1205!!
95,5,0432
12345,0000
1,2345
I hope that's useful!
EDIT:
There is an additional alteration that may be necessary, but is not without side effects:
(\b\d{1,3}),(?=\d{3}(\D|$))
This changes the "one or more" quantifier (+
) for the first set of digits into a "one to three" quantifier ({1,3}
) and adds a "word-boundary" assertion before it. It will prevent replacements like 1234,123
==> 1234123
. However, it will also prevent a replacement that might be desired (if it is preceded by a letter or underscore), such as A123,789
or _1,555
(which will remain unchanged).
A simple num.replace(/,/g, '')
should be sufficient I think.
Depends on what your thousand separator is
myString = myString.replace(/[ ,]/g, "");
would remove spaces and commas.
This should work for you
var decimalCharacter = ".",
regex = new RegExp("[\\d" + decimalCharacter + "]+", "g"),
num = "10,0000,000,000.999";
+num.match(regex).join("");
To confirm that a numeral-string is well-formed, use:
/^(\d*|\d{1,3}(,\d{3})+)($|[^\d])/.test(numeral_string)
which will return true
if the numeral-string is either (1) just a sequence of zero or more digits, or (2) a sequence of digits with a comma before each set of three digits, or (3) either of the above followed by a non-digit character and who knows what else. (Case #3 is for floats, as well as your "BARGAIN BYTES!" examples.)
Once you've confirmed that, use:
numeral_string.replace(/,/g, '')
which will return a copy of the numeral-string with all commas excised.
You can use s.replaceAll("(\\W)(?=\\d{3})","");
This regex gets all alpha-numeric character with 3 characters after it.
Strings like 4.444.444.444,00 €
will be 4444444444,00 €
I have used the following in a commercial setting, and it has worked often:
numberStr = numberStr.replace(/[. ,](\d\d\d\D|\d\d\d$)/g,'$1');
In the above example, thousands can be marked with a decimal, a comma, or a space.
In some cases ( like a price of 1000,5 Euros) the above doesn't work. If you need something more robust, this should work 100% of the time:
//convert a comma or space used as the cent placeholder to a decimal
$priceStr = $priceStr.replace(/[, ](\d\d$)/,'.$1');
$priceStr = $priceStr.replace(/[, ](\d$)/,'.$1');
//capture cents
var $hasCentsRegex = /[.]\d\d?$/;
if($hasCentsRegex.test($priceStr)) {
var $matchArray = $priceStr.match(/(.*)([.]\d\d?$)/);
var $priceBeforeCents = $matchArray[1];
var $cents = $matchArray[2];
} else{
var $priceBeforeCents = $priceStr;
var $cents = "";
}
//remove decimals, commas and whitespace from the pre-cent portion
$priceBeforeCents = $priceBeforeCents.replace(/[.\s,]/g,'');
//re-create the price by adding back the cents
$priceStr = $priceBeforeCents + $cents;
12 345 678
==>12345678
orhe said, 'I just won $25 000 000!!!'
==>he said, 'I just won $25000000!!!'
orthere are 5,280 feet in a mile, right?
==>there are 5280 feet in a mile, right?
or whatever you can think of – Code Jockey Commented Nov 18, 2011 at 21:02