最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

javascript - Regular expression to strip thousand separator from numeral string? - Stack Overflow

programmeradmin0浏览0评论

I have strings which contains thousand separators, however no string-to-number function wants to consume it correctly (using JavaScript). I'm thinking about "preparing" the string by stripping all thousand separators, leaving anything else untoched and letting Number/parseInt/parseFloat functions (I'm satisfied with their behavious otherwise) to decide the rest. But it seems what i have no idea which RegExp can do that!

Better ideas are welcome too!


UPDATE:

Sorry, answers enlightened me how badly formulated question it is. What i'm triyng to achieve is: 1) to strip thousand separators only if any, but 2) to not disturb original string much so i will get NaNs in the cases of invalid numerals.

MORE UPDATE:

JavaScript is limited to English locale for parsing, so lets assume thousand separator is ',' for simplicity (naturally, it never matches decimal separator in any locale, so changing to any other locale should not pose a problem)

Now, on parsing functions:

parseFloat('1023.95BARGAIN BYTES!')  // parseXXX functions just "gives up" on invalid chars and returns 1023.95
Number('1023.95BARGAIN BYTES!')      // while Number constructor behaves "strictly" and will return NaN

Sometimes I use rhw loose one, sometimes strict. I want to figure out the best approach for preparing string for both functions.

On validity of numerals:

'1,023.99' is perfectly well-formed English number, and stripping all commas will lead to correct result. '1,0,2,3.99' is broken, however generic comma stripping will give '1023.99' which is unlikely to be a correct result.

I have strings which contains thousand separators, however no string-to-number function wants to consume it correctly (using JavaScript). I'm thinking about "preparing" the string by stripping all thousand separators, leaving anything else untoched and letting Number/parseInt/parseFloat functions (I'm satisfied with their behavious otherwise) to decide the rest. But it seems what i have no idea which RegExp can do that!

Better ideas are welcome too!


UPDATE:

Sorry, answers enlightened me how badly formulated question it is. What i'm triyng to achieve is: 1) to strip thousand separators only if any, but 2) to not disturb original string much so i will get NaNs in the cases of invalid numerals.

MORE UPDATE:

JavaScript is limited to English locale for parsing, so lets assume thousand separator is ',' for simplicity (naturally, it never matches decimal separator in any locale, so changing to any other locale should not pose a problem)

Now, on parsing functions:

parseFloat('1023.95BARGAIN BYTES!')  // parseXXX functions just "gives up" on invalid chars and returns 1023.95
Number('1023.95BARGAIN BYTES!')      // while Number constructor behaves "strictly" and will return NaN

Sometimes I use rhw loose one, sometimes strict. I want to figure out the best approach for preparing string for both functions.

On validity of numerals:

'1,023.99' is perfectly well-formed English number, and stripping all commas will lead to correct result. '1,0,2,3.99' is broken, however generic comma stripping will give '1023.99' which is unlikely to be a correct result.

Share Improve this question edited Nov 18, 2011 at 22:49 OnTheFly asked Nov 18, 2011 at 20:27 OnTheFlyOnTheFly 2,1016 gold badges29 silver badges63 bronze badges 8
  • Is your separator a comma, or a dot? That is, is twelve thousand represented as '12,000', or as '12.000'? (Or something else?) – ruakh Commented Nov 18, 2011 at 20:32
  • @ruakh, actually a space, but it is not relevant, because JavaScript is not locale-capable at all. – OnTheFly Commented Nov 18, 2011 at 20:45
  • 2 It is relevant, because your regex will have to know which separator to remove! – ruakh Commented Nov 18, 2011 at 20:52
  • please give an example of the string before and after, for instance: 12 345 678 ==> 12345678 or he said, 'I just won $25 000 000!!!' ==> he said, 'I just won $25000000!!!' or there are 5,280 feet in a mile, right? ==> there are 5280 feet in a mile, right? or whatever you can think of – Code Jockey Commented Nov 18, 2011 at 21:02
  • @Code Jockey, i've updated the question, hope its better – OnTheFly Commented Nov 18, 2011 at 21:33
 |  Show 3 more comments

7 Answers 7

Reset to default 7

welp, I'll venture to throw my suggestion into the pot:

Note: Revised

stringWithNumbers = stringwithNumbers.replace(/(\d+),(?=\d{3}(\D|$))/g, "$1");

should turn

1,234,567.12
1,023.99
1,0,2,3.99
the dang thing costs $1,205!!
95,5,0,432
12345,0000
1,2345

into:

1234567.12
1023.99
1,0,2,3.99
the dang thing costs $1205!!
95,5,0432
12345,0000
1,2345

I hope that's useful!

EDIT:

There is an additional alteration that may be necessary, but is not without side effects:

(\b\d{1,3}),(?=\d{3}(\D|$))

This changes the "one or more" quantifier (+) for the first set of digits into a "one to three" quantifier ({1,3}) and adds a "word-boundary" assertion before it. It will prevent replacements like 1234,123 ==> 1234123. However, it will also prevent a replacement that might be desired (if it is preceded by a letter or underscore), such as A123,789 or _1,555 (which will remain unchanged).

A simple num.replace(/,/g, '') should be sufficient I think.

Depends on what your thousand separator is

myString = myString.replace(/[ ,]/g, "");

would remove spaces and commas.

This should work for you

var decimalCharacter = ".",
    regex = new RegExp("[\\d" + decimalCharacter + "]+", "g"),
    num = "10,0000,000,000.999";
+num.match(regex).join("");

To confirm that a numeral-string is well-formed, use:

/^(\d*|\d{1,3}(,\d{3})+)($|[^\d])/.test(numeral_string)

which will return true if the numeral-string is either (1) just a sequence of zero or more digits, or (2) a sequence of digits with a comma before each set of three digits, or (3) either of the above followed by a non-digit character and who knows what else. (Case #3 is for floats, as well as your "BARGAIN BYTES!" examples.)

Once you've confirmed that, use:

numeral_string.replace(/,/g, '')

which will return a copy of the numeral-string with all commas excised.

You can use s.replaceAll("(\\W)(?=\\d{3})","");

This regex gets all alpha-numeric character with 3 characters after it.

Strings like 4.444.444.444,00 € will be 4444444444,00 €

I have used the following in a commercial setting, and it has worked often:

numberStr = numberStr.replace(/[. ,](\d\d\d\D|\d\d\d$)/g,'$1');

In the above example, thousands can be marked with a decimal, a comma, or a space.

In some cases ( like a price of 1000,5 Euros) the above doesn't work. If you need something more robust, this should work 100% of the time:

//convert a comma or space used as the cent placeholder to a decimal
$priceStr = $priceStr.replace(/[, ](\d\d$)/,'.$1');
$priceStr = $priceStr.replace(/[, ](\d$)/,'.$1');
//capture cents
var $hasCentsRegex = /[.]\d\d?$/;
if($hasCentsRegex.test($priceStr)) {
    var $matchArray = $priceStr.match(/(.*)([.]\d\d?$)/);
    var $priceBeforeCents = $matchArray[1];
    var $cents = $matchArray[2];    
} else{
    var $priceBeforeCents = $priceStr;
    var $cents = "";
}
//remove decimals, commas and whitespace from the pre-cent portion
$priceBeforeCents = $priceBeforeCents.replace(/[.\s,]/g,'');
//re-create the price by adding back the cents
$priceStr = $priceBeforeCents + $cents;
发布评论

评论列表(0)

  1. 暂无评论