The OCC option symbol consists of 4 parts:
- Root symbol of the underlying stock or ETF, padded with spaces to 6 characters
- Expiration date, 6 digits in the format yymmdd
- Option type, either P or C, for put or call
- Strike price, as the price x 1000, front padded with 0s to 8 digits
As an example SPX 141122P00019500
means a put on SPX, expiring on 11/22/2014, with a strike price of $19.50.
Is it possible to use regex to parse this out automatically? I'm using JavaScript
The OCC option symbol consists of 4 parts:
- Root symbol of the underlying stock or ETF, padded with spaces to 6 characters
- Expiration date, 6 digits in the format yymmdd
- Option type, either P or C, for put or call
- Strike price, as the price x 1000, front padded with 0s to 8 digits
As an example SPX 141122P00019500
means a put on SPX, expiring on 11/22/2014, with a strike price of $19.50.
Is it possible to use regex to parse this out automatically? I'm using JavaScript
Share Improve this question asked Jun 7, 2017 at 2:09 ShamoonShamoon 43.7k101 gold badges332 silver badges628 bronze badges 2- What is expected result? – guest271314 Commented Jun 7, 2017 at 2:15
- To parse out the string into the different parts – Shamoon Commented Jun 7, 2017 at 2:15
3 Answers
Reset to default 5Here is the regex (I highly remend http://regexr.)
([\w ]{6})((\d{2})(\d{2})(\d{2}))([PC])(\d{8})
Group1: ETF
Group2: Year
Group3:Month
Group4: Day
Group5: put/call
Group6: strike price
Your js would look something like this (somewhat psuedo-code. Not tested)
var myString = "SPX 141122P00019500";
var myRegexp = /([\w ]{6})((\d{2})(\d{2})(\d{2}))([PC])(\d{8})/g;
var match = myRegexp.exec(myString);
console.log("a " + match[5] + " on " + match[1].trim() + ", expiring on " + match[3] + "/" + match[4] + "/20" + match[2] + " with a strike price of $" + match[6]);
I don't think you even need a regex, assuming the OCC option string has a fixed format. Instead, you can try just using substring()
to extract the various ponents.
var occ = 'SPX 141122P00019500';
var symbol = occ.substring(0, 3);
var year = parseInt(occ.substring(6, 8)) + 2000;
var month = occ.substring(8, 10);
var day = occ.substring(10, 12);
var date = month + '/' + day + '/' + year;
var type = occ.substring(12, 13) == 'P' ? 'put' : 'call';
var price = parseFloat(occ.substring(13, 21)) / 1000.0;
var output = 'a ' + type + ' on ' + symbol + ', expiring on ' + date +
', with a strike price of $' + price.toFixed(2); + '.';
console.log(output);
I would expect using substring to build your output string would generally perform better than using a regex.
You can use RegExp
/^[^\s]+(?=\s+|\d{6})|\d{6}(?=C|P)|(C|P)(?=0+)|(?!:\1)0+|\d+$/g
to match characters at beginning of string that are not space characters, or date as next six digits followed by C
or P
, or C
or P
followed by one or more 0
characters, or one or more 0
characters preceded by C
or P
, or one or more digits at end of string.
Utilize destructing assignment to define parts of matches within array separate variables.
let quote = "SPX 141122P00019500";
let re = /^[^\s]+(?=\s+|\d{6})|\d{6}(?=C|P)|(C|P)(?=0+)|(?!:\1)0+|\d+$/g;
let [ticker, date, option, strike, price] = quote.match(re);
console.log({ticker, date, option, strike, price});