I'm trying to e up with something along the lines of Google Calendar (or even some gmail messages), where freeform text will be parsed and converted to specific dates/times.
Some examples (assume for simplicity that right now is January 01, 2013 at 1am):
"I should call Mom tomorrow to wish her a happy birthday" -> "tomorrow" = "2013-01-02"
"The super bowl is on Feb 3rd at 6:30pm" -> "Feb 3rd at 6:30" => "2013-02-03T06:30:00Z"
"Remind me to take out the trash on Friday" => "Friday" => "2013-01-04"
First of all I'll ask this - are there any already existing open source libraries that this (or part of this). If not, what sort of approaches do you think I should take?
I am thinking of a few different possibilities:
- Lots of regular expressions, as many as I can e up with for each different use case
- Some sort of Bayesian Net that looks at n-grams and categorizes them into different scenarios like "relative date", "relative day of week", "specific date", "date and time", and then runs it through a rules engine (maybe more regex) to figure out the actual date.
- Sending it to a Google search and try to extract meaningful information from the search results (this one is probably not realistic)
I'm trying to e up with something along the lines of Google Calendar (or even some gmail messages), where freeform text will be parsed and converted to specific dates/times.
Some examples (assume for simplicity that right now is January 01, 2013 at 1am):
"I should call Mom tomorrow to wish her a happy birthday" -> "tomorrow" = "2013-01-02"
"The super bowl is on Feb 3rd at 6:30pm" -> "Feb 3rd at 6:30" => "2013-02-03T06:30:00Z"
"Remind me to take out the trash on Friday" => "Friday" => "2013-01-04"
First of all I'll ask this - are there any already existing open source libraries that this (or part of this). If not, what sort of approaches do you think I should take?
I am thinking of a few different possibilities:
- Lots of regular expressions, as many as I can e up with for each different use case
- Some sort of Bayesian Net that looks at n-grams and categorizes them into different scenarios like "relative date", "relative day of week", "specific date", "date and time", and then runs it through a rules engine (maybe more regex) to figure out the actual date.
- Sending it to a Google search and try to extract meaningful information from the search results (this one is probably not realistic)
1 Answer
Reset to default 11You can use this library: https://github./wanasit/chrono
Demo:
inputs = ["I should call Mom tomorrow to with her a happy birthday",
"The super bowl is on Feb 3rd at 6:30pm", "Remind me to take out the trash on Friday"];
for(var i = 0; i < inputs.length; i++) {
var input = inputs[i];
var parsed = chrono.parse(input);
console.log(input + " parsed as: " + JSON.stringify(parsed.map(function(p) { return [p.text, p.startDate]; })));
}
Output:
I should call Mom tomorrow to with her a happy birthday parsed as: [["tomorrow","2012-12-31T06:30:00.000Z"]]
The super bowl is on Feb 3rd at 6:30pm parsed as: [["Feb 3rd at 6:30pm","2013-02-03T13:00:00.000Z"]]
Remind me to take out the trash on Friday parsed as: [["Friday","2013-01-04T06:30:00.000Z"]]
http://jsfiddle/TXX3Z/