I'm trying to match URLs with wildcards in them to actual URLs. For example:
http://*google/*
Needs to match
And
What would be the best way of going about this?
I've tried using a regular expression and that works fine when I manually program it but I'm not sure whether it's possible to dynamically generate regular expressions or if that would be the best practice in this situation.
/(http|https):\/\/.*\.?google\\/?.*/i
Thanks very much.
I'm trying to match URLs with wildcards in them to actual URLs. For example:
http://*google.com/*
Needs to match
http://maps.google.com
And
http://www.google.com/maps
What would be the best way of going about this?
I've tried using a regular expression and that works fine when I manually program it but I'm not sure whether it's possible to dynamically generate regular expressions or if that would be the best practice in this situation.
/(http|https):\/\/.*\.?google\.com\/?.*/i
Thanks very much.
Share Improve this question asked Jun 25, 2010 at 10:22 Sam BowlerSam Bowler 1031 gold badge1 silver badge5 bronze badges 2- Watchout for the issue pointed out by @Sjoerd – Amarghosh Commented Jun 25, 2010 at 11:24
- 1 What was your solution to this, @sam-bowler? – yellow-saint Commented Sep 13, 2015 at 14:57
2 Answers
Reset to default 17Replace all occurrences of *
in the pattern with [^ ]*
- it matches a sequence of zero or more non-space characters.
Thus http://*google.com/*
will become http://[^ ]*google.com/[^ ]*
Here is a regular expression to do the task:
regex = urlPattern.replace(/\*/g, "[^ ]*");
If you want to see a well tested library for extracting parts of a URI, I would check out Google Closure Library's goog.uri.utils methods.
https://github.com/google/closure-library/blob/8e44fb343fff467938f9476ba7f727c6acac76d8/closure/goog/uri/utils.js#L187
Here's the regex that does the heavy lifting:
goog.uri.utils.splitRe_ = new RegExp(
'^' +
'(?:' +
'([^:/?#.]+)' + // scheme - ignore special characters
// used by other URL parts such as :,
// ?, /, #, and .
':)?' +
'(?://' +
'(?:([^/?#]*)@)?' + // userInfo
'([\\w\\d\\-\\u0100-\\uffff.%]*)' + // domain - restrict to letters,
// digits, dashes, dots, percent
// escapes, and unicode characters.
'(?::([0-9]+))?' + // port
')?' +
'([^?#]+)?' + // path
'(?:\\?([^#]*))?' + // query
'(?:#(.*))?' + // fragment
'$');