最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

regex - Create a permalink with JavaScript - Stack Overflow

programmeradmin3浏览0评论

I have a textbox where a user puts a string like this:

"hello world! I think that __i__ am awesome (yes I am!)"

I need to create a correct URL like this:

hello-world-i-think-that-i-am-awesome-yes-i-am

How can it be done using regular expressions?

Also, is it possible to do it with Greek (for example)?

"Γεια σου κόσμε"

turns to

geia-sou-kosme

In other programming languages (Python/Ruby) I am using a translation array. Should I do the same here?

I have a textbox where a user puts a string like this:

"hello world! I think that __i__ am awesome (yes I am!)"

I need to create a correct URL like this:

hello-world-i-think-that-i-am-awesome-yes-i-am

How can it be done using regular expressions?

Also, is it possible to do it with Greek (for example)?

"Γεια σου κόσμε"

turns to

geia-sou-kosme

In other programming languages (Python/Ruby) I am using a translation array. Should I do the same here?

Share Improve this question edited Mar 25, 2010 at 22:23 Peter Mortensen 31.6k22 gold badges110 silver badges133 bronze badges asked Mar 25, 2010 at 22:16 Jon RomeroJon Romero 4,0906 gold badges38 silver badges34 bronze badges 0
Add a comment  | 

4 Answers 4

Reset to default 15

Try this:

function doDashes(str) {
    var re = /[^a-z0-9]+/gi; // global and case insensitive matching of non-char/non-numeric
    var re2 = /^-*|-*$/g;     // get rid of any leading/trailing dashes
    str = str.replace(re, '-');  // perform the 1st regexp
    return str.replace(re2, '').toLowerCase(); // ..aaand the second + return lowercased result
}
console.log(doDashes("hello world! I think that __i__ am awesome (yes I am!)"));
// => hello-world-I-think-that-i-am-awesome-yes-I-am

As for the greek characters, yeah I can't think of anything else than some sort of lookup table used by another regexp.

Edit, here's the oneliner version:
Edit, added toLowerCase():
Edit, embarrassing fix to the trailing regexp:

function doDashes2(str) {
    return str.replace(/[^a-z0-9]+/gi, '-').replace(/^-*|-*$/g, '').toLowerCase();
}

A simple regex for doing this job is matching all "non-word" characters, and replace them with a -. But before matching this regex, convert the string to lowercase. This alone is not fool proof, since a dash on the end may be possible.

[^a-z]+

Thus, after the replacement; you can trim the dashes (from the front and the back) using this regex:

^-+|-+$

You'd have to create greek-to-latin glyps translation yourself, regex can't help you there. Using a translation array is a good idea.

I can't really say for Greek characters, but for the first example, a simple:

/[^a-zA-Z]+/

Will do the trick when using it as your pattern, and replacing the matches with a "-"

As per the Greek characters, I'd suggest using an array with all the "character translations", and then adding it's values to the regular expression.

To roughly build the url you would need something like this.

var textbox = "hello world! I think that __i__ am awesome (yes I am!)";
var url = textbox.toLowerCase().replace(/([^a-z])/, '').replace(/\s+/, " ").replace(/\s/, '-');

It simply removes all non-alpha characters, removes double spacing, and then replaces all space chars with a dash.

You could use another regular expression to replace the greek characters with english characters.

发布评论

评论列表(0)

  1. 暂无评论