最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

Check if two Strings share a common substring in JavaScript - Stack Overflow

programmeradmin2浏览0评论

Is there any fast way in JavaScript to find out if 2 Strings contain the same substring? e.g. I have these 2 Strings: "audi is a car" and "audiA8".

As you see the word "audi" is in both strings but we cannot find it out with a simple indexOf or RegExp, because of other characters in both strings.

Is there any fast way in JavaScript to find out if 2 Strings contain the same substring? e.g. I have these 2 Strings: "audi is a car" and "audiA8".

As you see the word "audi" is in both strings but we cannot find it out with a simple indexOf or RegExp, because of other characters in both strings.

Share Improve this question edited Jun 22, 2021 at 18:06 slebetman 114k19 gold badges146 silver badges178 bronze badges asked Oct 22, 2012 at 7:13 hoombhoomb 6862 gold badges12 silver badges26 bronze badges 6
  • 1 if (string1 === string2) { /*identical*/ } - What are you really trying to ask, how to test whether a particular substring is in two different strings, or whether there exists some substring that appears in two different strings, or...? Could you please show an example input and the desired output? – nnnnnn Commented Oct 22, 2012 at 7:14
  • why do you need indexOf or RegExp? just compare the 2 with '==='. – AMember Commented Oct 22, 2012 at 7:16
  • If the two strings are abc and cde, should they be considered "identical" because of c? – duri Commented Oct 22, 2012 at 7:16
  • if you know what the substring is, you can perform indexOf on both the strings to check if the substring exists. – Sandeep G B Commented Oct 22, 2012 at 7:18
  • @nicmon, I've edited the question to remove a mention of "identical", as that's misleading. If I changed the meaning too much, please edit it to clarify. – Joachim Sauer Commented Oct 22, 2012 at 7:19
 |  Show 1 more comment

5 Answers 5

Reset to default 10

The standard tool for doing this sort of thing in Bioinformatics is the BLAST program. It is used to compare two fragments of molecules (like DNA or proteins) to find where they align with each other - basically where the two strings (sometimes multi GB in size) share common substrings.

The basic algorithm is simple, just systematically break up one of the strings into pieces and compare the pieces with the other string. A simple implementation would be something like:

// Note: not fully tested, there may be bugs:

function subCompare (needle, haystack, min_substring_length) {

    // Min substring length is optional, if not given or is 0 default to 1:
    min_substring_length = min_substring_length || 1;

    // Search possible substrings from largest to smallest:
    for (var i=needle.length; i>=min_substring_length; i--) {
        for (j=0; j <= (needle.length - i); j++) {
            var substring = needle.substr(j,i);
            var k = haystack.indexOf(substring);
            if (k != -1) {
                return {
                    found : 1,
                    substring : substring,
                    needleIndex : j,
                    haystackIndex : k
                }
            }
        }
    }
    return {
        found : 0
    }
}

You can modify this algorithm to do more fancy searches like ignoring case, fuzzy matching the substring, look for multiple substrings etc. This is just the basic idea.

Take a look at the similar text function implementation here. It returns the number of matching chars in both strings.

For your example it would be:

similar_text("audi is a car", "audiA8") // -> 4

which means that strings have 4-char common substring.

Don't know about any simpler method, but this should work:

if(a.indexOf(substring) != -1 && b.indexOf(substring) != -1) { ... }

where a and b are your strings.

You can use the powerful algorythm of this library: https://github.com/kpdecker/jsdiff/blob/master/src/diff/base.js

like this

const wordDiff = new Diff();
wordDiff.diff('audi is a car', 'audiA8', {});

and receive the result

[
{
    "count": 4,
    "added": false,
    "removed": false,
    "value": "audi"
},
{
    "count": 9,
    "added": false,
    "removed": true,
    "value": " is a car"
},
{
    "count": 2,
    "added": true,
    "removed": false,
    "value": "A8"
}
]

Where "added": false, "removed": false - this values are common substrings.

You can do much more with this amazing library.

var a = "audi is a car";
var b = "audiA8";

var chunks = a.split(" ");
var commonsFound = 0;

for (var i = 0; i < chunks.length; i++) {
    if(b.indexOf(chunks[i]) != -1) commonsFound++;
}

alert(commonsFound + " common substrings found.");
发布评论

评论列表(0)

  1. 暂无评论