javascript - Compare two sentences word by word and return the number of word matches with some conditions

Here is a piece of code to pare two sentences word by word and return the number of word matches with some conditions:

hint: the word in the first sentence :::: the word in the second sentence

1) protecting :::: i should result Not matched

2) protecting :::: protect should result matched

3) protect :::: protecting should result matched

4) him :::: i should result Not matched

5) i :::: i should result matched but only once not twice: (let me explain this)

We have this string as the first sentence:

 let speechResult = "they're were protecting him i knew that i was aware";

It has two i as you see but there is only one i in the second sentence here:

let expectSt = ['i was sent to earth to protect you'];

So we should consider this match as one occurrence not two, If we had two i occurrences in the second sentence too, then we would consider the i matches as two occurrences.

6) was :::: was should result matched

Here is my code so far:

// Sentences we should pare word by word
let speechResult = "they're were protecting him i knew that i was aware";
let expectSt = ['i was sent to earth to protect you'];
    
// Create arrays of words from above sentences
let speechResultWords = speechResult.split(/\s+/);
let expectStWords = expectSt[0].split(/\s+/);

// Here you are..    
//console.log(speechResultWords)
//console.log(expectStWords)
        
// Count Matches between two sentences
function includeWords(){
// Declare a variable to hold the count number of matches    
let countMatches = 0;    
for(let a = 0; a < speechResultWords.length; a++){
        
    for(let b = 0; b < expectStWords.length; b++){
          
        if(speechResultWords[a].includes(expectStWords[b])){
           console.log(speechResultWords[a] + ' includes in ' + expectStWords[b]);
           countMatches++
        }
              
    }  // End of first for loop  
    
} // End of second for loop
    
return countMatches;
};
    
// Finally initiate the function to count the matches
let matches = includeWords();
console.log('Matched words: ' + matches);

Here is a piece of code to pare two sentences word by word and return the number of word matches with some conditions:

hint: the word in the first sentence :::: the word in the second sentence

1) protecting :::: i should result Not matched

2) protecting :::: protect should result matched

3) protect :::: protecting should result matched

4) him :::: i should result Not matched

5) i :::: i should result matched but only once not twice: (let me explain this)

We have this string as the first sentence:

 let speechResult = "they're were protecting him i knew that i was aware";

It has two i as you see but there is only one i in the second sentence here:

let expectSt = ['i was sent to earth to protect you'];

So we should consider this match as one occurrence not two, If we had two i occurrences in the second sentence too, then we would consider the i matches as two occurrences.

6) was :::: was should result matched

Here is my code so far:

// Sentences we should pare word by word
let speechResult = "they're were protecting him i knew that i was aware";
let expectSt = ['i was sent to earth to protect you'];
    
// Create arrays of words from above sentences
let speechResultWords = speechResult.split(/\s+/);
let expectStWords = expectSt[0].split(/\s+/);

// Here you are..    
//console.log(speechResultWords)
//console.log(expectStWords)
        
// Count Matches between two sentences
function includeWords(){
// Declare a variable to hold the count number of matches    
let countMatches = 0;    
for(let a = 0; a < speechResultWords.length; a++){
        
    for(let b = 0; b < expectStWords.length; b++){
          
        if(speechResultWords[a].includes(expectStWords[b])){
           console.log(speechResultWords[a] + ' includes in ' + expectStWords[b]);
           countMatches++
        }
              
    }  // End of first for loop  
    
} // End of second for loop
    
return countMatches;
};
    
// Finally initiate the function to count the matches
let matches = includeWords();
console.log('Matched words: ' + matches);

Share Improve this question edited Jan 20, 2020 at 16:35 asked Jan 20, 2020 at 16:02 Sara Ree 3,5431 gold badge19 silver badges72 bronze badges

1 What is your problem? Please be more specific. – Kamil Naja Commented Jan 20, 2020 at 16:09
3 I mentioned the problems in the conditions, please run the code snippet. The code I wrote assumes the protecting and i matched and ... – Sara Ree Commented Jan 20, 2020 at 16:11
2 why protect and protecting match but i and him do not? Is it based on stemming?, is it more simplistically based on mon starting sequence (e.g: super and show should match because of the s)? – grodzi Commented Jan 20, 2020 at 16:35
1 I mean matching between words, as i is a single word and him is another word they should not be considered matched . super and show are different words so there is no match.. – Sara Ree Commented Jan 20, 2020 at 16:39
3 my question was almost rhetorical: why would protect and protecting be the same word? (you should emphase that on your original post since no answer so far has taken that constraint into account) – grodzi Commented Jan 20, 2020 at 16:41

Add a ment |

6 Answers 6

Sorted by: Reset to default 4

You could count the wanted words with a Map and iterate the given words by checking the word count.

function includeWords(wanted, seen) {
    var wantedMap = wanted.split(/\s+/).reduce((m, s) => m.set(s, (m.get(s) || 0) + 1), new Map),
        wantedArray = Array.from(wantedMap.keys()),
        count = 0;

    seen.split(/\s+/)
        .forEach(s => {
            var key = wantedArray.find(t => s === t || s.length > 3 && t.length > 3 && (s.startsWith(t) || t.startsWith(s)));
            if (!wantedMap.get(key)) return;
            console.log(s, key)
            ++count;
            wantedMap.set(key, wantedMap.get(key) - 1);
        });

    return count;
}

let matches = includeWords('i was sent to earth to protect you', 'they\'re were protecting him i knew that i was aware');

console.log('Matched words: ' + matches);

.as-console-wrapper { max-height: 100% !important; top: 0; }

I think this should work:

let speechResult = "they're were protecting him i knew that i was aware";
let expectSt = ['i was sent to earth to protect you'];


function includeWords(){
    let countMatches = 0;    
    let ArrayFromStr = speechResult.split(" ");
    let Uniq = new Set(ArrayFromStr)
    let NewArray = [Uniq]
    let str2 = expectSt[0]

    for (word in NewArray){
        if (str2.includes(word)){
            countMatches += 1
        }
    }

    return countMatches;
};


let matches = includeWords();

I get the speechResult, made it into a array, remove duplicates, convert to array again, and then check if the expectSt string contains every word on the NewArray array.

Iterate over the strings and update the index of the matched word with the empty string and store the matches in an array.

let speechResult = "they're were protecting him i knew that i was aware";
let expectSt = ['i was sent to earth to protect you'];

// Create arrays of words from above sentences
let speechResultWords = speechResult.split(/\s+/);
let expectStWords = expectSt[0].split(/\s+/);

const matches = [];

speechResultWords.forEach(str => {
  for(let i=0; i<expectStWords.length; i++) {
    const innerStr = expectStWords[i];
    if(innerStr && (str.startsWith(innerStr) || innerStr.startsWith(str)) && (str.includes(innerStr) || innerStr.includes(str))) {
        if(str.length >= innerStr.length) {
          matches.push(innerStr);
          expectStWords[i] = '';
        } else {
          matches.push(str);
        }
        break;
    }
  }
});

console.log(matches.length);

By using stemming you intuit that words having the same stem are the same.

e.g

for verb: protect, protected, protecting, ...
but also plural: ball, balls

What you may want to do is:

stem the words: use some stemmer (which will have their pros & cons) (e.g PorterStemmer which seem to have a js implem)
count the occurrence on that "stemmed space", which is trivial

NB: splitting with '\s' may not be enough, think about mas and more generally punctuation. Should you have more need, keyword for this is tokenization.

Below an example using PorterStemmer with some poor home made tokenization

const examples = [
 ['protecting','i'],
 ['protecting','protect'],
 ['protect','protecting'],
 ['him','i'],
 ['i','i'],
 ['they\'re were protecting him i knew that i was aware','i was sent to earth to protect you'],
 ['i i', 'i i i i i']
]
function tokenize(s) {
  // this is not good, get yourself a good tokenizer
  return s.split(/\s+/).filter(x=>x.replace(/[^a-zA-Z0-9']/g,''))
}

function countWords(a, b){
  const sa = tokenize(a).map(t => stemmer(t))
  const sb = tokenize(b).map(t => stemmer(t))
  const m = sa.reduce((m, w) => (m[w] = (m[w] || 0) + 1, m), {})
  return sb.reduce((count, w) => {
    if (m[w]) {
      m[w]--
      return count + 1
    }
    return count
  }, 0)
}
examples.forEach(([a,b], i) => console.log(`ex ${i+1}: ${countWords(a,b)}`))

<script src="https://cdn.jsdelivr/gh/kristopolous/Porter-Stemmer/PorterStemmer1980.js"></script>

I think it will provide the primitive solution by paring sentences' tokens. But here are two pitfalls that I can see:

You should pare both sentences' tokens in your main IF clause by an OR operand
You can add both occurrences in a SET collection to avoid any repetitions.

You can use the below function to get the count of all matched word between two sentence / set of strings.

function matchWords(str1, str2){
    let countMatches = 0;    
    let strArray = str1.split(" ");
    let uniqueArray = [...new Set(strArray)];
    uniqueArray.forEach( word => {
        if (str2.includes(word)){
            countMatches += 1
        }
    })
    return countMatches;
};
console.log("Count:", matchWords("Test Match Words".toLowerCase(),"Result Match Words".toLowerCase());

Above code is tested and working.

科技改变生活-雨落星辰 - 所有的伟大,都源于一个勇敢的开始

javascript - Compare two sentences word by word and return the number of word matches with some conditions - Stack Overflow

6 Answers 6

与本文相关的文章

评论列表(0)