最新消息:雨落星辰是一个专注网站SEO优化、网站SEO诊断、搜索引擎研究、网络营销推广、网站策划运营及站长类的自媒体原创博客

javascript - Algorithm for data filter - Stack Overflow

programmeradmin2浏览0评论

Can you suggest me an algorithm for filtering out data.

I am using javascript and trying to write out a filter function which filters an array of data.I have an array of data and an array of filters, so in order to apply each filter on every data, I have written 2 for loops

foreach(data)
{
  foreach(filter)
  {
   check data with filter
  }
}

this is not the proper code, but in short that what my function does, the problem is this takes a huge amount of time, can someone suggest a better method.

I am using the Mootools library and the array of data is JSON array

Details of data and Filter

Data is JSON array of lets say user, so it will be

data = [{"name" : "first", "email" :  "first@first", "age" : "20"}.
        {"name" : "second", "email" :  "second@second", "age" : "21"}
        {"name" : "third", "email" :  "third@third", "age" : "22"}]

Array of filters is basically self define class for different fields of data

alFilter[0] = filterName;
alFilter[1] = filterEmail;
alFilter[2] = filterAge;

So when I enter the first for loop, I get a single JSON opbject (first row) in the above case. When I enter the second for loop (filters loop) I have a filter class which extracts the exact field on which the current filter would work and check the filter with the appropriate field of the data.

So in my example

foreach(data)
{
 foreach(filter)
{
  //loop one - filter name
 // loop two - filter email
 // loop three - filter age
}
}

when the second loop ends i set a flag denoting if the data has been filtered or not and depending on it the data is displayed.

Can you suggest me an algorithm for filtering out data.

I am using javascript and trying to write out a filter function which filters an array of data.I have an array of data and an array of filters, so in order to apply each filter on every data, I have written 2 for loops

foreach(data)
{
  foreach(filter)
  {
   check data with filter
  }
}

this is not the proper code, but in short that what my function does, the problem is this takes a huge amount of time, can someone suggest a better method.

I am using the Mootools library and the array of data is JSON array

Details of data and Filter

Data is JSON array of lets say user, so it will be

data = [{"name" : "first", "email" :  "first@first", "age" : "20"}.
        {"name" : "second", "email" :  "second@second", "age" : "21"}
        {"name" : "third", "email" :  "third@third", "age" : "22"}]

Array of filters is basically self define class for different fields of data

alFilter[0] = filterName;
alFilter[1] = filterEmail;
alFilter[2] = filterAge;

So when I enter the first for loop, I get a single JSON opbject (first row) in the above case. When I enter the second for loop (filters loop) I have a filter class which extracts the exact field on which the current filter would work and check the filter with the appropriate field of the data.

So in my example

foreach(data)
{
 foreach(filter)
{
  //loop one - filter name
 // loop two - filter email
 // loop three - filter age
}
}

when the second loop ends i set a flag denoting if the data has been filtered or not and depending on it the data is displayed.

Share Improve this question edited Nov 3, 2015 at 23:27 Brian Tompsett - 汤莱恩 5,89372 gold badges61 silver badges133 bronze badges asked Apr 15, 2009 at 21:09 ZajeZaje 2,3095 gold badges27 silver badges39 bronze badges
Add a ment  | 

5 Answers 5

Reset to default 4

You're going to have to give us some more detail about the exact structure of your data and filters to really be able to help you out. Are the filters being used to select a subset of data, or to modify the data? What are the filters doing?

That said, there are a few general suggestions:

  1. Do less work. Is there some way you can limit the amount of data you're working on? Some pre-filter that can run quickly and cut it down before you do your main loop?
  2. Break out of the inner loop as soon as possible. If one of the filters rejects a datum, then break out of the inner loop and move on to the next datum. If this is possible, then you should also try to make the most selective filters e first. (This is assuming that your filters are being used to reject items out of the list, rather than modify them)
  3. Check for redundancy in the putation the filters perform. If each of them performs some plicated calculations that share some subroutines, then perhaps memoization or dynamic programming may be used to avoid redundant putation.

Really, it all boils down to the first point, do less work, at all three levels of your code. Can you do less work by limiting the items in the outer loop? Do less work by stopping after a particular filter and doing the most selective filters first? Do less work by not doing any redundant putation inside of each filter?

That's pretty much how you should do it. The trick is to optimize that "check data with filter"-part. You need to traverse all your data and check against all your filters - you'll not going to get any faster than that.

Avoid string parisons, use data models as native as possible, try to reduce the data set on each pass with filter, etc.

Without further knowledge, it's hard to optimize this for you.

You should sort the application of your filters, so that two things are optimized: expensive checks should e last, and checks that eliminate a lot of data should e first. Then, you should make sure that checking is cut short as soon as an "out" result occurs.

If your filters are looking for specific values, a range, or start of a text then jOrder (http://github./danstocker/jorder) will fit your problem.

All you need to do is create a jOrder table like this:

var table = jOrder(data)
    .index('name', ['name'], { grouped: true, ordered: true })
    .index('email', ['email'])
    .index('age', ['age'], { grouped: true, ordered: true, type: jOrder.number });

And then call table.where() to filter the table.

When you're looking for exact matches:

filtered = table.where([{name: 'first'}, {name: 'second'}]);

When you're looking for a certain range of one field:

filtered = table.where([{age: {lower: 20, upper: 21}}], {mode: jOrder.range});

Or, when you're looking for values starting with a given string:

filtered = table.where([{name: 'fir'}], {mode: jOrder.startof});

Filtering will be magnitudes faster this way than with nested loops.

Supposing that a filter removes the data if it doesn't match, I suggest, that you switch the two loops like so:

foreach(filter) {
    foreach(data) {
        check data with filter
    }
}

By doing so, the second filter doesn't have to work all data, but only the data that passed the first filter, and so on. Of course the tips above (like doing expensive checks last) are still true and should additionally be considered.

发布评论

评论列表(0)

  1. 暂无评论