I am creating auto suggest function in my website search box, every time the user press a new key, javascript call a webservice on the server side to get 10 most relevant keywords from the db and gives to the javascript again, and javascript populate the search autosuggest list.
My function is not too slow, but comparing to what live or google doing its veryyyy slow, i tested them and really i feel that they are getting the keywords from my PC not from their servers.
How they get keywords very fast like this, and sure they have million million times my keywords?
there is a famous style to do like this?
Also using my firebug, i found that they are not calling webservice "may be calling by a way i am not aware", but i found in the Net tab that a new get is happening.
I am creating auto suggest function in my website search box, every time the user press a new key, javascript call a webservice on the server side to get 10 most relevant keywords from the db and gives to the javascript again, and javascript populate the search autosuggest list.
My function is not too slow, but comparing to what live.com or google.com doing its veryyyy slow, i tested them and really i feel that they are getting the keywords from my PC not from their servers.
How they get keywords very fast like this, and sure they have million million times my keywords?
there is a famous style to do like this?
Also using my firebug, i found that they are not calling webservice "may be calling by a way i am not aware", but i found in the Net tab that a new get is happening.
- 1) they have more bandwidth. 2) they have more servers 3). when searching and getting full page of results, you get those results sub second anyway, so a js list of 10 will most likely happen even faster. – Darren Kopp Commented Apr 21, 2009 at 20:47
8 Answers
Reset to default 4Not sure where you're looking, but certainly on live.com I get a request for each letter:
As you can see, there's very little coming back across the wire - 500B - that's what you're aiming for - a lean web service that returns the minimum you need to display that back to the user.
Then on top of that, as the others have said, cache previous responses, etc.
Not that the results often aren't alphabetical, and so if you don't display your ordering criteria, you can work on the principle of "Something now is better than completely accurate later".
Rather than making a request every keypress, what if you just make a request every certain amount of time if there has been a keypress in the interim? If you did something like 100ms intervals, it would still look "instant" but could potentially be much less load on your server. Also, are you having the client cache the keywords? If the user backspaces in the search field, it shouldn't have to re-contact the server to get the keywords. Also, you can immediately filter your current list of keywords on every keypress without contacting the server (you'll just end up with less than 10 of them because some/all of the ones you already have will not contain the letter just typed). That can fill in the "gaps" between actual requests for data to make it seem more instant.
There's no reason to request search terms with every keypress. Google does it (1) because they can, and (2) because they're presenting terms across the corpus of the Internet.
In most web applications, there are a much smaller number of "common" search terms -- usually no more than a hundred or so, and as few as a dozen that are context appropriate.
You can retrieve the entire set of relevant terms and build a prefix map on the client side at the time the page loads. By matching the current search term against this prefix map, you can give suggestions far faster that Google.
The limitation is that, at some point, you'll run out of suggested terms. But again, that's really not an issue: even Google runs out of suggestions for "transnormative" (a made-up word, but there are 191 results from a full search).
There are two main things that you can do:
- Use as much caching as possible in the server side. After all, the search queries follow the power-law. There are few queries with many requests and many-many queries with very few requests each. A perfect environment for caching
- You would need to minimize the amount of data that is transmitted over and one way to do this is through the use of the radix tree. If you need to transmit a list of 20 strings all sharing a common prefix then you don't need to transmit 20 separate strings. You can transmit the prefix once and then the 20 different parts.
The first thing I would suggest is to make sure your web service is caching they keywords in memory rather than hitting the DB each time - assuming of course that your data set is small enough to do this.
Other than that, you would have to parallelize the query somehow across multiple servers, which is probably way more complicated than you want to get.
Found this Blog post which talk in details in this point:
Autocomplete Data Structures
First, you should probably re-phrase your question. Any answer to "how can I be as fast as google" is bound to be "get used to disappointment."
Given that, you can still make your solution better. It seems like every keypress, you do a round trip to a service and to a database. I doubt google is going that. Perhaps you should focus on doing less round trips (cahcing, bring back more when you do have to go to the DB, etc).
As someone suggested above using ETags with REST api could mean some additional caching for repeated queries . Check out this article on Jeo Gregorio's blog for more info etags in REST context.