I am programming a webservice that queries different databases for prices. The web service writes a new <Result>
node into the XML
page that is given back when calling .php?search=carpet
as soon as it has results from the database. Unfortunately the queries sent to the various databases take extremely long (up to 30 seconds total). Obviously I dont want the user to wait for 30 seconds, then give back the XML and build a table with this data; I want it to dynamically load.
Let's assume a user searches for "Carpet", the databases will give back multiple products such as "Red Carpet" and "Yellow Carpet". "Red Carpet" has two Distributors that are dynamically loaded into the table of "Red Carpet". "Yellow Carpet" only has one distributor.
I need a price parison Table like the one shown in the below picture that dynamically adds a new table if a new article is given back and that adds a new line to the table if a new distributor is found for a product.
Do you have suggestions on how to acplish this? How do I receive only the data that has changed from my xml.php?
Price Comparison Table Structure
XML Data
<?xml version="1.0" encoding="UTF-8"?>
<Results>
<!--Given back within 5 seconds-->
<Result>
<ArticleNumber>Red Carpet</ArticleNumber>
<Manufacturer>Big Carpet Inc</Manufacturer>
<Distributor>Amazonas</Distributor>
<Prices>
<Pricebreak>
<Quantity>1</Quantity>
<Price>$ 1.20</Price>
</Pricebreak>
<Pricebreak>
<Quantity>10</Quantity>
<Price>$ 1.00</Price>
</Pricebreak>
<Pricebreak>
<Quantity>100</Quantity>
<Price>$ 0.50</Price>
</Pricebreak>
</Prices>
</Result>
<!--Given back within another 10 seconds-->
<Result>
<ArticleNumber>Red Carpet</ArticleNumber>
<Manufacturer>Big Carpet Inc</Manufacturer>
<Distributor>Veritas</Distributor>
<Prices>
<Pricebreak>
<Quantity>1</Quantity>
<Price>$ 0.90</Price>
</Pricebreak>
<Pricebreak>
<Quantity>5</Quantity>
<Price>$ 0.70</Price>
</Pricebreak>
</Prices>
</Result>
<!--Given back within another 5 seconds-->
<Result>
<ArticleNumber>Yellow Carpet</ArticleNumber>
<Manufacturer>Smallrug Corporation</Manufacturer>
<Distributor>Veritas</Distributor>
<Prices>
<Pricebreak>
<Quantity>1</Quantity>
<Price>$ 3.90</Price>
</Pricebreak>
<Pricebreak>
<Quantity>10</Quantity>
<Price>$ 2.70</Price>
</Pricebreak>
</Prices>
</Result>
</Results>
I am programming a webservice that queries different databases for prices. The web service writes a new <Result>
node into the XML
page that is given back when calling http://service./xml.php?search=carpet
as soon as it has results from the database. Unfortunately the queries sent to the various databases take extremely long (up to 30 seconds total). Obviously I dont want the user to wait for 30 seconds, then give back the XML and build a table with this data; I want it to dynamically load.
Let's assume a user searches for "Carpet", the databases will give back multiple products such as "Red Carpet" and "Yellow Carpet". "Red Carpet" has two Distributors that are dynamically loaded into the table of "Red Carpet". "Yellow Carpet" only has one distributor.
I need a price parison Table like the one shown in the below picture that dynamically adds a new table if a new article is given back and that adds a new line to the table if a new distributor is found for a product.
Do you have suggestions on how to acplish this? How do I receive only the data that has changed from my xml.php?
Price Comparison Table Structure
XML Data
<?xml version="1.0" encoding="UTF-8"?>
<Results>
<!--Given back within 5 seconds-->
<Result>
<ArticleNumber>Red Carpet</ArticleNumber>
<Manufacturer>Big Carpet Inc</Manufacturer>
<Distributor>Amazonas</Distributor>
<Prices>
<Pricebreak>
<Quantity>1</Quantity>
<Price>$ 1.20</Price>
</Pricebreak>
<Pricebreak>
<Quantity>10</Quantity>
<Price>$ 1.00</Price>
</Pricebreak>
<Pricebreak>
<Quantity>100</Quantity>
<Price>$ 0.50</Price>
</Pricebreak>
</Prices>
</Result>
<!--Given back within another 10 seconds-->
<Result>
<ArticleNumber>Red Carpet</ArticleNumber>
<Manufacturer>Big Carpet Inc</Manufacturer>
<Distributor>Veritas</Distributor>
<Prices>
<Pricebreak>
<Quantity>1</Quantity>
<Price>$ 0.90</Price>
</Pricebreak>
<Pricebreak>
<Quantity>5</Quantity>
<Price>$ 0.70</Price>
</Pricebreak>
</Prices>
</Result>
<!--Given back within another 5 seconds-->
<Result>
<ArticleNumber>Yellow Carpet</ArticleNumber>
<Manufacturer>Smallrug Corporation</Manufacturer>
<Distributor>Veritas</Distributor>
<Prices>
<Pricebreak>
<Quantity>1</Quantity>
<Price>$ 3.90</Price>
</Pricebreak>
<Pricebreak>
<Quantity>10</Quantity>
<Price>$ 2.70</Price>
</Pricebreak>
</Prices>
</Result>
</Results>
Share
Improve this question
edited May 14, 2011 at 21:50
Dominik
asked May 14, 2011 at 21:40
DominikDominik
4,77813 gold badges46 silver badges58 bronze badges
8
- 1 Probably your sql is not optimized, the query is not efficient, maybe showing the actual query can help us give you a better solution – Ibu Commented May 14, 2011 at 21:45
- 1 show the code that you use to retrieve the content – Ibu Commented May 14, 2011 at 21:48
- @Ibu, actually the queries are not to an sql data source but to drivers that get information out of html websites, unfortunately these queries take a long time and I have to run about 30 of them. I wrote "database" to reduce the information and keep it simple. – Dominik Commented May 14, 2011 at 21:48
- are you using an external webservice, or your own? – Ibu Commented May 14, 2011 at 21:55
- 1 If you really need help, you are going to have to show some code. – Ibu Commented May 14, 2011 at 22:11
6 Answers
Reset to default 5 +25...actually the queries are not to an sql data source but to drivers that get information out of html websites...
If this is the case, then I suggest that you do not do this when your web page is called. But rather have some other script running as a cron task or service that polls this data periodically depending on how volatile the data is. Once every 5 minutes, or every hour, every day? And have that populate your local database which you can poll with much greater speed. This will allow you load any data you wish as the user selects it without waiting for the parser.
Effectively add a caching layer to your web service so that you can have your app run quickly, and not have to rely on an on-demand parsing of data.
Additionally, you could have the 'update' happen manually when you know that data has been updated if an automatic process isn't available.
Supposing that you can't modify any structure and upgrade performance of the data retrieving, I think the only solution, is to cache result and try to make a copy of the db in a local db.
An Asynchronous mode I think you are on a best way.
You can develop a script that copy the slow querying, in the local db, that can be accessed fastest by the Front-end application, and apply this script at Cron Job, also if you can handle updated records, you are on the crest of a wave ;)
I think this could help
Performance/Caching : Improve the speed of your "queries" that get your data OR implement an aggressive caching strategy. I'm assuming you're writing a web app with a technology stack that supports caching. memcached is a good general purpose caching library that works with lots of web app stacks. If you have to rely on scraping the data from petitor websites a cron job or task queue that can handle this at a regular interval would help.
Automatic Updating : effectively what you're talking about is continuous polling ajax, where you would have a continuous connection open from client to server and respond to new data by injecting the html into your page. Take a look at socket.io, it provides an abstraction that allows for continuous polling via a number of transport methods (long polling xhr, flash, web sockets, etc..) and it supports all recent browsers (including IE 5.5 and up).
Rendering : Take a look at the template function which is part of underscore.js, it's a decent way to create a reusable markup template (ie: your table rows structure) and pass it some JSON to be rendered into a markup string. You can then use a library like jQuery or Prototype to inject the markup.
If you are really dedicated to receiving only the data that has changed you would have to do 4 things
- Version the xml
- Have your JS periodically query the version number
- if different JS then calls "updateFromCurrentVersion(currentVersion)"
- updateFromCurrentVersion returns the changes
I can think of at least 2 ways to get what you need done.
Method #1 Server side break up your web-service output into two queries and return sets. Have the initial query be for a list of id's of results and a second query that you can call a single result set by id name.
service./results.xml.php? returns: 1 2 etc ...
service./result.xml.php?id= returns: etc...
This will cause a lot of queries that will add overhead but will stop you from downloading multiple duplicate results over and over again.
What you can do then is spam the 1st query and pare it to ID's currently displayed and then query #2 for details you need to push out yet.
Method #2:
Have 1 query return all the results like you do now but use an ID field stored locally in your scripting portion that keeps track of which content has already been displayed and only add the new data through AJAX into the existing page. You will have to have your javascript pulling and parsing the XML in a looping environment in this case.
The true hitch in this is how does your application determine that it has all the results and to stop looping?
The best way to do this would maybe be to strip down your query to try and get rowcounts for your final resultset before sending the query that returns all the details you need.
example:
select pareitem from sourcetable where pareitem=
this will simply tell you how many items match your query with the least amount of information getting pumped back to you. you then rowcount that resultset and store it for a rainy day.
select * from sourcetable where pareitem=
now you grab all the details and i assume this takes longer. now you can effectively loop your display code to know when to stop checking for results.
ajax pseudo code for method #1: - set current displayed results to 0 - set array of displayed id's to null - run the basic query to determine how many rows you are expecting total (count each provider separate if you have multiple sources then add them all up) - enter loop - grab xml with details - process xml id by id to see if it is displayed yet (if array is null, first iteration display them all) - any id that needs to be displayed add the id as a member of the displayed id's array - pump the details into the document - increment current displayed results by 1 for each result id details displayed - check if you have displayed all the results expected. (displayedids >= expectedresults). if true exit loop. if not continue on.
to optimize this you will most likely want to add some kind of time management or you will simply spam the server at a stupid high rate for xml but that you can test out.
I prefer caching the remote prices and making the calculations over cached variables and if they're expired you can grab them from web service. You can do this with APC, memcached, xCache etc.
Or you can calculate them in periods with cron jobs (maybe like 5 minutes interval or more) and write them on to your local database.
With this two options visitors will not wait this hole calculation operation everytime when they visit your service.