Recently I helped someone to build a dropship like solution. The product data from the provider was accessed using a web service. Initially I was expecting the number of items would be just a few hundred and carefully hand picked. However, he just choose to put in thousands of items. As a result, the cron job setup to pull the item data on a daily basis was running for much longer duration and also consumed higher CPU. As a result, his shared hosing solution couldn’t handle it and the cron job used to get killed due to the quota limitations.
I thought about ways to sync the product data without reaching the resource limits but that meant a lot of redesign and would have costed him quite a bit upfront for the development. So obviously, not designing a program keeping the resources under which it has to work in mind has a cost.
But here is the kicker. He decided to go from shared to dedicated hosting that costed him 20 times more per month than what he was paying! Yeah, that’s 20 times more! He would probably recoup the money he would have to spend to optimize his program for performance within a year if he could go with the shared plan.
When the cost of performance tuning goes up, with hardware becoming cheaper and cheaper, it becomes cost effective to live with sub-optimal programs most of the time. However, in cases like SAAS or hosting solutions where the hardware is loaned, the extra cost would add up over time.
I recently found that the connect by clause performance is much better in Oracle 11g compared to Oracle 10g. In 10g, the explain plan showed either a FULL TABLE SCAN or a INDEX FULL SCAN as the last step. However, in 11g, this is not the case. The number of consistent gets is much lower.
Mashups are cool. Cool to develop. No denial. But what about the person accessing your mashup? Is their browser being hung up? Their bandwidth pounded with log of requests (be it AJAX or otherwise)?
A good mashup should pay attention to the usability aspect as well. Just pulling in information from lots of websites and assembling it on the fly on the client-side may not work. Especially if the information being fetched from the remote site has to be repeated for each row on a table of data, a standard implementation would definitely choke the browser.
Take Online Ad Networks page for example. It currently has mashups to get the Google’s PageRank and Alexa’s traffic rank. However, if this information is loaded on the fly for each row in the table, not only is it going to be painful for the viewer, it also would rapidly fire several requests to the respective servers from where the data is being fetched. So, instead, the information is loaded on demand, through the onmouseover events.
In general, in cases where a lot of requests have to be sent, instead of doing that by default, it would be good to do it on a user action. The choice is typically between a onclick and onmouseover. If you want to avoid the user to click too much, then go for onmouseover. However, if you want to absolutely make sure that the user actually intends to see the information you are providing, then an onclick would be better. Note that it’s still better than forcing the user to go to another page for the details and then bringing back to the summary page by back button or a bread crumb.
Web Applications development is all about managing the state and navigation. And for those few lucky sites that have high traffic (like wordpress.com), performance also matters. Two days back I happened to be looking at a portal application that was trying to display reports in real time. Since a portal typically has multiple portlets in it each one showing a real-time report, the way that particular portal application was designed was to initially show loading icon till the portlet page is loaded and then display the page. One good thing with this approach, as opposed to sequentially building the entire page within a single request is that user can start looking at the content as soon as it’s queried and sent to the client. The downside is that there will be multiple client requests to serve each page, one per each portlet, that would be expensive.
Now, this particular portal page I happened to look at has drill-down pages for each report and a group of people are constantly looking at the report and the drill-down. The drill-down is in a separate page, so there is a lot of navigation going back and forth. And each time the user clicks from the drill-down page to the main portal page, the portal page starts rendering all the portlets in real time. This definitely didn’t look promising as it was taking a lot of time for the page to render and consuming a lot of the entire group’s time.
Now, I have recently been experimenting with GreyBox. What this does is, it allows you to open up a detailed page in a embedded window without leaving the current page. When the user clicks on a link, it greys out the original page, and puts a new box with the target link. Hence the name, GreyBox, I would assume. The good thing with this approach is, since one can look at the details, without leaving the current page, in a scenario like a Portal, it would be extremely useful.