Monthly Archives: May 2007

MySQL: String concat multiple rows

If you are implementing a tagging system like what you see with WordPress, then you may have a requirement to fetch all the tags of each article in the blog when displaying the blog. As each article has many tags, one option is to first fetch all the articles and for each article fetch the list of tags. However, there is a better way to do this in one single SQL when using MySQL.

MySQL has the function group_concat that allows one to condense a set of rows into one single row by concatenating the list of values.

The SQL will be something like

select article_id,article_title,article_text,(select group_concat(tag) from tags t where t.article_id = a.id) tags from articles a …;

Here, the subselect used for the tags column is what I am talking about. It uses the group_concat to concatenate all the tags together. By default it uses ‘,’ as the separator and I believe that can be altered.

PHP & MySQL Web Development is a book for learning web development with PHP and MySQL.

20 Comments

Filed under MySQL, tagging

MySQL Regular Expressions

I have a table that stores a bunch of keywords. The keywords can be either with or without quotes (“). That is, the keywords could be PageRank, “Web 2.0” and startups. My requirement was to sort these keywords but ignoring the quotes. That is, instead of displaying them as “Web 2.0”, PageRank , startups, I wanted them to show up as PageRank, startups, “Web 2.0”

That’s when I started looking at MySQL regular expressions. A few interesting things. First and foremost, it looked odd to me that the syntax is something like

“select stringcolumn regex ‘some-pattern'”

instead of

“select regex(stringcolumn,’some-pattern’)”

The second thing is in how I had to write my sql to achieve the above requirement. I ended up doing something like

select …, case when keyword regex ‘”.*”‘ = 1 then substr(keyword,2,length(keyword)-2) else keyword end okw from … where … order by owk;

This felt quite ugly, but perhaps a better hypothetical syntax could have been like

regex(keyword,'”(.*)”‘,”nvl($1,$0)”)

where the 3rd argument is an SQL expression evaluated with special bind variables captured from the regular expression. (Ignore the usage of nvl which is a Oracle syntax, but you get the point).

1 Comment

Filed under MySQL

Open Source Or Otherwise, Some Problems Remain The Same

When it comes to Enterprise Applications which is commercially dominated by SAP and Oracle, there are several open source solutions being developed. Some of them are trying to be the complete suite of applications like Compiere and the others more best-of-breed point solutions like Coupa.

Just like the CIOs and IT staff face with the dilemma of picking the best-of-breed commercial solutions vs going with generic but wide range solutions, same is applicable for open source as well.

Moving from paper based tracking to any form of automated tracking certainly helps. But for companies trying to go beyond the basics and have well integrated automations from procurement to payment, order (or even a quote) capture to fulfillment, there will be integration challenges to deal with if one were to go with best-of-breed solutions. Infact, one can even imagine a configuration where some of the best-of-breed solutions are commercial while the others are open-source!

In such a scenario, who will be building the integrations for these disparate best-of-breed open source applications? Would it make sense to bring in the big-5/4 there? IMHO, then that would defeat the purpose of trying to use open source with the intention of saving the bottom-line.

In addition, with open-source applications, there is no restriction on the technology components. One team/company can choose to use Ruby-on-Rails, while the other perhaps Python and yet another Java and may be even Perl. So, what a nightmare that would be to integrate these various different technology components into one single harmonious application. Of course, this is true for commercial counter parts as well. However, most commercial software typically goes with the main-stream technology, which currently is either Java or .Net.

There is no denial that the WebServices and the SOA are supposed to address this language/platform problems. However, it is yet to be seen how successful this route is going to be as many earlier attempts at seamless distributed computing (such as CORBA and RMI) never really worked well mainly due to performance. But even with WebServices and SOA, most of the open source enterprise applications are currently at a stage where they need to catchup and implement a lot of functionality and the least focus they have is to ensuring that their application is interoperable with other applications. That is not to say that they are not interested in that, but with limited resources, there is only so much that can be done and that would typically be on the core competency, the product itself and how many features it has as a standalone application.

But who knows, with the commercial applications already betting and leading their way to SOA enabled applications, the open-source counterparts will consider it as a part of the survival strategy than a add-on bolt. I see the same issue with business intelligence also where the main stream commercial applications have some in-built capabilities while the open source counterparts mostly lack them.

Leave a comment

Filed under application integration, Enterprise Applications, open source

Open Source And Google PageRank

More and more companies are adopting the “open source” strategy (not philosophy) and how much that helps them in improving the top-line or bottom-line is yet to be seen. However, one thing that is likely to improve is the Google PageRank. This is because, every now and then some one or the other writes an article about open source in some context or the other (like open source enterprise applications, open source business intelligence applications, open source middleware etc) and your open source may likely to feature in one of those articles. That will sure improve your Google PageRank! So, if not anything, the least Open Source strategy will do for your company is to improve the Google PageRank and hence your website’s likelihood to feature as high as possible on the search results for certain keywords. Not bad huh!

Leave a comment

Filed under open source

Driving Search Volume Through Articles

Google continuously being No 1, and other search engines constantly losing the search engine battle, Yahoo! seems to be trying out a few new things. One I have been seeing for the last few days is to provide links to a few select words which when highlighted will
popup a small inline box (similar to contextual ads) letting the user to click that and get search results. I personally don’t like that. Now, today I see on their homepage the following
link about the most popular puppies which enumerates a list of top 20 most popular puppy breeds. Each of the breed name is a link that takes you to the search results for that breed. I again don’t see why one would want to do a search while reading an article, but the fact that you don’t know that you will be taken to the search results, makes you click it, perhaps with the hope of seeing more details like photos and other, only to be greeted with a link of ads, and web search results (and images).

Whether this technique is useful to the audience or not, it certainly is a good tactic to raise the search volume for Yahoo!

Leave a comment

Filed under search engine, search engine volume, Yahoo!

Alexa Is Obsolete?

Amazon surprised everyone in the recent quarter and it’s stock went up more than 50% since then. Not sure how much of their revenue is from their Web Services.

One of the web service they offer is Alexa Top Sites for which they charge $0.0025 per query. That’s about $0.25 for every 100 sites you need to figure out. I am not sure how many are really interested in figuring out traffic ranks by city and all as most websites with a rank above 100k pretty much are all one and the same (well, not exactly, but the traffic in these sites is so little, that there is practically not much difference in knowing if one is 303245 and the other is 523450).

Anyway, with the arrival of Quantcast, Alexa has a very tough competition ahead. Ofcourse, each company uses a different technique to arrive at the ranking (and hence, WordPress for example, ranks around 40 by Quantcast and 96 by Alexa as of this writing), but we all know that no ranking system is perfect. The good thing about Quantcast is, if you are the lucky enough to have a website within 100k ranking, then you can see some very interesting statistics about your audience. A lot of demographic information that is not available with Alexa.

What’s more, Quantcast is giving away the top 1 million sites for free! (check the bottom of the link for a download link). So, now who is interested in paying $0.25 for every 100 results to Alexa, rise your hands!

Excellent job Quantcast. Keep up the good work!

Leave a comment

Filed under alexa, quantcast, traffic rank

Finding color codes in a webpage

If you are like me, a programmer, and not an artistic web-designer, chances are your color sense when it comes to designing web pages is, put it mildly, not as good. If you work for a large corporation or a big design firm, then you can probably afford to have two different people with two different job roles. But otherwise, you are pretty much stuck with a single person doing both, coding and ui design of the web page. Ofcourse, the goal of this article is not to indicate that web designers are not necessary and programmers can do everything. It all depends on the potential for the product or website. If you know it can generate a lot of revenue, there is no need to skimp on the professional resources with unique skills.

Anyway, so one tool I found recently which I find very useful is Colorzilla. It’s an add-on to Firefox. Once you install it, it creates a little color-picker icon in the left-hand side of the status bar. Now, when you are in any webpage, you can click on the color-picker and as you move your cursor over the webpage, you will be able to see the color codes for the background of the region currently pointed by the mouse. This is a great tool to get color codes from some of the well designed websites.

Ofcourse, this only helps you pick the individual colors. You need to hone your skills or hire professional to still get the overall theme of your webpage. There is no “theme picker”, so to speak (unless you steal the entire design of a webpage, which is the last thing you should be doing).

1 Comment

Filed under color picker, web design

What about PermaPages?

As you keep writing more and more blog articles, the older articles disappear from the main page. However, they can still be accessed, using a unique url that never changes. These are called PermaLinks and they are important from a search engine perspective.

However, I would like to see the concept of PermaPages for blogs. Presently, on WordPress for example, the previous entries are accessed using https://poeticcode.wordpress.com/page/2/ and /page/3/ etc. Instead, they should be accessed as https://poeticcode.wordpress.com/page/<n>/ /page/<n-1>/ etc.

Why is this? This way, my first 10 blog articles will always get /page/1/ and the next 10 /page/2/ and so on. The latest is always /page/n/ or simply /page/ or even more simply just the homepage of the blog. With this, each set of 10 blog articles will collectively get indexed.

So, if you are implementing a system that has a rolling log of content, try to use the above scheme to create PermaPages. This will be good especially if you have AdSense as your 10 (or X number) of articles are indexed together and remain constant.

Leave a comment

Filed under AdSense, Web 2.0

Wikipedia is still Web 1.0

How many websites are there that takes any url and convert that into a tag cloud and give it back to you? ZoomClouds.com and TagCloud.com offer converting an RSS feed into a tag cloud. TagMyCloud.com converts plain text you paste in a textbox into a tag cloud. TagCrowd.com also converts plain text, but allows either pasting text or uploading a file. However, at present ToCloud.com is the only one offering converting any webpage into a tag cloud.

So, tocloud.com was listed on Wikipedia under http://en.wikipedia.org/wiki/Tag_Cloud under external links. Within the brief time it was on Wikipedia, several users visited the website and even built their own tag clouds and put them on their web pages. Check some of these users. However, one moderator removed the link with a comment “non-notable and advertisement”.

I don’t have any problem with moderating. But I do have objection to the type of moderation. First and foremost, wikipeida itself lives off of donations. So, what’s wrong for a service provider to generate revenue in whatever legal way possible (placing ads in this case), as long as there is some service that the end users are interested in? If the public doesn’t donate, what would happen to Wikipedia? Sure some bigwig corporations will run the show, but with some ulterior motives.

The second issue I have with the moderator’s comment is “non-notable”. IMHO, the comment sounds like Wikipedia is modeled towards “rich get richer” scheme. I mean, if Wikipedia wants to link only the notable links, what happens to all the people who are interested in building their own tag clouds? Just because any of the sites mentioned above are not popular doesn’t mean that the “Tag Cloud” concept is limited to the services provided by just del.icio.us and flickr. Is it? And who is a single moderator to be able to decide what is notable and what is not? This is where Wikipedia should grow into a Web 2.0 model. Let everyone vote whether an external link should belong in an article or not. Those that get lesser votes will go down and the rest will bubble up.

1 Comment

Filed under wikipedia

Mashup: Tag Cloud + Amazon Products

ToCloud.com has extended the Tag Cloud mashup with Google Suggest to now support exploring Amazon Products from the Tag Cloud. This gives an opportunity for bloggers to quickly check what kind of products correspond to their blogging content. This is useful for people considering placing affiliate links to generate additional revenue.

Here is an example of Photo.net tag cloud and when you click on each link, you can choose either Google Suggest or Amazon to display the related content for each of the words in the cloud. When Amazon is chosen, it is also possible to pick the product category.

Leave a comment

Filed under affiliates, keyword cloud, mashup, page cloud, tag cloud, Web 2.0, word cloud