Category Archives: Product Catalog

PIM – Even The Best Companies Struggle With Data Quality

PIM stands for Product Information Management. There are applications built specifically to deal with product definition since it is very critical for online retailers to have accurate and comprehensive data so that their visitors can make the right decision. Similarly, it’s also in the best interest of the manufacturers to capture accurate and comprehensive data in a timely manner to provide their upstream supply-chain parters.

My notepad crashed recently so I am trying to shop for a new one. After some research I ended up at HP Pavilion DV6500T Entertainment 15.4″ Notebook PC and the funny thing was, when you go to the Technical Details, it contained the following

” Special Features

* These are the additional features. Yada yada yada. Wuba wuba wuba”

I wonder if this is a funny place-holder TODO text used by the manufacturer (HP), or by the retailer ( Anyway, will be doing more research for HP Laptops.


Filed under Data Quality, DQ, PIM, Product Catalog

Product Catalog Search By Color

Today I happened to see a website that offered searching for products by color. I actually seen this in another site a few months back but I didn’t think much about the underlying technology. Then, today, as a first reaction I thought “wow! are they hiring people to look at each product image and capture the colors”. Then I realized, this can be done easily by processing the product image. The idea is, every image is made of a bunch of pixels, and the color of each pixel is available through the API. So, one approach is to get the frequency of each color and order the colors by frequency and finally picking first N or based on some threshold. However, as with any image processing, there are other alternate choices available. For example, if the image is jpeg instead of gif, then the number of colors is too many and the frequency of each individual color might be very little. So, perhaps treating all the colors that are very similar into one single color would help. Similarly, sometimes a color with high frequency could be just small specs scattered all over the image and it’s not really useful. Or a ring with a small diamond in the middle could contain a very small but the most important color. So, a color based on clustering rather than purely based on frequency is also a good choice. Only thing is, there needs to be a way to not include the background color, which in most product images is a white color.

Keeping all the above in mind, assume each product is related with a few colors. Then, the next thing is to take the color that the user has picked to search and matching against the product colors with a delta difference since getting precise match is not always possible or gives many choices.

For a retailer doing the above is simply processing the images in the system and creating the color index. However, if this were to be done by a search engine, the search engine has to first retrieve each product image for processing.

Leave a comment

Filed under Product Catalog, search engine, Search Indexing

Image Clouds Of Products In Procurement Software

Just came to know that v1.5 of the open source, highly user-friendly procurement software from is released. One feature they tout about is the concept of tagging the items. Check out their tag clouds feature. This is all good. But here is one of the issues when empowering employees to create requisitions. Say there are 25 different pens in the system, how does one know which pen to order? Some characteristics such as fountain pen vs ball-pen, red vs blue will help in narrowing down, but after narrowing down to say 6 to 10, what next?

One way to solve this problem is to have embedded analytics into the application which shows the most popular pens. If you want to go fancy in a web 2.0 fashion, you can even do something like Notebooks/Laptops Image Cloud which is ordered by Amazon’s SalesRank and sized based on the price. Aka tag cloud, but applied to product images.

Leave a comment

Filed under Image Cloud, Procurement, Product Catalog, tag clouds

CloudStore – Product Catalogs using Image Clouds

If you liked tag cloud / keyword cloud concept using text, think of what can be achieved using images instead of text! That is exactly what CloudStore – Online Shopping using Image Clouds from ToCloud does. The Digital SLR Cameras Image Cloud displays all the Digital SLR Cameras from Amazon as an Image Cloud. The cameras are ordered from left-to-right and top-to-bottom using Amazon’s SalesRank while the size of the Image is set to reflect the list price of the digital cameras. So, those digital SLR cameras that are more expensive are shown big while those that are cheap are shown small. Further, the images have a border rendered with different colors. Green indicates a “too low to display” price of Amazon, orange indicates that the sales price on Amazon is less than the list price while Yellow indicates that the list and sales prices are the same.

As far as I know, this is the first instance where a Web 2.0 concept of tag clouds has been implemented for Product Catalogs. What’s cool about this is the fact that it makes use of html image maps to be able to show the user additional information about each product and clicking on a particular product takes the user to the product details page on Amazon.

I have noticed an Image Cloud from listed at wikipedia which seems to have multiple drawbacks. They are, 1) there is no semantics to the ordering of the images 2) each image in the Image Cloud is a separate which ends up requesting several http requests. But perhaps that website is the first to come up with the concept of Image Clouds while ToCloud is perhaps the first to use Image Clouds for Product Catalogs.

Leave a comment

Filed under Image Cloud, Procurement, Product Catalog, tag cloud, Web 2.0