Cuil, the search engine company that is supposed to be a competition to Google has just started and I am already seeing traffic from this search engine to my blog. How do I know? Well, WordPress is currently not filtering them from it’s referrers report. I wrote earlier in Where is Google in WordPress Referrers Report that WordPress filters all the visits coming from search engines. But this new kid on the block is not yet been noticed, but may be it’s a matter of days.
Monthly Archives: July 2008
Just found a job post title on craigslist that made me laugh.
“Miss NEW YORK CITY? Come work as a *SOFTWARE ENGINEER* in NYC!”
I have a requirement to make sure that a certain data file being shipped along with the software is not tampered. So, I thought I could encrypt it. So, the plan was to use an asymmetric encryption using private and public keys. Java has support for RSA/ECB/PKCS1PADDING and so I thought of using it. There is also a CipherOutputStream and CipherInputStream in the javax.crypto package which I thought of using and that’s when the problems started. After creating a cipher output stream with a cipher based on RSA/ECB/PKCS1PADDING and writing data to it and closing the stream, the output file had nothing in it. Did some research on what CipherInputStream does and in process learnt that ciphers are typically block ciphers or stream ciphers. Block ciphers operate on a fixed block while stream ciphers can work on a large stream of data. However, it’s possible to convert a block cipher into a stream cipher by using ECB and other modes. So, in theory, it should be possible to use the RSA cipher that can only encrypt a fixed size of bytes as a stream cipher with the ECB mode. So, I manually tried splitting the input stream into small blocks and implemented the stream encrypting without using the CipherOutputStream. For this, I first harcoded a block size of 128 for output but that gave an error (it turns out the size is 128 – 11 = 117. 128 is based on the keysize and 11 is based on the padding). So, after changing the encryption block size to 117, I could successfully encrypt the entire file. I didn’t like the fact that I hardcoded the values. So, looking at the Cipher api, I decided to use the getBlockSize and that’s when I realized it returns a value of 0 (a reason why the CipherOutputStream didn’t work). Hmm, how can this return a value of zero? The documentation for this function says that for Ciphers that are not block ciphers, this value would be 0. What? Well, it turns out, while it’s possible to split the input stream into small chunks and encrypt them using RSA, the typical usage is to encrypt just a single small chunk. Seems only symmetric keys are used for encoding streams of data. So, it turns out that typically a session symmetric key is generated to encrypt the data and this key itself is encrypted using the public-private key encryption. Since the session key is small, it fits within the block size of an RSA cipher (Yes, RSA cipher does have a block size, though it’s not intended to be used as a block cipher).
So, finally I changed my strategy. My requirement is more to do with preventing the tampering of the data rather than preventing viewing of the data. So, I used SHA-5 message digest, computed the digest for the data file, then used the RSA Cipher and encrypted the digest with the private key. The idea is to ship the data file and the encrypted message digest and the public key and then at run time, first compute the digest on the data file and compare it with the decrypted digest computed using the public key on the encrypted digest. If they match, then the file is not tampered, otherwise it is.
BTW, in case you are wondering, why not go with the approach of encrypting with a symmetric key and encrypt the symmetric key using the public/private keys, this has a security issue. Unlike a 2-trusted parties communicating with each other using this approach and trying to prevent a 3rd party from knowing the message, here the issue is that me the 1st party, can’t trust the 2nd party. So, for example, once the program is run, it would be possible to identify the symmetric key used to encrypt the data by inspecting the RAM and then use that symmetric key and encrypt a different piece of data and overwrite the old encrypted data file. The program would happily accept the tampered data because it was able to successfully decrypt the data file using the same symmetric key that doesn’t change (once you ship the software, the key remains the same). Note that the purpose of encrypting the message digest above is not to hide it because again, looking at the RAM, it would be possible to figure this out, but the idea is that one can’t compute the encrypted value of the message digest of the tampered file since it can only be done by me using the private key.
Security is an interesting area. There are a handful of tools and which tool to use when depends on the use case. The data file in the above use case can be a software license as well that contains the details of the party that licensed the software. One doesn’t care so much about the fact that the license details are visible but that those visible details are not tampered.
In the pre-Google era no one cared about inbound links. Everyone cared about stuffing keywords within their own pages. That has changed due to PageRank. Companies spend a lot of money on online advertising, search engine optimization, link exchange, link submission, blogging and writing articles and even purchasing links. Some of the practices are frowned upon and potentially penalized while the others are genuine and well rewarded.
Here is a new strategy. What if you offer a “lite” version of your product/service for free with the understanding that the receiving party will link to your website? It would even be possible for those non-paying customers to write their experience with your product/service and make it a genuine backlink with appropriate content. Obviously, it’s not possible to offer this type of a transaction for life, so perhaps doing this limited time till you get enough backlinks that boosts your website to a PageRank of 5 or 6. Sounds like a cool strategy isn’t it?
Well, this bright idea is not mine, I just came across at this website. If you are wondering why would anyone want to do this and not just sell their product but spend money on advertising, obviously the reason is to improve visitors through organic search than paid advertising. In the long run who wants to pay Google or other online advertisers especially if the rates are high in some categories?