Deduped storage with SQL front-end

Came across this very interesting piece on Forbes:

The unique thing about this product is the ability to do SQL queries requiring obviously an additional overhead but much less so than tapes. With very high data reduction ratios the product claims to be an cost-effective big-data storage container for medium to long term storage that can be queried much easily than retrieving data from tapes.

However tapes have certain economics and cater to specific operational models that are tricky to match with an appliance. So it will be interesting to watch how RainStor fares. Also whenever I hear about claims of extreme compression above 90% effectiveness I start to add salt. Compression can only remove as much data redundancies as they exist within the data. Of course some compression algorithms are better at finding the redundancies than others and compression combined with other things line rzip, deduplication and content-specific data transformation filters can take out global redundancies effectively from large datasets. Still all these techniques are not something magical. If the data does not contain redundancies then they will fail to reduce the data volume. What tends to happen though in the real world is that business data tend to be structured with repeating content and successive snapshots of data tend to contain a lot in common with previous ones. So we can potentially see a lot of reduction. One can only determine this by doing a thorough evaluation.



Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s