I recently came across this old blog by Nimble storage co-founder Umesh Maheshwari: http://www.nimblestorage.com/blog/technology/better-than-dedupe-unduped/
The post has intriguing views on inline compression vs Dedupe and the approach of using snapshots on primary storage as a short-term backup mechanism is quite cool. COW snapshots by definition avoid duplicates the instant they are created. In addition consecutive snapshots will share common blocks with no special effort. This aspect not much different from ZFS. It has the same real-time compression and snapshot features and the same benefits apply there as well.
However it is the conclusions and the graphs that I humbly find missing some points and even misleading to an extent. Deduplication does not only remove duplicates from successive backups but it can remove internal duplicates within the volume. It helps compression in turn. In all my tests with Pcompress I have found Deduplication providing additional gain when added to standard compression. See the entry for Pcompress here: http://www.mattmahoney.net/dc/10gb.html. ZFS for that matter provides deduplication as well, though it has scalability bottlenecks.
While inline compression is cool, compression works within a limited window. The window size varies with the algorithm. For example Zlib uses a 32KB window, Bzip2 uses 900KB max, LZMA can use a gigabyte window size. Repeating patterns or redundancy within the window are compressed. Deduplication finds duplicate blocks within the entire dataset of gigabytes to hundreds of terabytes. There is no theoretical window size limitation (Though practical scaling considerations can put a limit). So I really cannot digest that just snapshotting + inline compression will be superior. Deduplication + snapshotting + inline compression will provide greater capacity optimization. For example see the section on “Compression” here: http://www.ciosolutions.com/Nimble+Storage+vs+Netapp+-+CASL+WAFL
Now if Post-Process Variable-Block Similarity based deduplication (of the type I added into Pcompress) can be added to ZFS, things will be very interesting.
- Nimble Storage says “me too!” (neuralytix.com)
- Nimble Storage Named a 2013 Emerging Technology Vendor by CRN (virtual-strategy.com)