Pcompress 2.1 released with fixes and performance enhancements

I just uploaded a new release of Pcompress with a load of fixes and performance tweaks. You can see the download and some details of the changes here: https://code.google.com/p/pcompress/downloads/detail?name=pcompress-2.1.tar.bz2&can=2&q=

A couple of the key things are improvement in Global Dedupe accuracy and ability to set the dedupe block hash independent of the data verification hash. From a conservative viewpoint the default block hash is set to the proven SHA256. This however can be changed via an environment variable called ‘PCOMPRESS_CHUNK_HASH_GLOBAL’. SKEIN is one of the alternatives supported for this. SKEIN is a solid NIST SHA3 finalist with good amount of cryptanalysis done and no practical weakness found. It is also faster than SHA256. These choices give a massive margin of safety against random hash collisions and unexpected data corruptions considering that other commercial and open-source dedupe offerings tend to use weaker options like SHA1(Collision attack found, see below), Tiger24 or even the non-cryptographic Murmur3-128! All this for the sake of performance. Albeit some of them did not have too many choices at the time development started on those products. In addition even with a collision attack it is still impractical to get a working exploit for a dedupe storage engine that uses SHA1 like say Data Domain, and corrupt stored data.

The Segmented Global Dedupe algorithm used for scalability now gives around 95% of the data reduction efficiency of simple full chunk index based dedupe.


3 thoughts on “Pcompress 2.1 released with fixes and performance enhancements

  1. Slowpoke

    Thanks for this nice and usefull program!
    Did a quick test on Ubuntu 12.10, pcompress crashed with these switches 😦 If need be you can get the archive here: http://sourceforge.net/projects/mingw-w64/files/Toolchains%20targetting%20Win64/Automated%20Builds/

    % pcompress -E -D -L -P -B 1 -M -C -c adapt2 -l 14 -s 1280m mingw-w64-bin_x86_64-linux_20130505.tar
    Scaling to 1 thread
    Max allowed chunk size for LIBBSC is: 1073741824
    Error compressing file: mingw-w64-bin_x86_64-linux_20130505.tar

    Compression Statistics
    Total chunks : 0
    Best compressed chunk : 0 B(0.00%)
    Worst compressed chunk : 0 B(0.00%)
    Floating point exception (core dumped)

    (Linux 3.5.0-28-generic #48-Ubuntu SMP Tue Apr 23 23:03:38 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux, gcc (Ubuntu/Linaro 4.7.2-2ubuntu1) 4.7.2, Core-i5/16GB RAM)

    1. moinakg Post author

      Thanks for letting me know. This looks like a bug in an invalid input handling routine! ‘Adapt2’ mode uses libbsc which can take a max segment size of 1024m. Obviously ‘-s 1280m’ exceeds that. But then the ‘Floating point exception’ looks like a simple division by zero bug while printing the statistics.

  2. Pingback: Pcompress 2.1 - Linux mint, centos, ubuntu - OSWorld.pl - mały świat wielkich systemów!

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s