File Compression: No-Brainer?

July 28th, 2012 - 01:48 pm ET by (PeteCresswell) | Report spam
I'm troubleshooting some problems with my media server and
something that just dawned on me is blocking. The discs are
blocked 4k and the media server's recommendation is 64k.

That being the case, I will reformat/reblock the discs.

The Question: Should I leave compression enabled?

Somewhere along the line, I got the idea that there is a
crossover point between physical disc speed and memory speed
where compression saves net time once memory speed is above a
certain point for a given physical disc speed and that, with
2+ ghz processors and SATA drives that crossover point has long
been passed and, accordingly, turning on compression sb SOP.

??
Pete Cresswell
email Follow the discussionReplies 15 repliesReplies Make a reply

Replies

#1 Paul
July 28th, 2012 - 03:30 pm ET | Report spam
(PeteCresswell) wrote:
I'm troubleshooting some problems with my media server and
something that just dawned on me is blocking. The discs are
blocked 4k and the media server's recommendation is 64k.

That being the case, I will reformat/reblock the discs.

The Question: Should I leave compression enabled?

Somewhere along the line, I got the idea that there is a
crossover point between physical disc speed and memory speed
where compression saves net time once memory speed is above a
certain point for a given physical disc speed and that, with
2+ ghz processors and SATA drives that crossover point has long
been passed and, accordingly, turning on compression sb SOP.

??



http://www.ntfs.com/ntfs-compressed.htm

"The compression algorithms in NTFS are designed to support
cluster sizes of up to 4 KB. When the cluster size is greater
than 4 KB on an NTFS volume, none of the NTFS compression
functions are available."

I don't know if that's sensitive to OS version or not (like, changed
at all, over the years).

The algorithm used, could be a good one. One article says
NTFS compression is LZ77. And the description for GZIP,
says it uses some flavor of LZ77 as well. And GZIP is about
the best, in my limited testing, for some degree of compression,
with good processing speed. There are compression methods which
achieve higher compression, but are much more computationally
expensive. Also, for GZIP, there is a multi-threaded version,
which can further speed things up. I doubt the NTFS compression
is that fancy (i.e. use a few cores).

You could bench with GZIP, and see what kind of speed your
machine can manage. The 7ZIP package, I think it you do
an "Add To Archive" and compress something, you can choose
the GZIP algorithm. If you needed a single core compressor run
with GZIP, that might be a way to do it. (Short of getting
a copy of the actual GZIP for Windows, and using that of course.)
And that would give you some idea of how fast an LZ77 compressor
could do its job.

Then the next question becomes, what kind of file do you test
with ? A file which is trivially compressible (a file of zeros) ?
A file which can't be compressed ? (If you can't think of a
way to make test files, this is a possibility. I'm hoping
this would make a 1GB test file, in each case.)

dd if=/dev/zero of=C:\easycompress.bin bs=1m count00
dd if=/dev/random of=C:\hardcompress.bin bs=1m count00

I would think normally, you choose to use compression, when space
saving is paramount, and performance is a secondary issue. For example,
I had a 500GB disk, and needed to do some temporary work with
600GB of files. It wasn't much of a debate, as to whether I needed
to enable compression or not on the drive, since the project wasn't
going anywhere without it. That's the only time I've used the
compression. I didn't really care whether it ran at 20MB/sec or
125MB/sec - it was just going to take as long as was needed,
until it was finished.

Paul

Similar topics