git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Compression speed for large files
@ 2006-07-03 11:13 Joachim B Haga
  2006-07-03 12:03 ` Alex Riesen
  2006-07-03 21:45 ` Compression speed for large files Jeff King
  0 siblings, 2 replies; 21+ messages in thread
From: Joachim B Haga @ 2006-07-03 11:13 UTC (permalink / raw)
  To: git

I'm looking at doing version control of data files, potentially very large,
often binary. In git, committing of large files is very slow; I have tested with
a 45MB file, which takes about 1 minute to check in (on an intel core-duo 2GHz).

Now, most of the time is spent in compressing the file. Would it be a good idea
to change the Z_BEST_COMPRESSION flag to zlib, at least for large files? I have
measured the time spent by git-commit with different flags in sha1_file.c:

  method                 time (s)  object size (kB)
  Z_BEST_COMPRESSION     62.0      17136
  Z_DEFAULT_COMPRESSION  10.4      16536
  Z_BEST_SPEED            4.8      17071

In this case Z_BEST_COMPRESSION also compresses worse, but that's not the major
issue: the time is. Here's a couple of other data points, measured with gzip -9,
-6 and -1 (comparable to the Z_ flags above):

129MB ascii data file
  method    time (s)  object size (kB)
  gzip -9   158       23066
  gzip -6    18       23619
  gzip -1     6       32304

3MB ascii data file
  gzip -9   2.2        887
  gzip -6   0.7        912
  gzip -1   0.3       1134

So: is it a good idea to change to faster compression, at least for larger
files? From my (limited) testing I would suggest using Z_BEST_COMPRESSION only
for small files (perhaps <1MB?) and Z_DEFAULT_COMPRESSION/Z_BEST_SPEED for
larger ones.


-j.

^ permalink raw reply	[flat|nested] 21+ messages in thread

end of thread, other threads:[~2006-07-08  2:10 UTC | newest]

Thread overview: 21+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-07-03 11:13 Compression speed for large files Joachim B Haga
2006-07-03 12:03 ` Alex Riesen
2006-07-03 12:42   ` Elrond
2006-07-03 13:44     ` Joachim B Haga
2006-07-03 13:32   ` Joachim Berdal Haga
     [not found]     ` <Pine.LN X.4.64.0607031030150.1213@localhost.localdomain>
2006-07-03 14:33     ` Nicolas Pitre
2006-07-03 14:54       ` Yakov Lerner
2006-07-03 15:17         ` Johannes Schindelin
2006-07-03 16:31       ` Linus Torvalds
2006-07-03 18:59         ` [PATCH] Make zlib compression level configurable, and change default Joachim B Haga
2006-07-03 19:33           ` Linus Torvalds
2006-07-03 19:50             ` Linus Torvalds
2006-07-03 20:11             ` Joachim B Haga
2006-07-03 19:02         ` [PATCH] Use configurable zlib compression level everywhere Joachim B Haga
2006-07-03 19:43           ` Junio C Hamano
2006-07-07 21:53             ` David Lang
2006-07-08  2:10               ` Johannes Schindelin
2006-07-03 21:45 ` Compression speed for large files Jeff King
2006-07-03 22:25   ` Joachim Berdal Haga
2006-07-03 23:02     ` Linus Torvalds
2006-07-04  5:42       ` Joachim Berdal Haga

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).