From: Jeff King <peff@peff.net>
To: "Dirk Süsserott" <newsletter@dirk.my1.cc>
Cc: Git Mailing List <git@vger.kernel.org>
Subject: Re: How to prevent Git from compressing certain files?
Date: Thu, 23 Jul 2009 06:12:27 -0400 [thread overview]
Message-ID: <20090723101227.GA4247@coredump.intra.peff.net> (raw)
In-Reply-To: <4A676D4B.7040004@dirk.my1.cc>
On Wed, Jul 22, 2009 at 09:49:31PM +0200, Dirk Süsserott wrote:
> Somewhere I read that Git isn't supposed to efficiently handle binary
> files. Of course, I don't want to merge my files, just store them with
> their history and git-push them to some "safe place".
Git handles binary files better than many systems. The downsides are:
- you can't do file-level diffing and merging very well, for obvious
reasons (though actually, git is better than most; it makes it easy
to look at both sides individually and pick the one you want).
- really _big_ files can give lousy performance. Git assumes single
files can fit into memory, which means files in the gigabyte range
(or hundreds of megabytes if your machine is old :) ) can be awful.
It also means that things like inexact rename detection and finding
delta candidates can be slow.
> I figured that pushing and git gc'ing both try to compress those files
> (or differences) really hard. Works great for "regular" files, but is
> pointless with jpegs.
>
> Question: Is there a way to prevent Git from trying to compress certain
> files based on their extension?
There are actually two types of compression that git uses: delta
compression between similar objects in packs, and zlib compression of
loose objects and objects within packs.
You almost certainly don't want zlib compression on your jpegs, as they
are already compressed. You can turn off zlib compression entirely by
setting core.compression to 0. Unfortunately, this turns off compression
for _all_ objects, which means in a mixed-use repo you won't be
compressing your text (and even in a photos-only repo, you are not
compressing your commit messages).
Delta compression between two jpegs, or between two versions of a jpeg
where the image data itself was modified, is unlikely to be useful.
However, if you use EXIF metadata in the file, then you will save a lot
of space between versions with the same image data, but different
metadata. So it's worth leaving delta compression on in that case, and
probably turning it off otherwise.
As Jakub mentioned, you can use the delta gitattribute for just your
jpegs. You can also turn off deltas entirely by setting pack.window to
0, though you may be losing some benefit on your non-blob objects.
-Peff
next prev parent reply other threads:[~2009-07-23 10:12 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-07-22 19:49 How to prevent Git from compressing certain files? Dirk Süsserott
2009-07-22 20:44 ` Jakub Narebski
2009-07-22 20:46 ` Jakub Narebski
2009-07-23 10:12 ` Jeff King [this message]
2009-07-23 18:51 ` Dirk Süsserott
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20090723101227.GA4247@coredump.intra.peff.net \
--to=peff@peff.net \
--cc=git@vger.kernel.org \
--cc=newsletter@dirk.my1.cc \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).