Git development
 help / color / mirror / Atom feed
From: Junio C Hamano <junkio@cox.net>
To: merlyn@stonehenge.com (Randal L. Schwartz)
Cc: git@vger.kernel.org
Subject: Re: How should I handle binary file with GIT
Date: Wed, 05 Apr 2006 11:34:55 -0700	[thread overview]
Message-ID: <7vslor27n4.fsf@assigned-by-dhcp.cox.net> (raw)
In-Reply-To: 86wte4rq3d.fsf@blue.stonehenge.com

merlyn@stonehenge.com (Randal L. Schwartz) writes:

> I think the issue is related to being able to cherry-pick and merge
> when binaries are involved.  I've been worried about that myself.
> How well are binaries supported these days for all the operations
> we're taking for granted?  When is a "diff" expected to be a real
> "diff" and not just "binary files differ"?

First of all, binary files are handled by cherry-pick and merge
without needing to involve "diff"+"patch" (which is not so
useful for binary files anyway).  They use 3-way read-tree merge
which compares the object names and leave the index unmerged if
there are conflicting changes, so you should be able to sort it
out by running up to three "git-cat-file blob $sha1".

What involves "diff"+"patch" are rebases and processing mailed-in
patches as in the example by the original poster.

In our diff output, we record the blob object name of preimage
and postimage, along with filemode, on the "index" line.
git-apply does not do anything with it by default, but if:

 - --binary flag is given,

 - the postimage blob is already available locally, and,

 - the file the patch is being applied to is the same as the
   recorded preimage,

then the file is _replaced_ with the postimage.

This is good enough for git-rebase (which uses format-patch
piped to am) and is safe (we do not "apply delta" -- only
replace when the file "being patched" matches the recorded
preimage).  It does not do any good for transferring a postimage
that the person who applies the patch does not yet have.

I think "applying delta" to a binary file is not very useful
thing to do.  Depending on the nature of the file being patched,
it may produce a perfectly good result, but verifying if the
result makes sense by the end user and hand-fixing it if does
not, which can be done for text files, is near impossible for
binary files.  "replace with postimage only when you are
applying to the same preimage" rule would be the only practical,
sane thing.

If we wanted to use the patch+diff (i.e. "format-patch,
send-email, and then am" workflow) to transfer new version of
binary files to a recipient, which I think is useful in some
projects, the sanest way to handle this is probably to add
Nico's delta, going from preimage to postimage, encoded for
safer transport, to our diff output.  For safety and sanity, we
will not "apply" the patch unless the patched file exactly
matches the preimage that is recorded in the diff, and as long
as the recipient has the preimage, such a patch would be able to
reproduce the postimage and hopefully be smaller than
transferring the whole thing.

We've been trying to keep our diff output reversible (e.g. we
show what the filemode of the preimage is), so if we take the
above route, it probably should record deltas for both going
from preimage to postimage _and_ going the other way (unless
xdelta can be applied in-reverse, which I do not think is the
case).

Of course, to be _completely_ generic, you could include both
compressed then uuencoded preimage and postimage, and let the
recipient sort it out.  An advantage of that approach is that
the applicability of such a "patch" improves as the tools to
apply it improve, after the patch was originally generated.  I
however think that is only a theoretical advantage, not a very
practical one.

  parent reply	other threads:[~2006-04-05 18:35 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-04-05  7:30 How should I handle binary file with GIT moreau francis
2006-04-05  8:14 ` Junio C Hamano
2006-04-05 12:21   ` moreau francis
2006-04-05 13:25     ` Nicolas Pitre
2006-04-05 13:35       ` moreau francis
2006-04-05 13:06   ` Nicolas Pitre
2006-04-05 13:18   ` moreau francis
2006-04-05 19:23     ` Marco Roeland
2006-04-05 15:11   ` Jakub Narebski
2006-04-05 15:32     ` Nicolas Pitre
2006-04-05 15:37       ` Randal L. Schwartz
2006-04-05 15:55         ` Shawn Pearce
2006-04-05 16:25           ` Nicolas Pitre
2006-04-05 16:21         ` Nicolas Pitre
2006-04-05 18:34         ` Junio C Hamano [this message]
2006-04-05 18:51           ` Randal L. Schwartz
2006-04-05 19:31           ` Nicolas Pitre
2006-04-05 20:20             ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7vslor27n4.fsf@assigned-by-dhcp.cox.net \
    --to=junkio@cox.net \
    --cc=git@vger.kernel.org \
    --cc=merlyn@stonehenge.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox