All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Randall S. Becker" <rsbecker@nexbridge.com>
To: "'Junio C Hamano'" <gitster@pobox.com>
Cc: "'Johannes Sixt'" <j6t@kdbg.org>, <git@vger.kernel.org>
Subject: RE: [Question] Signature calculation ignoring parts of binary files
Date: Thu, 13 Sep 2018 13:55:13 -0400	[thread overview]
Message-ID: <005e01d44b8a$ef1221e0$cd3665a0$@nexbridge.com> (raw)
In-Reply-To: <xmqqva79ffpv.fsf@gitster-ct.c.googlers.com>

On September 13, 2018 1:52 PM, Junio C Hamano wrote:
> Junio C Hamano <gitster@pobox.com> writes:
> 
> > "Randall S. Becker" <rsbecker@nexbridge.com> writes:
> >
> >> The scenario is slightly different.
> >> 1. Person A gives me a new binary file-1 with fingerprint A1. This
> >> goes into git unchanged.
> >> 2. Person B gives me binary file-2 with fingerprint B2. This does not
> >> go into git yet.
> >> 3. We attempt a git diff between the committed file-1 and uncommitted
> >> file-2 using a textconv implementation that strips what we don't need
to
> compare.
> >> 4. If file-1 and file-2 have no difference when textconv is used,
> >> file-2 is not added and not committed. It is discarded with impunity,
> >> never to be seen again, although we might whine a lot at the user for
> >> attempting to put
> >> file-2 in - but that's not git's issue.
> >
> > You are forgetting that Git is a distributed version control system,
> > aren't you?  Person A and B can introduce their "moral equivalent but
> > bytewise different" copies to their repository under the same object
> > name, and you can pull from them--what happens?
> >
> > It is fundamental that one object name given to Git identifies one
> > specific byte sequence contained in an object uniquely.  Once you
> > broke that, you no longer have Git.
> 
> Having said all that, if you want to keep the original with frills but
somehow
> give these bytewise different things that reduce to the same essence (e.g.
> when passed thru a filter like textconv), I suspect a better approach
might be
> to store both the "original" and the result of passing the "original"
through
> the filter in the object database.  In the above example, you'll get two
> "original"
> objects from person A and person B, plus one "canonical" object that are
> bytewise different from either of these two originals, but what they
reduce
> to when you use the filter on them.  Then you record the fact that to
derive
> the "essence" object, you can reduce either person A's or person B's
> "original" through the filter, perhaps by using "git notes" attached to
the
> "essence" object, recording the object names of these originals (the
reason
> why using notes in this direction is because you can mechanically
determine
> which "essence"
> object any given "original" object reduces to---it is just the matter of
passing
> it through the filter.  But there can be more than one "original" that
reduces
> to the same "essence").

I like that idea. It turns the reduced object into a contract. Thanks.


      reply	other threads:[~2018-09-13 17:55 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-09-12 19:16 [Question] Signature calculation ignoring parts of binary files Randall S. Becker
2018-09-12 20:48 ` Johannes Sixt
2018-09-12 20:53   ` Randall S. Becker
2018-09-12 22:20     ` Randall S. Becker
2018-09-12 22:59       ` Junio C Hamano
2018-09-13 12:19         ` Randall S. Becker
2018-09-13 15:03           ` Junio C Hamano
2018-09-13 15:38             ` Randall S. Becker
2018-09-13 17:51             ` Junio C Hamano
2018-09-13 17:55               ` Randall S. Becker [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to='005e01d44b8a$ef1221e0$cd3665a0$@nexbridge.com' \
    --to=rsbecker@nexbridge.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=j6t@kdbg.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.