All of lore.kernel.org
 help / color / mirror / Atom feed
From: Junio C Hamano <gitster@pobox.com>
To: Jeff King <peff@peff.net>
Cc: Elliot Wolk <elliot.wolk@gmail.com>,
	Robin Rosenberg <robin.rosenberg@dewire.com>,
	git@vger.kernel.org
Subject: Re: move detection doesnt take filename into account
Date: Wed, 09 Jul 2014 15:18:43 -0700	[thread overview]
Message-ID: <xmqqa98i9nwc.fsf@gitster.dls.corp.google.com> (raw)
In-Reply-To: <20140709220337.GF25854@sigill.intra.peff.net> (Jeff King's message of "Wed, 9 Jul 2014 18:03:37 -0400")

Jeff King <peff@peff.net> writes:

> I think the hash here does not collide in that way. It really is just
> the last sixteen characters shoved into a uint32_t.

All bytes overlap with their adjacent byte because they are shifted
by only 2 bits, not 8 bits, when a new byte is brought in.  We can
say that the topmost two bits of the result must have come from the
last character, but other than these, there are more than one input
byte for each bit position to be set/unset by, so two names that human
would not consider "similar" would be given the same hash, no?

That is useful for delta code because the code only needs that
similar things are grouped together, it does not mind things that
are not similar is also mixed to a group, as the end result is
primarily determined by similarity of the actual contents, not
pathnames.

What is under topic in this discussion is the other way around; we
know two paths have contents of the same similarity to the third one
and want to tie-break these two using how similar their pathnames
are to the third one.  

  reply	other threads:[~2014-07-09 22:18 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-06-30  6:38 move detection doesnt take filename into account Elliot Wolk
2014-07-01  9:16 ` Robin Rosenberg
2014-07-01 14:40   ` Elliot Wolk
2014-07-01 14:57   ` Junio C Hamano
2014-07-01 15:05     ` Elliot Wolk
2014-07-01 17:08       ` Junio C Hamano
2014-07-09  6:45         ` Jeff King
2014-07-09 15:51           ` Junio C Hamano
2014-07-09 22:03             ` Jeff King
2014-07-09 22:18               ` Junio C Hamano [this message]
2014-07-10  3:53                 ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqa98i9nwc.fsf@gitster.dls.corp.google.com \
    --to=gitster@pobox.com \
    --cc=elliot.wolk@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=peff@peff.net \
    --cc=robin.rosenberg@dewire.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.