git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* bug in git notes fanout
@ 2010-11-02 22:06 Shawn Pearce
  2010-11-02 23:36 ` Johan Herland
  0 siblings, 1 reply; 2+ messages in thread
From: Shawn Pearce @ 2010-11-02 22:06 UTC (permalink / raw)
  To: git, Johan Herland

Why doesn't the fan-out work for this case?

  git notes --ref buggy.fanout add -m test HEAD
  perl -e 'for($i=0;$i<1024;$i++){printf "%s %s%4.4x\n", $ARGV[0],
"0"x36, $i}' $(git rev-parse HEAD) | git notes --ref buggy.fanout copy
--stdin
  git ls-tree refs/notes/buggy.fanout
100644 blob 9daeafb9864cf43055ae93beb0afd6c7d144bfa4	0000000000000000000000000000000000000000
100644 blob 9daeafb9864cf43055ae93beb0afd6c7d144bfa4	0000000000000000000000000000000000000001
100644 blob 9daeafb9864cf43055ae93beb0afd6c7d144bfa4	0000000000000000000000000000000000000002
100644 blob 9daeafb9864cf43055ae93beb0afd6c7d144bfa4	0000000000000000000000000000000000000003
...
100644 blob 9daeafb9864cf43055ae93beb0afd6c7d144bfa4	000000000000000000000000000000000000018c
...
100644 blob 9daeafb9864cf43055ae93beb0afd6c7d144bfa4	00000000000000000000000000000000000003e1
...
100644 blob 9daeafb9864cf43055ae93beb0afd6c7d144bfa4	00000000000000000000000000000000000003e2
...
100644 blob 9daeafb9864cf43055ae93beb0afd6c7d144bfa4	00000000000000000000000000000000000003fd
100644 blob 9daeafb9864cf43055ae93beb0afd6c7d144bfa4	00000000000000000000000000000000000003fe
100644 blob 9daeafb9864cf43055ae93beb0afd6c7d144bfa4	00000000000000000000000000000000000003ff
100644 blob 9daeafb9864cf43055ae93beb0afd6c7d144bfa4	638d3d9244720e0f07f22a953d25db878e9ad3f5


I thought the entire points of the notes fanout being 0/40, 2/38,
2/2/36, 2/2/2/34 was to prevent git from having a big linear search
within any single tree when a large number of notes are added to a
note branch.

IIRC when the notes stuff was being debated on this list we insisted
that the fan-out algorithm had to be consistent, ensuring that if I
create a note for 638d3d92 and Junio creates a note on the same
object, they will wind up at the same path in the notes, and we won't
have doppleganger notes (two different notes in the same tree with the
same object).

Basically this issue arises because we are adding node support to
JGit.  Our planned implementation would have inserted
00000000000000000000000000000000000003ff into
00/00/00/00/00/00/00/00/00/00/00/00/00/00/00/00/00/00/03/ff because
there are so many notes that at each depth we wind up with >255
entries, and split the tree again.  But that results in a very
different structure from what the C implementation is doing today.

As far as I can tell from the C code, it won't split unless every hex
digit in the 0..f range is used.  That is just a weird heuristic, and
actual paths used depend on the actual SHA-1s that we add to the tree.
 If I get "unlucky" and add 5 commits that start with "f" and never
add one that starts with "0", I won't get fan-out, even though my
notes tree has more than 256 items in it.

*sigh*

This is almost as bad as the "lets sort names in trees by name *and* type!".

-- 
Shawn.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2010-11-02 23:37 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2010-11-02 22:06 bug in git notes fanout Shawn Pearce
2010-11-02 23:36 ` Johan Herland

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).