git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Shawn O. Pearce" <spearce@spearce.org>
To: Geert Bosch <bosch@adacore.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Nicolas Pitre <nico@cam.org>, Dana How <danahow@gmail.com>,
	git@vger.kernel.org
Subject: Re: [RFC] Packing large repositories
Date: Tue, 3 Apr 2007 01:39:59 -0400	[thread overview]
Message-ID: <20070403053959.GH15922@spearce.org> (raw)
In-Reply-To: <64E16DEF-E572-4384-9E68-42EBBCE678B1@adacore.com>

Geert Bosch <bosch@adacore.com> wrote:
> Actually, I had implemented this first, using two newton-raphson
> iterations and then binary search. With just one iteration is
> too little, and one iteration+binary search often is no win.
> Two iterations followed by binary search cuts the nr of steps in
> half for the Linux kernel. Two iterations followed by linear search
> is often worse, because of "unlucky" cases that end up doing many
> probes. Still, during the 5-8 probes in moderately large repositories
> (1M objects), each probe pretty much requires its own cache line:
> very cache unfriendly.

If Nico and I can ever find the time to get our ideas for pack v4
coded into something executable, I think you will find this is less
of an issue than you think.

We're hoping to change enough of the commit and tree traversal
code that the "tight" loops around chasing tree, parent, and blob
pointers can be done using strictly pack offsets and completely
avoid these SHA-1 lookups.  Thus the only time we'd fall into the
above-mentioned SHA-1 lookup path is on initial entry to a revision
walk, or when spanning to another packfile.  This would mean most
workloads should only hit that code once per command line argument.

-- 
Shawn.

  reply	other threads:[~2007-04-03  5:40 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-03-28  7:05 [RFC] Packing large repositories Dana How
2007-03-28 16:53 ` Linus Torvalds
2007-03-30  6:23   ` Shawn O. Pearce
2007-03-30 13:01     ` Nicolas Pitre
2007-03-31 11:04       ` Geert Bosch
2007-03-31 18:36         ` Linus Torvalds
2007-03-31 19:02           ` Nicolas Pitre
2007-03-31 20:54           ` Junio C Hamano
2007-03-31 21:20           ` Linus Torvalds
2007-03-31 21:56             ` Linus Torvalds
2007-04-02  6:22           ` Geert Bosch
2007-04-03  5:39             ` Shawn O. Pearce [this message]
2007-03-31 18:51         ` Nicolas Pitre
2007-04-02 21:19   ` Dana How
2007-04-02  1:39 ` Sam Vilain

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20070403053959.GH15922@spearce.org \
    --to=spearce@spearce.org \
    --cc=bosch@adacore.com \
    --cc=danahow@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=nico@cam.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).