From: Nicolas Pitre <nico@cam.org>
To: Junio C Hamano <junkio@cox.net>
Cc: git@vger.kernel.org
Subject: [PATCH] always start looking up objects in the last used pack first
Date: Wed, 30 May 2007 22:48:13 -0400 (EDT) [thread overview]
Message-ID: <alpine.LFD.0.99.0705302152180.11491@xanadu.home> (raw)
Jon Smirl said:
| Once an object reference hits a pack file it is very likely that
| following references will hit the same pack file. So first place to
| look for an object is the same place the previous object was found.
This is indeed a good heuristic so here it is. The search always start
with the pack where the last object lookup succeeded. If the wanted
object is not available there then the search continues with the normal
pack ordering.
To test this I split the Linux repository into 66 packs and performed a
"time git-rev-list --objects --all > /dev/null". Best results are as
follows:
Pack Sort w/o this patch w/ this patch
-------------------------------------------------------------
recent objects last 26.4s 20.9s
recent objects first 24.9s 18.4s
This shows that the pack order based on object age has some influence,
but that the last-used-pack heuristic is even more significant in
reducing object lookup.
Signed-off-by: Nicolas Pitre <nico@cam.org> --- Note: the
--max-pack-size to git-repack currently produces packs with old objects
after those containing recent objects. The pack sort based on
filesystem timestamp is therefore backward for those. This needs to be
fixed of course, but at least it made me think about this variable for
the test.
diff --git a/sha1_file.c b/sha1_file.c
index a3637d7..aa6d499 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1687,20 +1688,25 @@ static int matches_pack_name(struct packed_git *p, const char *ig)
static int find_pack_entry(const unsigned char *sha1, struct pack_entry *e, const char **ignore_packed)
{
+ static struct packed_git *last_found = (void *)1;
struct packed_git *p;
off_t offset;
prepare_packed_git();
+ if (!packed_git)
+ return 0;
+ p = (last_found == (void *)1) ? packed_git : last_found;
- for (p = packed_git; p; p = p->next) {
+ do {
if (ignore_packed) {
const char **ig;
for (ig = ignore_packed; *ig; ig++)
if (!matches_pack_name(p, *ig))
break;
if (*ig)
- continue;
+ goto next;
}
+
offset = find_pack_entry_one(sha1, p);
if (offset) {
/*
@@ -1713,14 +1719,23 @@ static int find_pack_entry(const unsigned char *sha1, struct pack_entry *e, cons
*/
if (p->pack_fd == -1 && open_packed_git(p)) {
error("packfile %s cannot be accessed", p->pack_name);
- continue;
+ goto next;
}
e->offset = offset;
e->p = p;
hashcpy(e->sha1, sha1);
+ last_found = p;
return 1;
}
- }
+
+ next:
+ if (p == last_found)
+ p = packed_git;
+ else
+ p = p->next;
+ if (p == last_found)
+ p = p->next;
+ } while (p);
return 0;
}
next reply other threads:[~2007-05-31 2:48 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-05-31 2:48 Nicolas Pitre [this message]
2007-05-31 3:24 ` [PATCH] always start looking up objects in the last used pack first Nicolas Pitre
2007-05-31 5:02 ` Shawn O. Pearce
2007-05-31 15:39 ` Nicolas Pitre
2007-06-02 15:00 ` Dana How
2007-06-02 14:53 ` Dana How
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LFD.0.99.0705302152180.11491@xanadu.home \
--to=nico@cam.org \
--cc=git@vger.kernel.org \
--cc=junkio@cox.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).