From: Nicolas Pitre <nico@cam.org>
To: Junio C Hamano <junkio@cox.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
Git Mailing List <git@vger.kernel.org>
Subject: Re: [PATCH 2/2] Implement a simple delta_base cache
Date: Sat, 17 Mar 2007 21:13:57 -0400 (EDT) [thread overview]
Message-ID: <alpine.LFD.0.83.0703172053020.18328@xanadu.home> (raw)
In-Reply-To: <7vfy83qyxh.fsf@assigned-by-dhcp.cox.net>
On Sat, 17 Mar 2007, Junio C Hamano wrote:
> When unpacking a depth-3 deltified object A, the code finds the
> target object A (which is a delta), ask for its base B and put B
> in the cache after using it to reconstitute A. While doing so,
> the first-generation base B is also a delta so its base C (which
> is a non-delta) is found and placed in the cache. When A is
> returned, the cache has B and C. If you ask for B at this
> point, we read the delta, pick up its base C from the cache,
> apply, and return while putting C back in the cache. If you ask
> for A after that, we do not read from the cache, although it is
> available.
>
> Which feels a bit wasteful at first sight, and we *could* make
> read_packed_sha1() also steal from the cache, but after thinking
> about it a bit, I am not sure if it is worth it. The contract
> between read_packed_sha1() and read_sha1_file() and its callers
> is that the returned data belongs to the caller and it is a
> responsibility for the caller to free the buffer, and also the
> caller is free to modify it, so stealing from the cache from
> that codepath means an extra allocation and memcpy.
So?
A malloc() + memcpy() will always be faster than mmap() + malloc() +
inflate(). If the data is already there it is certainly better to copy
it straight away.
With the patch below I can do 'git log drivers/scsi/ > /dev/null' about
7% faster. I bet it might be even more on those platforms with bad
mmap() support.
Signed-off-by: Nicolas Pitre <nico@cam.org>
---
diff --git a/sha1_file.c b/sha1_file.c
index a7e3a2a..ee64865 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1372,7 +1372,7 @@ static unsigned long pack_entry_hash(struct packed_git *p, off_t base_offset)
}
static void *cache_or_unpack_entry(struct packed_git *p, off_t base_offset,
- unsigned long *base_size, enum object_type *type)
+ unsigned long *base_size, enum object_type *type, int keep_cache)
{
void *ret;
unsigned long hash = pack_entry_hash(p, base_offset);
@@ -1384,7 +1384,13 @@ static void *cache_or_unpack_entry(struct packed_git *p, off_t base_offset,
return unpack_entry(p, base_offset, type, base_size);
found_cache_entry:
- ent->data = NULL;
+ if (!keep_cache)
+ ent->data = NULL;
+ else {
+ ret = xmalloc(ent->size + 1);
+ memcpy(ret, ent->data, ent->size);
+ ((char *)ret)[ent->size] = 0;
+ }
*type = ent->type;
*base_size = ent->size;
return ret;
@@ -1418,7 +1424,7 @@ static void *unpack_delta_entry(struct packed_git *p,
off_t base_offset;
base_offset = get_delta_base(p, w_curs, &curpos, *type, obj_offset);
- base = cache_or_unpack_entry(p, base_offset, &base_size, type);
+ base = cache_or_unpack_entry(p, base_offset, &base_size, type, 0);
if (!base)
die("failed to read delta base object"
" at %"PRIuMAX" from %s",
@@ -1615,7 +1621,7 @@ static void *read_packed_sha1(const unsigned char *sha1,
if (!find_pack_entry(sha1, &e, NULL))
return NULL;
else
- return unpack_entry(e.p, e.offset, type, size);
+ return cache_or_unpack_entry(e.p, e.offset, size, type, 1);
}
/*
next prev parent reply other threads:[~2007-03-18 1:14 UTC|newest]
Thread overview: 79+ messages / expand[flat|nested] mbox.gz Atom feed top
2007-03-16 1:04 cleaner/better zlib sources? Linus Torvalds
2007-03-16 1:10 ` Shawn O. Pearce
2007-03-16 1:11 ` Jeff Garzik
2007-03-16 1:14 ` Matt Mackall
2007-03-16 1:46 ` Linus Torvalds
2007-03-16 1:54 ` Linus Torvalds
2007-03-16 2:43 ` Davide Libenzi
2007-03-16 2:56 ` Linus Torvalds
2007-03-16 3:16 ` Davide Libenzi
2007-03-16 16:21 ` Linus Torvalds
2007-03-16 16:24 ` Davide Libenzi
2007-03-16 16:35 ` Linus Torvalds
2007-03-16 19:21 ` Davide Libenzi
2007-03-17 0:01 ` Linus Torvalds
2007-03-17 1:11 ` Linus Torvalds
2007-03-17 3:28 ` Nicolas Pitre
2007-03-17 5:19 ` Shawn O. Pearce
2007-03-17 17:55 ` Linus Torvalds
2007-03-17 19:40 ` Linus Torvalds
2007-03-17 19:42 ` [PATCH 1/2] Make trivial wrapper functions around delta base generation and freeing Linus Torvalds
2007-03-17 19:44 ` [PATCH 2/2] Implement a simple delta_base cache Linus Torvalds
2007-03-17 21:45 ` Linus Torvalds
2007-03-17 22:37 ` Junio C Hamano
2007-03-17 23:09 ` Linus Torvalds
2007-03-17 23:54 ` Linus Torvalds
2007-03-18 1:13 ` Nicolas Pitre [this message]
2007-03-18 7:47 ` Junio C Hamano
2007-03-17 23:12 ` Junio C Hamano
2007-03-17 23:24 ` Linus Torvalds
2007-03-17 23:52 ` Jon Smirl
2007-03-18 1:14 ` Morten Welinder
2007-03-18 1:29 ` Linus Torvalds
2007-03-18 1:38 ` Nicolas Pitre
2007-03-18 1:55 ` Linus Torvalds
2007-03-18 2:03 ` Nicolas Pitre
2007-03-18 2:20 ` Linus Torvalds
2007-03-18 3:00 ` Nicolas Pitre
2007-03-18 3:31 ` Linus Torvalds
2007-03-18 5:30 ` Julian Phillips
2007-03-18 17:23 ` Linus Torvalds
2007-03-18 10:53 ` Robin Rosenberg
2007-03-18 17:34 ` Linus Torvalds
2007-03-18 18:29 ` Robin Rosenberg
2007-03-18 21:25 ` Shawn O. Pearce
2007-03-19 13:16 ` David Brodsky
2007-03-20 6:35 ` Robin Rosenberg
2007-03-20 9:13 ` David Brodsky
2007-03-21 2:37 ` Linus Torvalds
2007-03-21 2:54 ` Nicolas Pitre
2007-03-18 3:06 ` [PATCH 3/2] Avoid unnecessary strlen() calls Linus Torvalds
2007-03-18 9:45 ` Junio C Hamano
2007-03-18 15:54 ` Linus Torvalds
2007-03-18 15:57 ` Linus Torvalds
2007-03-18 21:38 ` Shawn O. Pearce
2007-03-18 21:48 ` Linus Torvalds
2007-03-20 3:05 ` Johannes Schindelin
2007-03-20 3:29 ` Shawn O. Pearce
2007-03-20 3:40 ` Shawn O. Pearce
2007-03-20 4:11 ` Linus Torvalds
2007-03-20 4:18 ` Shawn O. Pearce
2007-03-20 4:45 ` Linus Torvalds
2007-03-20 5:44 ` Junio C Hamano
2007-03-20 3:16 ` Junio C Hamano
2007-03-20 4:31 ` Linus Torvalds
2007-03-20 4:39 ` Shawn O. Pearce
2007-03-20 4:57 ` Linus Torvalds
2007-03-18 1:44 ` [PATCH 2/2] Implement a simple delta_base cache Linus Torvalds
2007-03-18 6:28 ` Avi Kivity
2007-03-17 22:44 ` Linus Torvalds
2007-03-16 16:35 ` cleaner/better zlib sources? Jeff Garzik
2007-03-16 16:42 ` Matt Mackall
2007-03-16 16:51 ` Linus Torvalds
2007-03-16 17:12 ` Nicolas Pitre
2007-03-16 23:22 ` Shawn O. Pearce
2007-03-16 17:06 ` Nicolas Pitre
2007-03-16 17:51 ` Linus Torvalds
2007-03-16 18:09 ` Nicolas Pitre
2007-03-16 1:33 ` Davide Libenzi
2007-03-16 2:06 ` Davide Libenzi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LFD.0.83.0703172053020.18328@xanadu.home \
--to=nico@cam.org \
--cc=git@vger.kernel.org \
--cc=junkio@cox.net \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).