git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Nicolas Pitre <nico@cam.org>
To: Junio C Hamano <junkio@cox.net>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Git Mailing List <git@vger.kernel.org>
Subject: Re: [PATCH 2/2] Implement a simple delta_base cache
Date: Sat, 17 Mar 2007 21:13:57 -0400 (EDT)	[thread overview]
Message-ID: <alpine.LFD.0.83.0703172053020.18328@xanadu.home> (raw)
In-Reply-To: <7vfy83qyxh.fsf@assigned-by-dhcp.cox.net>

On Sat, 17 Mar 2007, Junio C Hamano wrote:

> When unpacking a depth-3 deltified object A, the code finds the
> target object A (which is a delta), ask for its base B and put B
> in the cache after using it to reconstitute A.  While doing so,
> the first-generation base B is also a delta so its base C (which
> is a non-delta) is found and placed in the cache.  When A is
> returned, the cache has B and C.  If you ask for B at this
> point, we read the delta, pick up its base C from the cache,
> apply, and return while putting C back in the cache.  If you ask
> for A after that, we do not read from the cache, although it is
> available.
> 
> Which feels a bit wasteful at first sight, and we *could* make
> read_packed_sha1() also steal from the cache, but after thinking
> about it a bit, I am not sure if it is worth it.  The contract
> between read_packed_sha1() and read_sha1_file() and its callers
> is that the returned data belongs to the caller and it is a
> responsibility for the caller to free the buffer, and also the
> caller is free to modify it, so stealing from the cache from
> that codepath means an extra allocation and memcpy.

So?

A malloc() + memcpy() will always be faster than mmap() + malloc() + 
inflate().  If the data is already there it is certainly better to copy 
it straight away.

With the patch below I can do 'git log drivers/scsi/ > /dev/null' about 
7% faster.  I bet it might be even more on those platforms with bad 
mmap() support.

Signed-off-by: Nicolas Pitre <nico@cam.org>
---
diff --git a/sha1_file.c b/sha1_file.c
index a7e3a2a..ee64865 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1372,7 +1372,7 @@ static unsigned long pack_entry_hash(struct packed_git *p, off_t base_offset)
 }
 
 static void *cache_or_unpack_entry(struct packed_git *p, off_t base_offset,
-	unsigned long *base_size, enum object_type *type)
+	unsigned long *base_size, enum object_type *type, int keep_cache)
 {
 	void *ret;
 	unsigned long hash = pack_entry_hash(p, base_offset);
@@ -1384,7 +1384,13 @@ static void *cache_or_unpack_entry(struct packed_git *p, off_t base_offset,
 	return unpack_entry(p, base_offset, type, base_size);
 
 found_cache_entry:
-	ent->data = NULL;
+	if (!keep_cache)
+		ent->data = NULL;
+	else {
+		ret = xmalloc(ent->size + 1);
+		memcpy(ret, ent->data, ent->size);
+		((char *)ret)[ent->size] = 0;
+	}
 	*type = ent->type;
 	*base_size = ent->size;
 	return ret;
@@ -1418,7 +1424,7 @@ static void *unpack_delta_entry(struct packed_git *p,
 	off_t base_offset;
 
 	base_offset = get_delta_base(p, w_curs, &curpos, *type, obj_offset);
-	base = cache_or_unpack_entry(p, base_offset, &base_size, type);
+	base = cache_or_unpack_entry(p, base_offset, &base_size, type, 0);
 	if (!base)
 		die("failed to read delta base object"
 		    " at %"PRIuMAX" from %s",
@@ -1615,7 +1621,7 @@ static void *read_packed_sha1(const unsigned char *sha1,
 	if (!find_pack_entry(sha1, &e, NULL))
 		return NULL;
 	else
-		return unpack_entry(e.p, e.offset, type, size);
+		return cache_or_unpack_entry(e.p, e.offset, size, type, 1);
 }
 
 /*

  parent reply	other threads:[~2007-03-18  1:14 UTC|newest]

Thread overview: 79+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2007-03-16  1:04 cleaner/better zlib sources? Linus Torvalds
2007-03-16  1:10 ` Shawn O. Pearce
2007-03-16  1:11 ` Jeff Garzik
2007-03-16  1:14   ` Matt Mackall
2007-03-16  1:46   ` Linus Torvalds
2007-03-16  1:54     ` Linus Torvalds
2007-03-16  2:43       ` Davide Libenzi
2007-03-16  2:56         ` Linus Torvalds
2007-03-16  3:16           ` Davide Libenzi
2007-03-16 16:21             ` Linus Torvalds
2007-03-16 16:24               ` Davide Libenzi
2007-03-16 16:35                 ` Linus Torvalds
2007-03-16 19:21                   ` Davide Libenzi
2007-03-17  0:01                     ` Linus Torvalds
2007-03-17  1:11                       ` Linus Torvalds
2007-03-17  3:28                         ` Nicolas Pitre
2007-03-17  5:19                           ` Shawn O. Pearce
2007-03-17 17:55                           ` Linus Torvalds
2007-03-17 19:40                             ` Linus Torvalds
2007-03-17 19:42                               ` [PATCH 1/2] Make trivial wrapper functions around delta base generation and freeing Linus Torvalds
2007-03-17 19:44                               ` [PATCH 2/2] Implement a simple delta_base cache Linus Torvalds
2007-03-17 21:45                                 ` Linus Torvalds
2007-03-17 22:37                                   ` Junio C Hamano
2007-03-17 23:09                                     ` Linus Torvalds
2007-03-17 23:54                                       ` Linus Torvalds
2007-03-18  1:13                                     ` Nicolas Pitre [this message]
2007-03-18  7:47                                       ` Junio C Hamano
2007-03-17 23:12                                   ` Junio C Hamano
2007-03-17 23:24                                     ` Linus Torvalds
2007-03-17 23:52                                       ` Jon Smirl
2007-03-18  1:14                                   ` Morten Welinder
2007-03-18  1:29                                     ` Linus Torvalds
2007-03-18  1:38                                       ` Nicolas Pitre
2007-03-18  1:55                                         ` Linus Torvalds
2007-03-18  2:03                                           ` Nicolas Pitre
2007-03-18  2:20                                             ` Linus Torvalds
2007-03-18  3:00                                               ` Nicolas Pitre
2007-03-18  3:31                                                 ` Linus Torvalds
2007-03-18  5:30                                                   ` Julian Phillips
2007-03-18 17:23                                                     ` Linus Torvalds
2007-03-18 10:53                                                   ` Robin Rosenberg
2007-03-18 17:34                                                     ` Linus Torvalds
2007-03-18 18:29                                                       ` Robin Rosenberg
2007-03-18 21:25                                                         ` Shawn O. Pearce
2007-03-19 13:16                                                         ` David Brodsky
2007-03-20  6:35                                                           ` Robin Rosenberg
2007-03-20  9:13                                                             ` David Brodsky
2007-03-21  2:37                                                               ` Linus Torvalds
2007-03-21  2:54                                                                 ` Nicolas Pitre
2007-03-18  3:06                                               ` [PATCH 3/2] Avoid unnecessary strlen() calls Linus Torvalds
2007-03-18  9:45                                                 ` Junio C Hamano
2007-03-18 15:54                                                   ` Linus Torvalds
2007-03-18 15:57                                                     ` Linus Torvalds
2007-03-18 21:38                                                       ` Shawn O. Pearce
2007-03-18 21:48                                                         ` Linus Torvalds
2007-03-20  3:05                                                     ` Johannes Schindelin
2007-03-20  3:29                                                       ` Shawn O. Pearce
2007-03-20  3:40                                                         ` Shawn O. Pearce
2007-03-20  4:11                                                           ` Linus Torvalds
2007-03-20  4:18                                                             ` Shawn O. Pearce
2007-03-20  4:45                                                               ` Linus Torvalds
2007-03-20  5:44                                                             ` Junio C Hamano
2007-03-20  3:16                                                     ` Junio C Hamano
2007-03-20  4:31                                                       ` Linus Torvalds
2007-03-20  4:39                                                         ` Shawn O. Pearce
2007-03-20  4:57                                                           ` Linus Torvalds
2007-03-18  1:44                                       ` [PATCH 2/2] Implement a simple delta_base cache Linus Torvalds
2007-03-18  6:28                                     ` Avi Kivity
2007-03-17 22:44                                 ` Linus Torvalds
2007-03-16 16:35               ` cleaner/better zlib sources? Jeff Garzik
2007-03-16 16:42                 ` Matt Mackall
2007-03-16 16:51                 ` Linus Torvalds
2007-03-16 17:12                 ` Nicolas Pitre
2007-03-16 23:22                 ` Shawn O. Pearce
2007-03-16 17:06               ` Nicolas Pitre
2007-03-16 17:51                 ` Linus Torvalds
2007-03-16 18:09                   ` Nicolas Pitre
2007-03-16  1:33 ` Davide Libenzi
2007-03-16  2:06   ` Davide Libenzi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LFD.0.83.0703172053020.18328@xanadu.home \
    --to=nico@cam.org \
    --cc=git@vger.kernel.org \
    --cc=junkio@cox.net \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).