git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Optimize sha1_object_info for loose objects, not concurrent repacks
@ 2008-08-05 20:08 Steven Grimm
  2008-08-05 20:18 ` Shawn O. Pearce
  0 siblings, 1 reply; 2+ messages in thread
From: Steven Grimm @ 2008-08-05 20:08 UTC (permalink / raw)
  To: git

When dealing with a repository with lots of loose objects, sha1_object_info
would rescan the packs directory every time an unpacked object was referenced
before finally giving up and looking for the loose object. This caused a lot
of extra unnecessary system calls during git pack-objects; the code was
rereading the entire pack directory once for each loose object file.

This patch looks for a loose object before falling back to rescanning the
pack directory, rather than the other way around.

Signed-off-by: Steven Grimm <koreth@midwinter.com>
---

	I discovered this by running strace on a pack-objects that was
	taking especially long to run; it was making more system calls
	to scan the pack directory than to do stuff with the loose
	objects, which didn't seem right.

 sha1_file.c |    9 ++++++++-
 1 files changed, 8 insertions(+), 1 deletions(-)

diff --git a/sha1_file.c b/sha1_file.c
index e281c14..32e4664 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -1929,11 +1929,18 @@ static int sha1_loose_object_info(const unsigned char *sha1, unsigned long *size
 int sha1_object_info(const unsigned char *sha1, unsigned long *sizep)
 {
 	struct pack_entry e;
+	int status;
 
 	if (!find_pack_entry(sha1, &e, NULL)) {
+		/* Most likely it's a loose object. */
+		status = sha1_loose_object_info(sha1, sizep);
+		if (status >= 0)
+			return status;
+
+		/* Not a loose object; someone else may have just packed it. */
 		reprepare_packed_git();
 		if (!find_pack_entry(sha1, &e, NULL))
-			return sha1_loose_object_info(sha1, sizep);
+			return status;
 	}
 	return packed_object_info(e.p, e.offset, sizep);
 }
-- 
1.6.0.rc1.66.gc78d7

^ permalink raw reply related	[flat|nested] 2+ messages in thread

* Re: [PATCH] Optimize sha1_object_info for loose objects, not concurrent repacks
  2008-08-05 20:08 [PATCH] Optimize sha1_object_info for loose objects, not concurrent repacks Steven Grimm
@ 2008-08-05 20:18 ` Shawn O. Pearce
  0 siblings, 0 replies; 2+ messages in thread
From: Shawn O. Pearce @ 2008-08-05 20:18 UTC (permalink / raw)
  To: Steven Grimm; +Cc: git

Steven Grimm <koreth@midwinter.com> wrote:
> When dealing with a repository with lots of loose objects, sha1_object_info
> would rescan the packs directory every time an unpacked object was referenced
> before finally giving up and looking for the loose object. This caused a lot
> of extra unnecessary system calls during git pack-objects; the code was
> rereading the entire pack directory once for each loose object file.
> 
> This patch looks for a loose object before falling back to rescanning the
> pack directory, rather than the other way around.
> 
> Signed-off-by: Steven Grimm <koreth@midwinter.com>

Heh.  Cute bug.

ACK.

> diff --git a/sha1_file.c b/sha1_file.c
> index e281c14..32e4664 100644
> --- a/sha1_file.c
> +++ b/sha1_file.c
> @@ -1929,11 +1929,18 @@ static int sha1_loose_object_info(const unsigned char *sha1, unsigned long *size
>  int sha1_object_info(const unsigned char *sha1, unsigned long *sizep)
>  {
>  	struct pack_entry e;
> +	int status;
>  
>  	if (!find_pack_entry(sha1, &e, NULL)) {
> +		/* Most likely it's a loose object. */
> +		status = sha1_loose_object_info(sha1, sizep);
> +		if (status >= 0)
> +			return status;
> +
> +		/* Not a loose object; someone else may have just packed it. */
>  		reprepare_packed_git();
>  		if (!find_pack_entry(sha1, &e, NULL))
> -			return sha1_loose_object_info(sha1, sizep);
> +			return status;
>  	}
>  	return packed_object_info(e.p, e.offset, sizep);
>  }

-- 
Shawn.

^ permalink raw reply	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2008-08-05 20:20 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-08-05 20:08 [PATCH] Optimize sha1_object_info for loose objects, not concurrent repacks Steven Grimm
2008-08-05 20:18 ` Shawn O. Pearce

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).