All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Nguyen Thai Ngoc Duy <pclouds@gmail.com>
Cc: David Michael Barr <b@rr-dav.id.au>,
	Git Mailing List <git@vger.kernel.org>
Subject: Re: [RFC] pack-objects: compression level for non-blobs
Date: Sat, 29 Dec 2012 04:05:58 -0500	[thread overview]
Message-ID: <20121229090558.GA31291@sigill.intra.peff.net> (raw)
In-Reply-To: <20121229052747.GA14928@sigill.intra.peff.net>

On Sat, Dec 29, 2012 at 12:27:47AM -0500, Jeff King wrote:

> > I think I tried the partial decompression for commit header and it did
> > not help much (or I misremember it, not so sure).
> 
> I'll see if I can dig up the reference, as it was something I was going
> to look at next.

I tried the simple patch below, but it actually made things slower!  I'm
assuming it is because the streaming setup is not micro-optimized very
well. A custom read_sha1_until_blank_line() could probably do better.

diff --git a/commit.c b/commit.c
index e8eb0ae..efd6c06 100644
--- a/commit.c
+++ b/commit.c
@@ -8,6 +8,7 @@
 #include "notes.h"
 #include "gpg-interface.h"
 #include "mergesort.h"
+#include "streaming.h"
 
 static struct commit_extra_header *read_commit_extra_header_lines(const char *buf, size_t len, const char **);
 
@@ -306,6 +307,39 @@ int parse_commit_buffer(struct commit *item, const void *buffer, unsigned long s
 	return 0;
 }
 
+static void *read_commit_header(const unsigned char *sha1,
+				enum object_type *type,
+				unsigned long *size)
+{
+	static const int chunk_size = 256;
+	struct strbuf buf = STRBUF_INIT;
+	struct git_istream *st;
+
+	st = open_istream(sha1, type, size, NULL);
+	if (!st)
+		return NULL;
+	while (1) {
+		size_t start = buf.len;
+		ssize_t readlen;
+
+		strbuf_grow(&buf, chunk_size);
+		readlen = read_istream(st, buf.buf + start, chunk_size);
+		buf.buf[start + readlen + 1] = '\0';
+		buf.len += readlen;
+
+		if (readlen < 0) {
+			close_istream(st);
+			strbuf_release(&buf);
+			return NULL;
+		}
+		if (!readlen || strstr(buf.buf + start, "\n\n"))
+			break;
+	}
+
+	close_istream(st);
+	return strbuf_detach(&buf, size);
+}
+
 int parse_commit(struct commit *item)
 {
 	enum object_type type;
@@ -317,7 +351,11 @@ int parse_commit(struct commit *item)
 		return -1;
 	if (item->object.parsed)
 		return 0;
-	buffer = read_sha1_file(item->object.sha1, &type, &size);
+
+	if (!save_commit_buffer)
+		buffer = read_commit_header(item->object.sha1, &type, &size);
+	else
+		buffer = read_sha1_file(item->object.sha1, &type, &size);
 	if (!buffer)
 		return error("Could not read %s",
 			     sha1_to_hex(item->object.sha1));

  reply	other threads:[~2012-12-29  9:06 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-11-26  6:25 [RFC] pack-objects: compression level for non-blobs David Michael Barr
2012-11-26 12:35 ` David Michael Barr
2012-12-29  0:41 ` Jeff King
2012-12-29  4:34   ` Nguyen Thai Ngoc Duy
2012-12-29  5:07     ` Jeff King
2012-12-29  5:25       ` Nguyen Thai Ngoc Duy
2012-12-29  5:27         ` Jeff King
2012-12-29  9:05           ` Jeff King [this message]
2012-12-29  9:48             ` Jeff King
2012-12-30 12:05           ` Jeff King
2012-12-30 12:53             ` Nguyen Thai Ngoc Duy
2012-12-30 21:31               ` Jeff King
2012-12-31 18:06                 ` Shawn Pearce
2013-01-01  4:15                   ` Duy Nguyen
2013-01-01 12:10                     ` Duy Nguyen
2013-01-01 17:17                       ` Shawn Pearce
2013-01-01 23:47                         ` Junio C Hamano
2013-01-02  2:23                         ` Duy Nguyen
2013-01-01 20:02                       ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20121229090558.GA31291@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=b@rr-dav.id.au \
    --cc=git@vger.kernel.org \
    --cc=pclouds@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.