From: Ian Kumlien <pomac@vapor.com>
To: "Nguyễn Thái Ngọc Duy" <pclouds@gmail.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH 2/2] index-pack: reduce memory usage when the pack has large blobs
Date: Fri, 24 Feb 2012 15:30:42 +0100 [thread overview]
Message-ID: <20120224143042.GE9526@pomac.netswarm.net> (raw)
In-Reply-To: <1330086201-13916-2-git-send-email-pclouds@gmail.com>
On Fri, Feb 24, 2012 at 07:23:21PM +0700, Nguyễn Thái Ngọc Duy wrote:
> This command unpacks every non-delta objects in order to:
>
> 1. calculate sha-1
> 2. do byte-to-byte sha-1 collision test if we happen to have objects
> with the same sha-1
> 3. validate object content in strict mode
>
> All this requires the entire object to stay in memory, a bad news for
> giant blobs. This patch lowers memory consumption by not saving the
> object in memory whenever possible, calculating SHA-1 while unpacking
> the object.
>
> This patch assumes that the collision test is rarely needed. The
> collision test will be done later in second pass if necessary, which
> puts the entire object back to memory again (We could even do the
> collision test without putting the entire object back in memory, by
> comparing as we unpack it).
>
> In strict mode, it always keeps non-blob objects in memory for
> validation (blobs do not need data validation). "--strict --verify"
> also keeps blobs in memory.
I applied both patches to git master, with some manual tinkering so i
might have missed some change that caused this to break.
But i get a segmentation fault and i just thought that i'd send you a
small trace before i even start trying to look in to this:
0xb7eb5b43 in SHA1_Update () from /lib/i686/cmov/libcrypto.so.0.9.8
(gdb) bt
#0 0xb7eb5b43 in SHA1_Update () from /lib/i686/cmov/libcrypto.so.0.9.8
#1 0x08116a2d in write_sha1_file_prepare
#2 0x08116a83 in hash_sha1_file
#3 0x0807c2a6 in sha1_object
#4 0x0807d74a in parse_pack_objects
#5 0x0807de6f in cmd_index_pack
#6 0x0804be97 in run_builtin
#7 handle_internal_command
#8 0x0804c0ad in run_argv
#9 main
Sorry about the censorship but i don't know how sensetive this data
is...
sha1_file.c:2343
---
static void write_sha1_file_prepare(const void *buf, unsigned long len,
const char *type, unsigned char *sha1,
char *hdr, int *hdrlen)
{
git_SHA_CTX c;
/* Generate the header */
*hdrlen = sprintf(hdr, "%s %lu", type, len)+1;
/* Sha1.. */
git_SHA1_Init(&c);
git_SHA1_Update(&c, hdr, *hdrlen);
git_SHA1_Update(&c, buf, len); <== this line fails.
git_HA1_Final(sha1, &c);
}
---
Just keep sending patches, i have atleast one git to test it on. ;)
next prev parent reply other threads:[~2012-02-24 14:31 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-02-24 12:23 [PATCH 1/2] Skip SHA-1 collision test on "index-pack --verify" Nguyễn Thái Ngọc Duy
2012-02-24 12:23 ` [PATCH 2/2] index-pack: reduce memory usage when the pack has large blobs Nguyễn Thái Ngọc Duy
2012-02-24 14:30 ` Ian Kumlien [this message]
2012-02-24 14:40 ` Ian Kumlien
2012-02-24 15:37 ` Ian Kumlien
2012-02-24 16:16 ` Ian Kumlien
2012-02-25 1:49 ` Nguyen Thai Ngoc Duy
2012-02-25 13:17 ` Ian Kumlien
2012-02-25 22:45 ` Ian Kumlien
2012-02-26 4:10 ` Nguyen Thai Ngoc Duy
2012-02-26 13:28 ` Ian Kumlien
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120224143042.GE9526@pomac.netswarm.net \
--to=pomac@vapor.com \
--cc=git@vger.kernel.org \
--cc=pclouds@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).