From: Liu Yubao <yubao.liu@gmail.com>
To: "Shawn O. Pearce" <spearce@spearce.org>
Cc: Junio C Hamano <gitster@pobox.com>, git list <git@vger.kernel.org>
Subject: Re: [PATCH 3/5] optimize parse_sha1_header() a little by detecting object type
Date: Wed, 03 Dec 2008 12:06:19 +0800 [thread overview]
Message-ID: <493605BB.8020705@gmail.com> (raw)
In-Reply-To: <20081202155300.GL23984@spearce.org>
Shawn O. Pearce wrote:
> Liu Yubao <yubao.liu@gmail.com> wrote:
>> diff --git a/sha1_file.c b/sha1_file.c
>> index dccc455..79062f0 100644
>> --- a/sha1_file.c
>> +++ b/sha1_file.c
>> @@ -1099,7 +1099,8 @@ static void *map_sha1_file(const unsigned char *sha1, unsigned long *size)
>>
>> if (!fstat(fd, &st)) {
>> *size = xsize_t(st.st_size);
>> - map = xmmap(NULL, *size, PROT_READ, MAP_PRIVATE, fd, 0);
>> + if (*size > 0)
>> + map = xmmap(NULL, *size, PROT_READ, MAP_PRIVATE, fd, 0);
>> }
>> close(fd);
>> }
>
> This has nothing to do with this change description. Why are we
> returning NULL from map_sha1_file when the file length is 0 bytes?
> No loose object should ever be an empty file, there must always be
> some sort of type header present. So it probably is an error to
> have a 0 length file here. But that bug is a different change.
>
Also a defensive programming for uncompressed loose object because the mapped memory
will be passed to parse_sha1_header() directly without being checked by inflateInit().
In fact unpack_sha1_header() in current code calls legacy_loose_object() without checking
mapsize first. If it encounters (very very unlikely) a corrupted empty loose object, it
will crash.
>> @@ -1257,6 +1258,8 @@ static int parse_sha1_header(const char *hdr, unsigned long length, unsigned lon
>> * terminating '\0' that we add), and is followed by
>> * a space, at least one byte for size, and a '\0'.
>> */
>> + if ('b' != *hdr && 'c' != *hdr && 't' != *hdr) /* blob/commit/tag/tree */
>> + return -1;
>> i = 0;
>> while (hdr < hdr_end - 2) {
>> char c = *hdr++;
>
> Oh. I wouldn't do that. Its a cute trick and it works to quickly
> determine if the header is an uncompressed header vs. a zlib header
> vs. a new-style loose object header (which git cannot write anymore,
> but it still can read). But its just asking for trouble when/if a
> new object type was ever added to the type table.
>
I can't agree any more, it's just a trick. I considered adding
a function seems_likely_uncompressed_loose_object(), but I didn't
because this patch series are just my first try, I don't know whether
the idea to support uncompressed loose object is attractive enough.
> Given that we know that no type name can be more than 10 bytes and
> if you use my patch from earlier today you can be certain hdr has a
> '\0' terminator, so you could write a function to test for the type
> against the hdr, stopping on either ' ' or '\0'. Or find the first
> ' ' in the first 10 bytes (which is what this loop does anyway) and
> then test that against the type name table.
>
next prev parent reply other threads:[~2008-12-03 4:07 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-12-01 8:00 two questions about the format of loose object Liu Yubao
2008-12-01 8:25 ` Junio C Hamano
2008-12-01 9:28 ` Liu Yubao
2008-12-01 11:32 ` Jakub Narebski
2008-12-02 2:19 ` Liu Yubao
2008-12-01 15:21 ` Shawn O. Pearce
2008-12-02 2:43 ` Liu Yubao
2008-12-02 1:48 ` [PATCH 0/5] support reading and writing uncompressed " Liu Yubao
2008-12-02 1:51 ` [PATCH 1/5] avoid parse_sha1_header() accessing memory out of bound Liu Yubao
2008-12-02 15:42 ` Shawn O. Pearce
2008-12-03 3:49 ` Liu Yubao
2008-12-02 1:53 ` [PATCH 2/5] don't die immediately when convert an invalid type name Liu Yubao
2008-12-02 1:55 ` [PATCH 3/5] optimize parse_sha1_header() a little by detecting object type Liu Yubao
2008-12-02 15:53 ` Shawn O. Pearce
2008-12-03 4:06 ` Liu Yubao [this message]
2008-12-02 1:56 ` [PATCH 4/5] support reading uncompressed loose object Liu Yubao
2008-12-02 15:58 ` Shawn O. Pearce
2008-12-03 4:09 ` Liu Yubao
2008-12-02 2:03 ` [PATCH 5/5] support writing " Liu Yubao
2008-12-02 16:07 ` Shawn O. Pearce
2008-12-03 4:22 ` Liu Yubao
2008-12-02 3:11 ` [PATCH 0/5] support reading and " Liu Yubao
2008-12-01 12:16 ` two questions about the format of " Nick Andrew
2008-12-02 2:26 ` Liu Yubao
2008-12-01 15:32 ` Shawn O. Pearce
2008-12-02 3:05 ` Liu Yubao
2008-12-04 0:54 ` Nicolas Pitre
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=493605BB.8020705@gmail.com \
--to=yubao.liu@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=spearce@spearce.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.