From: mkoegler@auto.tuwien.ac.at (Martin Koegler)
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH] parse_commit_buffer: don't parse invalid commits
Date: Mon, 7 Jan 2008 08:40:09 +0100 [thread overview]
Message-ID: <20080107074009.GA32710@auto.tuwien.ac.at> (raw)
In-Reply-To: <7vbq7y4ns6.fsf@gitster.siamese.dyndns.org>
On Sun, Jan 06, 2008 at 02:00:57PM -0800, Junio C Hamano wrote:
> Martin Koegler <mkoegler@auto.tuwien.ac.at> writes:
> > diff --git a/commit.c b/commit.c
> > index f074811..ffa0894 100644
> > --- a/commit.c
> > +++ b/commit.c
> > @@ -48,19 +48,33 @@ struct commit *lookup_commit(const unsigned char *sha1)
> > return check_commit(obj, sha1, 0);
> > }
> >
> > -static unsigned long parse_commit_date(const char *buf)
> > +static unsigned long parse_commit_date(const char *buf, const char* tail)
>
> Should be "const char *tail" in our codebase.
Will fix.
> > {
> > unsigned long date;
> > + char datebuf[20];
> > + unsigned long len;
> >
> > + if (buf + 6 >= tail)
> > + return 0;
> > if (memcmp(buf, "author", 6))
> > return 0;
>
> Even though buf, which is a result from read_sha1_file(), is
> always terminated with an extra NUL (outside its object size),
> if a bogus commit object ends with "author" (and without the
> author information) this part will pass, and ...
fsck_commit (builtin-fsck.c) does quite redundant checks (except that
it checks for author too). Should I make parse_commit fail, if there
is no author and commiter line? This way, we could remove
| if (memcmp(buffer, "tree ", 5))
| return objerror(&commit->object, "invalid format - expected 'tree' line");
| if (get_sha1_hex(buffer+5, tree_sha1) || buffer[45] != '\n')
| return objerror(&commit->object, "invalid 'tree' line format - bad sha1");
| buffer += 46;
| while (!memcmp(buffer, "parent ", 7)) {
| if (get_sha1_hex(buffer+7, sha1) || buffer[47] != '\n')
| return objerror(&commit->object, "invalid 'parent' line format - bad sha1");
| buffer += 48;
| }
| if (memcmp(buffer, "author ", 7))
| return objerror(&commit->object, "invalid format - expected 'author' line");
from fsck_commit.
> > - while (*buf++ != '\n')
> > + while (buf < tail && *buf++ != '\n')
> > /* nada */;
> > + if (buf + 9 >= tail)
> > + return 0;
>
> ... you catch that here. That seems like a good change.
>
> > if (memcmp(buf, "committer", 9))
> > return 0;
> > - while (*buf++ != '>')
> > + while (buf < tail && *buf++ != '>')
> > /* nada */;
> > - date = strtoul(buf, NULL, 10);
> > + if (buf >= tail)
> > + return 0;
>
> Likewise here.
>
> > + len = tail - buf;
> > + if (len > sizeof(datebuf) - 1)
> > + len = sizeof(datebuf) - 1;
>
> Broken indentation.
Will fix.
> > + memcpy(datebuf, buf, len);
> > + datebuf[len] = 0;
> > + date = strtoul(datebuf, NULL, 10);
>
> However, as long as buf at this point hasn't go beyond tail,
> which you already checked, I think we can rely on strtoul()
> stopping at the NUL at the end of buffer (that is one beyond
> tail), without this extra memcpy(). Am I mistaken?
No.
parse_commit_buffer is only called from parse_commit (safe, as it uses
a buffer by read_sha1_file) and parse_object_buffer (safe, as it is
called by parse_object (safe as it uses a buffer by read_sha1_file)
and get_obj (in builtin-for-each-ref.c, safe as it uses a buffer by
read_sha1_file)).
> > @@ -275,7 +289,7 @@ int parse_commit_buffer(struct commit *item, void *buffer, unsigned long size)
> > n_refs++;
> > }
> > }
> > - item->date = parse_commit_date(bufptr);
> > + item->date = parse_commit_date(bufptr, tail);
> >
> > if (track_object_refs) {
> > unsigned i = 0;
> > --
> > 1.4.4.4
>
> When already somewhat deep in the rc cycle, looking at a patch
> from somebody who uses 1.4.4.4 makes me look at the patch a bit
> more carefully than usual ;-)
stg is much simpler to use for such patches. My distribution [Linus
already called it insafe for it's git version] ships stg 0.11, which
is sufficient for this. I simply use stg with the shipped git 1.4.4.4
to avoid any compatibiltiy problems.
mfg Martin Kögler
PS:
The function of get_obj in builtin-for-each-ref.c looks quite similar to
parse_object.
next prev parent reply other threads:[~2008-01-07 7:40 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-01-06 19:03 [PATCH] parse_tag_buffer: don't parse invalid tags Martin Koegler
2008-01-06 19:03 ` [PATCH] parse_commit_buffer: don't parse invalid commits Martin Koegler
2008-01-06 22:00 ` Junio C Hamano
2008-01-07 7:40 ` Martin Koegler [this message]
-- strict thread matches above, loose matches on Subject: below --
2008-01-14 21:20 Martin Koegler
2008-01-15 7:32 ` Johannes Sixt
2008-01-19 17:35 Martin Koegler
2008-01-19 19:52 ` Junio C Hamano
2008-01-20 16:11 ` Martin Koegler
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080107074009.GA32710@auto.tuwien.ac.at \
--to=mkoegler@auto.tuwien.ac.at \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).