From: mkoegler@auto.tuwien.ac.at (Martin Koegler)
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH] parse_commit_buffer: don't parse invalid commits
Date: Mon, 7 Jan 2008 08:40:09 +0100 [thread overview]
Message-ID: <20080107074009.GA32710@auto.tuwien.ac.at> (raw)
In-Reply-To: <7vbq7y4ns6.fsf@gitster.siamese.dyndns.org>
On Sun, Jan 06, 2008 at 02:00:57PM -0800, Junio C Hamano wrote:
> Martin Koegler <mkoegler@auto.tuwien.ac.at> writes:
> > diff --git a/commit.c b/commit.c
> > index f074811..ffa0894 100644
> > --- a/commit.c
> > +++ b/commit.c
> > @@ -48,19 +48,33 @@ struct commit *lookup_commit(const unsigned char *sha1)
> > return check_commit(obj, sha1, 0);
> > }
> >
> > -static unsigned long parse_commit_date(const char *buf)
> > +static unsigned long parse_commit_date(const char *buf, const char* tail)
>
> Should be "const char *tail" in our codebase.
Will fix.
> > {
> > unsigned long date;
> > + char datebuf[20];
> > + unsigned long len;
> >
> > + if (buf + 6 >= tail)
> > + return 0;
> > if (memcmp(buf, "author", 6))
> > return 0;
>
> Even though buf, which is a result from read_sha1_file(), is
> always terminated with an extra NUL (outside its object size),
> if a bogus commit object ends with "author" (and without the
> author information) this part will pass, and ...
fsck_commit (builtin-fsck.c) does quite redundant checks (except that
it checks for author too). Should I make parse_commit fail, if there
is no author and commiter line? This way, we could remove
| if (memcmp(buffer, "tree ", 5))
| return objerror(&commit->object, "invalid format - expected 'tree' line");
| if (get_sha1_hex(buffer+5, tree_sha1) || buffer[45] != '\n')
| return objerror(&commit->object, "invalid 'tree' line format - bad sha1");
| buffer += 46;
| while (!memcmp(buffer, "parent ", 7)) {
| if (get_sha1_hex(buffer+7, sha1) || buffer[47] != '\n')
| return objerror(&commit->object, "invalid 'parent' line format - bad sha1");
| buffer += 48;
| }
| if (memcmp(buffer, "author ", 7))
| return objerror(&commit->object, "invalid format - expected 'author' line");
from fsck_commit.
> > - while (*buf++ != '\n')
> > + while (buf < tail && *buf++ != '\n')
> > /* nada */;
> > + if (buf + 9 >= tail)
> > + return 0;
>
> ... you catch that here. That seems like a good change.
>
> > if (memcmp(buf, "committer", 9))
> > return 0;
> > - while (*buf++ != '>')
> > + while (buf < tail && *buf++ != '>')
> > /* nada */;
> > - date = strtoul(buf, NULL, 10);
> > + if (buf >= tail)
> > + return 0;
>
> Likewise here.
>
> > + len = tail - buf;
> > + if (len > sizeof(datebuf) - 1)
> > + len = sizeof(datebuf) - 1;
>
> Broken indentation.
Will fix.
> > + memcpy(datebuf, buf, len);
> > + datebuf[len] = 0;
> > + date = strtoul(datebuf, NULL, 10);
>
> However, as long as buf at this point hasn't go beyond tail,
> which you already checked, I think we can rely on strtoul()
> stopping at the NUL at the end of buffer (that is one beyond
> tail), without this extra memcpy(). Am I mistaken?
No.
parse_commit_buffer is only called from parse_commit (safe, as it uses
a buffer by read_sha1_file) and parse_object_buffer (safe, as it is
called by parse_object (safe as it uses a buffer by read_sha1_file)
and get_obj (in builtin-for-each-ref.c, safe as it uses a buffer by
read_sha1_file)).
> > @@ -275,7 +289,7 @@ int parse_commit_buffer(struct commit *item, void *buffer, unsigned long size)
> > n_refs++;
> > }
> > }
> > - item->date = parse_commit_date(bufptr);
> > + item->date = parse_commit_date(bufptr, tail);
> >
> > if (track_object_refs) {
> > unsigned i = 0;
> > --
> > 1.4.4.4
>
> When already somewhat deep in the rc cycle, looking at a patch
> from somebody who uses 1.4.4.4 makes me look at the patch a bit
> more carefully than usual ;-)
stg is much simpler to use for such patches. My distribution [Linus
already called it insafe for it's git version] ships stg 0.11, which
is sufficient for this. I simply use stg with the shipped git 1.4.4.4
to avoid any compatibiltiy problems.
mfg Martin Kögler
PS:
The function of get_obj in builtin-for-each-ref.c looks quite similar to
parse_object.
next prev parent reply other threads:[~2008-01-07 7:40 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-01-06 19:03 [PATCH] parse_tag_buffer: don't parse invalid tags Martin Koegler
2008-01-06 19:03 ` [PATCH] parse_commit_buffer: don't parse invalid commits Martin Koegler
2008-01-06 22:00 ` Junio C Hamano
2008-01-07 7:40 ` Martin Koegler [this message]
-- strict thread matches above, loose matches on Subject: below --
2008-01-14 21:20 Martin Koegler
2008-01-15 7:32 ` Johannes Sixt
2008-01-19 17:35 Martin Koegler
2008-01-19 19:52 ` Junio C Hamano
2008-01-20 16:11 ` Martin Koegler
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20080107074009.GA32710@auto.tuwien.ac.at \
--to=mkoegler@auto.tuwien.ac.at \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.