git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: mkoegler@auto.tuwien.ac.at (Martin Koegler)
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org
Subject: Re: [PATCH] parse_commit_buffer: don't parse invalid commits
Date: Mon, 7 Jan 2008 08:40:09 +0100	[thread overview]
Message-ID: <20080107074009.GA32710@auto.tuwien.ac.at> (raw)
In-Reply-To: <7vbq7y4ns6.fsf@gitster.siamese.dyndns.org>

On Sun, Jan 06, 2008 at 02:00:57PM -0800, Junio C Hamano wrote:
> Martin Koegler <mkoegler@auto.tuwien.ac.at> writes:
> > diff --git a/commit.c b/commit.c
> > index f074811..ffa0894 100644
> > --- a/commit.c
> > +++ b/commit.c
> > @@ -48,19 +48,33 @@ struct commit *lookup_commit(const unsigned char *sha1)
> >  	return check_commit(obj, sha1, 0);
> >  }
> >  
> > -static unsigned long parse_commit_date(const char *buf)
> > +static unsigned long parse_commit_date(const char *buf, const char* tail)
> 
> Should be "const char *tail" in our codebase.

Will fix.

> >  {
> >  	unsigned long date;
> > +	char datebuf[20];
> > +	unsigned long len;
> >  
> > +	if (buf + 6 >= tail)
> > +		return 0;
> >  	if (memcmp(buf, "author", 6))
> >  		return 0;
> 
> Even though buf, which is a result from read_sha1_file(), is
> always terminated with an extra NUL (outside its object size),
> if a bogus commit object ends with "author" (and without the
> author information) this part will pass, and ...

fsck_commit (builtin-fsck.c) does quite redundant checks (except that
it checks for author too). Should I make parse_commit fail, if there
is no author and commiter line? This way, we could remove 
|        if (memcmp(buffer, "tree ", 5))
|                return objerror(&commit->object, "invalid format - expected 'tree' line");
|        if (get_sha1_hex(buffer+5, tree_sha1) || buffer[45] != '\n')
|                return objerror(&commit->object, "invalid 'tree' line format - bad sha1");
|        buffer += 46;
|        while (!memcmp(buffer, "parent ", 7)) {
|                if (get_sha1_hex(buffer+7, sha1) || buffer[47] != '\n')
|                        return objerror(&commit->object, "invalid 'parent' line format - bad sha1");
|                buffer += 48;
|        }
|        if (memcmp(buffer, "author ", 7))
|                return objerror(&commit->object, "invalid format - expected 'author' line");
from fsck_commit.


> > -	while (*buf++ != '\n')
> > +	while (buf < tail && *buf++ != '\n')
> >  		/* nada */;
> > +	if (buf + 9 >= tail)
> > +		return 0;
> 
> ... you catch that here.  That seems like a good change.
> 
> >  	if (memcmp(buf, "committer", 9))
> >  		return 0;
> > -	while (*buf++ != '>')
> > +	while (buf < tail && *buf++ != '>')
> >  		/* nada */;
> > -	date = strtoul(buf, NULL, 10);
> > +	if (buf >= tail)
> > +		return 0;
> 
> Likewise here.
> 
> > +	len = tail - buf;
> > +	if (len > sizeof(datebuf) - 1)
> > +	  len = sizeof(datebuf) - 1;
> 
> Broken indentation.

Will fix.

> > +	memcpy(datebuf, buf, len);
> > +	datebuf[len] = 0;
> > +	date = strtoul(datebuf, NULL, 10);
> 
> However, as long as buf at this point hasn't go beyond tail,
> which you already checked, I think we can rely on strtoul()
> stopping at the NUL at the end of buffer (that is one beyond
> tail), without this extra memcpy().  Am I mistaken?

No.

parse_commit_buffer is only called from parse_commit (safe, as it uses
a buffer by read_sha1_file) and parse_object_buffer (safe, as it is
called by parse_object (safe as it uses a buffer by read_sha1_file)
and get_obj (in builtin-for-each-ref.c, safe as it uses a buffer by
read_sha1_file)).

> > @@ -275,7 +289,7 @@ int parse_commit_buffer(struct commit *item, void *buffer, unsigned long size)
> >  			n_refs++;
> >  		}
> >  	}
> > -	item->date = parse_commit_date(bufptr);
> > +	item->date = parse_commit_date(bufptr, tail);
> >  
> >  	if (track_object_refs) {
> >  		unsigned i = 0;
> > -- 
> > 1.4.4.4
> 
> When already somewhat deep in the rc cycle, looking at a patch
> from somebody who uses 1.4.4.4 makes me look at the patch a bit
> more carefully than usual ;-)

stg is much simpler to use for such patches. My distribution [Linus
already called it insafe for it's git version] ships stg 0.11, which
is sufficient for this. I simply use stg with the shipped git 1.4.4.4
to avoid any compatibiltiy problems.

mfg Martin Kögler
PS:
The function of get_obj in builtin-for-each-ref.c looks quite similar to
parse_object.

  reply	other threads:[~2008-01-07  7:40 UTC|newest]

Thread overview: 9+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2008-01-06 19:03 [PATCH] parse_tag_buffer: don't parse invalid tags Martin Koegler
2008-01-06 19:03 ` [PATCH] parse_commit_buffer: don't parse invalid commits Martin Koegler
2008-01-06 22:00   ` Junio C Hamano
2008-01-07  7:40     ` Martin Koegler [this message]
  -- strict thread matches above, loose matches on Subject: below --
2008-01-14 21:20 Martin Koegler
2008-01-15  7:32 ` Johannes Sixt
2008-01-19 17:35 Martin Koegler
2008-01-19 19:52 ` Junio C Hamano
2008-01-20 16:11   ` Martin Koegler

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20080107074009.GA32710@auto.tuwien.ac.at \
    --to=mkoegler@auto.tuwien.ac.at \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).