Re: [PATCHv2 2/2] fast-import: tighten parsing of mark references

git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Jonathan Nieder <jrnieder@gmail.com>
To: Pete Wyckoff <pw@padd.com>
Cc: git@vger.kernel.org, Dmitry Ivankov <divanorama@gmail.com>,
	David Barr <davidbarr@google.com>,
	Sverre Rabbelier <srabbelier@gmail.com>,
	Junio C Hamano <gitster@pobox.com>,
	Johan Herland <johan@herland.net>
Subject: Re: [PATCHv2 2/2] fast-import: tighten parsing of mark references
Date: Tue, 3 Apr 2012 09:20:01 -0500	[thread overview]
Message-ID: <20120403142001.GD15589@burratino> (raw)
In-Reply-To: <1333417910-17955-3-git-send-email-pw@padd.com>

(cc-ing Johan for noteimport code)
Pete Wyckoff wrote:

>                    Fast-import does not complain when garbage
> appears after a mark reference in some cases.

Thanks for fixing it.

[...]
> +++ b/fast-import.c
[...]
> @@ -2236,20 +2287,24 @@ static void file_change_m(struct branch *b)
>  
>  	if (*p == ':') {
>  		char *x;
> -		oe = find_mark(strtoumax(p + 1, &x, 10));
> +		oe = find_mark(parse_mark_ref_space(p, &x));
>  		hashcpy(sha1, oe->idx.sha1);
>  		p = x;

Simpler:

	if (*p == ':') {
		oe = find_mark(parse_mark_ref_space(p, &p));
		hashcpy(sha1, oe->idx.sha1);
	} else if ...

>  	} else if (!prefixcmp(p, "inline")) {
>  		inline_data = 1;
>  		p += 6;
> +		if (*p != ' ')
> +			die("Missing space after 'inline': %s",
> +			    command_buf.buf);
>  	} else {
>  		if (get_sha1_hex(p, sha1))
>  			die("Invalid SHA1: %s", command_buf.buf);

If I write

	M 100644 inliness some/path/to/file

was my mistake actually leaving out a space after 'inline' or
was it using an invalid <dataref>?

I think the latter, so I would suggest

	} else if (!prefixcmp(p, "inline ")) {
		inline_data = 1;
		p += strlen("inline");	/* advance to space */
	} else {
		if (get_sha1_hex(p, sha1))
			...

[...]
>  	}
> -	if (*p++ != ' ')
> -		die("Missing space after SHA1: %s", command_buf.buf);
> +	++p;  /* skip space */

I guess I'd suggest

	assert(*p == ' ');
	p++;

as defense against coders introducing additional cases that
are not as careful.

> @@ -2408,20 +2463,24 @@ static void note_change_n(struct branch *b, unsigned char *old_fanout)
>  	/* <dataref> or 'inline' */
>  	if (*p == ':') {
>  		char *x;
> -		oe = find_mark(strtoumax(p + 1, &x, 10));
> +		oe = find_mark(parse_mark_ref_space(p, &x));
>  		hashcpy(sha1, oe->idx.sha1);
>  		p = x;

Likewise (btw, why doesn't this share code with the filemodify case?):

	if (*p == ':') {
		oe = find_mark(parse_mark_with_trailing_space(p, &p));
		hashcpy(sha1, oe->idx.sha1);
	} else if ...

and so on.

[...]
> @@ -2430,7 +2489,7 @@ static void note_change_n(struct branch *b, unsigned char *old_fanout)
>  			die("Can't add a note on empty branch.");
>  		hashcpy(commit_sha1, s->sha1);
>  	} else if (*p == ':') {
> -		uintmax_t commit_mark = strtoumax(p + 1, NULL, 10);
> +		uintmax_t commit_mark = parse_mark_ref_eol(p);
>  		struct object_entry *commit_oe = find_mark(commit_mark);
>  		if (commit_oe->type != OBJ_COMMIT)
>  			die("Mark :%" PRIuMAX " not a commit", commit_mark);
> @@ -2537,7 +2596,7 @@ static int parse_from(struct branch *b)
>  		hashcpy(b->branch_tree.versions[0].sha1, t);
>  		hashcpy(b->branch_tree.versions[1].sha1, t);
>  	} else if (*from == ':') {
> -		uintmax_t idnum = strtoumax(from + 1, NULL, 10);
> +		uintmax_t idnum = parse_mark_ref_eol(from);

The title feature.  Nice.

[...]
> @@ -2945,9 +2999,7 @@ static struct object_entry *parse_treeish_dataref(const char **p)
>  
>  	if (**p == ':') {	/* <mark> */
>  		char *endptr;
> -		e = find_mark(strtoumax(*p + 1, &endptr, 10));
> -		if (endptr == *p + 1)
> -			die("Invalid mark: %s", command_buf.buf);
> +		e = find_mark(parse_mark_ref_space(*p, &endptr));
>  		if (!e)
>  			die("Unknown mark: %s", command_buf.buf);
>  		*p = endptr;

Simpler:

	if (**p == ':') {
		e = find_mark(parse_mark_...(*p, p));
		if (!e)
			die(...);
	} else {

> @@ -2955,9 +3007,12 @@ static struct object_entry *parse_treeish_dataref(const char **p)
>  	} else {	/* <sha1> */
>  		if (get_sha1_hex(*p, sha1))
>  			die("Invalid SHA1: %s", command_buf.buf);
> -		e = find_object(sha1);
>  		*p += 40;
> +		if (**p != ' ')
> +			die("Missing space after SHA1: %s", command_buf.buf);
> +		e = find_object(sha1);

This seems dangerous.  What if a new caller arises that wants to
parse a <dataref> representing a tree-ish at the end of the line?

So I think checking the character after the tree-ish should still
be the caller's responsibility.

>  	}
> +	*p += 1;  /* skip space */

If other patches in flight use the same function, they would expect
*p to point to the space when parse_treeish_dataref returns.  If we
wanted to change that (as mentioned above I don't think we ought to)
then the function's name should be changed to force such new callers
not to compile.

> @@ -3008,8 +3063,6 @@ static void parse_ls(struct branch *b)
>  		root = new_tree_entry();
>  		hashcpy(root->versions[1].sha1, e->idx.sha1);
>  		load_tree(root);
> -		if (*p++ != ' ')
> -			die("Missing space after tree-ish: %s", command_buf.buf);

(here's the caller).

Except where noted above, this looks good.

Thanks and hope that helps,
Jonathan

next prev parent reply	other threads:[~2012-04-03 14:20 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-04-01 22:54 [PATCH] fast-import: catch garbage after marks in from/merge Pete Wyckoff
2012-04-01 23:12 ` Jonathan Nieder
2012-04-02  0:13   ` Pete Wyckoff
2012-04-02  6:56     ` Dmitry Ivankov
2012-04-02 16:16       ` Junio C Hamano
2012-04-02 15:43     ` Jonathan Nieder
2012-04-02 16:15 ` Junio C Hamano
2012-04-03  1:51 ` [PATCHv2 0/2] fast-import: tighten parsing of mark references Pete Wyckoff
2012-04-03  1:51   ` [PATCHv2 1/2] fast-import: test behavior of garbage after " Pete Wyckoff
2012-04-03 14:00     ` Jonathan Nieder
2012-04-04  0:46       ` Pete Wyckoff
2012-04-04  5:43         ` Jonathan Nieder
2012-04-03  1:51   ` [PATCHv2 2/2] fast-import: tighten parsing of " Pete Wyckoff
2012-04-03 14:20     ` Jonathan Nieder [this message]
2012-04-04  1:20       ` Pete Wyckoff
2012-04-04  5:32         ` Jonathan Nieder
2012-04-03  2:00   ` [PATCHv2 0/2] " Sverre Rabbelier
2012-04-05  1:51   ` [PATCHv3] " Pete Wyckoff
2012-04-05  2:24     ` Jonathan Nieder
2012-04-05 17:20       ` Junio C Hamano
2012-04-07 22:59     ` [PATCHv4] fast-import: tighten parsing of datarefs Pete Wyckoff
2012-04-10 21:40       ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120403142001.GD15589@burratino \
    --to=jrnieder@gmail.com \
    --cc=davidbarr@google.com \
    --cc=divanorama@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=johan@herland.net \
    --cc=pw@padd.com \
    --cc=srabbelier@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).