Re: [PATCHv2 2/2] fast-import: tighten parsing of mark references

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Jonathan Nieder <jrnieder@gmail.com>
To: Pete Wyckoff <pw@padd.com>
Cc: git@vger.kernel.org, Dmitry Ivankov <divanorama@gmail.com>,
	David Barr <davidbarr@google.com>,
	Sverre Rabbelier <srabbelier@gmail.com>,
	Junio C Hamano <gitster@pobox.com>,
	Johan Herland <johan@herland.net>
Subject: Re: [PATCHv2 2/2] fast-import: tighten parsing of mark references
Date: Tue, 3 Apr 2012 09:20:01 -0500	[thread overview]
Message-ID: <20120403142001.GD15589@burratino> (raw)
In-Reply-To: <1333417910-17955-3-git-send-email-pw@padd.com>

(cc-ing Johan for noteimport code)
Pete Wyckoff wrote:

>                    Fast-import does not complain when garbage
> appears after a mark reference in some cases.

Thanks for fixing it.

[...]
> +++ b/fast-import.c
[...]
> @@ -2236,20 +2287,24 @@ static void file_change_m(struct branch *b)
>  
>  	if (*p == ':') {
>  		char *x;
> -		oe = find_mark(strtoumax(p + 1, &x, 10));
> +		oe = find_mark(parse_mark_ref_space(p, &x));
>  		hashcpy(sha1, oe->idx.sha1);
>  		p = x;

Simpler:

	if (*p == ':') {
		oe = find_mark(parse_mark_ref_space(p, &p));
		hashcpy(sha1, oe->idx.sha1);
	} else if ...

>  	} else if (!prefixcmp(p, "inline")) {
>  		inline_data = 1;
>  		p += 6;
> +		if (*p != ' ')
> +			die("Missing space after 'inline': %s",
> +			    command_buf.buf);
>  	} else {
>  		if (get_sha1_hex(p, sha1))
>  			die("Invalid SHA1: %s", command_buf.buf);

If I write

	M 100644 inliness some/path/to/file

was my mistake actually leaving out a space after 'inline' or
was it using an invalid <dataref>?

I think the latter, so I would suggest

	} else if (!prefixcmp(p, "inline ")) {
		inline_data = 1;
		p += strlen("inline");	/* advance to space */
	} else {
		if (get_sha1_hex(p, sha1))
			...

[...]
>  	}
> -	if (*p++ != ' ')
> -		die("Missing space after SHA1: %s", command_buf.buf);
> +	++p;  /* skip space */

I guess I'd suggest

	assert(*p == ' ');
	p++;

as defense against coders introducing additional cases that
are not as careful.

> @@ -2408,20 +2463,24 @@ static void note_change_n(struct branch *b, unsigned char *old_fanout)
>  	/* <dataref> or 'inline' */
>  	if (*p == ':') {
>  		char *x;
> -		oe = find_mark(strtoumax(p + 1, &x, 10));
> +		oe = find_mark(parse_mark_ref_space(p, &x));
>  		hashcpy(sha1, oe->idx.sha1);
>  		p = x;

Likewise (btw, why doesn't this share code with the filemodify case?):

	if (*p == ':') {
		oe = find_mark(parse_mark_with_trailing_space(p, &p));
		hashcpy(sha1, oe->idx.sha1);
	} else if ...

and so on.

[...]
> @@ -2430,7 +2489,7 @@ static void note_change_n(struct branch *b, unsigned char *old_fanout)
>  			die("Can't add a note on empty branch.");
>  		hashcpy(commit_sha1, s->sha1);
>  	} else if (*p == ':') {
> -		uintmax_t commit_mark = strtoumax(p + 1, NULL, 10);
> +		uintmax_t commit_mark = parse_mark_ref_eol(p);
>  		struct object_entry *commit_oe = find_mark(commit_mark);
>  		if (commit_oe->type != OBJ_COMMIT)
>  			die("Mark :%" PRIuMAX " not a commit", commit_mark);
> @@ -2537,7 +2596,7 @@ static int parse_from(struct branch *b)
>  		hashcpy(b->branch_tree.versions[0].sha1, t);
>  		hashcpy(b->branch_tree.versions[1].sha1, t);
>  	} else if (*from == ':') {
> -		uintmax_t idnum = strtoumax(from + 1, NULL, 10);
> +		uintmax_t idnum = parse_mark_ref_eol(from);

The title feature.  Nice.

[...]
> @@ -2945,9 +2999,7 @@ static struct object_entry *parse_treeish_dataref(const char **p)
>  
>  	if (**p == ':') {	/* <mark> */
>  		char *endptr;
> -		e = find_mark(strtoumax(*p + 1, &endptr, 10));
> -		if (endptr == *p + 1)
> -			die("Invalid mark: %s", command_buf.buf);
> +		e = find_mark(parse_mark_ref_space(*p, &endptr));
>  		if (!e)
>  			die("Unknown mark: %s", command_buf.buf);
>  		*p = endptr;

Simpler:

	if (**p == ':') {
		e = find_mark(parse_mark_...(*p, p));
		if (!e)
			die(...);
	} else {

> @@ -2955,9 +3007,12 @@ static struct object_entry *parse_treeish_dataref(const char **p)
>  	} else {	/* <sha1> */
>  		if (get_sha1_hex(*p, sha1))
>  			die("Invalid SHA1: %s", command_buf.buf);
> -		e = find_object(sha1);
>  		*p += 40;
> +		if (**p != ' ')
> +			die("Missing space after SHA1: %s", command_buf.buf);
> +		e = find_object(sha1);

This seems dangerous.  What if a new caller arises that wants to
parse a <dataref> representing a tree-ish at the end of the line?

So I think checking the character after the tree-ish should still
be the caller's responsibility.

>  	}
> +	*p += 1;  /* skip space */

If other patches in flight use the same function, they would expect
*p to point to the space when parse_treeish_dataref returns.  If we
wanted to change that (as mentioned above I don't think we ought to)
then the function's name should be changed to force such new callers
not to compile.

> @@ -3008,8 +3063,6 @@ static void parse_ls(struct branch *b)
>  		root = new_tree_entry();
>  		hashcpy(root->versions[1].sha1, e->idx.sha1);
>  		load_tree(root);
> -		if (*p++ != ' ')
> -			die("Missing space after tree-ish: %s", command_buf.buf);

(here's the caller).

Except where noted above, this looks good.

Thanks and hope that helps,
Jonathan

next prev parent reply	other threads:[~2012-04-03 14:20 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-04-01 22:54 [PATCH] fast-import: catch garbage after marks in from/merge Pete Wyckoff
2012-04-01 23:12 ` Jonathan Nieder
2012-04-02  0:13   ` Pete Wyckoff
2012-04-02  6:56     ` Dmitry Ivankov
2012-04-02 16:16       ` Junio C Hamano
2012-04-02 15:43     ` Jonathan Nieder
2012-04-02 16:15 ` Junio C Hamano
2012-04-03  1:51 ` [PATCHv2 0/2] fast-import: tighten parsing of mark references Pete Wyckoff
2012-04-03  1:51   ` [PATCHv2 1/2] fast-import: test behavior of garbage after " Pete Wyckoff
2012-04-03 14:00     ` Jonathan Nieder
2012-04-04  0:46       ` Pete Wyckoff
2012-04-04  5:43         ` Jonathan Nieder
2012-04-03  1:51   ` [PATCHv2 2/2] fast-import: tighten parsing of " Pete Wyckoff
2012-04-03 14:20     ` Jonathan Nieder [this message]
2012-04-04  1:20       ` Pete Wyckoff
2012-04-04  5:32         ` Jonathan Nieder
2012-04-03  2:00   ` [PATCHv2 0/2] " Sverre Rabbelier
2012-04-05  1:51   ` [PATCHv3] " Pete Wyckoff
2012-04-05  2:24     ` Jonathan Nieder
2012-04-05 17:20       ` Junio C Hamano
2012-04-07 22:59     ` [PATCHv4] fast-import: tighten parsing of datarefs Pete Wyckoff
2012-04-10 21:40       ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120403142001.GD15589@burratino \
    --to=jrnieder@gmail.com \
    --cc=davidbarr@google.com \
    --cc=divanorama@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=johan@herland.net \
    --cc=pw@padd.com \
    --cc=srabbelier@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.