From: Jonathan Nieder <jrnieder@gmail.com>
To: Pete Wyckoff <pw@padd.com>
Cc: git@vger.kernel.org, Dmitry Ivankov <divanorama@gmail.com>,
David Barr <davidbarr@google.com>,
Sverre Rabbelier <srabbelier@gmail.com>,
Junio C Hamano <gitster@pobox.com>,
Johan Herland <johan@herland.net>
Subject: Re: [PATCHv2 2/2] fast-import: tighten parsing of mark references
Date: Tue, 3 Apr 2012 09:20:01 -0500 [thread overview]
Message-ID: <20120403142001.GD15589@burratino> (raw)
In-Reply-To: <1333417910-17955-3-git-send-email-pw@padd.com>
(cc-ing Johan for noteimport code)
Pete Wyckoff wrote:
> Fast-import does not complain when garbage
> appears after a mark reference in some cases.
Thanks for fixing it.
[...]
> +++ b/fast-import.c
[...]
> @@ -2236,20 +2287,24 @@ static void file_change_m(struct branch *b)
>
> if (*p == ':') {
> char *x;
> - oe = find_mark(strtoumax(p + 1, &x, 10));
> + oe = find_mark(parse_mark_ref_space(p, &x));
> hashcpy(sha1, oe->idx.sha1);
> p = x;
Simpler:
if (*p == ':') {
oe = find_mark(parse_mark_ref_space(p, &p));
hashcpy(sha1, oe->idx.sha1);
} else if ...
> } else if (!prefixcmp(p, "inline")) {
> inline_data = 1;
> p += 6;
> + if (*p != ' ')
> + die("Missing space after 'inline': %s",
> + command_buf.buf);
> } else {
> if (get_sha1_hex(p, sha1))
> die("Invalid SHA1: %s", command_buf.buf);
If I write
M 100644 inliness some/path/to/file
was my mistake actually leaving out a space after 'inline' or
was it using an invalid <dataref>?
I think the latter, so I would suggest
} else if (!prefixcmp(p, "inline ")) {
inline_data = 1;
p += strlen("inline"); /* advance to space */
} else {
if (get_sha1_hex(p, sha1))
...
[...]
> }
> - if (*p++ != ' ')
> - die("Missing space after SHA1: %s", command_buf.buf);
> + ++p; /* skip space */
I guess I'd suggest
assert(*p == ' ');
p++;
as defense against coders introducing additional cases that
are not as careful.
> @@ -2408,20 +2463,24 @@ static void note_change_n(struct branch *b, unsigned char *old_fanout)
> /* <dataref> or 'inline' */
> if (*p == ':') {
> char *x;
> - oe = find_mark(strtoumax(p + 1, &x, 10));
> + oe = find_mark(parse_mark_ref_space(p, &x));
> hashcpy(sha1, oe->idx.sha1);
> p = x;
Likewise (btw, why doesn't this share code with the filemodify case?):
if (*p == ':') {
oe = find_mark(parse_mark_with_trailing_space(p, &p));
hashcpy(sha1, oe->idx.sha1);
} else if ...
and so on.
[...]
> @@ -2430,7 +2489,7 @@ static void note_change_n(struct branch *b, unsigned char *old_fanout)
> die("Can't add a note on empty branch.");
> hashcpy(commit_sha1, s->sha1);
> } else if (*p == ':') {
> - uintmax_t commit_mark = strtoumax(p + 1, NULL, 10);
> + uintmax_t commit_mark = parse_mark_ref_eol(p);
> struct object_entry *commit_oe = find_mark(commit_mark);
> if (commit_oe->type != OBJ_COMMIT)
> die("Mark :%" PRIuMAX " not a commit", commit_mark);
> @@ -2537,7 +2596,7 @@ static int parse_from(struct branch *b)
> hashcpy(b->branch_tree.versions[0].sha1, t);
> hashcpy(b->branch_tree.versions[1].sha1, t);
> } else if (*from == ':') {
> - uintmax_t idnum = strtoumax(from + 1, NULL, 10);
> + uintmax_t idnum = parse_mark_ref_eol(from);
The title feature. Nice.
[...]
> @@ -2945,9 +2999,7 @@ static struct object_entry *parse_treeish_dataref(const char **p)
>
> if (**p == ':') { /* <mark> */
> char *endptr;
> - e = find_mark(strtoumax(*p + 1, &endptr, 10));
> - if (endptr == *p + 1)
> - die("Invalid mark: %s", command_buf.buf);
> + e = find_mark(parse_mark_ref_space(*p, &endptr));
> if (!e)
> die("Unknown mark: %s", command_buf.buf);
> *p = endptr;
Simpler:
if (**p == ':') {
e = find_mark(parse_mark_...(*p, p));
if (!e)
die(...);
} else {
> @@ -2955,9 +3007,12 @@ static struct object_entry *parse_treeish_dataref(const char **p)
> } else { /* <sha1> */
> if (get_sha1_hex(*p, sha1))
> die("Invalid SHA1: %s", command_buf.buf);
> - e = find_object(sha1);
> *p += 40;
> + if (**p != ' ')
> + die("Missing space after SHA1: %s", command_buf.buf);
> + e = find_object(sha1);
This seems dangerous. What if a new caller arises that wants to
parse a <dataref> representing a tree-ish at the end of the line?
So I think checking the character after the tree-ish should still
be the caller's responsibility.
> }
> + *p += 1; /* skip space */
If other patches in flight use the same function, they would expect
*p to point to the space when parse_treeish_dataref returns. If we
wanted to change that (as mentioned above I don't think we ought to)
then the function's name should be changed to force such new callers
not to compile.
> @@ -3008,8 +3063,6 @@ static void parse_ls(struct branch *b)
> root = new_tree_entry();
> hashcpy(root->versions[1].sha1, e->idx.sha1);
> load_tree(root);
> - if (*p++ != ' ')
> - die("Missing space after tree-ish: %s", command_buf.buf);
(here's the caller).
Except where noted above, this looks good.
Thanks and hope that helps,
Jonathan
next prev parent reply other threads:[~2012-04-03 14:20 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-04-01 22:54 [PATCH] fast-import: catch garbage after marks in from/merge Pete Wyckoff
2012-04-01 23:12 ` Jonathan Nieder
2012-04-02 0:13 ` Pete Wyckoff
2012-04-02 6:56 ` Dmitry Ivankov
2012-04-02 16:16 ` Junio C Hamano
2012-04-02 15:43 ` Jonathan Nieder
2012-04-02 16:15 ` Junio C Hamano
2012-04-03 1:51 ` [PATCHv2 0/2] fast-import: tighten parsing of mark references Pete Wyckoff
2012-04-03 1:51 ` [PATCHv2 1/2] fast-import: test behavior of garbage after " Pete Wyckoff
2012-04-03 14:00 ` Jonathan Nieder
2012-04-04 0:46 ` Pete Wyckoff
2012-04-04 5:43 ` Jonathan Nieder
2012-04-03 1:51 ` [PATCHv2 2/2] fast-import: tighten parsing of " Pete Wyckoff
2012-04-03 14:20 ` Jonathan Nieder [this message]
2012-04-04 1:20 ` Pete Wyckoff
2012-04-04 5:32 ` Jonathan Nieder
2012-04-03 2:00 ` [PATCHv2 0/2] " Sverre Rabbelier
2012-04-05 1:51 ` [PATCHv3] " Pete Wyckoff
2012-04-05 2:24 ` Jonathan Nieder
2012-04-05 17:20 ` Junio C Hamano
2012-04-07 22:59 ` [PATCHv4] fast-import: tighten parsing of datarefs Pete Wyckoff
2012-04-10 21:40 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120403142001.GD15589@burratino \
--to=jrnieder@gmail.com \
--cc=davidbarr@google.com \
--cc=divanorama@gmail.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=johan@herland.net \
--cc=pw@padd.com \
--cc=srabbelier@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).