* [BUG] rev-list doesn't validate arguments to -n option @ 2023-12-07 22:12 Britton Kerin 2023-12-08 20:36 ` Junio C Hamano 0 siblings, 1 reply; 6+ messages in thread From: Britton Kerin @ 2023-12-07 22:12 UTC (permalink / raw) To: git It tolerates non-numeric arguments and garbage after a number: For example: $ # -n 1 means same as -n 0: $ git rev-list -n q newest_commit $ git rev-list -n 0 newest_commit $ # Garbage after number is tolerated: $ git rev-list -n 1q newest_commit 3be33f83695088d968cf084a1a08bdcde25a8d7a $ git rev-list -n 2q newest_commit 3be33f83695088d968cf084a1a08bdcde25a8d7a 286e62e1b68e39334978e6222cbff187ecc17bcf ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [BUG] rev-list doesn't validate arguments to -n option 2023-12-07 22:12 [BUG] rev-list doesn't validate arguments to -n option Britton Kerin @ 2023-12-08 20:36 ` Junio C Hamano 2023-12-08 22:35 ` [PATCH] revision: parse integer arguments to --max-count, --skip, etc., more carefully Junio C Hamano 0 siblings, 1 reply; 6+ messages in thread From: Junio C Hamano @ 2023-12-08 20:36 UTC (permalink / raw) To: git; +Cc: Britton Kerin Britton Kerin <britton.kerin@gmail.com> writes: > It tolerates non-numeric arguments and garbage after a number: > > For example: > > $ # -n 1 means same as -n 0: > $ git rev-list -n q newest_commit > $ git rev-list -n 0 newest_commit > $ # Garbage after number is tolerated: > $ git rev-list -n 1q newest_commit > 3be33f83695088d968cf084a1a08bdcde25a8d7a > $ git rev-list -n 2q newest_commit > 3be33f83695088d968cf084a1a08bdcde25a8d7a > 286e62e1b68e39334978e6222cbff187ecc17bcf Indeed, unlike the option parsing for most commands, "rev-list" and "log" family of commands in the oldest part of the system still use hand-crafted option parsing and most notably use atoi() instead of a more robust strtol() family of functions. The same issue is present for parsing things like "--max-count=1q" and "--skip=1q". Perhaps a change along this line, plus a few new tests, would suffice. There are others (like "--max-age" we see in the post context of the patch) that use atoi() but they probably should not share the same machinery (most importantly, the type of the value) as "--max-count", so I didn't touch it in the below. revision.c | 21 ++++++++++++++------- 1 file changed, 14 insertions(+), 7 deletions(-) diff --git c/revision.c w/revision.c index 00d5c29bfc..99e838bbd1 100644 --- c/revision.c +++ w/revision.c @@ -2223,6 +2223,15 @@ static void add_message_grep(struct rev_info *revs, const char *pattern) add_grep(revs, pattern, GREP_PATTERN_BODY); } +static int parse_count(const char *arg) +{ + int count; + + if (strtol_i(arg, 10, &count) < 0 || count < 0) + die("'%s': not a non-negative integer", arg); + return count; +} + static int handle_revision_opt(struct rev_info *revs, int argc, const char **argv, int *unkc, const char **unkv, const struct setup_revision_opt* opt) @@ -2249,26 +2258,24 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg } if ((argcount = parse_long_opt("max-count", argv, &optarg))) { - revs->max_count = atoi(optarg); + revs->max_count = parse_count(optarg); revs->no_walk = 0; return argcount; } else if ((argcount = parse_long_opt("skip", argv, &optarg))) { - revs->skip_count = atoi(optarg); + revs->skip_count = parse_count(optarg); return argcount; } else if ((*arg == '-') && isdigit(arg[1])) { /* accept -<digit>, like traditional "head" */ - if (strtol_i(arg + 1, 10, &revs->max_count) < 0 || - revs->max_count < 0) - die("'%s': not a non-negative integer", arg + 1); + revs->max_count = parse_count(arg + 1); revs->no_walk = 0; } else if (!strcmp(arg, "-n")) { if (argc <= 1) return error("-n requires an argument"); - revs->max_count = atoi(argv[1]); + revs->max_count = parse_count(argv[1]); revs->no_walk = 0; return 2; } else if (skip_prefix(arg, "-n", &optarg)) { - revs->max_count = atoi(optarg); + revs->max_count = parse_count(optarg); revs->no_walk = 0; } else if ((argcount = parse_long_opt("max-age", argv, &optarg))) { revs->max_age = atoi(optarg); ^ permalink raw reply related [flat|nested] 6+ messages in thread
* [PATCH] revision: parse integer arguments to --max-count, --skip, etc., more carefully 2023-12-08 20:36 ` Junio C Hamano @ 2023-12-08 22:35 ` Junio C Hamano 2023-12-12 1:30 ` Jeff King 0 siblings, 1 reply; 6+ messages in thread From: Junio C Hamano @ 2023-12-08 22:35 UTC (permalink / raw) To: git; +Cc: Britton Kerin The "rev-list" and other commands in the "log" family, being the oldest part of the system, use their own custom argument parsers, and integer values of some options are parsed with atoi(), which allows a non-digit after the number (e.g., "1q") to be silently ignored. As a natural consequence, an argument that does not begin with a digit (e.g., "q") silently becomes zero, too. Switch to use strtol_i() and parse_timestamp() appropriately to catch bogus input. Note that one may naïvely expect that --max-count, --skip, etc., to only take non-negative values, but we must allow them to also take negative values, as an escape hatch to countermand a limit set by an earlier option on the command line; the underlying variables are initialized to (-1) and "--max-count=-1", for example, is a legitimate way to reinitialize the limit. Signed-off-by: Junio C Hamano <gitster@pobox.com> --- revision.c | 41 ++++++++++++++++++++++++++++---------- t/t6005-rev-list-count.sh | 15 +++++++++++++- t/t6009-rev-list-parent.sh | 11 ++++++++++ 3 files changed, 55 insertions(+), 12 deletions(-) diff --git a/revision.c b/revision.c index 0ae1c76db3..8cda6e43d4 100644 --- a/revision.c +++ b/revision.c @@ -2214,6 +2214,27 @@ static void add_message_grep(struct rev_info *revs, const char *pattern) add_grep(revs, pattern, GREP_PATTERN_BODY); } +static int parse_count(const char *arg) +{ + int count; + + if (strtol_i(arg, 10, &count) < 0) + die("'%s': not an integer", arg); + return count; +} + +static timestamp_t parse_age(const char *arg) +{ + timestamp_t num; + char *p; + + errno = 0; + num = parse_timestamp(arg, &p, 10); + if (errno || *p || p == arg) + die("'%s': not a number of seconds since epoch", arg); + return num; +} + static int handle_revision_opt(struct rev_info *revs, int argc, const char **argv, int *unkc, const char **unkv, const struct setup_revision_opt* opt) @@ -2240,29 +2261,27 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg } if ((argcount = parse_long_opt("max-count", argv, &optarg))) { - revs->max_count = atoi(optarg); + revs->max_count = parse_count(optarg); revs->no_walk = 0; return argcount; } else if ((argcount = parse_long_opt("skip", argv, &optarg))) { - revs->skip_count = atoi(optarg); + revs->skip_count = parse_count(optarg); return argcount; } else if ((*arg == '-') && isdigit(arg[1])) { /* accept -<digit>, like traditional "head" */ - if (strtol_i(arg + 1, 10, &revs->max_count) < 0 || - revs->max_count < 0) - die("'%s': not a non-negative integer", arg + 1); + revs->max_count = parse_count(arg + 1); revs->no_walk = 0; } else if (!strcmp(arg, "-n")) { if (argc <= 1) return error("-n requires an argument"); - revs->max_count = atoi(argv[1]); + revs->max_count = parse_count(argv[1]); revs->no_walk = 0; return 2; } else if (skip_prefix(arg, "-n", &optarg)) { - revs->max_count = atoi(optarg); + revs->max_count = parse_count(optarg); revs->no_walk = 0; } else if ((argcount = parse_long_opt("max-age", argv, &optarg))) { - revs->max_age = atoi(optarg); + revs->max_age = parse_age(optarg); return argcount; } else if ((argcount = parse_long_opt("since", argv, &optarg))) { revs->max_age = approxidate(optarg); @@ -2274,7 +2293,7 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg revs->max_age = approxidate(optarg); return argcount; } else if ((argcount = parse_long_opt("min-age", argv, &optarg))) { - revs->min_age = atoi(optarg); + revs->min_age = parse_age(optarg); return argcount; } else if ((argcount = parse_long_opt("before", argv, &optarg))) { revs->min_age = approxidate(optarg); @@ -2362,11 +2381,11 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg } else if (!strcmp(arg, "--no-merges")) { revs->max_parents = 1; } else if (skip_prefix(arg, "--min-parents=", &optarg)) { - revs->min_parents = atoi(optarg); + revs->min_parents = parse_count(optarg); } else if (!strcmp(arg, "--no-min-parents")) { revs->min_parents = 0; } else if (skip_prefix(arg, "--max-parents=", &optarg)) { - revs->max_parents = atoi(optarg); + revs->max_parents = parse_count(optarg); } else if (!strcmp(arg, "--no-max-parents")) { revs->max_parents = -1; } else if (!strcmp(arg, "--boundary")) { diff --git a/t/t6005-rev-list-count.sh b/t/t6005-rev-list-count.sh index 0729f800c3..62538f5a02 100755 --- a/t/t6005-rev-list-count.sh +++ b/t/t6005-rev-list-count.sh @@ -18,13 +18,23 @@ test_expect_success 'no options' ' ' test_expect_success '--max-count' ' + test_must_fail git rev-list --max-count=1q HEAD 2>error && + grep "not an integer" error && + test_stdout_line_count = 0 git rev-list HEAD --max-count=0 && test_stdout_line_count = 3 git rev-list HEAD --max-count=3 && test_stdout_line_count = 5 git rev-list HEAD --max-count=5 && - test_stdout_line_count = 5 git rev-list HEAD --max-count=10 + test_stdout_line_count = 5 git rev-list HEAD --max-count=10 && + test_stdout_line_count = 5 git rev-list HEAD --max-count=-1 ' test_expect_success '--max-count all forms' ' + test_must_fail git rev-list -1q HEAD 2>error && + grep "not an integer" error && + test_must_fail git rev-list --1 HEAD && + test_must_fail git rev-list -n 1q HEAD 2>error && + grep "not an integer" error && + test_stdout_line_count = 1 git rev-list HEAD --max-count=1 && test_stdout_line_count = 1 git rev-list HEAD -1 && test_stdout_line_count = 1 git rev-list HEAD -n1 && @@ -32,6 +42,9 @@ test_expect_success '--max-count all forms' ' ' test_expect_success '--skip' ' + test_must_fail git rev-list --skip 1q HEAD 2>error && + grep "not an integer" error && + test_stdout_line_count = 5 git rev-list HEAD --skip=0 && test_stdout_line_count = 2 git rev-list HEAD --skip=3 && test_stdout_line_count = 0 git rev-list HEAD --skip=5 && diff --git a/t/t6009-rev-list-parent.sh b/t/t6009-rev-list-parent.sh index 5a67bbc760..9c9a8459af 100755 --- a/t/t6009-rev-list-parent.sh +++ b/t/t6009-rev-list-parent.sh @@ -62,6 +62,17 @@ test_expect_success 'setup roots, merges and octopuses' ' git checkout main ' +test_expect_success 'parse --max-parents & --min-parents' ' + test_must_fail git rev-list --max-parents=1q HEAD 2>error && + grep "not an integer" error && + + test_must_fail git rev-list --min-parents=1q HEAD 2>error && + grep "not an integer" error && + + git rev-list --max-parents=1 --min-parents=1 HEAD && + git rev-list --max-parents=-1 --min-parents=-1 HEAD +' + test_expect_success 'rev-list roots' ' check_revlist "--max-parents=0" one five -- 2.43.0 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH] revision: parse integer arguments to --max-count, --skip, etc., more carefully 2023-12-08 22:35 ` [PATCH] revision: parse integer arguments to --max-count, --skip, etc., more carefully Junio C Hamano @ 2023-12-12 1:30 ` Jeff King 2023-12-12 15:09 ` Junio C Hamano 0 siblings, 1 reply; 6+ messages in thread From: Jeff King @ 2023-12-12 1:30 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Britton Kerin On Sat, Dec 09, 2023 at 07:35:23AM +0900, Junio C Hamano wrote: > The "rev-list" and other commands in the "log" family, being the > oldest part of the system, use their own custom argument parsers, > and integer values of some options are parsed with atoi(), which > allows a non-digit after the number (e.g., "1q") to be silently > ignored. As a natural consequence, an argument that does not begin > with a digit (e.g., "q") silently becomes zero, too. > > Switch to use strtol_i() and parse_timestamp() appropriately to > catch bogus input. > > Note that one may naïvely expect that --max-count, --skip, etc., to > only take non-negative values, but we must allow them to also take > negative values, as an escape hatch to countermand a limit set by an > earlier option on the command line; the underlying variables are > initialized to (-1) and "--max-count=-1", for example, is a > legitimate way to reinitialize the limit. This all looks pretty reasonable to me. I couldn't help but think, though, that surely we have some helpers for this already? But the closest seems to be git_parse_int(), which also allows unit factors. I'm not sure if allowing "-n 1k" would be a feature or a bug. ;) I guess "strtol_i()" maybe is that helper already, though I did not even know it existed. Looks like it goes back to 2007, and is seldom used. I wonder if there are more spots that could benefit. I don't think there is any such helper for timestamps, but the checks in your parser look good (strtol_i() checks for overflow as we cast to int, but I don't think we need to do the same here since timestamp_t and parse_timestamp() should be matched). -Peff ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] revision: parse integer arguments to --max-count, --skip, etc., more carefully 2023-12-12 1:30 ` Jeff King @ 2023-12-12 15:09 ` Junio C Hamano 2023-12-12 22:05 ` Jeff King 0 siblings, 1 reply; 6+ messages in thread From: Junio C Hamano @ 2023-12-12 15:09 UTC (permalink / raw) To: Jeff King; +Cc: git, Britton Kerin Jeff King <peff@peff.net> writes: > This all looks pretty reasonable to me. > > I couldn't help but think, though, that surely we have some helpers for > this already? But the closest seems to be git_parse_int(), which also > allows unit factors. I'm not sure if allowing "-n 1k" would be a feature > or a bug. ;) The change in question is meant to be a pure fix to replace a careless use of atoi(). I do not mind to see a separate patch to add such a feature later on top. > I guess "strtol_i()" maybe is that helper already, though I did not even > know it existed. Looks like it goes back to 2007, and is seldom used. I didn't know about it, either ;-) I used it only because there is one use of it, among places that used atoi() I wanted to tighten. > I wonder if there are more spots that could benefit. "git grep -e 'atoi('" would give somebody motivated a decent set of #microproject ideas, but many hits are not suited for strtol_i(), which is designed to parse an integer at the end of a string. Some places use atoi() immediately followed by strspn() to skip over digits, which means they are parsing an integer and want to continue reading after the integer, which is incompatible with what strtol_i() wants to do. They need either a separate helper or an updated strtol_i() that optionally allows you to parse the prefix and report where the integer ended, e.g., something like: #define strtol_i(s,b,r) strtol_i2((s), (b), (r), NULL) static inline int strtol_i2(char const *s, int base, int *result, char **endp) { long ul; char *dummy = NULL; if (!endp) endp = &dummy; errno = 0; ul = strtol(s, &endp, base); if (errno || /* * if we are told to parse to the end of the string by * passing NULL to endp, it is an error to have any * remaining character after the digits. */ (dummy && *dummy) || endp == s || (int) ul != ul) return -1; *result = ul; return 0; } perhaps. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [PATCH] revision: parse integer arguments to --max-count, --skip, etc., more carefully 2023-12-12 15:09 ` Junio C Hamano @ 2023-12-12 22:05 ` Jeff King 0 siblings, 0 replies; 6+ messages in thread From: Jeff King @ 2023-12-12 22:05 UTC (permalink / raw) To: Junio C Hamano; +Cc: git, Britton Kerin On Tue, Dec 12, 2023 at 07:09:02AM -0800, Junio C Hamano wrote: > Jeff King <peff@peff.net> writes: > > > This all looks pretty reasonable to me. > > > > I couldn't help but think, though, that surely we have some helpers for > > this already? But the closest seems to be git_parse_int(), which also > > allows unit factors. I'm not sure if allowing "-n 1k" would be a feature > > or a bug. ;) > > The change in question is meant to be a pure fix to replace a careless > use of atoi(). I do not mind to see a separate patch to add such a > feature later on top. Oh, I mostly meant that I would have turned to git_parse_int() as that already-existing helper, but it is not suitable because of the extra unit-handling. I think your patch draws the line in the right place. > > I wonder if there are more spots that could benefit. > > "git grep -e 'atoi('" would give somebody motivated a decent set of > #microproject ideas, but many hits are not suited for strtol_i(), > which is designed to parse an integer at the end of a string. Some > places use atoi() immediately followed by strspn() to skip over > digits, which means they are parsing an integer and want to continue > reading after the integer, which is incompatible with what > strtol_i() wants to do. They need either a separate helper or an > updated strtol_i() that optionally allows you to parse the prefix > and report where the integer ended, e.g., something like: Yeah, I agree this might be a good microproject (or leftoverbits) area, and the semantics for the helper you propose make sense to me. -Peff ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2023-12-12 22:05 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-12-07 22:12 [BUG] rev-list doesn't validate arguments to -n option Britton Kerin 2023-12-08 20:36 ` Junio C Hamano 2023-12-08 22:35 ` [PATCH] revision: parse integer arguments to --max-count, --skip, etc., more carefully Junio C Hamano 2023-12-12 1:30 ` Jeff King 2023-12-12 15:09 ` Junio C Hamano 2023-12-12 22:05 ` Jeff King
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).