git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/6] parse-options: long option parsing fixes and cleanup
@ 2024-02-24 21:29 René Scharfe
  2024-02-24 21:29 ` [PATCH 1/6] parse-options: recognize abbreviated negated option with arg René Scharfe
                   ` (7 more replies)
  0 siblings, 8 replies; 23+ messages in thread
From: René Scharfe @ 2024-02-24 21:29 UTC (permalink / raw)
  To: git

Fix two corner cases in patches 1 and 4, refactor the code in patches 2
and 3 to prepare the second fix, bonus patches 5 and 6 simplify the code
and improve readability.

  parse-options: recognize abbreviated negated option with arg
  parse-options: set arg of abbreviated option lazily
  parse-options: factor out register_abbrev() and struct parsed_option
  parse-options: detect ambiguous self-negation
  parse-options: normalize arg and long_name before comparison
  parse-options: rearrange long_name matching code

 parse-options.c               | 137 ++++++++++++++++++----------------
 t/t0040-parse-options.sh      |  16 ++++
 t/t1502-rev-parse-parseopt.sh |  11 +++
 3 files changed, 100 insertions(+), 64 deletions(-)

--
2.44.0


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH 1/6] parse-options: recognize abbreviated negated option with arg
  2024-02-24 21:29 [PATCH 0/6] parse-options: long option parsing fixes and cleanup René Scharfe
@ 2024-02-24 21:29 ` René Scharfe
  2024-02-24 21:29 ` [PATCH 2/6] parse-options: set arg of abbreviated option lazily René Scharfe
                   ` (6 subsequent siblings)
  7 siblings, 0 replies; 23+ messages in thread
From: René Scharfe @ 2024-02-24 21:29 UTC (permalink / raw)
  To: git

Giving an argument to an option that doesn't take one causes Git to
report that error specifically:

   $ git rm --dry-run=bogus
   error: option `dry-run' takes no value

The same is true when the option is negated or abbreviated:

   $ git rm --no-dry-run=bogus
   error: option `no-dry-run' takes no value

   $ git rm --dry=bogus
   error: option `dry-run' takes no value

Not so when doing both, though:

   $ git rm --no-dry=bogus
   error: unknown option `no-dry=bogus'
   usage: git rm [-f | --force] [-n] [-r] [--cached] [--ignore-unmatch]

(Rest of the usage message omitted.)

Improve consistency and usefulness of the error message by recognizing
abbreviated negated options even if they have a (most likely bogus)
argument.  With this patch we get:

   $ git rm --no-dry=bogus
   error: option `no-dry-run' takes no value
---
 parse-options.c          |  5 +++--
 t/t0040-parse-options.sh | 16 ++++++++++++++++
 2 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/parse-options.c b/parse-options.c
index 63a99dea6e..e4ce33ea48 100644
--- a/parse-options.c
+++ b/parse-options.c
@@ -391,7 +391,7 @@ static enum parse_opt_result parse_long_opt(
 					ambiguous_option = abbrev_option;
 					ambiguous_flags = abbrev_flags;
 				}
-				if (!(flags & OPT_UNSET) && *arg_end)
+				if (*arg_end)
 					p->opt = arg_end + 1;
 				abbrev_option = options;
 				abbrev_flags = flags ^ opt_flags;
@@ -412,7 +412,8 @@ static enum parse_opt_result parse_long_opt(
 			if (!skip_prefix(arg + 3, long_name, &rest)) {
 				/* abbreviated and negated? */
 				if (allow_abbrev &&
-				    starts_with(long_name, arg + 3))
+				    !strncmp(long_name, arg + 3,
+					     arg_end - arg - 3))
 					goto is_abbreviated;
 				else
 					continue;
diff --git a/t/t0040-parse-options.sh b/t/t0040-parse-options.sh
index ec974867e4..8bb2a8b453 100755
--- a/t/t0040-parse-options.sh
+++ b/t/t0040-parse-options.sh
@@ -210,6 +210,22 @@ test_expect_success 'superfluous value provided: boolean' '
 	test_cmp expect actual
 '

+test_expect_success 'superfluous value provided: boolean, abbreviated' '
+	cat >expect <<-\EOF &&
+	error: option `yes'\'' takes no value
+	EOF
+	test_expect_code 129 env GIT_TEST_DISALLOW_ABBREVIATED_OPTIONS=false \
+	test-tool parse-options --ye=hi 2>actual &&
+	test_cmp expect actual &&
+
+	cat >expect <<-\EOF &&
+	error: option `no-yes'\'' takes no value
+	EOF
+	test_expect_code 129 env GIT_TEST_DISALLOW_ABBREVIATED_OPTIONS=false \
+	test-tool parse-options --no-ye=hi 2>actual &&
+	test_cmp expect actual
+'
+
 test_expect_success 'superfluous value provided: cmdmode' '
 	cat >expect <<-\EOF &&
 	error: option `mode1'\'' takes no value
--
2.44.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 2/6] parse-options: set arg of abbreviated option lazily
  2024-02-24 21:29 [PATCH 0/6] parse-options: long option parsing fixes and cleanup René Scharfe
  2024-02-24 21:29 ` [PATCH 1/6] parse-options: recognize abbreviated negated option with arg René Scharfe
@ 2024-02-24 21:29 ` René Scharfe
  2024-02-24 21:29 ` [PATCH 3/6] parse-options: factor out register_abbrev() and struct parsed_option René Scharfe
                   ` (5 subsequent siblings)
  7 siblings, 0 replies; 23+ messages in thread
From: René Scharfe @ 2024-02-24 21:29 UTC (permalink / raw)
  To: git

Postpone setting the opt pointer until we're about to call get_value(),
which uses it.  There's no point in setting it eagerly for every
abbreviated candidate option, which may turn out to be ambiguous.
Removing this assignment from the loop doesn't noticeably improve the
performance, but allows further simplification.
---
 parse-options.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/parse-options.c b/parse-options.c
index e4ce33ea48..056c6b30e9 100644
--- a/parse-options.c
+++ b/parse-options.c
@@ -391,8 +391,6 @@ static enum parse_opt_result parse_long_opt(
 					ambiguous_option = abbrev_option;
 					ambiguous_flags = abbrev_flags;
 				}
-				if (*arg_end)
-					p->opt = arg_end + 1;
 				abbrev_option = options;
 				abbrev_flags = flags ^ opt_flags;
 				continue;
@@ -441,8 +439,11 @@ static enum parse_opt_result parse_long_opt(
 			abbrev_option->long_name);
 		return PARSE_OPT_HELP;
 	}
-	if (abbrev_option)
+	if (abbrev_option) {
+		if (*arg_end)
+			p->opt = arg_end + 1;
 		return get_value(p, abbrev_option, abbrev_flags);
+	}
 	return PARSE_OPT_UNKNOWN;
 }

--
2.44.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 3/6] parse-options: factor out register_abbrev() and struct parsed_option
  2024-02-24 21:29 [PATCH 0/6] parse-options: long option parsing fixes and cleanup René Scharfe
  2024-02-24 21:29 ` [PATCH 1/6] parse-options: recognize abbreviated negated option with arg René Scharfe
  2024-02-24 21:29 ` [PATCH 2/6] parse-options: set arg of abbreviated option lazily René Scharfe
@ 2024-02-24 21:29 ` René Scharfe
  2024-02-24 23:13   ` Junio C Hamano
  2024-02-24 21:29 ` [PATCH 4/6] parse-options: detect ambiguous self-negation René Scharfe
                   ` (4 subsequent siblings)
  7 siblings, 1 reply; 23+ messages in thread
From: René Scharfe @ 2024-02-24 21:29 UTC (permalink / raw)
  To: git

Add a function, register_abbrev(), for storing the necessary details for
remembering an abbreviated and thus potentially ambiguous option.  Call
it instead of sharing the code using goto, to make the control flow more
explicit.

Conveniently collect these details in the new struct parsed_option to
reduce the number of necessary function arguments.
---
 parse-options.c | 83 +++++++++++++++++++++++++++++--------------------
 1 file changed, 49 insertions(+), 34 deletions(-)

diff --git a/parse-options.c b/parse-options.c
index 056c6b30e9..398ebaef14 100644
--- a/parse-options.c
+++ b/parse-options.c
@@ -350,14 +350,40 @@ static int is_alias(struct parse_opt_ctx_t *ctx,
 	return 0;
 }

+struct parsed_option {
+	const struct option *option;
+	enum opt_parsed flags;
+};
+
+static void register_abbrev(struct parse_opt_ctx_t *p,
+			    const struct option *option, enum opt_parsed flags,
+			    struct parsed_option *abbrev,
+			    struct parsed_option *ambiguous)
+{
+	if (p->flags & PARSE_OPT_KEEP_UNKNOWN_OPT)
+		return;
+	if (abbrev->option &&
+	    !is_alias(p, abbrev->option, option)) {
+		/*
+		 * If this is abbreviated, it is
+		 * ambiguous. So when there is no
+		 * exact match later, we need to
+		 * error out.
+		 */
+		ambiguous->option = abbrev->option;
+		ambiguous->flags = abbrev->flags;
+	}
+	abbrev->option = option;
+	abbrev->flags = flags;
+}
+
 static enum parse_opt_result parse_long_opt(
 	struct parse_opt_ctx_t *p, const char *arg,
 	const struct option *options)
 {
 	const char *arg_end = strchrnul(arg, '=');
-	const struct option *abbrev_option = NULL, *ambiguous_option = NULL;
-	enum opt_parsed abbrev_flags = OPT_LONG, ambiguous_flags = OPT_LONG;
-	int allow_abbrev = !(p->flags & PARSE_OPT_KEEP_UNKNOWN_OPT);
+	struct parsed_option abbrev = { .option = NULL, .flags = OPT_LONG };
+	struct parsed_option ambiguous = { .option = NULL, .flags = OPT_LONG };

 	for (; options->type != OPTION_END; options++) {
 		const char *rest, *long_name = options->long_name;
@@ -377,31 +403,20 @@ static enum parse_opt_result parse_long_opt(
 			rest = NULL;
 		if (!rest) {
 			/* abbreviated? */
-			if (allow_abbrev &&
-			    !strncmp(long_name, arg, arg_end - arg)) {
-is_abbreviated:
-				if (abbrev_option &&
-				    !is_alias(p, abbrev_option, options)) {
-					/*
-					 * If this is abbreviated, it is
-					 * ambiguous. So when there is no
-					 * exact match later, we need to
-					 * error out.
-					 */
-					ambiguous_option = abbrev_option;
-					ambiguous_flags = abbrev_flags;
-				}
-				abbrev_option = options;
-				abbrev_flags = flags ^ opt_flags;
+			if (!strncmp(long_name, arg, arg_end - arg)) {
+				register_abbrev(p, options, flags ^ opt_flags,
+						&abbrev, &ambiguous);
 				continue;
 			}
 			/* negation allowed? */
 			if (options->flags & PARSE_OPT_NONEG)
 				continue;
 			/* negated and abbreviated very much? */
-			if (allow_abbrev && starts_with("no-", arg)) {
+			if (starts_with("no-", arg)) {
 				flags |= OPT_UNSET;
-				goto is_abbreviated;
+				register_abbrev(p, options, flags ^ opt_flags,
+						&abbrev, &ambiguous);
+				continue;
 			}
 			/* negated? */
 			if (!starts_with(arg, "no-"))
@@ -409,12 +424,12 @@ static enum parse_opt_result parse_long_opt(
 			flags |= OPT_UNSET;
 			if (!skip_prefix(arg + 3, long_name, &rest)) {
 				/* abbreviated and negated? */
-				if (allow_abbrev &&
-				    !strncmp(long_name, arg + 3,
+				if (!strncmp(long_name, arg + 3,
 					     arg_end - arg - 3))
-					goto is_abbreviated;
-				else
-					continue;
+					register_abbrev(p, options,
+							flags ^ opt_flags,
+							&abbrev, &ambiguous);
+				continue;
 			}
 		}
 		if (*rest) {
@@ -425,24 +440,24 @@ static enum parse_opt_result parse_long_opt(
 		return get_value(p, options, flags ^ opt_flags);
 	}

-	if (disallow_abbreviated_options && (ambiguous_option || abbrev_option))
+	if (disallow_abbreviated_options && (ambiguous.option || abbrev.option))
 		die("disallowed abbreviated or ambiguous option '%.*s'",
 		    (int)(arg_end - arg), arg);

-	if (ambiguous_option) {
+	if (ambiguous.option) {
 		error(_("ambiguous option: %s "
 			"(could be --%s%s or --%s%s)"),
 			arg,
-			(ambiguous_flags & OPT_UNSET) ?  "no-" : "",
-			ambiguous_option->long_name,
-			(abbrev_flags & OPT_UNSET) ?  "no-" : "",
-			abbrev_option->long_name);
+			(ambiguous.flags & OPT_UNSET) ?  "no-" : "",
+			ambiguous.option->long_name,
+			(abbrev.flags & OPT_UNSET) ?  "no-" : "",
+			abbrev.option->long_name);
 		return PARSE_OPT_HELP;
 	}
-	if (abbrev_option) {
+	if (abbrev.option) {
 		if (*arg_end)
 			p->opt = arg_end + 1;
-		return get_value(p, abbrev_option, abbrev_flags);
+		return get_value(p, abbrev.option, abbrev.flags);
 	}
 	return PARSE_OPT_UNKNOWN;
 }
--
2.44.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 4/6] parse-options: detect ambiguous self-negation
  2024-02-24 21:29 [PATCH 0/6] parse-options: long option parsing fixes and cleanup René Scharfe
                   ` (2 preceding siblings ...)
  2024-02-24 21:29 ` [PATCH 3/6] parse-options: factor out register_abbrev() and struct parsed_option René Scharfe
@ 2024-02-24 21:29 ` René Scharfe
  2024-02-24 21:29 ` [PATCH 5/6] parse-options: normalize arg and long_name before comparison René Scharfe
                   ` (3 subsequent siblings)
  7 siblings, 0 replies; 23+ messages in thread
From: René Scharfe @ 2024-02-24 21:29 UTC (permalink / raw)
  To: git

Git currently does not detect the ambiguity of an option that starts
with "no" like --notes and its negated form if given just --n or --no.
All Git commands with such options have other negatable options, and
we detect the ambiguity with them, so that's currently only a potential
problem for scripts that use git rev-parse --parseopt.

Let's fix it nevertheless, as there's no need for that confusion.  To
detect the ambiguity we have to loosen the check in register_abbrev(),
as an option is considered an alias of itself.  Add non-matching
negation flags as a criterion to recognize an option being ambiguous
with its negated form.

And we need to keep going after finding a non-negated option as an
abbreviated candidate and perform the negation checks in the same
loop.
---
 parse-options.c               |  3 +--
 t/t1502-rev-parse-parseopt.sh | 11 +++++++++++
 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/parse-options.c b/parse-options.c
index 398ebaef14..008c0f32cf 100644
--- a/parse-options.c
+++ b/parse-options.c
@@ -363,7 +363,7 @@ static void register_abbrev(struct parse_opt_ctx_t *p,
 	if (p->flags & PARSE_OPT_KEEP_UNKNOWN_OPT)
 		return;
 	if (abbrev->option &&
-	    !is_alias(p, abbrev->option, option)) {
+	    !(abbrev->flags == flags && is_alias(p, abbrev->option, option))) {
 		/*
 		 * If this is abbreviated, it is
 		 * ambiguous. So when there is no
@@ -406,7 +406,6 @@ static enum parse_opt_result parse_long_opt(
 			if (!strncmp(long_name, arg, arg_end - arg)) {
 				register_abbrev(p, options, flags ^ opt_flags,
 						&abbrev, &ambiguous);
-				continue;
 			}
 			/* negation allowed? */
 			if (options->flags & PARSE_OPT_NONEG)
diff --git a/t/t1502-rev-parse-parseopt.sh b/t/t1502-rev-parse-parseopt.sh
index f0737593c3..b754b9fd74 100755
--- a/t/t1502-rev-parse-parseopt.sh
+++ b/t/t1502-rev-parse-parseopt.sh
@@ -322,4 +322,15 @@ check_invalid_long_option optionspec-neg --no-positive-only
 check_invalid_long_option optionspec-neg --negative
 check_invalid_long_option optionspec-neg --no-no-negative

+test_expect_success 'ambiguous: --no matches both --noble and --no-noble' '
+	cat >spec <<-\EOF &&
+	some-command [options]
+	--
+	noble The feudal switch.
+	EOF
+	test_expect_code 129 env GIT_TEST_DISALLOW_ABBREVIATED_OPTIONS=false \
+	git rev-parse --parseopt -- <spec 2>err --no &&
+	grep "error: ambiguous option: no (could be --noble or --no-noble)" err
+'
+
 test_done
--
2.44.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 5/6] parse-options: normalize arg and long_name before comparison
  2024-02-24 21:29 [PATCH 0/6] parse-options: long option parsing fixes and cleanup René Scharfe
                   ` (3 preceding siblings ...)
  2024-02-24 21:29 ` [PATCH 4/6] parse-options: detect ambiguous self-negation René Scharfe
@ 2024-02-24 21:29 ` René Scharfe
  2024-02-24 21:29 ` [PATCH 6/6] parse-options: rearrange long_name matching code René Scharfe
                   ` (2 subsequent siblings)
  7 siblings, 0 replies; 23+ messages in thread
From: René Scharfe @ 2024-02-24 21:29 UTC (permalink / raw)
  To: git

Strip "no-" from arg and long_name before comparing them.  This way we
no longer have to repeat the comparison with an offset of 3 for negated
arguments.

Note that we must not modify the "flags" value, which tracks whether arg
is negated, inside the loop.  When registering "--n", "--no" or "--no-"
as abbreviation for any negative option, we used to OR it with OPT_UNSET
and end the loop.  We can simply hard-code OPT_UNSET and leave flags
unchanged instead.
---
 parse-options.c | 44 ++++++++++++++++++++++----------------------
 1 file changed, 22 insertions(+), 22 deletions(-)

diff --git a/parse-options.c b/parse-options.c
index 008c0f32cf..d45efa4e5c 100644
--- a/parse-options.c
+++ b/parse-options.c
@@ -382,28 +382,42 @@ static enum parse_opt_result parse_long_opt(
 	const struct option *options)
 {
 	const char *arg_end = strchrnul(arg, '=');
+	const char *arg_start = arg;
+	enum opt_parsed flags = OPT_LONG;
+	int arg_starts_with_no_no = 0;
 	struct parsed_option abbrev = { .option = NULL, .flags = OPT_LONG };
 	struct parsed_option ambiguous = { .option = NULL, .flags = OPT_LONG };

+	if (skip_prefix(arg_start, "no-", &arg_start)) {
+		if (skip_prefix(arg_start, "no-", &arg_start))
+			arg_starts_with_no_no = 1;
+		else
+			flags |= OPT_UNSET;
+	}
+
 	for (; options->type != OPTION_END; options++) {
 		const char *rest, *long_name = options->long_name;
-		enum opt_parsed flags = OPT_LONG, opt_flags = OPT_LONG;
+		enum opt_parsed opt_flags = OPT_LONG;
+		int allow_unset = !(options->flags & PARSE_OPT_NONEG);

 		if (options->type == OPTION_SUBCOMMAND)
 			continue;
 		if (!long_name)
 			continue;

-		if (!starts_with(arg, "no-") &&
-		    !(options->flags & PARSE_OPT_NONEG) &&
-		    skip_prefix(long_name, "no-", &long_name))
+		if (skip_prefix(long_name, "no-", &long_name))
 			opt_flags |= OPT_UNSET;
+		else if (arg_starts_with_no_no)
+			continue;
+
+		if (((flags ^ opt_flags) & OPT_UNSET) && !allow_unset)
+			continue;

-		if (!skip_prefix(arg, long_name, &rest))
+		if (!skip_prefix(arg_start, long_name, &rest))
 			rest = NULL;
 		if (!rest) {
 			/* abbreviated? */
-			if (!strncmp(long_name, arg, arg_end - arg)) {
+			if (!strncmp(long_name, arg_start, arg_end - arg_start)) {
 				register_abbrev(p, options, flags ^ opt_flags,
 						&abbrev, &ambiguous);
 			}
@@ -412,24 +426,10 @@ static enum parse_opt_result parse_long_opt(
 				continue;
 			/* negated and abbreviated very much? */
 			if (starts_with("no-", arg)) {
-				flags |= OPT_UNSET;
-				register_abbrev(p, options, flags ^ opt_flags,
+				register_abbrev(p, options, OPT_UNSET ^ opt_flags,
 						&abbrev, &ambiguous);
-				continue;
-			}
-			/* negated? */
-			if (!starts_with(arg, "no-"))
-				continue;
-			flags |= OPT_UNSET;
-			if (!skip_prefix(arg + 3, long_name, &rest)) {
-				/* abbreviated and negated? */
-				if (!strncmp(long_name, arg + 3,
-					     arg_end - arg - 3))
-					register_abbrev(p, options,
-							flags ^ opt_flags,
-							&abbrev, &ambiguous);
-				continue;
 			}
+			continue;
 		}
 		if (*rest) {
 			if (*rest != '=')
--
2.44.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH 6/6] parse-options: rearrange long_name matching code
  2024-02-24 21:29 [PATCH 0/6] parse-options: long option parsing fixes and cleanup René Scharfe
                   ` (4 preceding siblings ...)
  2024-02-24 21:29 ` [PATCH 5/6] parse-options: normalize arg and long_name before comparison René Scharfe
@ 2024-02-24 21:29 ` René Scharfe
  2024-02-24 21:42 ` [PATCH 0/6] parse-options: long option parsing fixes and cleanup René Scharfe
  2024-03-03 12:19 ` [PATCH v2 " René Scharfe
  7 siblings, 0 replies; 23+ messages in thread
From: René Scharfe @ 2024-02-24 21:29 UTC (permalink / raw)
  To: git

Move the code for handling a full match of long_name first and get rid
of negations.  Reduce the indent of the code for matching abbreviations
and remove unnecessary curly braces.  Combine the checks for whether
negation is allowed and whether arg is "n", "no" or "no-" because they
belong together and avoid a continue statement.  The result is shorter,
more readable code.
---
 parse-options.c | 37 +++++++++++++++----------------------
 1 file changed, 15 insertions(+), 22 deletions(-)

diff --git a/parse-options.c b/parse-options.c
index d45efa4e5c..30b9e68f8a 100644
--- a/parse-options.c
+++ b/parse-options.c
@@ -413,30 +413,23 @@ static enum parse_opt_result parse_long_opt(
 		if (((flags ^ opt_flags) & OPT_UNSET) && !allow_unset)
 			continue;

-		if (!skip_prefix(arg_start, long_name, &rest))
-			rest = NULL;
-		if (!rest) {
-			/* abbreviated? */
-			if (!strncmp(long_name, arg_start, arg_end - arg_start)) {
-				register_abbrev(p, options, flags ^ opt_flags,
-						&abbrev, &ambiguous);
-			}
-			/* negation allowed? */
-			if (options->flags & PARSE_OPT_NONEG)
+		if (skip_prefix(arg_start, long_name, &rest)) {
+			if (*rest == '=')
+				p->opt = rest + 1;
+			else if (*rest)
 				continue;
-			/* negated and abbreviated very much? */
-			if (starts_with("no-", arg)) {
-				register_abbrev(p, options, OPT_UNSET ^ opt_flags,
-						&abbrev, &ambiguous);
-			}
-			continue;
+			return get_value(p, options, flags ^ opt_flags);
 		}
-		if (*rest) {
-			if (*rest != '=')
-				continue;
-			p->opt = rest + 1;
-		}
-		return get_value(p, options, flags ^ opt_flags);
+
+		/* abbreviated? */
+		if (!strncmp(long_name, arg_start, arg_end - arg_start))
+			register_abbrev(p, options, flags ^ opt_flags,
+					&abbrev, &ambiguous);
+
+		/* negated and abbreviated very much? */
+		if (allow_unset && starts_with("no-", arg))
+			register_abbrev(p, options, OPT_UNSET ^ opt_flags,
+					&abbrev, &ambiguous);
 	}

 	if (disallow_abbreviated_options && (ambiguous.option || abbrev.option))
--
2.44.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH 0/6] parse-options: long option parsing fixes and cleanup
  2024-02-24 21:29 [PATCH 0/6] parse-options: long option parsing fixes and cleanup René Scharfe
                   ` (5 preceding siblings ...)
  2024-02-24 21:29 ` [PATCH 6/6] parse-options: rearrange long_name matching code René Scharfe
@ 2024-02-24 21:42 ` René Scharfe
  2024-03-03 12:19 ` [PATCH v2 " René Scharfe
  7 siblings, 0 replies; 23+ messages in thread
From: René Scharfe @ 2024-02-24 21:42 UTC (permalink / raw)
  To: git

Am 24.02.24 um 22:29 schrieb René Scharfe:
> Fix two corner cases in patches 1 and 4, refactor the code in patches 2
> and 3 to prepare the second fix, bonus patches 5 and 6 simplify the code
> and improve readability.

Forgot to sign-off, d'oh!  Will wait for feedback and sign-off in v2..

>
>   parse-options: recognize abbreviated negated option with arg
>   parse-options: set arg of abbreviated option lazily
>   parse-options: factor out register_abbrev() and struct parsed_option
>   parse-options: detect ambiguous self-negation
>   parse-options: normalize arg and long_name before comparison
>   parse-options: rearrange long_name matching code
>
>  parse-options.c               | 137 ++++++++++++++++++----------------
>  t/t0040-parse-options.sh      |  16 ++++
>  t/t1502-rev-parse-parseopt.sh |  11 +++
>  3 files changed, 100 insertions(+), 64 deletions(-)
>
> --
> 2.44.0
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH 3/6] parse-options: factor out register_abbrev() and struct parsed_option
  2024-02-24 21:29 ` [PATCH 3/6] parse-options: factor out register_abbrev() and struct parsed_option René Scharfe
@ 2024-02-24 23:13   ` Junio C Hamano
  0 siblings, 0 replies; 23+ messages in thread
From: Junio C Hamano @ 2024-02-24 23:13 UTC (permalink / raw)
  To: René Scharfe; +Cc: git

René Scharfe <l.s.r@web.de> writes:

>  static enum parse_opt_result parse_long_opt(
>  	struct parse_opt_ctx_t *p, const char *arg,
>  	const struct option *options)
>  {
>  	const char *arg_end = strchrnul(arg, '=');
> -	const struct option *abbrev_option = NULL, *ambiguous_option = NULL;
> -	enum opt_parsed abbrev_flags = OPT_LONG, ambiguous_flags = OPT_LONG;
> -	int allow_abbrev = !(p->flags & PARSE_OPT_KEEP_UNKNOWN_OPT);
> +	struct parsed_option abbrev = { .option = NULL, .flags = OPT_LONG };
> +	struct parsed_option ambiguous = { .option = NULL, .flags = OPT_LONG };

There is this "allow_abbrev" thing ...

> ...
>  			if (!skip_prefix(arg + 3, long_name, &rest)) {
>  				/* abbreviated and negated? */
> -				if (allow_abbrev &&
> -				    !strncmp(long_name, arg + 3,
> +				if (!strncmp(long_name, arg + 3,
>  					     arg_end - arg - 3))
> -					goto is_abbreviated;
> -				else
> -					continue;
> +					register_abbrev(p, options,
> +							flags ^ opt_flags,
> +							&abbrev, &ambiguous);
> +				continue;
>  			}
>  		}

... whose use goes away completely from the loop.  We still call
register_abbrev(), but the helper function becomes no-op when
p->flags had KEEP_UNKNOWN_OPT bit set, so everything cancels out.

OK.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v2 0/6] parse-options: long option parsing fixes and cleanup
  2024-02-24 21:29 [PATCH 0/6] parse-options: long option parsing fixes and cleanup René Scharfe
                   ` (6 preceding siblings ...)
  2024-02-24 21:42 ` [PATCH 0/6] parse-options: long option parsing fixes and cleanup René Scharfe
@ 2024-03-03 12:19 ` René Scharfe
  2024-03-03 12:19   ` [PATCH v2 1/6] parse-options: recognize abbreviated negated option with arg René Scharfe
                     ` (6 more replies)
  7 siblings, 7 replies; 23+ messages in thread
From: René Scharfe @ 2024-03-03 12:19 UTC (permalink / raw)
  To: git

Fix two corner cases in patches 1 and 4, refactor the code in patches 2
and 3 to prepare the second fix, bonus patches 5 and 6 simplify the code
and improve readability.

Changes since v1:
* Added sign-off.

  parse-options: recognize abbreviated negated option with arg
  parse-options: set arg of abbreviated option lazily
  parse-options: factor out register_abbrev() and struct parsed_option
  parse-options: detect ambiguous self-negation
  parse-options: normalize arg and long_name before comparison
  parse-options: rearrange long_name matching code

 parse-options.c               | 137 ++++++++++++++++++----------------
 t/t0040-parse-options.sh      |  16 ++++
 t/t1502-rev-parse-parseopt.sh |  11 +++
 3 files changed, 100 insertions(+), 64 deletions(-)

--
2.44.0


^ permalink raw reply	[flat|nested] 23+ messages in thread

* [PATCH v2 1/6] parse-options: recognize abbreviated negated option with arg
  2024-03-03 12:19 ` [PATCH v2 " René Scharfe
@ 2024-03-03 12:19   ` René Scharfe
  2024-03-12 21:42     ` Josh Steadmon
  2024-03-03 12:19   ` [PATCH v2 2/6] parse-options: set arg of abbreviated option lazily René Scharfe
                     ` (5 subsequent siblings)
  6 siblings, 1 reply; 23+ messages in thread
From: René Scharfe @ 2024-03-03 12:19 UTC (permalink / raw)
  To: git

Giving an argument to an option that doesn't take one causes Git to
report that error specifically:

   $ git rm --dry-run=bogus
   error: option `dry-run' takes no value

The same is true when the option is negated or abbreviated:

   $ git rm --no-dry-run=bogus
   error: option `no-dry-run' takes no value

   $ git rm --dry=bogus
   error: option `dry-run' takes no value

Not so when doing both, though:

   $ git rm --no-dry=bogus
   error: unknown option `no-dry=bogus'
   usage: git rm [-f | --force] [-n] [-r] [--cached] [--ignore-unmatch]

(Rest of the usage message omitted.)

Improve consistency and usefulness of the error message by recognizing
abbreviated negated options even if they have a (most likely bogus)
argument.  With this patch we get:

   $ git rm --no-dry=bogus
   error: option `no-dry-run' takes no value

Signed-off-by: René Scharfe <l.s.r@web.de>
---
 parse-options.c          |  5 +++--
 t/t0040-parse-options.sh | 16 ++++++++++++++++
 2 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/parse-options.c b/parse-options.c
index 63a99dea6e..e4ce33ea48 100644
--- a/parse-options.c
+++ b/parse-options.c
@@ -391,7 +391,7 @@ static enum parse_opt_result parse_long_opt(
 					ambiguous_option = abbrev_option;
 					ambiguous_flags = abbrev_flags;
 				}
-				if (!(flags & OPT_UNSET) && *arg_end)
+				if (*arg_end)
 					p->opt = arg_end + 1;
 				abbrev_option = options;
 				abbrev_flags = flags ^ opt_flags;
@@ -412,7 +412,8 @@ static enum parse_opt_result parse_long_opt(
 			if (!skip_prefix(arg + 3, long_name, &rest)) {
 				/* abbreviated and negated? */
 				if (allow_abbrev &&
-				    starts_with(long_name, arg + 3))
+				    !strncmp(long_name, arg + 3,
+					     arg_end - arg - 3))
 					goto is_abbreviated;
 				else
 					continue;
diff --git a/t/t0040-parse-options.sh b/t/t0040-parse-options.sh
index ec974867e4..8bb2a8b453 100755
--- a/t/t0040-parse-options.sh
+++ b/t/t0040-parse-options.sh
@@ -210,6 +210,22 @@ test_expect_success 'superfluous value provided: boolean' '
 	test_cmp expect actual
 '

+test_expect_success 'superfluous value provided: boolean, abbreviated' '
+	cat >expect <<-\EOF &&
+	error: option `yes'\'' takes no value
+	EOF
+	test_expect_code 129 env GIT_TEST_DISALLOW_ABBREVIATED_OPTIONS=false \
+	test-tool parse-options --ye=hi 2>actual &&
+	test_cmp expect actual &&
+
+	cat >expect <<-\EOF &&
+	error: option `no-yes'\'' takes no value
+	EOF
+	test_expect_code 129 env GIT_TEST_DISALLOW_ABBREVIATED_OPTIONS=false \
+	test-tool parse-options --no-ye=hi 2>actual &&
+	test_cmp expect actual
+'
+
 test_expect_success 'superfluous value provided: cmdmode' '
 	cat >expect <<-\EOF &&
 	error: option `mode1'\'' takes no value
--
2.44.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 2/6] parse-options: set arg of abbreviated option lazily
  2024-03-03 12:19 ` [PATCH v2 " René Scharfe
  2024-03-03 12:19   ` [PATCH v2 1/6] parse-options: recognize abbreviated negated option with arg René Scharfe
@ 2024-03-03 12:19   ` René Scharfe
  2024-03-03 12:19   ` [PATCH v2 3/6] parse-options: factor out register_abbrev() and struct parsed_option René Scharfe
                     ` (4 subsequent siblings)
  6 siblings, 0 replies; 23+ messages in thread
From: René Scharfe @ 2024-03-03 12:19 UTC (permalink / raw)
  To: git

Postpone setting the opt pointer until we're about to call get_value(),
which uses it.  There's no point in setting it eagerly for every
abbreviated candidate option, which may turn out to be ambiguous.
Removing this assignment from the loop doesn't noticeably improve the
performance, but allows further simplification.

Signed-off-by: René Scharfe <l.s.r@web.de>
---
 parse-options.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/parse-options.c b/parse-options.c
index e4ce33ea48..056c6b30e9 100644
--- a/parse-options.c
+++ b/parse-options.c
@@ -391,8 +391,6 @@ static enum parse_opt_result parse_long_opt(
 					ambiguous_option = abbrev_option;
 					ambiguous_flags = abbrev_flags;
 				}
-				if (*arg_end)
-					p->opt = arg_end + 1;
 				abbrev_option = options;
 				abbrev_flags = flags ^ opt_flags;
 				continue;
@@ -441,8 +439,11 @@ static enum parse_opt_result parse_long_opt(
 			abbrev_option->long_name);
 		return PARSE_OPT_HELP;
 	}
-	if (abbrev_option)
+	if (abbrev_option) {
+		if (*arg_end)
+			p->opt = arg_end + 1;
 		return get_value(p, abbrev_option, abbrev_flags);
+	}
 	return PARSE_OPT_UNKNOWN;
 }

--
2.44.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 3/6] parse-options: factor out register_abbrev() and struct parsed_option
  2024-03-03 12:19 ` [PATCH v2 " René Scharfe
  2024-03-03 12:19   ` [PATCH v2 1/6] parse-options: recognize abbreviated negated option with arg René Scharfe
  2024-03-03 12:19   ` [PATCH v2 2/6] parse-options: set arg of abbreviated option lazily René Scharfe
@ 2024-03-03 12:19   ` René Scharfe
  2024-03-12 21:43     ` Josh Steadmon
  2024-03-03 12:19   ` [PATCH v2 4/6] parse-options: detect ambiguous self-negation René Scharfe
                     ` (3 subsequent siblings)
  6 siblings, 1 reply; 23+ messages in thread
From: René Scharfe @ 2024-03-03 12:19 UTC (permalink / raw)
  To: git

Add a function, register_abbrev(), for storing the necessary details for
remembering an abbreviated and thus potentially ambiguous option.  Call
it instead of sharing the code using goto, to make the control flow more
explicit.

Conveniently collect these details in the new struct parsed_option to
reduce the number of necessary function arguments.

Signed-off-by: René Scharfe <l.s.r@web.de>
---
 parse-options.c | 83 +++++++++++++++++++++++++++++--------------------
 1 file changed, 49 insertions(+), 34 deletions(-)

diff --git a/parse-options.c b/parse-options.c
index 056c6b30e9..398ebaef14 100644
--- a/parse-options.c
+++ b/parse-options.c
@@ -350,14 +350,40 @@ static int is_alias(struct parse_opt_ctx_t *ctx,
 	return 0;
 }

+struct parsed_option {
+	const struct option *option;
+	enum opt_parsed flags;
+};
+
+static void register_abbrev(struct parse_opt_ctx_t *p,
+			    const struct option *option, enum opt_parsed flags,
+			    struct parsed_option *abbrev,
+			    struct parsed_option *ambiguous)
+{
+	if (p->flags & PARSE_OPT_KEEP_UNKNOWN_OPT)
+		return;
+	if (abbrev->option &&
+	    !is_alias(p, abbrev->option, option)) {
+		/*
+		 * If this is abbreviated, it is
+		 * ambiguous. So when there is no
+		 * exact match later, we need to
+		 * error out.
+		 */
+		ambiguous->option = abbrev->option;
+		ambiguous->flags = abbrev->flags;
+	}
+	abbrev->option = option;
+	abbrev->flags = flags;
+}
+
 static enum parse_opt_result parse_long_opt(
 	struct parse_opt_ctx_t *p, const char *arg,
 	const struct option *options)
 {
 	const char *arg_end = strchrnul(arg, '=');
-	const struct option *abbrev_option = NULL, *ambiguous_option = NULL;
-	enum opt_parsed abbrev_flags = OPT_LONG, ambiguous_flags = OPT_LONG;
-	int allow_abbrev = !(p->flags & PARSE_OPT_KEEP_UNKNOWN_OPT);
+	struct parsed_option abbrev = { .option = NULL, .flags = OPT_LONG };
+	struct parsed_option ambiguous = { .option = NULL, .flags = OPT_LONG };

 	for (; options->type != OPTION_END; options++) {
 		const char *rest, *long_name = options->long_name;
@@ -377,31 +403,20 @@ static enum parse_opt_result parse_long_opt(
 			rest = NULL;
 		if (!rest) {
 			/* abbreviated? */
-			if (allow_abbrev &&
-			    !strncmp(long_name, arg, arg_end - arg)) {
-is_abbreviated:
-				if (abbrev_option &&
-				    !is_alias(p, abbrev_option, options)) {
-					/*
-					 * If this is abbreviated, it is
-					 * ambiguous. So when there is no
-					 * exact match later, we need to
-					 * error out.
-					 */
-					ambiguous_option = abbrev_option;
-					ambiguous_flags = abbrev_flags;
-				}
-				abbrev_option = options;
-				abbrev_flags = flags ^ opt_flags;
+			if (!strncmp(long_name, arg, arg_end - arg)) {
+				register_abbrev(p, options, flags ^ opt_flags,
+						&abbrev, &ambiguous);
 				continue;
 			}
 			/* negation allowed? */
 			if (options->flags & PARSE_OPT_NONEG)
 				continue;
 			/* negated and abbreviated very much? */
-			if (allow_abbrev && starts_with("no-", arg)) {
+			if (starts_with("no-", arg)) {
 				flags |= OPT_UNSET;
-				goto is_abbreviated;
+				register_abbrev(p, options, flags ^ opt_flags,
+						&abbrev, &ambiguous);
+				continue;
 			}
 			/* negated? */
 			if (!starts_with(arg, "no-"))
@@ -409,12 +424,12 @@ static enum parse_opt_result parse_long_opt(
 			flags |= OPT_UNSET;
 			if (!skip_prefix(arg + 3, long_name, &rest)) {
 				/* abbreviated and negated? */
-				if (allow_abbrev &&
-				    !strncmp(long_name, arg + 3,
+				if (!strncmp(long_name, arg + 3,
 					     arg_end - arg - 3))
-					goto is_abbreviated;
-				else
-					continue;
+					register_abbrev(p, options,
+							flags ^ opt_flags,
+							&abbrev, &ambiguous);
+				continue;
 			}
 		}
 		if (*rest) {
@@ -425,24 +440,24 @@ static enum parse_opt_result parse_long_opt(
 		return get_value(p, options, flags ^ opt_flags);
 	}

-	if (disallow_abbreviated_options && (ambiguous_option || abbrev_option))
+	if (disallow_abbreviated_options && (ambiguous.option || abbrev.option))
 		die("disallowed abbreviated or ambiguous option '%.*s'",
 		    (int)(arg_end - arg), arg);

-	if (ambiguous_option) {
+	if (ambiguous.option) {
 		error(_("ambiguous option: %s "
 			"(could be --%s%s or --%s%s)"),
 			arg,
-			(ambiguous_flags & OPT_UNSET) ?  "no-" : "",
-			ambiguous_option->long_name,
-			(abbrev_flags & OPT_UNSET) ?  "no-" : "",
-			abbrev_option->long_name);
+			(ambiguous.flags & OPT_UNSET) ?  "no-" : "",
+			ambiguous.option->long_name,
+			(abbrev.flags & OPT_UNSET) ?  "no-" : "",
+			abbrev.option->long_name);
 		return PARSE_OPT_HELP;
 	}
-	if (abbrev_option) {
+	if (abbrev.option) {
 		if (*arg_end)
 			p->opt = arg_end + 1;
-		return get_value(p, abbrev_option, abbrev_flags);
+		return get_value(p, abbrev.option, abbrev.flags);
 	}
 	return PARSE_OPT_UNKNOWN;
 }
--
2.44.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 4/6] parse-options: detect ambiguous self-negation
  2024-03-03 12:19 ` [PATCH v2 " René Scharfe
                     ` (2 preceding siblings ...)
  2024-03-03 12:19   ` [PATCH v2 3/6] parse-options: factor out register_abbrev() and struct parsed_option René Scharfe
@ 2024-03-03 12:19   ` René Scharfe
  2024-03-03 12:19   ` [PATCH v2 5/6] parse-options: normalize arg and long_name before comparison René Scharfe
                     ` (2 subsequent siblings)
  6 siblings, 0 replies; 23+ messages in thread
From: René Scharfe @ 2024-03-03 12:19 UTC (permalink / raw)
  To: git

Git currently does not detect the ambiguity of an option that starts
with "no" like --notes and its negated form if given just --n or --no.
All Git commands with such options have other negatable options, and
we detect the ambiguity with them, so that's currently only a potential
problem for scripts that use git rev-parse --parseopt.

Let's fix it nevertheless, as there's no need for that confusion.  To
detect the ambiguity we have to loosen the check in register_abbrev(),
as an option is considered an alias of itself.  Add non-matching
negation flags as a criterion to recognize an option being ambiguous
with its negated form.

And we need to keep going after finding a non-negated option as an
abbreviated candidate and perform the negation checks in the same
loop.

Signed-off-by: René Scharfe <l.s.r@web.de>
---
 parse-options.c               |  3 +--
 t/t1502-rev-parse-parseopt.sh | 11 +++++++++++
 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/parse-options.c b/parse-options.c
index 398ebaef14..008c0f32cf 100644
--- a/parse-options.c
+++ b/parse-options.c
@@ -363,7 +363,7 @@ static void register_abbrev(struct parse_opt_ctx_t *p,
 	if (p->flags & PARSE_OPT_KEEP_UNKNOWN_OPT)
 		return;
 	if (abbrev->option &&
-	    !is_alias(p, abbrev->option, option)) {
+	    !(abbrev->flags == flags && is_alias(p, abbrev->option, option))) {
 		/*
 		 * If this is abbreviated, it is
 		 * ambiguous. So when there is no
@@ -406,7 +406,6 @@ static enum parse_opt_result parse_long_opt(
 			if (!strncmp(long_name, arg, arg_end - arg)) {
 				register_abbrev(p, options, flags ^ opt_flags,
 						&abbrev, &ambiguous);
-				continue;
 			}
 			/* negation allowed? */
 			if (options->flags & PARSE_OPT_NONEG)
diff --git a/t/t1502-rev-parse-parseopt.sh b/t/t1502-rev-parse-parseopt.sh
index f0737593c3..b754b9fd74 100755
--- a/t/t1502-rev-parse-parseopt.sh
+++ b/t/t1502-rev-parse-parseopt.sh
@@ -322,4 +322,15 @@ check_invalid_long_option optionspec-neg --no-positive-only
 check_invalid_long_option optionspec-neg --negative
 check_invalid_long_option optionspec-neg --no-no-negative

+test_expect_success 'ambiguous: --no matches both --noble and --no-noble' '
+	cat >spec <<-\EOF &&
+	some-command [options]
+	--
+	noble The feudal switch.
+	EOF
+	test_expect_code 129 env GIT_TEST_DISALLOW_ABBREVIATED_OPTIONS=false \
+	git rev-parse --parseopt -- <spec 2>err --no &&
+	grep "error: ambiguous option: no (could be --noble or --no-noble)" err
+'
+
 test_done
--
2.44.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 5/6] parse-options: normalize arg and long_name before comparison
  2024-03-03 12:19 ` [PATCH v2 " René Scharfe
                     ` (3 preceding siblings ...)
  2024-03-03 12:19   ` [PATCH v2 4/6] parse-options: detect ambiguous self-negation René Scharfe
@ 2024-03-03 12:19   ` René Scharfe
  2024-03-12 21:43     ` Josh Steadmon
  2024-03-03 12:19   ` [PATCH v2 6/6] parse-options: rearrange long_name matching code René Scharfe
  2024-03-12 21:45   ` [PATCH v2 0/6] parse-options: long option parsing fixes and cleanup Josh Steadmon
  6 siblings, 1 reply; 23+ messages in thread
From: René Scharfe @ 2024-03-03 12:19 UTC (permalink / raw)
  To: git

Strip "no-" from arg and long_name before comparing them.  This way we
no longer have to repeat the comparison with an offset of 3 for negated
arguments.

Note that we must not modify the "flags" value, which tracks whether arg
is negated, inside the loop.  When registering "--n", "--no" or "--no-"
as abbreviation for any negative option, we used to OR it with OPT_UNSET
and end the loop.  We can simply hard-code OPT_UNSET and leave flags
unchanged instead.

Signed-off-by: René Scharfe <l.s.r@web.de>
---
 parse-options.c | 44 ++++++++++++++++++++++----------------------
 1 file changed, 22 insertions(+), 22 deletions(-)

diff --git a/parse-options.c b/parse-options.c
index 008c0f32cf..d45efa4e5c 100644
--- a/parse-options.c
+++ b/parse-options.c
@@ -382,28 +382,42 @@ static enum parse_opt_result parse_long_opt(
 	const struct option *options)
 {
 	const char *arg_end = strchrnul(arg, '=');
+	const char *arg_start = arg;
+	enum opt_parsed flags = OPT_LONG;
+	int arg_starts_with_no_no = 0;
 	struct parsed_option abbrev = { .option = NULL, .flags = OPT_LONG };
 	struct parsed_option ambiguous = { .option = NULL, .flags = OPT_LONG };

+	if (skip_prefix(arg_start, "no-", &arg_start)) {
+		if (skip_prefix(arg_start, "no-", &arg_start))
+			arg_starts_with_no_no = 1;
+		else
+			flags |= OPT_UNSET;
+	}
+
 	for (; options->type != OPTION_END; options++) {
 		const char *rest, *long_name = options->long_name;
-		enum opt_parsed flags = OPT_LONG, opt_flags = OPT_LONG;
+		enum opt_parsed opt_flags = OPT_LONG;
+		int allow_unset = !(options->flags & PARSE_OPT_NONEG);

 		if (options->type == OPTION_SUBCOMMAND)
 			continue;
 		if (!long_name)
 			continue;

-		if (!starts_with(arg, "no-") &&
-		    !(options->flags & PARSE_OPT_NONEG) &&
-		    skip_prefix(long_name, "no-", &long_name))
+		if (skip_prefix(long_name, "no-", &long_name))
 			opt_flags |= OPT_UNSET;
+		else if (arg_starts_with_no_no)
+			continue;
+
+		if (((flags ^ opt_flags) & OPT_UNSET) && !allow_unset)
+			continue;

-		if (!skip_prefix(arg, long_name, &rest))
+		if (!skip_prefix(arg_start, long_name, &rest))
 			rest = NULL;
 		if (!rest) {
 			/* abbreviated? */
-			if (!strncmp(long_name, arg, arg_end - arg)) {
+			if (!strncmp(long_name, arg_start, arg_end - arg_start)) {
 				register_abbrev(p, options, flags ^ opt_flags,
 						&abbrev, &ambiguous);
 			}
@@ -412,24 +426,10 @@ static enum parse_opt_result parse_long_opt(
 				continue;
 			/* negated and abbreviated very much? */
 			if (starts_with("no-", arg)) {
-				flags |= OPT_UNSET;
-				register_abbrev(p, options, flags ^ opt_flags,
+				register_abbrev(p, options, OPT_UNSET ^ opt_flags,
 						&abbrev, &ambiguous);
-				continue;
-			}
-			/* negated? */
-			if (!starts_with(arg, "no-"))
-				continue;
-			flags |= OPT_UNSET;
-			if (!skip_prefix(arg + 3, long_name, &rest)) {
-				/* abbreviated and negated? */
-				if (!strncmp(long_name, arg + 3,
-					     arg_end - arg - 3))
-					register_abbrev(p, options,
-							flags ^ opt_flags,
-							&abbrev, &ambiguous);
-				continue;
 			}
+			continue;
 		}
 		if (*rest) {
 			if (*rest != '=')
--
2.44.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* [PATCH v2 6/6] parse-options: rearrange long_name matching code
  2024-03-03 12:19 ` [PATCH v2 " René Scharfe
                     ` (4 preceding siblings ...)
  2024-03-03 12:19   ` [PATCH v2 5/6] parse-options: normalize arg and long_name before comparison René Scharfe
@ 2024-03-03 12:19   ` René Scharfe
  2024-03-12 21:44     ` Josh Steadmon
  2024-03-12 21:45   ` [PATCH v2 0/6] parse-options: long option parsing fixes and cleanup Josh Steadmon
  6 siblings, 1 reply; 23+ messages in thread
From: René Scharfe @ 2024-03-03 12:19 UTC (permalink / raw)
  To: git

Move the code for handling a full match of long_name first and get rid
of negations.  Reduce the indent of the code for matching abbreviations
and remove unnecessary curly braces.  Combine the checks for whether
negation is allowed and whether arg is "n", "no" or "no-" because they
belong together and avoid a continue statement.  The result is shorter,
more readable code.

Signed-off-by: René Scharfe <l.s.r@web.de>
---
 parse-options.c | 37 +++++++++++++++----------------------
 1 file changed, 15 insertions(+), 22 deletions(-)

diff --git a/parse-options.c b/parse-options.c
index d45efa4e5c..30b9e68f8a 100644
--- a/parse-options.c
+++ b/parse-options.c
@@ -413,30 +413,23 @@ static enum parse_opt_result parse_long_opt(
 		if (((flags ^ opt_flags) & OPT_UNSET) && !allow_unset)
 			continue;

-		if (!skip_prefix(arg_start, long_name, &rest))
-			rest = NULL;
-		if (!rest) {
-			/* abbreviated? */
-			if (!strncmp(long_name, arg_start, arg_end - arg_start)) {
-				register_abbrev(p, options, flags ^ opt_flags,
-						&abbrev, &ambiguous);
-			}
-			/* negation allowed? */
-			if (options->flags & PARSE_OPT_NONEG)
+		if (skip_prefix(arg_start, long_name, &rest)) {
+			if (*rest == '=')
+				p->opt = rest + 1;
+			else if (*rest)
 				continue;
-			/* negated and abbreviated very much? */
-			if (starts_with("no-", arg)) {
-				register_abbrev(p, options, OPT_UNSET ^ opt_flags,
-						&abbrev, &ambiguous);
-			}
-			continue;
+			return get_value(p, options, flags ^ opt_flags);
 		}
-		if (*rest) {
-			if (*rest != '=')
-				continue;
-			p->opt = rest + 1;
-		}
-		return get_value(p, options, flags ^ opt_flags);
+
+		/* abbreviated? */
+		if (!strncmp(long_name, arg_start, arg_end - arg_start))
+			register_abbrev(p, options, flags ^ opt_flags,
+					&abbrev, &ambiguous);
+
+		/* negated and abbreviated very much? */
+		if (allow_unset && starts_with("no-", arg))
+			register_abbrev(p, options, OPT_UNSET ^ opt_flags,
+					&abbrev, &ambiguous);
 	}

 	if (disallow_abbreviated_options && (ambiguous.option || abbrev.option))
--
2.44.0


^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 1/6] parse-options: recognize abbreviated negated option with arg
  2024-03-03 12:19   ` [PATCH v2 1/6] parse-options: recognize abbreviated negated option with arg René Scharfe
@ 2024-03-12 21:42     ` Josh Steadmon
  0 siblings, 0 replies; 23+ messages in thread
From: Josh Steadmon @ 2024-03-12 21:42 UTC (permalink / raw)
  To: René Scharfe; +Cc: git

On 2024.03.03 13:19, René Scharfe wrote:
> Giving an argument to an option that doesn't take one causes Git to
> report that error specifically:
> 
>    $ git rm --dry-run=bogus
>    error: option `dry-run' takes no value
> 
> The same is true when the option is negated or abbreviated:
> 
>    $ git rm --no-dry-run=bogus
>    error: option `no-dry-run' takes no value
> 
>    $ git rm --dry=bogus
>    error: option `dry-run' takes no value
> 
> Not so when doing both, though:
> 
>    $ git rm --no-dry=bogus
>    error: unknown option `no-dry=bogus'
>    usage: git rm [-f | --force] [-n] [-r] [--cached] [--ignore-unmatch]
> 
> (Rest of the usage message omitted.)
> 
> Improve consistency and usefulness of the error message by recognizing
> abbreviated negated options even if they have a (most likely bogus)
> argument.  With this patch we get:
> 
>    $ git rm --no-dry=bogus
>    error: option `no-dry-run' takes no value
> 
> Signed-off-by: René Scharfe <l.s.r@web.de>
> ---
>  parse-options.c          |  5 +++--
>  t/t0040-parse-options.sh | 16 ++++++++++++++++
>  2 files changed, 19 insertions(+), 2 deletions(-)
> 
> diff --git a/parse-options.c b/parse-options.c
> index 63a99dea6e..e4ce33ea48 100644
> --- a/parse-options.c
> +++ b/parse-options.c
> @@ -391,7 +391,7 @@ static enum parse_opt_result parse_long_opt(
>  					ambiguous_option = abbrev_option;
>  					ambiguous_flags = abbrev_flags;
>  				}
> -				if (!(flags & OPT_UNSET) && *arg_end)
> +				if (*arg_end)
>  					p->opt = arg_end + 1;
>  				abbrev_option = options;
>  				abbrev_flags = flags ^ opt_flags;

So this part allows us to recognize when we've attached an option onto a
negated flag (like "--no-foo=bar"). Looks good.


> @@ -412,7 +412,8 @@ static enum parse_opt_result parse_long_opt(
>  			if (!skip_prefix(arg + 3, long_name, &rest)) {
>  				/* abbreviated and negated? */
>  				if (allow_abbrev &&
> -				    starts_with(long_name, arg + 3))
> +				    !strncmp(long_name, arg + 3,
> +					     arg_end - arg - 3))
>  					goto is_abbreviated;
>  				else
>  					continue;

And now we fix the abbreviation match. So if arg="no-foo=bar", (arg+3) =
"foo=bar" and thus never matches a long_name. We fix that by only
comparing up to the first equal sign. Also looks good.

> diff --git a/t/t0040-parse-options.sh b/t/t0040-parse-options.sh
> index ec974867e4..8bb2a8b453 100755
> --- a/t/t0040-parse-options.sh
> +++ b/t/t0040-parse-options.sh
> @@ -210,6 +210,22 @@ test_expect_success 'superfluous value provided: boolean' '
>  	test_cmp expect actual
>  '
> 
> +test_expect_success 'superfluous value provided: boolean, abbreviated' '
> +	cat >expect <<-\EOF &&
> +	error: option `yes'\'' takes no value
> +	EOF
> +	test_expect_code 129 env GIT_TEST_DISALLOW_ABBREVIATED_OPTIONS=false \
> +	test-tool parse-options --ye=hi 2>actual &&
> +	test_cmp expect actual &&
> +
> +	cat >expect <<-\EOF &&
> +	error: option `no-yes'\'' takes no value
> +	EOF
> +	test_expect_code 129 env GIT_TEST_DISALLOW_ABBREVIATED_OPTIONS=false \
> +	test-tool parse-options --no-ye=hi 2>actual &&
> +	test_cmp expect actual
> +'
> +
>  test_expect_success 'superfluous value provided: cmdmode' '
>  	cat >expect <<-\EOF &&
>  	error: option `mode1'\'' takes no value
> --
> 2.44.0

Test looks good as well.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 3/6] parse-options: factor out register_abbrev() and struct parsed_option
  2024-03-03 12:19   ` [PATCH v2 3/6] parse-options: factor out register_abbrev() and struct parsed_option René Scharfe
@ 2024-03-12 21:43     ` Josh Steadmon
  2024-03-13 17:48       ` Junio C Hamano
  2024-03-13 18:47       ` René Scharfe
  0 siblings, 2 replies; 23+ messages in thread
From: Josh Steadmon @ 2024-03-12 21:43 UTC (permalink / raw)
  To: René Scharfe; +Cc: git

I found this change to be hard to follow, although I'm not sure anything
actually needs to be changed. Thinking aloud below, apologies for being
verbose.

On 2024.03.03 13:19, René Scharfe wrote:
> Add a function, register_abbrev(), for storing the necessary details for
> remembering an abbreviated and thus potentially ambiguous option.  Call
> it instead of sharing the code using goto, to make the control flow more
> explicit.
> 
> Conveniently collect these details in the new struct parsed_option to
> reduce the number of necessary function arguments.
> 
> Signed-off-by: René Scharfe <l.s.r@web.de>
> ---
>  parse-options.c | 83 +++++++++++++++++++++++++++++--------------------
>  1 file changed, 49 insertions(+), 34 deletions(-)
> 
> diff --git a/parse-options.c b/parse-options.c
> index 056c6b30e9..398ebaef14 100644
> --- a/parse-options.c
> +++ b/parse-options.c
> @@ -350,14 +350,40 @@ static int is_alias(struct parse_opt_ctx_t *ctx,
>  	return 0;
>  }
> 
> +struct parsed_option {
> +	const struct option *option;
> +	enum opt_parsed flags;
> +};

We're bundling up the state for previously-examined options that "arg"
might expand to. Looks good.


> +static void register_abbrev(struct parse_opt_ctx_t *p,
> +			    const struct option *option, enum opt_parsed flags,
> +			    struct parsed_option *abbrev,
> +			    struct parsed_option *ambiguous)
> +{

Here we're defining a function to separate out the logic that we
previously jumped to with "goto is_abbreviated;". Looks good.


> +	if (p->flags & PARSE_OPT_KEEP_UNKNOWN_OPT)
> +		return;

This is the (negation of the) allow_abbrev condition that was removed
below. Looks good.

> +	if (abbrev->option &&
> +	    !is_alias(p, abbrev->option, option)) {
> +		/*
> +		 * If this is abbreviated, it is
> +		 * ambiguous. So when there is no
> +		 * exact match later, we need to
> +		 * error out.
> +		 */
> +		ambiguous->option = abbrev->option;
> +		ambiguous->flags = abbrev->flags;

Couldn't this just be "*ambiguous = *abbrev;" ?


> +	}
> +	abbrev->option = option;
> +	abbrev->flags = flags;
> +}

So, we've found a candidate that matches our abbreviation. If this is
the second (or later) candidate, then we've got an ambiguous
abbreviation, which we'll need to warn about later. Since we're
overwriting both "ambiguous" and "abbrev", we'll only refer to the last
two candidates seen, but that seems fine.

>  static enum parse_opt_result parse_long_opt(
>  	struct parse_opt_ctx_t *p, const char *arg,
>  	const struct option *options)
>  {
>  	const char *arg_end = strchrnul(arg, '=');
> -	const struct option *abbrev_option = NULL, *ambiguous_option = NULL;
> -	enum opt_parsed abbrev_flags = OPT_LONG, ambiguous_flags = OPT_LONG;
> -	int allow_abbrev = !(p->flags & PARSE_OPT_KEEP_UNKNOWN_OPT);
> +	struct parsed_option abbrev = { .option = NULL, .flags = OPT_LONG };
> +	struct parsed_option ambiguous = { .option = NULL, .flags = OPT_LONG };
> 
>  	for (; options->type != OPTION_END; options++) {
>  		const char *rest, *long_name = options->long_name;
> @@ -377,31 +403,20 @@ static enum parse_opt_result parse_long_opt(
>  			rest = NULL;
>  		if (!rest) {
>  			/* abbreviated? */
> -			if (allow_abbrev &&
> -			    !strncmp(long_name, arg, arg_end - arg)) {
> -is_abbreviated:
> -				if (abbrev_option &&
> -				    !is_alias(p, abbrev_option, options)) {
> -					/*
> -					 * If this is abbreviated, it is
> -					 * ambiguous. So when there is no
> -					 * exact match later, we need to
> -					 * error out.
> -					 */
> -					ambiguous_option = abbrev_option;
> -					ambiguous_flags = abbrev_flags;
> -				}
> -				abbrev_option = options;
> -				abbrev_flags = flags ^ opt_flags;
> +			if (!strncmp(long_name, arg, arg_end - arg)) {
> +				register_abbrev(p, options, flags ^ opt_flags,
> +						&abbrev, &ambiguous);

This part I found hard to follow; the change itself is a simple
replacement of the original logic with our new function, but I don't
understand the original logic in the first place. Why do we XOR flags
and opt_flags here?

These are both bitflags which can hold a combination of OPT_SHORT and/or
OPT_UNSET (i.e., we've passed --no-foo). Since we're only looking at
args that we already know are in long form, the only one we should need
to worry about is whether OPT_UNSET is true or false. And indeed, we
never set OPT_SHORT in this function.

IIUC, "flags" corresponds to "arg", the flag we're trying to parse, and
"opt_flags" corresponds to "options", the current item in the list of
all options that we're trying to match against.

So OPT_UNSET will be true in "flags" if either "arg" starts with "no-"
or if it is a prefix of "no-".

OPT_UNSET will be true in "opt_flags" if all of the following are true:
* "arg" does not start with "no-"
* PARSE_OPT_NONEG is not set in options->flags
* options->long_name starts with "no-"

So OPT_UNSET can never be set at the same time in both "flags" and
"opt_flags", and thus the XOR makes a bit more sense: either the
argument we're parsing or the canonical form of the candidate that we're
matching against is negated. We don't care which one, but we need to
know about the negation later, to either set the value properly, or to
report potential ambiguities with the "no-" prefix.

>  				continue;
>  			}
>  			/* negation allowed? */
>  			if (options->flags & PARSE_OPT_NONEG)
>  				continue;
>  			/* negated and abbreviated very much? */
> -			if (allow_abbrev && starts_with("no-", arg)) {
> +			if (starts_with("no-", arg)) {
>  				flags |= OPT_UNSET;
> -				goto is_abbreviated;
> +				register_abbrev(p, options, flags ^ opt_flags,
> +						&abbrev, &ambiguous);
> +				continue;

This is a slight change: we might set OPT_UNSET in flags where before we
were prevented by the "allow_abbrev" condition, but register_abbrev will
still be a no-op in that case, so I don't think this changes any
behavior.

All the other changes in this patch are straightforward. So, despite
having to puzzle out the original logic, everything here looks good.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 5/6] parse-options: normalize arg and long_name before comparison
  2024-03-03 12:19   ` [PATCH v2 5/6] parse-options: normalize arg and long_name before comparison René Scharfe
@ 2024-03-12 21:43     ` Josh Steadmon
  0 siblings, 0 replies; 23+ messages in thread
From: Josh Steadmon @ 2024-03-12 21:43 UTC (permalink / raw)
  To: René Scharfe; +Cc: git

On 2024.03.03 13:19, René Scharfe wrote:
> Strip "no-" from arg and long_name before comparing them.  This way we
> no longer have to repeat the comparison with an offset of 3 for negated
> arguments.
> 
> Note that we must not modify the "flags" value, which tracks whether arg
> is negated, inside the loop.  When registering "--n", "--no" or "--no-"
> as abbreviation for any negative option, we used to OR it with OPT_UNSET
> and end the loop.  We can simply hard-code OPT_UNSET and leave flags
> unchanged instead.
> 
> Signed-off-by: René Scharfe <l.s.r@web.de>
> ---
>  parse-options.c | 44 ++++++++++++++++++++++----------------------
>  1 file changed, 22 insertions(+), 22 deletions(-)
> 
> diff --git a/parse-options.c b/parse-options.c
> index 008c0f32cf..d45efa4e5c 100644
> --- a/parse-options.c
> +++ b/parse-options.c
> @@ -382,28 +382,42 @@ static enum parse_opt_result parse_long_opt(
>  	const struct option *options)
>  {
>  	const char *arg_end = strchrnul(arg, '=');
> +	const char *arg_start = arg;
> +	enum opt_parsed flags = OPT_LONG;
> +	int arg_starts_with_no_no = 0;
>  	struct parsed_option abbrev = { .option = NULL, .flags = OPT_LONG };
>  	struct parsed_option ambiguous = { .option = NULL, .flags = OPT_LONG };
> 
> +	if (skip_prefix(arg_start, "no-", &arg_start)) {
> +		if (skip_prefix(arg_start, "no-", &arg_start))
> +			arg_starts_with_no_no = 1;
> +		else
> +			flags |= OPT_UNSET;
> +	}
> +
>  	for (; options->type != OPTION_END; options++) {
>  		const char *rest, *long_name = options->long_name;
> -		enum opt_parsed flags = OPT_LONG, opt_flags = OPT_LONG;
> +		enum opt_parsed opt_flags = OPT_LONG;
> +		int allow_unset = !(options->flags & PARSE_OPT_NONEG);
> 
>  		if (options->type == OPTION_SUBCOMMAND)
>  			continue;
>  		if (!long_name)
>  			continue;
> 
> -		if (!starts_with(arg, "no-") &&
> -		    !(options->flags & PARSE_OPT_NONEG) &&
> -		    skip_prefix(long_name, "no-", &long_name))
> +		if (skip_prefix(long_name, "no-", &long_name))
>  			opt_flags |= OPT_UNSET;
> +		else if (arg_starts_with_no_no)
> +			continue;

We only allow double negation ("--no-no-foo") if the canonical form
starts with "--no-". Makes sense.


> +		if (((flags ^ opt_flags) & OPT_UNSET) && !allow_unset)
> +			continue;

I don't think the "& OPT_UNSET" is required here, as we never set any
flags other than OPT_UNSET, but I think it's fine to keep it for
clarity.


> -		if (!skip_prefix(arg, long_name, &rest))
> +		if (!skip_prefix(arg_start, long_name, &rest))
>  			rest = NULL;
>  		if (!rest) {
>  			/* abbreviated? */
> -			if (!strncmp(long_name, arg, arg_end - arg)) {
> +			if (!strncmp(long_name, arg_start, arg_end - arg_start)) {
>  				register_abbrev(p, options, flags ^ opt_flags,
>  						&abbrev, &ambiguous);
>  			}
> @@ -412,24 +426,10 @@ static enum parse_opt_result parse_long_opt(
>  				continue;
>  			/* negated and abbreviated very much? */
>  			if (starts_with("no-", arg)) {
> -				flags |= OPT_UNSET;
> -				register_abbrev(p, options, flags ^ opt_flags,
> +				register_abbrev(p, options, OPT_UNSET ^ opt_flags,
>  						&abbrev, &ambiguous);
> -				continue;
> -			}
> -			/* negated? */
> -			if (!starts_with(arg, "no-"))
> -				continue;
> -			flags |= OPT_UNSET;
> -			if (!skip_prefix(arg + 3, long_name, &rest)) {
> -				/* abbreviated and negated? */
> -				if (!strncmp(long_name, arg + 3,
> -					     arg_end - arg - 3))
> -					register_abbrev(p, options,
> -							flags ^ opt_flags,
> -							&abbrev, &ambiguous);
> -				continue;
>  			}
> +			continue;
>  		}
>  		if (*rest) {
>  			if (*rest != '=')
> --
> 2.44.0
> 
> 

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 6/6] parse-options: rearrange long_name matching code
  2024-03-03 12:19   ` [PATCH v2 6/6] parse-options: rearrange long_name matching code René Scharfe
@ 2024-03-12 21:44     ` Josh Steadmon
  0 siblings, 0 replies; 23+ messages in thread
From: Josh Steadmon @ 2024-03-12 21:44 UTC (permalink / raw)
  To: René Scharfe; +Cc: git

On 2024.03.03 13:19, René Scharfe wrote:
> Move the code for handling a full match of long_name first and get rid
> of negations.  Reduce the indent of the code for matching abbreviations
> and remove unnecessary curly braces.  Combine the checks for whether
> negation is allowed and whether arg is "n", "no" or "no-" because they
> belong together and avoid a continue statement.  The result is shorter,
> more readable code.
> 
> Signed-off-by: René Scharfe <l.s.r@web.de>
> ---
>  parse-options.c | 37 +++++++++++++++----------------------
>  1 file changed, 15 insertions(+), 22 deletions(-)

This version is much more readable, thanks for the cleanup!

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 0/6] parse-options: long option parsing fixes and cleanup
  2024-03-03 12:19 ` [PATCH v2 " René Scharfe
                     ` (5 preceding siblings ...)
  2024-03-03 12:19   ` [PATCH v2 6/6] parse-options: rearrange long_name matching code René Scharfe
@ 2024-03-12 21:45   ` Josh Steadmon
  6 siblings, 0 replies; 23+ messages in thread
From: Josh Steadmon @ 2024-03-12 21:45 UTC (permalink / raw)
  To: René Scharfe; +Cc: git

On 2024.03.03 13:19, René Scharfe wrote:
> Fix two corner cases in patches 1 and 4, refactor the code in patches 2
> and 3 to prepare the second fix, bonus patches 5 and 6 simplify the code
> and improve readability.
> 
> Changes since v1:
> * Added sign-off.
> 
>   parse-options: recognize abbreviated negated option with arg
>   parse-options: set arg of abbreviated option lazily
>   parse-options: factor out register_abbrev() and struct parsed_option
>   parse-options: detect ambiguous self-negation
>   parse-options: normalize arg and long_name before comparison
>   parse-options: rearrange long_name matching code
> 
>  parse-options.c               | 137 ++++++++++++++++++----------------
>  t/t0040-parse-options.sh      |  16 ++++
>  t/t1502-rev-parse-parseopt.sh |  11 +++
>  3 files changed, 100 insertions(+), 64 deletions(-)
> 
> --
> 2.44.0
> 
> 

This series is a good cleanup of some code that at least to me was hard
to figure out on first read. Thanks for the series!

Reviewed-by: Josh Steadmon <steadmon@google.com>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 3/6] parse-options: factor out register_abbrev() and struct parsed_option
  2024-03-12 21:43     ` Josh Steadmon
@ 2024-03-13 17:48       ` Junio C Hamano
  2024-03-13 18:47       ` René Scharfe
  1 sibling, 0 replies; 23+ messages in thread
From: Junio C Hamano @ 2024-03-13 17:48 UTC (permalink / raw)
  To: Josh Steadmon; +Cc: René Scharfe, git

Josh Steadmon <steadmon@google.com> writes:

> I found this change to be hard to follow, although I'm not sure anything
> actually needs to be changed. Thinking aloud below, apologies for being
> verbose.

Thanks for carefully following the code.  It unfortunately has to
get long, but this is the kind of review that I would appreciate
most if I were the author, pointing out what is easy to understand
and more importantly what is harder to follow.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH v2 3/6] parse-options: factor out register_abbrev() and struct parsed_option
  2024-03-12 21:43     ` Josh Steadmon
  2024-03-13 17:48       ` Junio C Hamano
@ 2024-03-13 18:47       ` René Scharfe
  1 sibling, 0 replies; 23+ messages in thread
From: René Scharfe @ 2024-03-13 18:47 UTC (permalink / raw)
  To: Josh Steadmon, git

Am 12.03.24 um 22:43 schrieb Josh Steadmon:
> I found this change to be hard to follow, although I'm not sure anything
> actually needs to be changed. Thinking aloud below, apologies for being
> verbose.

Thank you for taking the time to review and share!

> On 2024.03.03 13:19, René Scharfe wrote:
>> Add a function, register_abbrev(), for storing the necessary details for
>> remembering an abbreviated and thus potentially ambiguous option.  Call
>> it instead of sharing the code using goto, to make the control flow more
>> explicit.
>>
>> Conveniently collect these details in the new struct parsed_option to
>> reduce the number of necessary function arguments.
>>
>> Signed-off-by: René Scharfe <l.s.r@web.de>
>> ---
>>  parse-options.c | 83 +++++++++++++++++++++++++++++--------------------
>>  1 file changed, 49 insertions(+), 34 deletions(-)
>>
>> diff --git a/parse-options.c b/parse-options.c
>> index 056c6b30e9..398ebaef14 100644
>> --- a/parse-options.c
>> +++ b/parse-options.c
>> @@ -350,14 +350,40 @@ static int is_alias(struct parse_opt_ctx_t *ctx,
>>  	return 0;
>>  }
>>
>> +struct parsed_option {
>> +	const struct option *option;
>> +	enum opt_parsed flags;
>> +};
>
> We're bundling up the state for previously-examined options that "arg"
> might expand to. Looks good.
>
>
>> +static void register_abbrev(struct parse_opt_ctx_t *p,
>> +			    const struct option *option, enum opt_parsed flags,
>> +			    struct parsed_option *abbrev,
>> +			    struct parsed_option *ambiguous)
>> +{
>
> Here we're defining a function to separate out the logic that we
> previously jumped to with "goto is_abbreviated;". Looks good.
>
>
>> +	if (p->flags & PARSE_OPT_KEEP_UNKNOWN_OPT)
>> +		return;
>
> This is the (negation of the) allow_abbrev condition that was removed
> below. Looks good.
>
>> +	if (abbrev->option &&
>> +	    !is_alias(p, abbrev->option, option)) {
>> +		/*
>> +		 * If this is abbreviated, it is
>> +		 * ambiguous. So when there is no
>> +		 * exact match later, we need to
>> +		 * error out.
>> +		 */
>> +		ambiguous->option = abbrev->option;
>> +		ambiguous->flags = abbrev->flags;
>
> Couldn't this just be "*ambiguous = *abbrev;" ?

It could, but I wanted to keep it as close to the original as possible.
I was hoping the changes made here would be mechanical enough to be
reviewable without requiring a deeper understanding of what the code
actually does, but that didn't quite work out it seems.  Glad you made
it through, though! :)

>
>
>> +	}
>> +	abbrev->option = option;
>> +	abbrev->flags = flags;
>> +}
>
> So, we've found a candidate that matches our abbreviation. If this is
> the second (or later) candidate, then we've got an ambiguous
> abbreviation, which we'll need to warn about later. Since we're
> overwriting both "ambiguous" and "abbrev", we'll only refer to the last
> two candidates seen, but that seems fine.
>
>>  static enum parse_opt_result parse_long_opt(
>>  	struct parse_opt_ctx_t *p, const char *arg,
>>  	const struct option *options)
>>  {
>>  	const char *arg_end = strchrnul(arg, '=');
>> -	const struct option *abbrev_option = NULL, *ambiguous_option = NULL;
>> -	enum opt_parsed abbrev_flags = OPT_LONG, ambiguous_flags = OPT_LONG;
>> -	int allow_abbrev = !(p->flags & PARSE_OPT_KEEP_UNKNOWN_OPT);
>> +	struct parsed_option abbrev = { .option = NULL, .flags = OPT_LONG };
>> +	struct parsed_option ambiguous = { .option = NULL, .flags = OPT_LONG };
>>
>>  	for (; options->type != OPTION_END; options++) {
>>  		const char *rest, *long_name = options->long_name;
>> @@ -377,31 +403,20 @@ static enum parse_opt_result parse_long_opt(
>>  			rest = NULL;
>>  		if (!rest) {
>>  			/* abbreviated? */
>> -			if (allow_abbrev &&
>> -			    !strncmp(long_name, arg, arg_end - arg)) {
>> -is_abbreviated:
>> -				if (abbrev_option &&
>> -				    !is_alias(p, abbrev_option, options)) {
>> -					/*
>> -					 * If this is abbreviated, it is
>> -					 * ambiguous. So when there is no
>> -					 * exact match later, we need to
>> -					 * error out.
>> -					 */
>> -					ambiguous_option = abbrev_option;
>> -					ambiguous_flags = abbrev_flags;
>> -				}
>> -				abbrev_option = options;
>> -				abbrev_flags = flags ^ opt_flags;
>> +			if (!strncmp(long_name, arg, arg_end - arg)) {
>> +				register_abbrev(p, options, flags ^ opt_flags,
>> +						&abbrev, &ambiguous);
>
> This part I found hard to follow; the change itself is a simple
> replacement of the original logic with our new function, but I don't
> understand the original logic in the first place. Why do we XOR flags
> and opt_flags here?
>
> These are both bitflags which can hold a combination of OPT_SHORT and/or
> OPT_UNSET (i.e., we've passed --no-foo). Since we're only looking at
> args that we already know are in long form, the only one we should need
> to worry about is whether OPT_UNSET is true or false. And indeed, we
> never set OPT_SHORT in this function.

Right.  We do set OPT_LONG as well, but its value is 0, so that's just
for show.

> IIUC, "flags" corresponds to "arg", the flag we're trying to parse, and
> "opt_flags" corresponds to "options", the current item in the list of
> all options that we're trying to match against.
>
> So OPT_UNSET will be true in "flags" if either "arg" starts with "no-"
> or if it is a prefix of "no-".
>
> OPT_UNSET will be true in "opt_flags" if all of the following are true:
> * "arg" does not start with "no-"
> * PARSE_OPT_NONEG is not set in options->flags
> * options->long_name starts with "no-"

Right.

> So OPT_UNSET can never be set at the same time in both "flags" and
> "opt_flags",

It can if arg is "no" or "n" (the "negated and abbreviated very much?"
path), PARSE_OPT_NONEG is unset and long_name starts with "no-".

> and thus the XOR makes a bit more sense: either the
> argument we're parsing or the canonical form of the candidate that we're
> matching against is negated. We don't care which one, but we need to
> know about the negation later, to either set the value properly, or to
> report potential ambiguities with the "no-" prefix.

Right.  The flags track whether the user negated an option, but with
the twist that both arguments given by the user and long names in the
option definition can start with "no-":

arg        option       negated
---------- ------------ -------
--index     --index     no
--index     --no-index  yes
--no-index  --index     yes
--no-index  --no-index  no

"negated" means "the opposite meaning of the defined option" here,
not "presence of no- somewhere".  --index asks for the opposite
effect of what an option defined as --no-index would do.

If OPT_UNSET is set in both, the XOR cancels it out.  E.g. "--no"
directly matches "--no-index" as an abbreviation, without negating it.
(It also matches all other negatable options, so that's only usable
for commands that have a single one of them..)

>
>>  				continue;
>>  			}
>>  			/* negation allowed? */
>>  			if (options->flags & PARSE_OPT_NONEG)
>>  				continue;
>>  			/* negated and abbreviated very much? */
>> -			if (allow_abbrev && starts_with("no-", arg)) {
>> +			if (starts_with("no-", arg)) {
>>  				flags |= OPT_UNSET;
>> -				goto is_abbreviated;
>> +				register_abbrev(p, options, flags ^ opt_flags,
>> +						&abbrev, &ambiguous);
>> +				continue;
>
> This is a slight change: we might set OPT_UNSET in flags where before we
> were prevented by the "allow_abbrev" condition, but register_abbrev will
> still be a no-op in that case, so I don't think this changes any
> behavior.

Indeed.  The continue statement starts the loop over and initializes
flags anew, and the allow_abbrev condition is checked at the top of
register_abbrev(), as you noted.

> All the other changes in this patch are straightforward. So, despite
> having to puzzle out the original logic, everything here looks good.

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2024-03-13 18:47 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-02-24 21:29 [PATCH 0/6] parse-options: long option parsing fixes and cleanup René Scharfe
2024-02-24 21:29 ` [PATCH 1/6] parse-options: recognize abbreviated negated option with arg René Scharfe
2024-02-24 21:29 ` [PATCH 2/6] parse-options: set arg of abbreviated option lazily René Scharfe
2024-02-24 21:29 ` [PATCH 3/6] parse-options: factor out register_abbrev() and struct parsed_option René Scharfe
2024-02-24 23:13   ` Junio C Hamano
2024-02-24 21:29 ` [PATCH 4/6] parse-options: detect ambiguous self-negation René Scharfe
2024-02-24 21:29 ` [PATCH 5/6] parse-options: normalize arg and long_name before comparison René Scharfe
2024-02-24 21:29 ` [PATCH 6/6] parse-options: rearrange long_name matching code René Scharfe
2024-02-24 21:42 ` [PATCH 0/6] parse-options: long option parsing fixes and cleanup René Scharfe
2024-03-03 12:19 ` [PATCH v2 " René Scharfe
2024-03-03 12:19   ` [PATCH v2 1/6] parse-options: recognize abbreviated negated option with arg René Scharfe
2024-03-12 21:42     ` Josh Steadmon
2024-03-03 12:19   ` [PATCH v2 2/6] parse-options: set arg of abbreviated option lazily René Scharfe
2024-03-03 12:19   ` [PATCH v2 3/6] parse-options: factor out register_abbrev() and struct parsed_option René Scharfe
2024-03-12 21:43     ` Josh Steadmon
2024-03-13 17:48       ` Junio C Hamano
2024-03-13 18:47       ` René Scharfe
2024-03-03 12:19   ` [PATCH v2 4/6] parse-options: detect ambiguous self-negation René Scharfe
2024-03-03 12:19   ` [PATCH v2 5/6] parse-options: normalize arg and long_name before comparison René Scharfe
2024-03-12 21:43     ` Josh Steadmon
2024-03-03 12:19   ` [PATCH v2 6/6] parse-options: rearrange long_name matching code René Scharfe
2024-03-12 21:44     ` Josh Steadmon
2024-03-12 21:45   ` [PATCH v2 0/6] parse-options: long option parsing fixes and cleanup Josh Steadmon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).