[Bug] duplicated long-form options go unnoticed

public inbox for git@vger.kernel.org
 help / color / mirror / Atom feed

* [Bug] duplicated long-form options go unnoticed
@ 2026-02-27  0:13 Junio C Hamano
  2026-02-27 19:27 ` [PATCH 1/2] pack-objects: remove duplicate --stdin-packs definition René Scharfe
  2026-02-27 19:27 ` [PATCH 2/2] parseopt: check for duplicate long names and numerical options René Scharfe
  0 siblings, 2 replies; 12+ messages in thread
From: Junio C Hamano @ 2026-02-27  0:13 UTC (permalink / raw)
  To: git

If you make this stupid change to builtin/cat-file.c, rebuild your
git and run "git cat-file --batch-check", without anybody helping
you notice that your change to add a duplicated long command to the
options table is a nonsense.  There should be some way to help the
developer.

The most expensive would be a run-time check in parse_options_check(),
which is not very advisable, but it may be OK to have one hidden behind
a conditional debugging option (like exporting GIT_PARSEOPT_PARFNOID
variable).

 builtin/cat-file.c | 1 +
 1 file changed, 1 insertion(+)

diff --git i/builtin/cat-file.c w/builtin/cat-file.c
index b6f12f41d6..eaa53b2b29 100644
--- i/builtin/cat-file.c
+++ w/builtin/cat-file.c
@@ -1091,6 +1091,7 @@ int cmd_cat_file(int argc,
 			N_("like --batch, but don't emit <contents>"),
 			PARSE_OPT_OPTARG | PARSE_OPT_NONEG,
 			batch_option_callback),
+		OPT_BOOL(0, "batch-check", &batch, N_("batch")),
 		OPT_BOOL_F('z', NULL, &input_nul_terminated, N_("stdin is NUL-terminated"),
 			PARSE_OPT_HIDDEN),
 		OPT_BOOL('Z', NULL, &nul_terminated, N_("stdin and stdout is NUL-terminated")),

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 1/2] pack-objects: remove duplicate --stdin-packs definition
  2026-02-27  0:13 [Bug] duplicated long-form options go unnoticed Junio C Hamano
@ 2026-02-27 19:27 ` René Scharfe
  2026-02-27 19:27 ` [PATCH 2/2] parseopt: check for duplicate long names and numerical options René Scharfe
  1 sibling, 0 replies; 12+ messages in thread
From: René Scharfe @ 2026-02-27 19:27 UTC (permalink / raw)
  To: Junio C Hamano, git, Taylor Blau

cd846bacc7 (pack-objects: introduce '--stdin-packs=follow', 2025-06-23)
added a new definition of the option --stdin-packs that accepts an
argument.  It kept the old definition, which still shows up in the short
help, but is shadowed by the new one.  Remove it.

Hinted-at-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: René Scharfe <l.s.r@web.de>
---
 builtin/pack-objects.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index cfb03d4c09..1ea823f1fb 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -4939,8 +4939,6 @@ int cmd_pack_objects(int argc,
 		OPT_CALLBACK_F(0, "stdin-packs", &stdin_packs, N_("mode"),
 			     N_("read packs from stdin"),
 			     PARSE_OPT_OPTARG, parse_stdin_packs_mode),
-		OPT_BOOL(0, "stdin-packs", &stdin_packs,
-			 N_("read packs from stdin")),
 		OPT_BOOL(0, "stdout", &pack_to_stdout,
 			 N_("output pack to stdout")),
 		OPT_BOOL(0, "include-tag", &include_tag,
-- 
2.53.0

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH 2/2] parseopt: check for duplicate long names and numerical options
  2026-02-27  0:13 [Bug] duplicated long-form options go unnoticed Junio C Hamano
  2026-02-27 19:27 ` [PATCH 1/2] pack-objects: remove duplicate --stdin-packs definition René Scharfe
@ 2026-02-27 19:27 ` René Scharfe
  2026-02-27 22:50   ` Jeff King
  2026-02-28  9:19   ` [PATCH v2 " René Scharfe
  1 sibling, 2 replies; 12+ messages in thread
From: René Scharfe @ 2026-02-27 19:27 UTC (permalink / raw)
  To: Junio C Hamano, git

We already check for duplicate short names.  Check for and report
duplicate long names and numerical options as well.

Signed-off-by: René Scharfe <l.s.r@web.de>
---
The check clearly has a cost, but I have a hard time measuring it.
We already do lots of (kinda cheap) checks.  Turning them on only
in DEVELOPER builds (and ideally demonstrating a speedup) left as
an exercise for interested readers (with stronger benchmark-fu)..

 parse-options.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/parse-options.c b/parse-options.c
index c9cafc21b9..51b72eee11 100644
--- a/parse-options.c
+++ b/parse-options.c
@@ -5,6 +5,7 @@
 #include "gettext.h"
 #include "strbuf.h"
 #include "string-list.h"
+#include "strmap.h"
 #include "utf8.h"
 
 static int disallow_abbreviated_options;
@@ -641,6 +642,8 @@ static void check_typos(const char *arg, const struct option *options)
 static void parse_options_check(const struct option *opts)
 {
 	char short_opts[128];
+	struct strset long_names = STRSET_INIT;
+	bool saw_number_option = false;
 	void *subcommand_value = NULL;
 
 	memset(short_opts, '\0', sizeof(short_opts));
@@ -655,6 +658,16 @@ static void parse_options_check(const struct option *opts)
 			else if (short_opts[opts->short_name]++)
 				optbug(opts, "short name already used");
 		}
+		if (opts->long_name) {
+			if (strset_contains(&long_names, opts->long_name))
+				optbug(opts, "long name already used");
+			strset_add(&long_names, opts->long_name);
+		}
+		if (opts->type == OPTION_NUMBER) {
+			if (saw_number_option)
+				optbug(opts, "duplicate numerical option");
+			saw_number_option = true;
+		}
 		if (opts->flags & PARSE_OPT_NODASH &&
 		    ((opts->flags & PARSE_OPT_OPTARG) ||
 		     !(opts->flags & PARSE_OPT_NOARG) ||
@@ -712,6 +725,7 @@ static void parse_options_check(const struct option *opts)
 			optbug(opts, "multi-word argh should use dash to separate words");
 	}
 	BUG_if_bug("invalid 'struct option'");
+	strset_clear(&long_names);
 }
 
 static int has_subcommands(const struct option *options)
-- 
2.53.0

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] parseopt: check for duplicate long names and numerical options
  2026-02-27 19:27 ` [PATCH 2/2] parseopt: check for duplicate long names and numerical options René Scharfe
@ 2026-02-27 22:50   ` Jeff King
  2026-02-27 23:08     ` Jeff King
  2026-02-28  9:19   ` [PATCH v2 " René Scharfe
  1 sibling, 1 reply; 12+ messages in thread
From: Jeff King @ 2026-02-27 22:50 UTC (permalink / raw)
  To: René Scharfe; +Cc: Junio C Hamano, git

On Fri, Feb 27, 2026 at 08:27:02PM +0100, René Scharfe wrote:

> The check clearly has a cost, but I have a hard time measuring it.
> We already do lots of (kinda cheap) checks.  Turning them on only
> in DEVELOPER builds (and ideally demonstrating a speedup) left as
> an exercise for interested readers (with stronger benchmark-fu)..

I agree it is probably not introducing a measurable slowdown. If we were
to make it conditional, I'd suggest a run-time toggle (so we could turn
it on for all test scripts, but not regular use).

That said...

> @@ -655,6 +658,16 @@ static void parse_options_check(const struct option *opts)
>  			else if (short_opts[opts->short_name]++)
>  				optbug(opts, "short name already used");
>  		}
> +		if (opts->long_name) {
> +			if (strset_contains(&long_names, opts->long_name))
> +				optbug(opts, "long name already used");
> +			strset_add(&long_names, opts->long_name);
> +		}

...if you want to micro-optimize, note that the return value of
strset_add() tells you whether the item was already in the set. That can
save one hash of the string.

Probably the allocation for each element is the dominating cost, though,
and it doesn't help with that.

-Peff

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] parseopt: check for duplicate long names and numerical options
  2026-02-27 22:50   ` Jeff King
@ 2026-02-27 23:08     ` Jeff King
  2026-02-27 23:28       ` Junio C Hamano
  2026-02-28  9:19       ` René Scharfe
  0 siblings, 2 replies; 12+ messages in thread
From: Jeff King @ 2026-02-27 23:08 UTC (permalink / raw)
  To: René Scharfe; +Cc: Junio C Hamano, git

On Fri, Feb 27, 2026 at 05:50:56PM -0500, Jeff King wrote:

> On Fri, Feb 27, 2026 at 08:27:02PM +0100, René Scharfe wrote:
> 
> > The check clearly has a cost, but I have a hard time measuring it.
> > We already do lots of (kinda cheap) checks.  Turning them on only
> > in DEVELOPER builds (and ideally demonstrating a speedup) left as
> > an exercise for interested readers (with stronger benchmark-fu)..
> 
> I agree it is probably not introducing a measurable slowdown. If we were
> to make it conditional, I'd suggest a run-time toggle (so we could turn
> it on for all test scripts, but not regular use).

Just for fun, I was going to write a script that generated a test-tool
parse-options list with 100k entries. But then I realized we already
have something like that!

If you do this:

  (
    echo usage
    echo --
    for i in $(seq 100000); do
      echo "opt$i option $i"
    done
  ) >input

then hyperfine reports (before and after your patches):

  Benchmark 1: ./git.old rev-parse --parseopt -- --opt42 <input
    Time (mean ± σ):      22.2 ms ±   0.4 ms    [User: 16.6 ms, System: 5.6 ms]
    Range (min … max):    21.5 ms …  23.9 ms    127 runs
  
  Benchmark 2: ./git.new rev-parse --parseopt -- --opt42 <input
    Time (mean ± σ):      32.5 ms ±   0.5 ms    [User: 23.8 ms, System: 8.6 ms]
    Range (min … max):    31.7 ms …  34.8 ms    89 runs
  
  Summary
    ./git.old rev-parse --parseopt -- --opt42 <input ran
      1.46 ± 0.03 times faster than ./git.new rev-parse --parseopt -- --opt42 <input

So it is measurable (even with the extra per-option costs to generate
the option structs in the first place). Looks like on the order of 10ms
for 100k options, or about 100ns per option. If you imagine that most
option lists are smaller than 100, we're talking about probably the
equivalent of 50-100 syscalls. If we are really looking to
micro-optimize startup time, I suspect there's pretty low-hanging fruit
to be found of that magnitude.

> > +		if (opts->long_name) {
> > +			if (strset_contains(&long_names, opts->long_name))
> > +				optbug(opts, "long name already used");
> > +			strset_add(&long_names, opts->long_name);
> > +		}
> 
> ...if you want to micro-optimize, note that the return value of
> strset_add() tells you whether the item was already in the set. That can
> save one hash of the string.
> 
> Probably the allocation for each element is the dominating cost, though,
> and it doesn't help with that.

Doing this:

diff --git a/parse-options.c b/parse-options.c
index 51b72eee11..f056a4471e 100644
--- a/parse-options.c
+++ b/parse-options.c
@@ -659,9 +659,8 @@ static void parse_options_check(const struct option *opts)
 				optbug(opts, "short name already used");
 		}
 		if (opts->long_name) {
-			if (strset_contains(&long_names, opts->long_name))
+			if (!strset_add(&long_names, opts->long_name))
 				optbug(opts, "long name already used");
-			strset_add(&long_names, opts->long_name);
 		}
 		if (opts->type == OPTION_NUMBER) {
 			if (saw_number_option)

seems to shave off ~1% of my benchmark. Not that exciting, but hey, it's
one line shorter to boot.

-Peff

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] parseopt: check for duplicate long names and numerical options
  2026-02-27 23:08     ` Jeff King
@ 2026-02-27 23:28       ` Junio C Hamano
  2026-02-28  9:19       ` René Scharfe
  1 sibling, 0 replies; 12+ messages in thread
From: Junio C Hamano @ 2026-02-27 23:28 UTC (permalink / raw)
  To: Jeff King; +Cc: René Scharfe, git

Jeff King <peff@peff.net> writes:

> Doing this:
>
> diff --git a/parse-options.c b/parse-options.c
> index 51b72eee11..f056a4471e 100644
> --- a/parse-options.c
> +++ b/parse-options.c
> @@ -659,9 +659,8 @@ static void parse_options_check(const struct option *opts)
>  				optbug(opts, "short name already used");
>  		}
>  		if (opts->long_name) {
> -			if (strset_contains(&long_names, opts->long_name))
> +			if (!strset_add(&long_names, opts->long_name))
>  				optbug(opts, "long name already used");
> -			strset_add(&long_names, opts->long_name);
>  		}
>  		if (opts->type == OPTION_NUMBER) {
>  			if (saw_number_option)
>
> seems to shave off ~1% of my benchmark. Not that exciting, but hey, it's
> one line shorter to boot.

Yeah, it is the right thing not to hash the same thing twice which
is totally unnecessary.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH 2/2] parseopt: check for duplicate long names and numerical options
  2026-02-27 23:08     ` Jeff King
  2026-02-27 23:28       ` Junio C Hamano
@ 2026-02-28  9:19       ` René Scharfe
  1 sibling, 0 replies; 12+ messages in thread
From: René Scharfe @ 2026-02-28  9:19 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, git

On 2/28/26 12:08 AM, Jeff King wrote:
> On Fri, Feb 27, 2026 at 05:50:56PM -0500, Jeff King wrote:
> 
>> On Fri, Feb 27, 2026 at 08:27:02PM +0100, René Scharfe wrote:
>>
>>> The check clearly has a cost, but I have a hard time measuring it.
>>> We already do lots of (kinda cheap) checks.  Turning them on only
>>> in DEVELOPER builds (and ideally demonstrating a speedup) left as
>>> an exercise for interested readers (with stronger benchmark-fu)..
>>
>> I agree it is probably not introducing a measurable slowdown. If we were
>> to make it conditional, I'd suggest a run-time toggle (so we could turn
>> it on for all test scripts, but not regular use).

Good idea.  We could piggy-back on -h.

> Just for fun, I was going to write a script that generated a test-tool
> parse-options list with 100k entries. But then I realized we already
> have something like that!
> 
> If you do this:
> 
>   (
>     echo usage
>     echo --
>     for i in $(seq 100000); do
>       echo "opt$i option $i"
>     done
>   ) >input
> 
> then hyperfine reports (before and after your patches):
> 
>   Benchmark 1: ./git.old rev-parse --parseopt -- --opt42 <input
>     Time (mean ± σ):      22.2 ms ±   0.4 ms    [User: 16.6 ms, System: 5.6 ms]
>     Range (min … max):    21.5 ms …  23.9 ms    127 runs
>   
>   Benchmark 2: ./git.new rev-parse --parseopt -- --opt42 <input
>     Time (mean ± σ):      32.5 ms ±   0.5 ms    [User: 23.8 ms, System: 8.6 ms]
>     Range (min … max):    31.7 ms …  34.8 ms    89 runs
>   
>   Summary
>     ./git.old rev-parse --parseopt -- --opt42 <input ran
>       1.46 ± 0.03 times faster than ./git.new rev-parse --parseopt -- --opt42 <input
> 
> So it is measurable (even with the extra per-option costs to generate
> the option structs in the first place). Looks like on the order of 10ms
> for 100k options, or about 100ns per option. If you imagine that most
> option lists are smaller than 100, we're talking about probably the
> equivalent of 50-100 syscalls. If we are really looking to
> micro-optimize startup time, I suspect there's pretty low-hanging fruit
> to be found of that magnitude.

Interesting.  I don't like this percentage.  We won't have that many
options, ever, but we'd pay that small cost on every git invocation,
which add up.  The beneficiaries are just a handful of developers who
duplicate options, which seems like a bad deal.

>>> +		if (opts->long_name) {
>>> +			if (strset_contains(&long_names, opts->long_name))
>>> +				optbug(opts, "long name already used");
>>> +			strset_add(&long_names, opts->long_name);
>>> +		}
>>
>> ...if you want to micro-optimize, note that the return value of
>> strset_add() tells you whether the item was already in the set. That can
>> save one hash of the string.

Makes sense, good call.

>> Probably the allocation for each element is the dominating cost, though,
>> and it doesn't help with that.

My knee-jerk reaction is to use a fixed-size array and sort.  Gets rid
of allocations, needs some more CPU cycles and memory accesses.  That
would then either bug out on experiments like yours or detect
duplicates only in the first N long name options.  Not sure if it's
worth the limitations.

René


^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v2 2/2] parseopt: check for duplicate long names and numerical options
  2026-02-27 19:27 ` [PATCH 2/2] parseopt: check for duplicate long names and numerical options René Scharfe
  2026-02-27 22:50   ` Jeff King
@ 2026-02-28  9:19   ` René Scharfe
  2026-02-28 10:58     ` Jeff King
  1 sibling, 1 reply; 12+ messages in thread
From: René Scharfe @ 2026-02-28  9:19 UTC (permalink / raw)
  To: Junio C Hamano, git; +Cc: Jeff King

We already check for duplicate short names.  Check for and report
duplicate long names and numerical options as well.

Perform the slightly expensive string duplicate check only when showing
the usage to keep the cost of normal invocations low.  t0012-help.sh
covers it.

Helped-by: Jeff King <peff@peff.net>
Signed-off-by: René Scharfe <l.s.r@web.de>
---
Changes since v1:
- Removed strset_contains() call.
- Only check for duplicate long names when showing the usage.

 parse-options.c | 22 ++++++++++++++++++++++
 1 file changed, 22 insertions(+)

diff --git a/parse-options.c b/parse-options.c
index c9cafc21b9..0214c106d4 100644
--- a/parse-options.c
+++ b/parse-options.c
@@ -5,6 +5,7 @@
 #include "gettext.h"
 #include "strbuf.h"
 #include "string-list.h"
+#include "strmap.h"
 #include "utf8.h"
 
 static int disallow_abbreviated_options;
@@ -641,6 +642,7 @@ static void check_typos(const char *arg, const struct option *options)
 static void parse_options_check(const struct option *opts)
 {
 	char short_opts[128];
+	bool saw_number_option = false;
 	void *subcommand_value = NULL;
 
 	memset(short_opts, '\0', sizeof(short_opts));
@@ -655,6 +657,11 @@ static void parse_options_check(const struct option *opts)
 			else if (short_opts[opts->short_name]++)
 				optbug(opts, "short name already used");
 		}
+		if (opts->type == OPTION_NUMBER) {
+			if (saw_number_option)
+				optbug(opts, "duplicate numerical option");
+			saw_number_option = true;
+		}
 		if (opts->flags & PARSE_OPT_NODASH &&
 		    ((opts->flags & PARSE_OPT_OPTARG) ||
 		     !(opts->flags & PARSE_OPT_NOARG) ||
@@ -714,6 +721,19 @@ static void parse_options_check(const struct option *opts)
 	BUG_if_bug("invalid 'struct option'");
 }
 
+static void parse_options_check_harder(const struct option *opts)
+{
+	struct strset long_names = STRSET_INIT;
+	for (; opts->type != OPTION_END; opts++) {
+		if (opts->long_name) {
+			if (!strset_add(&long_names, opts->long_name))
+				optbug(opts, "long name already used");
+		}
+	}
+	BUG_if_bug("invalid 'struct option'");
+	strset_clear(&long_names);
+}
+
 static int has_subcommands(const struct option *options)
 {
 	for (; options->type != OPTION_END; options++)
@@ -1339,6 +1359,8 @@ static enum parse_opt_result usage_with_options_internal(struct parse_opt_ctx_t
 	const char *prefix = usage_prefix;
 	int saw_empty_line = 0;
 
+	parse_options_check_harder(opts);
+
 	if (!usagestr)
 		return PARSE_OPT_HELP;
 
-- 
2.53.0

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 2/2] parseopt: check for duplicate long names and numerical options
  2026-02-28  9:19   ` [PATCH v2 " René Scharfe
@ 2026-02-28 10:58     ` Jeff King
  2026-02-28 11:28       ` René Scharfe
  2026-03-01 14:33       ` Junio C Hamano
  0 siblings, 2 replies; 12+ messages in thread
From: Jeff King @ 2026-02-28 10:58 UTC (permalink / raw)
  To: René Scharfe; +Cc: Junio C Hamano, git

On Sat, Feb 28, 2026 at 10:19:16AM +0100, René Scharfe wrote:

> Perform the slightly expensive string duplicate check only when showing
> the usage to keep the cost of normal invocations low.  t0012-help.sh
> covers it.

Nice, this seems like the perfect compromise to me. We get a runtime
switch that kicks in at the moment we want, and we don't even have to
pollute the world with a new switch or environment variable.

> +static void parse_options_check_harder(const struct option *opts)
> +{
> +	struct strset long_names = STRSET_INIT;
> +	for (; opts->type != OPTION_END; opts++) {
> +		if (opts->long_name) {
> +			if (!strset_add(&long_names, opts->long_name))
> +				optbug(opts, "long name already used");
> +		}
> +	}
> +	BUG_if_bug("invalid 'struct option'");
> +	strset_clear(&long_names);
> +}

I confirmed on my silly pathological case that invoking rev-parse with a
real option shows no slowdown, and we now pay the same 10ms cost to show
"-h".

Your other email made me wonder how the sorted-array solution might
perform (patch below). It shaves off 2ms of those 10. Probably not worth
caring about for "-h" output (which is already spending another 5-10ms
to generate the output, versus a normal parse).

-Peff

-- >8 --
diff --git a/parse-options.c b/parse-options.c
index 0214c106d4..1ea7efd5a3 100644
--- a/parse-options.c
+++ b/parse-options.c
@@ -721,17 +721,39 @@ static void parse_options_check(const struct option *opts)
 	BUG_if_bug("invalid 'struct option'");
 }
 
+static int qsort_strcmp(const void *va, const void *vb)
+{
+	const char *a = *(const char **)va;
+	const char *b = *(const char **)vb;
+	return strcmp(a, b);
+}
+
 static void parse_options_check_harder(const struct option *opts)
 {
-	struct strset long_names = STRSET_INIT;
-	for (; opts->type != OPTION_END; opts++) {
-		if (opts->long_name) {
-			if (!strset_add(&long_names, opts->long_name))
-				optbug(opts, "long name already used");
-		}
+	const struct option *p;
+	const char **long_names;
+	size_t i, len;
+
+	len = 0;
+	for (p = opts; p->type != OPTION_END; p++) {
+		if (p->long_name)
+			len++;
 	}
-	BUG_if_bug("invalid 'struct option'");
-	strset_clear(&long_names);
+
+	ALLOC_ARRAY(long_names, len);
+	i = 0;
+	for (p = opts; p->type != OPTION_END; p++) {
+		if (p->long_name)
+			long_names[i++] = p->long_name;
+	}
+
+	QSORT(long_names, len, qsort_strcmp);
+	for (i = 1; i < len; i++) {
+		if (!strcmp(long_names[i], long_names[i-1]))
+			BUG("long name %s used twice", long_names[i]);
+	}
+
+	free(long_names);
 }
 
 static int has_subcommands(const struct option *options)

^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 2/2] parseopt: check for duplicate long names and numerical options
  2026-02-28 10:58     ` Jeff King
@ 2026-02-28 11:28       ` René Scharfe
  2026-03-02 18:24         ` Jeff King
  2026-03-01 14:33       ` Junio C Hamano
  1 sibling, 1 reply; 12+ messages in thread
From: René Scharfe @ 2026-02-28 11:28 UTC (permalink / raw)
  To: Jeff King; +Cc: Junio C Hamano, git

On 2/28/26 11:58 AM, Jeff King wrote:
> On Sat, Feb 28, 2026 at 10:19:16AM +0100, René Scharfe wrote:
> 
>> +static void parse_options_check_harder(const struct option *opts)
>> +{
>> +	struct strset long_names = STRSET_INIT;
>> +	for (; opts->type != OPTION_END; opts++) {
>> +		if (opts->long_name) {
>> +			if (!strset_add(&long_names, opts->long_name))
>> +				optbug(opts, "long name already used");
>> +		}
>> +	}
>> +	BUG_if_bug("invalid 'struct option'");
>> +	strset_clear(&long_names);
>> +}
> 
> I confirmed on my silly pathological case that invoking rev-parse with a
> real option shows no slowdown, and we now pay the same 10ms cost to show
> "-h".
> 
> Your other email made me wonder how the sorted-array solution might
> perform (patch below). It shaves off 2ms of those 10. Probably not worth
> caring about for "-h" output (which is already spending another 5-10ms
> to generate the output, versus a normal parse).
Curious; sorting performs worse on my machine (Apple M1, 1 is 2cc719175,
2 is patch 2 v2, 3 is your patch on top):

Benchmark 1: ./git_main rev-parse --parseopt -- -h <input
  Time (mean ± σ):      77.5 ms ±   0.4 ms    [User: 73.1 ms, System: 3.5 ms]
  Range (min … max):    76.8 ms …  78.5 ms    37 runs

  Warning: Ignoring non-zero exit code.

Benchmark 2: ./git_strset rev-parse --parseopt -- -h <input
  Time (mean ± σ):      82.6 ms ±   0.3 ms    [User: 77.7 ms, System: 3.9 ms]
  Range (min … max):    82.1 ms …  83.7 ms    34 runs

  Warning: Ignoring non-zero exit code.

Benchmark 3: ./git_qsort rev-parse --parseopt -- -h <input
  Time (mean ± σ):      85.6 ms ±   0.2 ms    [User: 81.2 ms, System: 3.5 ms]
  Range (min … max):    85.3 ms …  86.5 ms    33 runs

  Warning: Ignoring non-zero exit code.

Summary
  ./git_main rev-parse --parseopt -- -h <input ran
    1.07 ± 0.01 times faster than ./git_strset rev-parse --parseopt -- -h <input
    1.10 ± 0.01 times faster than ./git_qsort rev-parse --parseopt -- -h <input


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 2/2] parseopt: check for duplicate long names and numerical options
  2026-02-28 10:58     ` Jeff King
  2026-02-28 11:28       ` René Scharfe
@ 2026-03-01 14:33       ` Junio C Hamano
  1 sibling, 0 replies; 12+ messages in thread
From: Junio C Hamano @ 2026-03-01 14:33 UTC (permalink / raw)
  To: Jeff King; +Cc: René Scharfe, git

Jeff King <peff@peff.net> writes:

> On Sat, Feb 28, 2026 at 10:19:16AM +0100, René Scharfe wrote:
>
>> Perform the slightly expensive string duplicate check only when showing
>> the usage to keep the cost of normal invocations low.  t0012-help.sh
>> covers it.
>
> Nice, this seems like the perfect compromise to me. We get a runtime
> switch that kicks in at the moment we want, and we don't even have to
> pollute the world with a new switch or environment variable.

Yes, absolutely.  This is very nice.

^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v2 2/2] parseopt: check for duplicate long names and numerical options
  2026-02-28 11:28       ` René Scharfe
@ 2026-03-02 18:24         ` Jeff King
  0 siblings, 0 replies; 12+ messages in thread
From: Jeff King @ 2026-03-02 18:24 UTC (permalink / raw)
  To: René Scharfe; +Cc: Junio C Hamano, git

On Sat, Feb 28, 2026 at 12:28:39PM +0100, René Scharfe wrote:

> > Your other email made me wonder how the sorted-array solution might
> > perform (patch below). It shaves off 2ms of those 10. Probably not worth
> > caring about for "-h" output (which is already spending another 5-10ms
> > to generate the output, versus a normal parse).
> Curious; sorting performs worse on my machine (Apple M1, 1 is 2cc719175,
> 2 is patch 2 v2, 3 is your patch on top):

Interesting. Different architectures, I guess (mine's an i9). It makes
me feel better about not trying to micro-optimize the last couple
nanoseconds, though. ;)

> Benchmark 1: ./git_main rev-parse --parseopt -- -h <input
>   Time (mean ± σ):      77.5 ms ±   0.4 ms    [User: 73.1 ms, System: 3.5 ms]
>   Range (min … max):    76.8 ms …  78.5 ms    37 runs
> 
>   Warning: Ignoring non-zero exit code.
> 
> Benchmark 2: ./git_strset rev-parse --parseopt -- -h <input
>   Time (mean ± σ):      82.6 ms ±   0.3 ms    [User: 77.7 ms, System: 3.9 ms]
>   Range (min … max):    82.1 ms …  83.7 ms    34 runs
> 
>   Warning: Ignoring non-zero exit code.

Interesting that your absolute times are much higher than mine (by a
factor of 4), but the absolute cost of the strset addition is smaller.
I'm not sure if that's another architecture difference, or maybe just
the other unrelated parts of the process startup are more expensive on
macOS (syscalls, filesystem access, etc).

Anyway, now that it is only used for "-h" I don't think we need to care
that much.

-Peff

^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2026-03-02 18:24 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-27  0:13 [Bug] duplicated long-form options go unnoticed Junio C Hamano
2026-02-27 19:27 ` [PATCH 1/2] pack-objects: remove duplicate --stdin-packs definition René Scharfe
2026-02-27 19:27 ` [PATCH 2/2] parseopt: check for duplicate long names and numerical options René Scharfe
2026-02-27 22:50   ` Jeff King
2026-02-27 23:08     ` Jeff King
2026-02-27 23:28       ` Junio C Hamano
2026-02-28  9:19       ` René Scharfe
2026-02-28  9:19   ` [PATCH v2 " René Scharfe
2026-02-28 10:58     ` Jeff King
2026-02-28 11:28       ` René Scharfe
2026-03-02 18:24         ` Jeff King
2026-03-01 14:33       ` Junio C Hamano

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox