* [BUG] git stash show -p with invalid option aborts with double-free in show_stash() (strvec_clear)
@ 2025-09-19 10:18 Lauri Niskanen
2025-09-19 13:11 ` Kristoffer Haugsbakk
0 siblings, 1 reply; 26+ messages in thread
From: Lauri Niskanen @ 2025-09-19 10:18 UTC (permalink / raw)
To: git
What did you do before the bug happened?
Create a repo and a stash entry, then invoke `git stash show -p` with
an invalid option:
git init repro
cd repro
touch a
git add a
git commit -m init
echo x >> a
git stash
git stash show -p --invalid
What did you expect to happen?
Git should print a usage / option error and exit cleanly without crashing.
What happened instead?
free(): double free detected in tcache 2
GDB backtrace:
#0 __pthread_kill_implementation (threadid=<optimized out>,
signo=signo@entry=6, no_tid=no_tid@entry=0) at pthread_kill.c:44
#1 0x00007ffff7c98a13 in __pthread_kill_internal (threadid=<optimized
out>, signo=6) at pthread_kill.c:89
#2 0x00007ffff7c3e410 in __GI_raise (sig=sig@entry=6) at
../sysdeps/posix/raise.c:26
#3 0x00007ffff7c2557a in __GI_abort () at abort.c:77
#4 0x00007ffff7c26613 in __libc_message_impl
(fmt=fmt@entry=0x7ffff7db4355 "%s\n") at
../sysdeps/posix/libc_fatal.c:138
#5 0x00007ffff7ca2d65 in malloc_printerr
(str=str@entry=0x7ffff7db7d78 "free(): double free detected in tcache
2") at malloc.c:5892
#6 0x00007ffff7ca82e8 in tcache_double_free_verify (e=<optimized
out>) at malloc.c:3350
#7 0x00007ffff7ca80a5 in __GI___libc_free (mem=<optimized out>) at
malloc.c:3547
#8 0x00005555558570ce in strvec_clear (array=0x7fffffffc580) at
/usr/src/debug/git/git/strvec.c:134
#9 0x000055555567d9f9 in show_stash (argc=<optimized out>,
argv=<optimized out>, prefix=<optimized out>, repo=<optimized out>) at
builtin/stash.c:1047
#10 0x0000555555686f25 in cmd_stash (argc=3, argv=0x5555559bd260,
prefix=0x0, repo=0x555555992ae0 <the_repo.lto_priv>) at
builtin/stash.c:2410
#11 0x000055555555fd43 in run_builtin (p=0x555555984698
<commands.lto_priv+2904>, argc=<optimized out>, argv=<optimized out>,
repo=0x555555992ae0 <the_repo.lto_priv>) at
/usr/src/debug/git/git/git.c:480
#12 handle_builtin (args=args@entry=0x7fffffffe1d0) at
/usr/src/debug/git/git/git.c:746
#13 0x00005555555608a2 in run_argv (args=0x7fffffffe1d0) at
/usr/src/debug/git/git/git.c:813
#14 cmd_main (argc=<optimized out>, argv=<optimized out>) at
/usr/src/debug/git/git/git.c:953
#15 0x000055555555d784 in main (argc=5, argv=0x7fffffffe518) at
/usr/src/debug/git/git/common-main.c:9
Valgrind:
==2507916== Memcheck, a memory error detector
==2507916== Copyright (C) 2002-2024, and GNU GPL'd, by Julian Seward et al.
==2507916== Using Valgrind-3.25.1 and LibVEX; rerun with -h for copyright info
==2507916== Command: git stash show -p --invalid
==2507916==
==2507916== Invalid free() / delete / delete[] / realloc()
==2507916== at 0x4CB18EF: free (vg_replace_malloc.c:989)
==2507916== by 0x43030CD: strvec_clear (strvec.c:134)
==2507916== by 0x41299F8: show_stash.lto_priv.0 (stash.c:1047)
==2507916== by 0x4132F24: cmd_stash (stash.c:2410)
==2507916== by 0x400BD42: UnknownInlinedFun (git.c:480)
==2507916== by 0x400BD42: handle_builtin (git.c:746)
==2507916== by 0x400C8A1: UnknownInlinedFun (git.c:813)
==2507916== by 0x400C8A1: cmd_main (git.c:953)
==2507916== by 0x4009783: main (common-main.c:9)
==2507916== Address 0x4ff4540 is 0 bytes inside a block of size 8 free'd
==2507916== at 0x4CB18EF: free (vg_replace_malloc.c:989)
==2507916== by 0x43030CD: strvec_clear (strvec.c:134)
==2507916== by 0x41299F8: show_stash.lto_priv.0 (stash.c:1047)
==2507916== by 0x4132F24: cmd_stash (stash.c:2410)
==2507916== by 0x400BD42: UnknownInlinedFun (git.c:480)
==2507916== by 0x400BD42: handle_builtin (git.c:746)
==2507916== by 0x400C8A1: UnknownInlinedFun (git.c:813)
==2507916== by 0x400C8A1: cmd_main (git.c:953)
==2507916== by 0x4009783: main (common-main.c:9)
==2507916== Block was alloc'd at
==2507916== at 0x4CAE7A8: malloc (vg_replace_malloc.c:446)
==2507916== by 0x4E6E2AF: strdup (strdup.c:42)
==2507916== by 0x4129624: UnknownInlinedFun (wrapper.c:43)
==2507916== by 0x4129624: UnknownInlinedFun (strvec.c:25)
==2507916== by 0x4129624: show_stash.lto_priv.0 (stash.c:994)
==2507916== by 0x4132F24: cmd_stash (stash.c:2410)
==2507916== by 0x400BD42: UnknownInlinedFun (git.c:480)
==2507916== by 0x400BD42: handle_builtin (git.c:746)
==2507916== by 0x400C8A1: UnknownInlinedFun (git.c:813)
==2507916== by 0x400C8A1: cmd_main (git.c:953)
==2507916== by 0x4009783: main (common-main.c:9)
==2507916==
usage: git stash show [-u | --include-untracked | --only-untracked]
[<diff-options>] [<stash>]
-u, --[no-]include-untracked
include untracked files in the stash
--only-untracked only show untracked files in the stash
==2507916==
==2507916== HEAP SUMMARY:
==2507916== in use at exit: 696,349 bytes in 371 blocks
==2507916== total heap usage: 750 allocs, 380 frees, 1,221,944 bytes allocated
==2507916==
==2507916== LEAK SUMMARY:
==2507916== definitely lost: 3 bytes in 1 blocks
==2507916== indirectly lost: 0 bytes in 0 blocks
==2507916== possibly lost: 120 bytes in 3 blocks
==2507916== still reachable: 696,226 bytes in 367 blocks
==2507916== suppressed: 0 bytes in 0 blocks
==2507916== Rerun with --leak-check=full to see details of leaked memory
==2507916==
==2507916== For lists of detected and suppressed errors, rerun with: -s
==2507916== ERROR SUMMARY: 1 errors from 1 contexts (suppressed: 0 from 0)
[System Info]
git version:
git version 2.51.0
cpu: x86_64
built from commit: c44beea485f0f2feaf460e2ac87fdd5608d63cf0
sizeof-long: 8
sizeof-size_t: 8
shell-path: /bin/sh
libcurl: 8.15.0
OpenSSL: OpenSSL 3.5.2 5 Aug 2025
zlib-ng: 2.2.5
SHA-1: SHA1_DC
SHA-256: SHA256_BLK
default-ref-format: files
default-hash: sha1
uname: Linux 6.16.7-arch1-1 #1 SMP PREEMPT_DYNAMIC Thu, 11 Sep 2025
17:42:36 +0000 x86_64
compiler info: gnuc: 15.2
libc info: glibc: 2.42
$SHELL (typically, interactive shell): /usr/bin/zsh
[Enabled Hooks]
--
Lauri Niskanen
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [BUG] git stash show -p with invalid option aborts with double-free in show_stash() (strvec_clear)
2025-09-19 10:18 [BUG] git stash show -p with invalid option aborts with double-free in show_stash() (strvec_clear) Lauri Niskanen
@ 2025-09-19 13:11 ` Kristoffer Haugsbakk
2025-09-19 16:00 ` Junio C Hamano
0 siblings, 1 reply; 26+ messages in thread
From: Kristoffer Haugsbakk @ 2025-09-19 13:11 UTC (permalink / raw)
To: Lauri Niskanen, git; +Cc: Patrick Steinhardt
On Fri, Sep 19, 2025, at 12:18, Lauri Niskanen wrote:
> What did you do before the bug happened?
>
> Create a repo and a stash entry, then invoke `git stash show -p` with
> an invalid option:
>
> git init repro
> cd repro
> touch a
> git add a
> git commit -m init
> echo x >> a
> git stash
> git stash show -p --invalid
>
>
> What did you expect to happen?
>
> Git should print a usage / option error and exit cleanly without crashing.
>
>
> What happened instead?
>
> free(): double free detected in tcache 2
This bisects to 748bd094 (builtin/stash: fix leak in `show_stash()`,
2024-06-11) for me.
> GDB backtrace:
>[snip]
--
Kristoffer Haugsbakk
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [BUG] git stash show -p with invalid option aborts with double-free in show_stash() (strvec_clear)
2025-09-19 13:11 ` Kristoffer Haugsbakk
@ 2025-09-19 16:00 ` Junio C Hamano
2025-09-19 16:48 ` Jeff King
2025-09-19 16:58 ` Junio C Hamano
0 siblings, 2 replies; 26+ messages in thread
From: Junio C Hamano @ 2025-09-19 16:00 UTC (permalink / raw)
To: Kristoffer Haugsbakk; +Cc: Lauri Niskanen, git, Patrick Steinhardt
"Kristoffer Haugsbakk" <kristofferhaugsbakk@fastmail.com> writes:
>> What happened instead?
>>
>> free(): double free detected in tcache 2
>
> This bisects to 748bd094 (builtin/stash: fix leak in `show_stash()`,
> 2024-06-11) for me.
Good eyes, both of you.
This comes from the fact that the program calls setup_revisions() on
the "rest of the command line args" and have it parse as much as it
understands, while leaving the remainder that it did not understand
there, with this call.
argc = setup_revisions(revision_args.nr, revision_args.v, &rev, NULL);
if (argc > 1)
goto usage;
If you give "git stash show -p --no-such-option", revision_args
before setup_revisions() is called would have
.v = { "show", "-p", "--no-such-option", NULL }
and upon returning from the call, we would have
.v = { "show", "--no-such-option", "--no-such-option", NULL }
with argc == 2. Note that .nr is only passed by value, so across
the call to setup_revisions(), it will stay .nr == 3.
This mongering of .v[] is because the callee parses out and removes
"-p" as an argv element that it understood shifting the remainder in
the array, with the expectation that the caller would use the array
it passed (i.e. .v[]) and the return value from the call (i.e. 2 in
argc) for further processing. It is not getting the strvec and no
access to .nr to adjust, so this is close to the best it can do.
So there are two ways to fix this.
The easier, more performant, and closer to the original design
around the revisions API is to do this:
diff --git c/builtin/stash.c w/builtin/stash.c
index f5ddee5c7f..b6312b1b70 100644
--- c/builtin/stash.c
+++ w/builtin/stash.c
@@ -1016,6 +1016,8 @@ static int show_stash(int argc, const char **argv, const char *prefix,
}
argc = setup_revisions(revision_args.nr, revision_args.v, &rev, NULL);
+ for (i = argc; i < revision_args.nr; i++)
+ revision_args.v[i] = NULL;
if (argc > 1)
goto usage;
if (!rev.diffopt.output_format) {
A less performant but may in the longer term safer alternative is to
change the caller-callee contract around setup_revisions() so that
the later "unused" slots in the argv array is NULLed before
returning to the caller, i.e. instead of leaving
.v = { "show", "--no-such-option", "--no-such-option", NULL }
in the revision_args.v[] array, teach setup_revisions() to leave
.v = { "show", "--no-such-option", NULL, NULL }
there (again, we cannot do anything about .nr that is only available
to the caller).
^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [BUG] git stash show -p with invalid option aborts with double-free in show_stash() (strvec_clear)
2025-09-19 16:00 ` Junio C Hamano
@ 2025-09-19 16:48 ` Jeff King
2025-09-19 17:13 ` Junio C Hamano
2025-09-19 16:58 ` Junio C Hamano
1 sibling, 1 reply; 26+ messages in thread
From: Jeff King @ 2025-09-19 16:48 UTC (permalink / raw)
To: Junio C Hamano
Cc: Kristoffer Haugsbakk, Lauri Niskanen, git, Patrick Steinhardt
On Fri, Sep 19, 2025 at 09:00:12AM -0700, Junio C Hamano wrote:
> The easier, more performant, and closer to the original design
> around the revisions API is to do this:
>
> diff --git c/builtin/stash.c w/builtin/stash.c
> index f5ddee5c7f..b6312b1b70 100644
> --- c/builtin/stash.c
> +++ w/builtin/stash.c
> @@ -1016,6 +1016,8 @@ static int show_stash(int argc, const char **argv, const char *prefix,
> }
>
> argc = setup_revisions(revision_args.nr, revision_args.v, &rev, NULL);
> + for (i = argc; i < revision_args.nr; i++)
> + revision_args.v[i] = NULL;
> if (argc > 1)
> goto usage;
> if (!rev.diffopt.output_format) {
I think we'll have leaked the string holding "-p" in this instance,
though. We probably need to pass in a setup_revision_opt struct with its
free_removed_argv_elements flag set.
That's true even without your patch, too, of course. I'm mildly
surprised that the test suite doesn't hit this in leak-checking mode,
since it is a problem any time we rearrange argv. E.g., I think:
git stash show -p --
leaks (I was surprised that "stash show -p --stat" didn't leak, but it
doesn't seem to rearrange?).
Another interesting thing about your patch above is that it fills the
strvec with a bunch of NULL entries. Which happens to work, because
free(NULL) is a noop, but I think may be subtly violating assumptions
made about strvecs. Probably:
revision_args.nr = setup_revisions(...);
fits my mental model better, though that is violating a different strvec
invariant now (that the .v[.nr] is always NULL). I think
setup_revisions() is a little sloppy not to set argv[argc] to NULL
itself.
> A less performant but may in the longer term safer alternative is to
> change the caller-callee contract around setup_revisions() so that
> the later "unused" slots in the argv array is NULLed before
> returning to the caller, i.e. instead of leaving
>
> .v = { "show", "--no-such-option", "--no-such-option", NULL }
>
> in the revision_args.v[] array, teach setup_revisions() to leave
>
> .v = { "show", "--no-such-option", NULL, NULL }
>
> there (again, we cannot do anything about .nr that is only available
> to the caller).
I think we should consider a fix like this. Grepping for the
free_removed_argv_elements option, there are a few other spots that
correctly use that flag, but aren't updating the strvec argc. E.g.,
bisect_rev_setup(). So they're going to run into the same problem.
I wonder if the best solution is a setup_revisions() wrapper for strvecs
that will:
- turn on the free_removed_argv_elements option automatically
- collect the return value of setup_revisions() and use it to fix
the .nr field of the strvec
- restore the NULL invariant at the end of the array (though I would
also be happy if setup_revisions() just did this itself)
-Peff
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [BUG] git stash show -p with invalid option aborts with double-free in show_stash() (strvec_clear)
2025-09-19 16:00 ` Junio C Hamano
2025-09-19 16:48 ` Jeff King
@ 2025-09-19 16:58 ` Junio C Hamano
2025-09-19 17:20 ` Jeff King
1 sibling, 1 reply; 26+ messages in thread
From: Junio C Hamano @ 2025-09-19 16:58 UTC (permalink / raw)
To: Kristoffer Haugsbakk; +Cc: Lauri Niskanen, git, Patrick Steinhardt
Junio C Hamano <gitster@pobox.com> writes:
> So there are two ways to fix this.
>
> The easier, more performant, and closer to the original design
> around the revisions API is to do this:
>
> diff --git c/builtin/stash.c w/builtin/stash.c
> index f5ddee5c7f..b6312b1b70 100644
> --- c/builtin/stash.c
> +++ w/builtin/stash.c
> @@ -1016,6 +1016,8 @@ static int show_stash(int argc, const char **argv, const char *prefix,
> }
>
> argc = setup_revisions(revision_args.nr, revision_args.v, &rev, NULL);
> + for (i = argc; i < revision_args.nr; i++)
> + revision_args.v[i] = NULL;
> if (argc > 1)
> goto usage;
> if (!rev.diffopt.output_format) {
>
> A less performant but may in the longer term safer alternative is to
> change the caller-callee contract around setup_revisions() so that
> the later "unused" slots in the argv array is NULLed before
> returning to the caller, i.e. instead of leaving
>
> .v = { "show", "--no-such-option", "--no-such-option", NULL }
>
> in the revision_args.v[] array, teach setup_revisions() to leave
>
> .v = { "show", "--no-such-option", NULL, NULL }
>
> there (again, we cannot do anything about .nr that is only available
> to the caller).
For completeness, here is how the other approach may look like, but
I have made my share of off-by-one mistakes all over the place over
the ears, so somebody else needs to lend an eyeball and check it for
sanity.
----- >8 -----
Subject: revisions: do not leave duplicated cruft from setup_revisions()
The API contract between setup_revisions() and its callers is that
the function gets <argc, argv[]> parameters, parses what it can
recognise, and leaves the remainder in argv[] (shifting them to the
head, and return the number of arguments left. The caller then is
supposed to look at only the early part of argv[] it passed to the
function, up to the argc returned by the function.
Because by contract the caller is not supposed to touch the
remainder of the original argv[], the function left them intact. If
the incoming parameter were
argc = 3, argv[] = { "A", "-p", "--unknown", NULL }
and only "-p" is what the function understands, the function would
return
argc = 2, argv[] = { "A", "--unknown", <unspecified>, <unspecified> }
to the caller, and it is fine as long as the caller would not look
beyond what it is allowed to see. In practice, these <unspecified>
slots has the original pointer given by the caller, which meant that
the same instance of string "--unknown" appears at argv[1] as it
got copied from argv[2], and argv[2] is not cleared.
Recently however this bit us when the <argc, argv[]> pair is
prepared in a strvec instance. When trying to clear the strvec
that was passed to setup_revisions(), because some later array
elements are duplicates of earlier elements, strvec_clear() ended up
freeing the same string twice X-<.
We could fix individual callers (in this particular case, letting
strvec_clear() touch the original array .v[] beyond the updated argc
returned by setup_revisions() is a bug in the caller), but I see no
good reason to return argv[] array after modifying it without
cleaning up the later slots (other than the fact that the callee
saves a few cycles because it does not have to bother cleaning array
slots that no caller is supposed to touch, that is).
Teach the function to assign NULLs to the slot the caller is not
supposed to see to help avoiding this class of bugs in the future.
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---
revision.c | 6 ++++++
t/t3903-stash.sh | 5 +++++
2 files changed, 11 insertions(+)
diff --git c/revision.c w/revision.c
index 6ba8f67054..93948e7dc6 100644
--- c/revision.c
+++ w/revision.c
@@ -3174,6 +3174,12 @@ int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct s
revs->show_notes_given = 1;
}
+ /*
+ * NULL out the leftover args we did not understand, which has
+ * shallow copies in earlier slots in the array.
+ */
+ while (left < argc--)
+ argv[argc] = NULL;
return left;
}
diff --git c/t/t3903-stash.sh w/t/t3903-stash.sh
index 0bb4648e36..dd70deb3b3 100755
--- c/t/t3903-stash.sh
+++ w/t/t3903-stash.sh
@@ -69,6 +69,11 @@ test_expect_success 'stash some dirty working directory' '
setup_stash
'
+test_expect_success 'controlled error return on unrecognized option' '
+ test_expect_code 129 git stash show -p --no-such 2>usage &&
+ grep -e "^usage: git stash show" usage
+'
+
cat >expect <<EOF
diff --git a/file b/file
index 0cfbf08..00750ed 100644
^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [BUG] git stash show -p with invalid option aborts with double-free in show_stash() (strvec_clear)
2025-09-19 16:48 ` Jeff King
@ 2025-09-19 17:13 ` Junio C Hamano
0 siblings, 0 replies; 26+ messages in thread
From: Junio C Hamano @ 2025-09-19 17:13 UTC (permalink / raw)
To: Jeff King; +Cc: Kristoffer Haugsbakk, Lauri Niskanen, git, Patrick Steinhardt
Jeff King <peff@peff.net> writes:
> I think we'll have leaked the string holding "-p" in this instance,
> though. We probably need to pass in a setup_revision_opt struct with its
> free_removed_argv_elements flag set.
>
> That's true even without your patch, too, of course.
Yeah, while I was preparing the "alternative", I noticed that option
being paid attention to by the code, but you are right. Anybody who
passes strvec (which owns its contents, unlike the traditional "we
got this argv[] from the operating system" callers) needs to flip
that bit set, or they would leak.
> I'm mildly
> surprised that the test suite doesn't hit this in leak-checking mode,
> since it is a problem any time we rearrange argv. E.g., I think:
>
> git stash show -p --
>
> leaks (I was surprised that "stash show -p --stat" didn't leak, but it
> doesn't seem to rearrange?).
Yeah.
> I wonder if the best solution is a setup_revisions() wrapper for strvecs
> that will:
>
> - turn on the free_removed_argv_elements option automatically
>
> - collect the return value of setup_revisions() and use it to fix
> the .nr field of the strvec
>
> - restore the NULL invariant at the end of the array (though I would
> also be happy if setup_revisions() just did this itself)
That would be nice. I only did the third one in my "alternative"
patch I sent earlier.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [BUG] git stash show -p with invalid option aborts with double-free in show_stash() (strvec_clear)
2025-09-19 16:58 ` Junio C Hamano
@ 2025-09-19 17:20 ` Jeff King
2025-09-19 18:15 ` Junio C Hamano
0 siblings, 1 reply; 26+ messages in thread
From: Jeff King @ 2025-09-19 17:20 UTC (permalink / raw)
To: Junio C Hamano
Cc: Kristoffer Haugsbakk, Lauri Niskanen, git, Patrick Steinhardt
On Fri, Sep 19, 2025 at 09:58:25AM -0700, Junio C Hamano wrote:
> For completeness, here is how the other approach may look like, but
> I have made my share of off-by-one mistakes all over the place over
> the ears, so somebody else needs to lend an eyeball and check it for
> sanity.
I _thought_ there was an off-by-one at first, but I think what you have
is correct:
> + /*
> + * NULL out the leftover args we did not understand, which has
> + * shallow copies in earlier slots in the array.
> + */
> + while (left < argc--)
> + argv[argc] = NULL;
> return left;
We definitely want argv[left] to be NULL, which I thought at first did
not happen because of the "<". But because it is post-increment that
happens in the loop condition, it works.
I probably would have written:
while (left <= argc)
argv[argc--] = NULL;
which I think is the same (but I didn't test it, so it probably does
have an off-by-one!).
But really, I do not know that we need to NULL the whole thing. We have
given the caller the reduced argc. The only argv invariant we are
violating is that argv[argc] should be NULL (or in this case,
argv[left]). Anything after argv+left should be considered
uninitialized. So just:
argv[left] = NULL;
would be enough, I'd think.
> diff --git c/t/t3903-stash.sh w/t/t3903-stash.sh
> index 0bb4648e36..dd70deb3b3 100755
> --- c/t/t3903-stash.sh
> +++ w/t/t3903-stash.sh
> @@ -69,6 +69,11 @@ test_expect_success 'stash some dirty working directory' '
> setup_stash
> '
>
> +test_expect_success 'controlled error return on unrecognized option' '
> + test_expect_code 129 git stash show -p --no-such 2>usage &&
> + grep -e "^usage: git stash show" usage
> +'
This passes now, but fails with SANITIZE=leak. Along with a bunch of
other tests, as we are now overwriting entries in strvecs with NULL, so
it has no opportunity to free them. We need to respect the
setup_revision_opt to free.
I'm working up a few alternative patches.
-Peff
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [BUG] git stash show -p with invalid option aborts with double-free in show_stash() (strvec_clear)
2025-09-19 17:20 ` Jeff King
@ 2025-09-19 18:15 ` Junio C Hamano
2025-09-19 19:56 ` Jeff King
0 siblings, 1 reply; 26+ messages in thread
From: Junio C Hamano @ 2025-09-19 18:15 UTC (permalink / raw)
To: Jeff King; +Cc: Kristoffer Haugsbakk, Lauri Niskanen, git, Patrick Steinhardt
Jeff King <peff@peff.net> writes:
> But really, I do not know that we need to NULL the whole thing. We have
> given the caller the reduced argc. The only argv invariant we are
> violating is that argv[argc] should be NULL (or in this case,
> argv[left]). Anything after argv+left should be considered
> uninitialized. So just:
>
> argv[left] = NULL;
>
> would be enough, I'd think.
Even when strvec was passed and more than one element was eaten
after parsing? strvec_clear() goes by .nr not stopping at the first
NULL IIRC.
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [BUG] git stash show -p with invalid option aborts with double-free in show_stash() (strvec_clear)
2025-09-19 18:15 ` Junio C Hamano
@ 2025-09-19 19:56 ` Jeff King
2025-09-19 22:33 ` [PATCH 0/6] fixing double-frees and leaks via setup_revisions() Jeff King
0 siblings, 1 reply; 26+ messages in thread
From: Jeff King @ 2025-09-19 19:56 UTC (permalink / raw)
To: Junio C Hamano
Cc: Kristoffer Haugsbakk, Lauri Niskanen, git, Patrick Steinhardt
On Fri, Sep 19, 2025 at 11:15:13AM -0700, Junio C Hamano wrote:
> Jeff King <peff@peff.net> writes:
>
> > But really, I do not know that we need to NULL the whole thing. We have
> > given the caller the reduced argc. The only argv invariant we are
> > violating is that argv[argc] should be NULL (or in this case,
> > argv[left]). Anything after argv+left should be considered
> > uninitialized. So just:
> >
> > argv[left] = NULL;
> >
> > would be enough, I'd think.
>
> Even when strvec was passed and more than one element was eaten
> after parsing? strvec_clear() goes by .nr not stopping at the first
> NULL IIRC.
Yes, there is a big can of worms here. ;) It turns out that many spots
with strvecs were relying on leaving these entries untouched, and so
setting any of them to NULL causes leaks.
I think I've got it mostly worked out, but that's why I haven't sent
patches yet. Stay tuned.
-Peff
^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH 0/6] fixing double-frees and leaks via setup_revisions()
2025-09-19 19:56 ` Jeff King
@ 2025-09-19 22:33 ` Jeff King
2025-09-19 22:40 ` [PATCH 1/6] stash: tell setup_revisions() to free our allocated strings Jeff King
` (5 more replies)
0 siblings, 6 replies; 26+ messages in thread
From: Jeff King @ 2025-09-19 22:33 UTC (permalink / raw)
To: Junio C Hamano
Cc: Kristoffer Haugsbakk, Lauri Niskanen, git, Patrick Steinhardt
On Fri, Sep 19, 2025 at 03:56:26PM -0400, Jeff King wrote:
> > > But really, I do not know that we need to NULL the whole thing. We have
> > > given the caller the reduced argc. The only argv invariant we are
> > > violating is that argv[argc] should be NULL (or in this case,
> > > argv[left]). Anything after argv+left should be considered
> > > uninitialized. So just:
> > >
> > > argv[left] = NULL;
> > >
> > > would be enough, I'd think.
> >
> > Even when strvec was passed and more than one element was eaten
> > after parsing? strvec_clear() goes by .nr not stopping at the first
> > NULL IIRC.
>
> Yes, there is a big can of worms here. ;) It turns out that many spots
> with strvecs were relying on leaving these entries untouched, and so
> setting any of them to NULL causes leaks.
>
> I think I've got it mostly worked out, but that's why I haven't sent
> patches yet. Stay tuned.
OK, here's what I've come up with. It passes the regular test suite, as
well as with LSan, ASan, and UBSan. There were definitely some ugly
gotchas in there.
It _almost_ makes me think that setup_revisions() should never munge
argv at all, and should just produce a separate strvec of unknown
options. That keeps the concerns separate, and the extra allocation cast
is trivial. But we'd have to modify every caller to handle the new world
order.
So anyway, here it is:
[1/6]: stash: tell setup_revisions() to free our allocated strings
This fixes a real (albeit tiny) leak.
[2/6]: revision: manage memory ownership of argv in setup_revisions()
This fixes the double-free that started us off, but also addresses
the related leak issues that our earlier discussion had.
[3/6]: revision: add wrapper to setup_revisions() from a strvec
[4/6]: treewide: use setup_revisions_from_strvec() when we have a strvec
[5/6]: treewide: pass strvecs around for setup_revisions_from_strvec()
The rest of this is optional. These three introduce a convenience
wrapper, which I do think makes the code a little shorter and
eliminates a possible footgun. But AFAIK it's not actually fixing
any bugs.
[6/6]: revision: retain argv NULL invariant in setup_revisions()
This one does the NULL thing we discussed earlier for all callers.
It turned up some interesting gotchas! One is that without the
wrapper from patches 3-5, it introduces leaks for strvec users which
assume (correctly) that setup_revisions() will never munge their
argv or need to free anything. And two, lots of callers pass a NULL
argv to setup_revisions. Yikes.
bisect.c | 5 +---
builtin/describe.c | 3 ++-
builtin/pack-objects.c | 6 ++---
builtin/rebase.c | 3 +--
builtin/stash.c | 4 +--
builtin/submodule--helper.c | 10 ++------
http-push.c | 2 +-
remote.c | 5 +---
revision.c | 49 ++++++++++++++++++++++++++++++++++---
revision.h | 2 ++
sequencer.c | 7 +++---
sequencer.h | 4 +--
shallow.c | 4 +--
shallow.h | 4 +--
submodule.c | 2 +-
t/t3903-stash.sh | 11 +++++++++
upload-pack.c | 7 +++---
17 files changed, 86 insertions(+), 42 deletions(-)
-Peff
^ permalink raw reply [flat|nested] 26+ messages in thread
* [PATCH 1/6] stash: tell setup_revisions() to free our allocated strings
2025-09-19 22:33 ` [PATCH 0/6] fixing double-frees and leaks via setup_revisions() Jeff King
@ 2025-09-19 22:40 ` Jeff King
2025-09-22 15:45 ` Junio C Hamano
2025-09-19 22:45 ` [PATCH 2/6] revision: manage memory ownership of argv in setup_revisions() Jeff King
` (4 subsequent siblings)
5 siblings, 1 reply; 26+ messages in thread
From: Jeff King @ 2025-09-19 22:40 UTC (permalink / raw)
To: Junio C Hamano
Cc: Kristoffer Haugsbakk, Lauri Niskanen, git, Patrick Steinhardt
In "git stash show", we do a first pass of parsing our command line
options by splitting them into revision args and stash args. These are
stored in strvecs, and we pass the revision args to setup_revisions().
But setup_revisions() may modify the argv we pass it, causing us to leak
some of the entries. In particular, if it sees a "--" string, that will
be dropped from argv. This is the same as other cases addressed by
f92dbdbc6a (revisions API: don't leak memory on argv elements that need
free()-ing, 2022-08-02), and we should fix it the same way: by passing
the free_removed_argv_elements option to setup_revisions().
I've added a test here which fails when built with SANITIZE=leak because
it calls "git stash show --". This by itself is not a very
interesting invocation, because there is nothing after the "--", and
thus the "--" is not really doing anything.
But I think the current parsing in show_stash() is a little
questionable. It splits the arguments into revision options and stash
options solely based on the presence of a leading dash, with no regard
to "--" at all. So:
git stash show -- foo
will take "foo" as a stash option before we even pass anything to
setup_revisions(). And something like:
git stash show -- 1
will show stash@{1}. But I would expect anything after the "--" to be a
pathspec. So in this example it would show only the part of the diff
that touched "foo". And something like:
git stash show -p 1 -- foo
would treat "1" as a stash and "foo" as a pathspec.
That may be something we want to fix, but I want to focus here on the
leak-fixing without changing behavior. So this test is a little odd, but
does what we want without locking us in to any particular behavior (we
only care that "--" by itself does not change the output nor leak).
Signed-off-by: Jeff King <peff@peff.net>
---
I wonder if anybody actually cares that "git stash show -- foo" will
treat "foo" as a stash. If not, then it would probably be a fairly easy
#leftoverbits project to teach it to stop there and retain everything
after as a rev argument (which would then treat it like a pathspec).
builtin/stash.c | 3 ++-
t/t3903-stash.sh | 6 ++++++
2 files changed, 8 insertions(+), 1 deletion(-)
diff --git a/builtin/stash.c b/builtin/stash.c
index f5ddee5c7f..e5ab3c4cf5 100644
--- a/builtin/stash.c
+++ b/builtin/stash.c
@@ -957,6 +957,7 @@ static void diff_include_untracked(const struct stash_info *info, struct diff_op
static int show_stash(int argc, const char **argv, const char *prefix,
struct repository *repo UNUSED)
{
+ struct setup_revision_opt opt = { .free_removed_argv_elements = 1 };
int i;
int ret = -1;
struct stash_info info = STASH_INFO_INIT;
@@ -1015,7 +1016,7 @@ static int show_stash(int argc, const char **argv, const char *prefix,
}
}
- argc = setup_revisions(revision_args.nr, revision_args.v, &rev, NULL);
+ argc = setup_revisions(revision_args.nr, revision_args.v, &rev, &opt);
if (argc > 1)
goto usage;
if (!rev.diffopt.output_format) {
diff --git a/t/t3903-stash.sh b/t/t3903-stash.sh
index 0bb4648e36..1c9e589bbe 100755
--- a/t/t3903-stash.sh
+++ b/t/t3903-stash.sh
@@ -1741,4 +1741,10 @@ test_expect_success 'submodules does not affect the branch recorded in stash mes
)
'
+test_expect_success 'stash show handles --' '
+ git stash show >expect &&
+ git stash show -- >actual &&
+ test_cmp expect actual
+'
+
test_done
--
2.51.0.568.g6b54b97edf
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH 2/6] revision: manage memory ownership of argv in setup_revisions()
2025-09-19 22:33 ` [PATCH 0/6] fixing double-frees and leaks via setup_revisions() Jeff King
2025-09-19 22:40 ` [PATCH 1/6] stash: tell setup_revisions() to free our allocated strings Jeff King
@ 2025-09-19 22:45 ` Jeff King
2025-09-19 22:48 ` [PATCH 3/6] revision: add wrapper to setup_revisions() from a strvec Jeff King
` (3 subsequent siblings)
5 siblings, 0 replies; 26+ messages in thread
From: Jeff King @ 2025-09-19 22:45 UTC (permalink / raw)
To: Junio C Hamano
Cc: Kristoffer Haugsbakk, Lauri Niskanen, git, Patrick Steinhardt
The setup_revisions() function takes an argc/argv pair and consumes
arguments from it, returning a reduced argc count to the caller. But it
may also overwrite entries within the argv array, as it shifts unknown
options to the front of argv (so they can be found in the range of
0..argc-1 after we return).
For a normal argc/argv coming from the operating system, this is OK.
We don't need to worry about memory ownership of the strings in those
entries. But some callers pass in allocated strings from a strvec, and
we do need to care about those.
We faced a similar issue in f92dbdbc6a (revisions API: don't leak memory
on argv elements that need free()-ing, 2022-08-02), which added an
option for callers to tell us that elements need to be freed. But the
implementation within setup_revisions() was incomplete. It only covered
the case of dropping "--", but not the movement of unknown options.
When we shift argv entries around, we should free the elements we are
about to overwrite, so they are not leaked. For example, in:
git stash show -p --invalid
we will pass this to setup_revisions():
argc = 3, argv[] = { "show", "-p", "--invalid", NULL }
which will then return:
argc = 2, argv[] = { "show", "--invalid", "--invalid", NULL }
overwriting the "-p" entry, which is leaked unless we free it at that
moment.
You can see in the output above another potential problem. We now have
two copies of the "--invalid" string. If the caller does not respect the
new argc when free-ing the strings via strvec_clear(), we'll get a
double-free. And git-stash suffers from this, and will crash with the
above command.
So it seems at first glance that the solution is to just assign the
reduced argc to the strvec.nr field in the caller. Then it would stop
after freeing only any copied entries. But that's not always right
either!
Remember that we are reducing "argc" to account for elements we've
consumed. So if there isn't an invalid option, we'd turn:
argc = 2, argv[] = { "show", "-p", NULL }
into:
argc = 1, argv[] = { "show", "-p", NULL }
In that case strvec_clear() must keep looking past the shortened argc we
return to find the original "-p" to free. It needs to use the original
argc to do that.
We can solve this by turning our argv writes into strict moves, not
copies. When we shuffle an unknown option to the front, we'll overwrite
its old position with NULL. That leaves an argv array that may have NULL
"holes" in it.
So in the "--invalid" example above we get:
argc = 2, argv[] = { "show", "--invalid", NULL, NULL }
but something like "git stash -p --invalid -p" would yield:
argc = 3, argv[] = { "show", "--invalid", NULL, "-p", NULL }
because we move "--invalid" to overwrite the first "-p", but the second
one is quietly consumed. But strvec_clear() can handle that fine (it
iterates over the "nr" field, and passing NULL to free() is OK).
To ease the implementation, I've introduced a helper function. It's a
little hacky because it must take a double-pointer to set the old
position to NULL. Which in turn means we cannot pass "&arg", our local
alias for the current entry we're parsing, but instead "&argv[i]", the
pointer in the original array. And to make it even more confusing, we
delegate some of this work to handle_revision_opt(), which is passed a
subset of the argv array, so is always working on "&argv[0]".
Likewise, because handle_revision_opt() only receives the part of argv
left to parse, it receives the array to accumulate unknown options as a
separate unkc/unkv pair. But we're always working on the same argv
array, so our strategy works fine. I suspect this would be a bit more
obvious (and avoid some pointer cleverness) if all functions saw the
full argv array and worked with positions within it (and our new helper
would take two positions, a src and dst). But that would involve
refactoring handle_revision_opt(). I punted on that, as what's here is
not too ugly and is all contained within revision.c itself.
The new test demonstrates that "git stash show -p --invalid" no longer
crashes with a double-free (because we move instead of copy). And it
passes with SANITIZE=leak because we free "-p" before overwriting.
Signed-off-by: Jeff King <peff@peff.net>
---
revision.c | 24 +++++++++++++++++++++---
t/t3903-stash.sh | 5 +++++
2 files changed, 26 insertions(+), 3 deletions(-)
diff --git a/revision.c b/revision.c
index 6ba8f67054..6c140d969d 100644
--- a/revision.c
+++ b/revision.c
@@ -2321,6 +2321,24 @@ static timestamp_t parse_age(const char *arg)
return num;
}
+static void overwrite_argv(int *argc, const char **argv,
+ const char **value,
+ const struct setup_revision_opt *opt)
+{
+ /*
+ * Detect the case when we are overwriting ourselves. The assignment
+ * itself would be a noop either way, but this lets us avoid corner
+ * cases around the free() and NULL operations.
+ */
+ if (*value != argv[*argc]) {
+ if (opt && opt->free_removed_argv_elements)
+ free((char *)argv[*argc]);
+ argv[*argc] = *value;
+ *value = NULL;
+ }
+ (*argc)++;
+}
+
static int handle_revision_opt(struct rev_info *revs, int argc, const char **argv,
int *unkc, const char **unkv,
const struct setup_revision_opt* opt)
@@ -2342,7 +2360,7 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
starts_with(arg, "--branches=") || starts_with(arg, "--tags=") ||
starts_with(arg, "--remotes=") || starts_with(arg, "--no-walk="))
{
- unkv[(*unkc)++] = arg;
+ overwrite_argv(unkc, unkv, &argv[0], opt);
return 1;
}
@@ -2706,7 +2724,7 @@ static int handle_revision_opt(struct rev_info *revs, int argc, const char **arg
} else {
int opts = diff_opt_parse(&revs->diffopt, argv, argc, revs->prefix);
if (!opts)
- unkv[(*unkc)++] = arg;
+ overwrite_argv(unkc, unkv, &argv[0], opt);
return opts;
}
@@ -3018,7 +3036,7 @@ int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct s
if (!strcmp(arg, "--stdin")) {
if (revs->disable_stdin) {
- argv[left++] = arg;
+ overwrite_argv(&left, argv, &argv[i], opt);
continue;
}
if (revs->read_from_stdin++)
diff --git a/t/t3903-stash.sh b/t/t3903-stash.sh
index 1c9e589bbe..7ebeb057d3 100755
--- a/t/t3903-stash.sh
+++ b/t/t3903-stash.sh
@@ -1747,4 +1747,9 @@ test_expect_success 'stash show handles --' '
test_cmp expect actual
'
+test_expect_success 'controlled error return on unrecognized option' '
+ test_expect_code 129 git stash show -p --invalid 2>usage &&
+ grep -e "^usage: git stash show" usage
+'
+
test_done
--
2.51.0.568.g6b54b97edf
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH 3/6] revision: add wrapper to setup_revisions() from a strvec
2025-09-19 22:33 ` [PATCH 0/6] fixing double-frees and leaks via setup_revisions() Jeff King
2025-09-19 22:40 ` [PATCH 1/6] stash: tell setup_revisions() to free our allocated strings Jeff King
2025-09-19 22:45 ` [PATCH 2/6] revision: manage memory ownership of argv in setup_revisions() Jeff King
@ 2025-09-19 22:48 ` Jeff King
2025-09-20 5:10 ` Eric Sunshine
2025-09-19 22:49 ` [PATCH 4/6] treewide: use setup_revisions_from_strvec() when we have " Jeff King
` (2 subsequent siblings)
5 siblings, 1 reply; 26+ messages in thread
From: Jeff King @ 2025-09-19 22:48 UTC (permalink / raw)
To: Junio C Hamano
Cc: Kristoffer Haugsbakk, Lauri Niskanen, git, Patrick Steinhardt
The setup_revisions() function was designed to take the argc/argv pair
from the operating system. But we sometimes construct our own argv using
a strvec and pass that in. There are a few gotchas that callers need to
deal with here:
1. You should always pass the free_removed_argv_elements option via
setup_revision_opt. Otherwise, entries may be leaked if
setup_revisions() re-shuffles options.
2. After setup_revisions() returns, the strvec state is odd. We get a
reduced argc from setup_revisions() telling us how many unknown
options were left in place. Entries after that in argv may be
retained, or may be NULL (depending on how the reshuffling
happened). But the strvec's "nr" field still represents the
original value, and some of the entries it thinks it is still
storing may be NULL. Callers must be careful with how they access
it.
Some callers deal with (1), but not all. In practice they are OK because
they do not pass any options that would cause setup_revisions() to
re-shuffle (namely unknown options which may be relayed from the user,
and the use of the "--" separator). But it's probably a good idea to
consistently pass this option anyway to future-proof ourselves against
the details of setup_revisions() changing.
No callers address (2), though I don't think there any visible bugs.
Most of them simply call strvec_clear() and never otherwise look at the
result. And in fact, if they naively set foo.nr to the argc returned by
setup_revisions(), that would cause leaks! Because setup_revisions()
does not free consumed options[1], we have to leave the "nr" field of
the strvec at its original value to find and free them during
strvec_clear().
So I don't think there are any bugs to fix here, but we can make things
safer and simpler for callers. Let's introduce a helper function that
sets the free_removed_argv_elements automatically and shrinks the strvec
to represent the retained options afterwards (taking care to free the
now-obsolete entries).
We'll start by converting all of the call-sites which the
free_removed_argv_elements option. There should be no behavior change
for them, except that their "shrunken" entries are cleaned up
immediately, rather than waiting for a strvec_clear() call.
[1] Arguably setup_revisions() should be doing this step for us if we
told it to free removed options, but there are many existing callers
which will be broken if it did. Introducing this helper is a
possible first step towards that.
Signed-off-by: Jeff King <peff@peff.net>
---
bisect.c | 5 +----
builtin/stash.c | 5 ++---
builtin/submodule--helper.c | 10 ++--------
remote.c | 5 +----
revision.c | 19 +++++++++++++++++++
revision.h | 2 ++
6 files changed, 27 insertions(+), 19 deletions(-)
diff --git a/bisect.c b/bisect.c
index f24474542e..a6dc76b15c 100644
--- a/bisect.c
+++ b/bisect.c
@@ -674,9 +674,6 @@ static void bisect_rev_setup(struct repository *r, struct rev_info *revs,
const char *bad_format, const char *good_format,
int read_paths)
{
- struct setup_revision_opt opt = {
- .free_removed_argv_elements = 1,
- };
int i;
repo_init_revisions(r, revs, prefix);
@@ -693,7 +690,7 @@ static void bisect_rev_setup(struct repository *r, struct rev_info *revs,
if (read_paths)
read_bisect_paths(rev_argv);
- setup_revisions(rev_argv->nr, rev_argv->v, revs, &opt);
+ setup_revisions_from_strvec(rev_argv, revs, NULL);
}
static void bisect_common(struct rev_info *revs)
diff --git a/builtin/stash.c b/builtin/stash.c
index e5ab3c4cf5..3b473c4489 100644
--- a/builtin/stash.c
+++ b/builtin/stash.c
@@ -957,7 +957,6 @@ static void diff_include_untracked(const struct stash_info *info, struct diff_op
static int show_stash(int argc, const char **argv, const char *prefix,
struct repository *repo UNUSED)
{
- struct setup_revision_opt opt = { .free_removed_argv_elements = 1 };
int i;
int ret = -1;
struct stash_info info = STASH_INFO_INIT;
@@ -1016,8 +1015,8 @@ static int show_stash(int argc, const char **argv, const char *prefix,
}
}
- argc = setup_revisions(revision_args.nr, revision_args.v, &rev, &opt);
- if (argc > 1)
+ setup_revisions_from_strvec(&revision_args, &rev, NULL);
+ if (revision_args.nr > 1)
goto usage;
if (!rev.diffopt.output_format) {
rev.diffopt.output_format = DIFF_FORMAT_PATCH;
diff --git a/builtin/submodule--helper.c b/builtin/submodule--helper.c
index 07a1935cbe..fcd73abe53 100644
--- a/builtin/submodule--helper.c
+++ b/builtin/submodule--helper.c
@@ -616,9 +616,6 @@ static void status_submodule(const char *path, const struct object_id *ce_oid,
struct rev_info rev = REV_INFO_INIT;
struct strbuf buf = STRBUF_INIT;
const char *git_dir;
- struct setup_revision_opt opt = {
- .free_removed_argv_elements = 1,
- };
if (validate_submodule_path(path) < 0)
die(NULL);
@@ -655,7 +652,7 @@ static void status_submodule(const char *path, const struct object_id *ce_oid,
repo_init_revisions(the_repository, &rev, NULL);
rev.abbrev = 0;
- setup_revisions(diff_files_args.nr, diff_files_args.v, &rev, &opt);
+ setup_revisions_from_strvec(&diff_files_args, &rev, NULL);
run_diff_files(&rev, 0);
if (!diff_result_code(&rev)) {
@@ -1094,9 +1091,6 @@ static int compute_summary_module_list(struct object_id *head_oid,
{
struct strvec diff_args = STRVEC_INIT;
struct rev_info rev;
- struct setup_revision_opt opt = {
- .free_removed_argv_elements = 1,
- };
struct module_cb_list list = MODULE_CB_LIST_INIT;
int ret = 0;
@@ -1114,7 +1108,7 @@ static int compute_summary_module_list(struct object_id *head_oid,
repo_init_revisions(the_repository, &rev, info->prefix);
rev.abbrev = 0;
precompose_argv_prefix(diff_args.nr, diff_args.v, NULL);
- setup_revisions(diff_args.nr, diff_args.v, &rev, &opt);
+ setup_revisions_from_strvec(&diff_args, &rev, NULL);
rev.diffopt.output_format = DIFF_FORMAT_NO_OUTPUT | DIFF_FORMAT_CALLBACK;
rev.diffopt.format_callback = submodule_summary_callback;
rev.diffopt.format_callback_data = &list;
diff --git a/remote.c b/remote.c
index 81d8fc017e..df9675cd33 100644
--- a/remote.c
+++ b/remote.c
@@ -2143,9 +2143,6 @@ static int stat_branch_pair(const char *branch_name, const char *base,
struct object_id oid;
struct commit *ours, *theirs;
struct rev_info revs;
- struct setup_revision_opt opt = {
- .free_removed_argv_elements = 1,
- };
struct strvec argv = STRVEC_INIT;
/* Cannot stat if what we used to build on no longer exists */
@@ -2180,7 +2177,7 @@ static int stat_branch_pair(const char *branch_name, const char *base,
strvec_push(&argv, "--");
repo_init_revisions(the_repository, &revs, NULL);
- setup_revisions(argv.nr, argv.v, &revs, &opt);
+ setup_revisions_from_strvec(&argv, &revs, NULL);
if (prepare_revision_walk(&revs))
die(_("revision walk setup failed"));
diff --git a/revision.c b/revision.c
index 6c140d969d..f50f5d8ea2 100644
--- a/revision.c
+++ b/revision.c
@@ -3195,6 +3195,25 @@ int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct s
return left;
}
+void setup_revisions_from_strvec(struct strvec *argv, struct rev_info *revs,
+ struct setup_revision_opt *opt)
+{
+ struct setup_revision_opt fallback_opt;
+ int ret;
+
+ if (!opt) {
+ memset(&fallback_opt, 0, sizeof(fallback_opt));
+ opt = &fallback_opt;
+ }
+ opt->free_removed_argv_elements = 1;
+
+ ret = setup_revisions(argv->nr, argv->v, revs, opt);
+
+ for (size_t i = ret; i < argv->nr; i++)
+ free((char *)argv->v[i]);
+ argv->nr = ret;
+}
+
static void release_revisions_cmdline(struct rev_cmdline_info *cmdline)
{
unsigned int i;
diff --git a/revision.h b/revision.h
index 21e288c5ba..a28e349044 100644
--- a/revision.h
+++ b/revision.h
@@ -441,6 +441,8 @@ struct setup_revision_opt {
};
int setup_revisions(int argc, const char **argv, struct rev_info *revs,
struct setup_revision_opt *);
+void setup_revisions_from_strvec(struct strvec *argv, struct rev_info *revs,
+ struct setup_revision_opt *);
/**
* Free data allocated in a "struct rev_info" after it's been
--
2.51.0.568.g6b54b97edf
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH 4/6] treewide: use setup_revisions_from_strvec() when we have a strvec
2025-09-19 22:33 ` [PATCH 0/6] fixing double-frees and leaks via setup_revisions() Jeff King
` (2 preceding siblings ...)
2025-09-19 22:48 ` [PATCH 3/6] revision: add wrapper to setup_revisions() from a strvec Jeff King
@ 2025-09-19 22:49 ` Jeff King
2025-09-19 22:50 ` [PATCH 5/6] treewide: pass strvecs around for setup_revisions_from_strvec() Jeff King
2025-09-19 22:51 ` [PATCH 6/6] revision: retain argv NULL invariant in setup_revisions() Jeff King
5 siblings, 0 replies; 26+ messages in thread
From: Jeff King @ 2025-09-19 22:49 UTC (permalink / raw)
To: Junio C Hamano
Cc: Kristoffer Haugsbakk, Lauri Niskanen, git, Patrick Steinhardt
The previous commit introduced a wrapper to make using setup_revisions()
with a strvec easier and safer. It converted spots that were already
doing most of what the wrapper did.
Let's now convert spots where we were not setting up the
free_removed_argv_elements flag. As discussed in the previous commit,
this probably isn't fixing any bugs or leaks (since these sites wouldn't
trigger the re-shuffling of argv that causes them). This is mostly
future-proofing us against setup_revisions() becoming more aggressive
about its re-shuffling.
Signed-off-by: Jeff King <peff@peff.net>
---
builtin/describe.c | 3 ++-
http-push.c | 2 +-
submodule.c | 2 +-
3 files changed, 4 insertions(+), 3 deletions(-)
diff --git a/builtin/describe.c b/builtin/describe.c
index 9f4e26d7ff..ffaf8d9f0a 100644
--- a/builtin/describe.c
+++ b/builtin/describe.c
@@ -580,7 +580,8 @@ static void describe_blob(const struct object_id *oid, struct strbuf *dst)
NULL);
repo_init_revisions(the_repository, &revs, NULL);
- if (setup_revisions(args.nr, args.v, &revs, NULL) > 1)
+ setup_revisions_from_strvec(&args, &revs, NULL);
+ if (args.nr > 1)
BUG("setup_revisions could not handle all args?");
if (prepare_revision_walk(&revs))
diff --git a/http-push.c b/http-push.c
index 91a5465afb..4c43ba3bc7 100644
--- a/http-push.c
+++ b/http-push.c
@@ -1941,7 +1941,7 @@ int cmd_main(int argc, const char **argv)
strvec_pushf(&commit_argv, "^%s",
oid_to_hex(&ref->old_oid));
repo_init_revisions(the_repository, &revs, setup_git_directory());
- setup_revisions(commit_argv.nr, commit_argv.v, &revs, NULL);
+ setup_revisions_from_strvec(&commit_argv, &revs, NULL);
revs.edge_hint = 0; /* just in case */
/* Generate a list of objects that need to be pushed */
diff --git a/submodule.c b/submodule.c
index fff3c75570..35c55155f7 100644
--- a/submodule.c
+++ b/submodule.c
@@ -900,7 +900,7 @@ static void collect_changed_submodules(struct repository *r,
save_warning = warn_on_object_refname_ambiguity;
warn_on_object_refname_ambiguity = 0;
repo_init_revisions(r, &rev, NULL);
- setup_revisions(argv->nr, argv->v, &rev, &s_r_opt);
+ setup_revisions_from_strvec(argv, &rev, &s_r_opt);
warn_on_object_refname_ambiguity = save_warning;
if (prepare_revision_walk(&rev))
die(_("revision walk setup failed"));
--
2.51.0.568.g6b54b97edf
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH 5/6] treewide: pass strvecs around for setup_revisions_from_strvec()
2025-09-19 22:33 ` [PATCH 0/6] fixing double-frees and leaks via setup_revisions() Jeff King
` (3 preceding siblings ...)
2025-09-19 22:49 ` [PATCH 4/6] treewide: use setup_revisions_from_strvec() when we have " Jeff King
@ 2025-09-19 22:50 ` Jeff King
2025-09-19 23:11 ` Jeff King
2025-09-19 22:51 ` [PATCH 6/6] revision: retain argv NULL invariant in setup_revisions() Jeff King
5 siblings, 1 reply; 26+ messages in thread
From: Jeff King @ 2025-09-19 22:50 UTC (permalink / raw)
To: Junio C Hamano
Cc: Kristoffer Haugsbakk, Lauri Niskanen, git, Patrick Steinhardt
The previous commit converted callers of setup_revisions() with a strvec
to use the safer and easier _from_strvec() variant.
Let's now convert spots that don't directly have a strvec, but receive
an argc/argv pair that eventually comes from one. We'll instead pass the
strvec down to the point where we call setup_revisions().
That makes these functions slightly less flexible if they were to grow
other callers that don't use strvecs, but this rigidity is buying us
some safety. It is only safe to pass the free_removed_argv_elements
option to setup_revisions() if we know the elements of argv/argc are
allocated on the heap. That isn't communicated in the type system when
we are passed the bare elements. But if we get a strvec, we know that
the elements are allocated strings.
And at any rate, each of these modified functions has only a single
caller (that has a strvec), so the loss of flexibility is unlikely to
ever matter.
Signed-off-by: Jeff King <peff@peff.net>
---
builtin/pack-objects.c | 6 +++---
builtin/rebase.c | 3 +--
sequencer.c | 7 ++++---
sequencer.h | 4 ++--
shallow.c | 4 ++--
shallow.h | 4 ++--
upload-pack.c | 7 +++----
7 files changed, 17 insertions(+), 18 deletions(-)
diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c
index 1494afcf3d..5856b5f6bf 100644
--- a/builtin/pack-objects.c
+++ b/builtin/pack-objects.c
@@ -4650,7 +4650,7 @@ static void get_object_list_path_walk(struct rev_info *revs)
die(_("failed to pack objects via path-walk"));
}
-static void get_object_list(struct rev_info *revs, int ac, const char **av)
+static void get_object_list(struct rev_info *revs, struct strvec *argv)
{
struct setup_revision_opt s_r_opt = {
.allow_exclude_promisor_objects = 1,
@@ -4660,7 +4660,7 @@ static void get_object_list(struct rev_info *revs, int ac, const char **av)
int save_warning;
save_commit_buffer = 0;
- setup_revisions(ac, av, revs, &s_r_opt);
+ setup_revisions_from_strvec(argv, revs, &s_r_opt);
/* make sure shallows are read */
is_repository_shallow(the_repository);
@@ -5229,7 +5229,7 @@ int cmd_pack_objects(int argc,
revs.include_check = is_not_in_promisor_pack;
revs.include_check_obj = is_not_in_promisor_pack_obj;
}
- get_object_list(&revs, rp.nr, rp.v);
+ get_object_list(&revs, &rp);
release_revisions(&revs);
}
cleanup_preferred_base();
diff --git a/builtin/rebase.c b/builtin/rebase.c
index 67c0352bf8..c468828189 100644
--- a/builtin/rebase.c
+++ b/builtin/rebase.c
@@ -299,8 +299,7 @@ static int do_interactive_rebase(struct rebase_options *opts, unsigned flags)
oid_to_hex(&opts->restrict_revision->object.oid));
ret = sequencer_make_script(the_repository, &todo_list.buf,
- make_script_args.nr, make_script_args.v,
- flags);
+ &make_script_args, flags);
if (ret) {
error(_("could not generate todo list"));
goto cleanup;
diff --git a/sequencer.c b/sequencer.c
index 9ae40a91b2..3d7909a399 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -6064,8 +6064,8 @@ static int make_script_with_merges(struct pretty_print_context *pp,
return 0;
}
-int sequencer_make_script(struct repository *r, struct strbuf *out, int argc,
- const char **argv, unsigned flags)
+int sequencer_make_script(struct repository *r, struct strbuf *out,
+ struct strvec *argv, unsigned flags)
{
char *format = NULL;
struct pretty_print_context pp = {0};
@@ -6106,7 +6106,8 @@ int sequencer_make_script(struct repository *r, struct strbuf *out, int argc,
pp.fmt = revs.commit_format;
pp.output_encoding = get_log_output_encoding();
- if (setup_revisions(argc, argv, &revs, NULL) > 1) {
+ setup_revisions_from_strvec(argv, &revs, NULL);
+ if (argv->nr > 1) {
ret = error(_("make_script: unhandled options"));
goto cleanup;
}
diff --git a/sequencer.h b/sequencer.h
index 304ba4b4d3..719684c8a9 100644
--- a/sequencer.h
+++ b/sequencer.h
@@ -186,8 +186,8 @@ int sequencer_remove_state(struct replay_opts *opts);
#define TODO_LIST_REAPPLY_CHERRY_PICKS (1U << 7)
#define TODO_LIST_WARN_SKIPPED_CHERRY_PICKS (1U << 8)
-int sequencer_make_script(struct repository *r, struct strbuf *out, int argc,
- const char **argv, unsigned flags);
+int sequencer_make_script(struct repository *r, struct strbuf *out,
+ struct strvec *argv, unsigned flags);
int complete_action(struct repository *r, struct replay_opts *opts, unsigned flags,
const char *shortrevisions, const char *onto_name,
diff --git a/shallow.c b/shallow.c
index ef3adb635f..d9cd4e219c 100644
--- a/shallow.c
+++ b/shallow.c
@@ -213,7 +213,7 @@ static void show_commit(struct commit *commit, void *data)
* are marked with shallow_flag. The list of border/shallow commits
* are also returned.
*/
-struct commit_list *get_shallow_commits_by_rev_list(int ac, const char **av,
+struct commit_list *get_shallow_commits_by_rev_list(struct strvec *argv,
int shallow_flag,
int not_shallow_flag)
{
@@ -232,7 +232,7 @@ struct commit_list *get_shallow_commits_by_rev_list(int ac, const char **av,
repo_init_revisions(the_repository, &revs, NULL);
save_commit_buffer = 0;
- setup_revisions(ac, av, &revs, NULL);
+ setup_revisions_from_strvec(argv, &revs, NULL);
if (prepare_revision_walk(&revs))
die("revision walk setup failed");
diff --git a/shallow.h b/shallow.h
index 9bfeade93e..59d54d48e7 100644
--- a/shallow.h
+++ b/shallow.h
@@ -36,8 +36,8 @@ void rollback_shallow_file(struct repository *r, struct shallow_lock *lk);
struct commit_list *get_shallow_commits(struct object_array *heads,
int depth, int shallow_flag, int not_shallow_flag);
-struct commit_list *get_shallow_commits_by_rev_list(
- int ac, const char **av, int shallow_flag, int not_shallow_flag);
+struct commit_list *get_shallow_commits_by_rev_list(struct strvec *argv,
+ int shallow_flag, int not_shallow_flag);
int write_shallow_commits(struct strbuf *out, int use_pack_protocol,
const struct oid_array *extra);
diff --git a/upload-pack.c b/upload-pack.c
index f78fabc1e1..1e87ae9559 100644
--- a/upload-pack.c
+++ b/upload-pack.c
@@ -913,13 +913,12 @@ static void deepen(struct upload_pack_data *data, int depth)
}
static void deepen_by_rev_list(struct upload_pack_data *data,
- int ac,
- const char **av)
+ struct strvec *argv)
{
struct commit_list *result;
disable_commit_graph(the_repository);
- result = get_shallow_commits_by_rev_list(ac, av, SHALLOW, NOT_SHALLOW);
+ result = get_shallow_commits_by_rev_list(argv, SHALLOW, NOT_SHALLOW);
send_shallow(data, result);
free_commit_list(result);
send_unshallow(data);
@@ -955,7 +954,7 @@ static int send_shallow_list(struct upload_pack_data *data)
struct object *o = data->want_obj.objects[i].item;
strvec_push(&av, oid_to_hex(&o->oid));
}
- deepen_by_rev_list(data, av.nr, av.v);
+ deepen_by_rev_list(data, &av);
strvec_clear(&av);
ret = 1;
} else {
--
2.51.0.568.g6b54b97edf
^ permalink raw reply related [flat|nested] 26+ messages in thread
* [PATCH 6/6] revision: retain argv NULL invariant in setup_revisions()
2025-09-19 22:33 ` [PATCH 0/6] fixing double-frees and leaks via setup_revisions() Jeff King
` (4 preceding siblings ...)
2025-09-19 22:50 ` [PATCH 5/6] treewide: pass strvecs around for setup_revisions_from_strvec() Jeff King
@ 2025-09-19 22:51 ` Jeff King
2025-09-19 23:07 ` Jeff King
5 siblings, 1 reply; 26+ messages in thread
From: Jeff King @ 2025-09-19 22:51 UTC (permalink / raw)
To: Junio C Hamano
Cc: Kristoffer Haugsbakk, Lauri Niskanen, git, Patrick Steinhardt
In an argc/argv pair, the entry for argv[argc] is generally NULL. You
can iterate by counting up to argc, or by looking for the NULL entry in
argv.
When we pass such a pair to setup_revisions(), it shrinks argc to
account for the options we consumed and returns the result to the
caller. But it doesn't touch the entries after the reduced argc. So
argv[argc] will be left pointing at some arbitrary entry rather than
NULL.
This isn't the source of any known bugs, since all callers are aware of
the limitation and act accordingly. But it's a possible gotcha that may
be easy to miss.
Let's set the new argv[argc] to NULL, taking care to free it if the
caller asked us to do so.
It is tempting to do likewise for all of the entries afterwards, too, as
some of them may also need to be freed (e.g., if coming from a strvec).
But doing so isn't entirely trivial, as we munge argc in the function
(e.g., when we find "--" and move all of the entries after it into the
prune_data list). It would be possible with some light refactoring, but
it's probably not worth it. Nobody should ever look at them (they are
beyond the revised argc and past the NULL argv entry) outside of strvec
cleanup, and setup_revisions_from_strvec() already handles this case.
There's one other interesting gotcha: many callers which do not want to
provide arguments just pass 0/NULL for argc/argv. We need to check for
this case before assigning the final NULL.
Signed-off-by: Jeff King <peff@peff.net>
---
revision.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/revision.c b/revision.c
index f50f5d8ea2..806a1c4c24 100644
--- a/revision.c
+++ b/revision.c
@@ -3192,6 +3192,12 @@ int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct s
revs->show_notes_given = 1;
}
+ if (argv) {
+ if (opt && opt->free_removed_argv_elements)
+ free((char *)argv[left]);
+ argv[left] = NULL;
+ }
+
return left;
}
--
2.51.0.568.g6b54b97edf
^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [PATCH 6/6] revision: retain argv NULL invariant in setup_revisions()
2025-09-19 22:51 ` [PATCH 6/6] revision: retain argv NULL invariant in setup_revisions() Jeff King
@ 2025-09-19 23:07 ` Jeff King
0 siblings, 0 replies; 26+ messages in thread
From: Jeff King @ 2025-09-19 23:07 UTC (permalink / raw)
To: Junio C Hamano
Cc: Kristoffer Haugsbakk, Lauri Niskanen, git, Patrick Steinhardt
On Fri, Sep 19, 2025 at 06:51:46PM -0400, Jeff King wrote:
> It is tempting to do likewise for all of the entries afterwards, too, as
> some of them may also need to be freed (e.g., if coming from a strvec).
> But doing so isn't entirely trivial, as we munge argc in the function
> (e.g., when we find "--" and move all of the entries after it into the
> prune_data list). It would be possible with some light refactoring, but
> it's probably not worth it. Nobody should ever look at them (they are
> beyond the revised argc and past the NULL argv entry) outside of strvec
> cleanup, and setup_revisions_from_strvec() already handles this case.
I _think_ that would probably look like this on top (with obvious
inspiration from your earlier patch), but I don't know if it is
worthwhile or not:
diff --git a/revision.c b/revision.c
index 806a1c4c24..96188ab4ad 100644
--- a/revision.c
+++ b/revision.c
@@ -3003,14 +3003,17 @@ int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct s
seen_dashdash = 0;
for (i = 1; i < argc; i++) {
const char *arg = argv[i];
+ int j;
if (strcmp(arg, "--"))
continue;
- if (opt && opt->free_removed_argv_elements)
- free((char *)argv[i]);
- argv[i] = NULL;
+ for (j = i; j < argc; j++) {
+ if (i != j)
+ strvec_push(&prune_data, argv[j]);
+ if (opt && opt->free_removed_argv_elements)
+ free((char *)argv[j]);
+ argv[j] = NULL;
+ }
argc = i;
- if (argv[i + 1])
- strvec_pushv(&prune_data, argv + i + 1);
seen_dashdash = 1;
break;
}
@@ -3192,10 +3195,10 @@ int setup_revisions(int argc, const char **argv, struct rev_info *revs, struct s
revs->show_notes_given = 1;
}
- if (argv) {
+ while (left < argc--) {
if (opt && opt->free_removed_argv_elements)
- free((char *)argv[left]);
- argv[left] = NULL;
+ free((char *)argv[argc]);
+ argv[argc] = NULL;
}
return left;
@@ -3205,19 +3208,14 @@ void setup_revisions_from_strvec(struct strvec *argv, struct rev_info *revs,
struct setup_revision_opt *opt)
{
struct setup_revision_opt fallback_opt;
- int ret;
if (!opt) {
memset(&fallback_opt, 0, sizeof(fallback_opt));
opt = &fallback_opt;
}
opt->free_removed_argv_elements = 1;
- ret = setup_revisions(argv->nr, argv->v, revs, opt);
-
- for (size_t i = ret; i < argv->nr; i++)
- free((char *)argv->v[i]);
- argv->nr = ret;
+ argv->nr = setup_revisions(argv->nr, argv->v, revs, opt);
}
static void release_revisions_cmdline(struct rev_cmdline_info *cmdline)
^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [PATCH 5/6] treewide: pass strvecs around for setup_revisions_from_strvec()
2025-09-19 22:50 ` [PATCH 5/6] treewide: pass strvecs around for setup_revisions_from_strvec() Jeff King
@ 2025-09-19 23:11 ` Jeff King
0 siblings, 0 replies; 26+ messages in thread
From: Jeff King @ 2025-09-19 23:11 UTC (permalink / raw)
To: Junio C Hamano
Cc: Kristoffer Haugsbakk, Lauri Niskanen, git, Patrick Steinhardt
On Fri, Sep 19, 2025 at 06:50:48PM -0400, Jeff King wrote:
> diff --git a/shallow.h b/shallow.h
> index 9bfeade93e..59d54d48e7 100644
> --- a/shallow.h
> +++ b/shallow.h
> @@ -36,8 +36,8 @@ void rollback_shallow_file(struct repository *r, struct shallow_lock *lk);
>
> struct commit_list *get_shallow_commits(struct object_array *heads,
> int depth, int shallow_flag, int not_shallow_flag);
> -struct commit_list *get_shallow_commits_by_rev_list(
> - int ac, const char **av, int shallow_flag, int not_shallow_flag);
> +struct commit_list *get_shallow_commits_by_rev_list(struct strvec *argv,
> + int shallow_flag, int not_shallow_flag);
> int write_shallow_commits(struct strbuf *out, int use_pack_protocol,
> const struct oid_array *extra);
>
Sorry, I missed a hdr-check complaint here. It needs:
diff --git a/shallow.h b/shallow.h
index 59d54d48e7..ad591bd139 100644
--- a/shallow.h
+++ b/shallow.h
@@ -7,6 +7,7 @@
#include "strbuf.h"
struct oid_array;
+struct strvec;
void set_alternate_shallow_file(struct repository *r, const char *path, int override);
int register_shallow(struct repository *r, const struct object_id *oid);
on top.
-Peff
^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [PATCH 3/6] revision: add wrapper to setup_revisions() from a strvec
2025-09-19 22:48 ` [PATCH 3/6] revision: add wrapper to setup_revisions() from a strvec Jeff King
@ 2025-09-20 5:10 ` Eric Sunshine
2025-09-20 5:48 ` Jeff King
0 siblings, 1 reply; 26+ messages in thread
From: Eric Sunshine @ 2025-09-20 5:10 UTC (permalink / raw)
To: Jeff King
Cc: Junio C Hamano, Kristoffer Haugsbakk, Lauri Niskanen, git,
Patrick Steinhardt
On Fri, Sep 19, 2025 at 6:51 PM Jeff King <peff@peff.net> wrote:
> The setup_revisions() function was designed to take the argc/argv pair
> from the operating system. But we sometimes construct our own argv using
> a strvec and pass that in. There are a few gotchas that callers need to
> deal with here:
> [...]
> We'll start by converting all of the call-sites which the
> free_removed_argv_elements option. There should be no behavior change
> for them, except that their "shrunken" entries are cleaned up
> immediately, rather than waiting for a strvec_clear() call.
There is some grammatical problem with the first sentence of this
paragraph which makes it difficult to decipher.
> Signed-off-by: Jeff King <peff@peff.net>
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 3/6] revision: add wrapper to setup_revisions() from a strvec
2025-09-20 5:10 ` Eric Sunshine
@ 2025-09-20 5:48 ` Jeff King
0 siblings, 0 replies; 26+ messages in thread
From: Jeff King @ 2025-09-20 5:48 UTC (permalink / raw)
To: Eric Sunshine
Cc: Junio C Hamano, Kristoffer Haugsbakk, Lauri Niskanen, git,
Patrick Steinhardt
On Sat, Sep 20, 2025 at 01:10:50AM -0400, Eric Sunshine wrote:
> On Fri, Sep 19, 2025 at 6:51 PM Jeff King <peff@peff.net> wrote:
> > The setup_revisions() function was designed to take the argc/argv pair
> > from the operating system. But we sometimes construct our own argv using
> > a strvec and pass that in. There are a few gotchas that callers need to
> > deal with here:
> > [...]
> > We'll start by converting all of the call-sites which the
> > free_removed_argv_elements option. There should be no behavior change
> > for them, except that their "shrunken" entries are cleaned up
> > immediately, rather than waiting for a strvec_clear() call.
>
> There is some grammatical problem with the first sentence of this
> paragraph which makes it difficult to decipher.
Urgh, sorry. Should be:
We'll start by converting all of the call-sites which use the
free_removed_argv_elements option.
-Peff
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 1/6] stash: tell setup_revisions() to free our allocated strings
2025-09-19 22:40 ` [PATCH 1/6] stash: tell setup_revisions() to free our allocated strings Jeff King
@ 2025-09-22 15:45 ` Junio C Hamano
2025-09-22 19:05 ` Jeff King
0 siblings, 1 reply; 26+ messages in thread
From: Junio C Hamano @ 2025-09-22 15:45 UTC (permalink / raw)
To: Jeff King; +Cc: Kristoffer Haugsbakk, Lauri Niskanen, git, Patrick Steinhardt
Jeff King <peff@peff.net> writes:
> In "git stash show", we do a first pass of parsing our command line
> options by splitting them into revision args and stash args. These are
> stored in strvecs, and we pass the revision args to setup_revisions().
>
> But setup_revisions() may modify the argv we pass it, causing us to leak
> some of the entries. In particular, if it sees a "--" string, that will
> be dropped from argv. This is the same as other cases addressed by
> f92dbdbc6a (revisions API: don't leak memory on argv elements that need
> free()-ing, 2022-08-02), and we should fix it the same way: by passing
> the free_removed_argv_elements option to setup_revisions().
>
> I've added a test here which fails when built with SANITIZE=leak because
> it calls "git stash show --". This by itself is not a very
> interesting invocation, because there is nothing after the "--", and
> thus the "--" is not really doing anything.
>
> But I think the current parsing in show_stash() is a little
> questionable. It splits the arguments into revision options and stash
> options solely based on the presence of a leading dash, with no regard
> to "--" at all. So:
>
> git stash show -- foo
>
> will take "foo" as a stash option before we even pass anything to
> setup_revisions(). And something like:
>
> git stash show -- 1
>
> will show stash@{1}. But I would expect anything after the "--" to be a
> pathspec. So in this example it would show only the part of the diff
> that touched "foo". And something like:
>
> git stash show -p 1 -- foo
>
> would treat "1" as a stash and "foo" as a pathspec.
>
> That may be something we want to fix, but I want to focus here on the
> leak-fixing without changing behavior. So this test is a little odd, but
> does what we want without locking us in to any particular behavior (we
> only care that "--" by itself does not change the output nor leak).
"git stash show" gives a "git diff --stat stash~ stash" (i.e. the
worktree relative to then-current-HEAD in diffstat format), and "git
stash show --" (no other arguments) gives us "git diff -p" for the
same, it seems; this is with or without your patch.
We may care "--" by itself does not change the output, but it has
already been giving different output without your patch.
> diff --git a/t/t3903-stash.sh b/t/t3903-stash.sh
> index 0bb4648e36..1c9e589bbe 100755
> --- a/t/t3903-stash.sh
> +++ b/t/t3903-stash.sh
> @@ -1741,4 +1741,10 @@ test_expect_success 'submodules does not affect the branch recorded in stash mes
> )
> '
>
> +test_expect_success 'stash show handles --' '
> + git stash show >expect &&
> + git stash show -- >actual &&
> + test_cmp expect actual
> +'
> +
> test_done
In other words, this test should not pass if the stash stores
something reasonable. The only case is when both "git stash show"
and "git stash show -p" would have been silent.
The commit object pointed at by refs/stash before this test runs has
this curious thing.
tree c5f56db0c5c4e40b68ffda4bee76b301c9cc3406
parent a2a54b9cb0873c567fbab134c4b95020b9419f6c
parent 182836fa1adbc25d81a137d4e7489cbd0bec744a
parent f93620bcecd8173d3bd3ad4dff4e69e32ba6f278
author A U Thor <author@example.com> 1112912473 -0700
committer C O Mitter <committer@example.com> 1112912473 -0700
WIP on orphan2: a2a54b9 A
The trees of these three parents reveal why this test passes.
c5f56db0c5c4e40b68ffda4bee76b301c9cc3406
c5f56db0c5c4e40b68ffda4bee76b301c9cc3406
adc18b651de078499a4acf1454fadb65352ed961
In other words, "git diff refs/stash~$n refs/stash" for n == 1 & n
== 2 are empty and this stash entry exists only to record the
untracked cruft. Even "git show -c refs/stash" would stay silent
for a "merge" like this, and both "git stash show" and "git stash
show -p" would of course be empty.
I do not think we want to drop this test (we do want the "handles
without leaking" part of the test), but we should not expect the
output from these commands match.
Unless we are aspiring to introduce a breaking change, that is [*].
Perhaps create a new stash entry that does not show anything in this
test and drop that entry with test_when_finished before leaving it?
Or just run "git stash show --" without any check on its output,
possibly with a prereq that we are running under leak checker?
I dunno.
I only discovered this while merging this and another topic that
happen to touch the same t3903 into 'seen'.
Thanks.
[Footnote]
* I personally find the traditional behaviour nonsense and it may be
coming from the crappy command line parsing we have had forever, but
I am sure people who wrote
git stash show --
git stash show --end-of-options
out of "principle" in their scripts, and assumed that the patch
output is the norm for the command even though it should have been
giving diffstat, would be unhappy if we suddenly made them behave
exactly like "git stash show" (nothing else on the command line).
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 1/6] stash: tell setup_revisions() to free our allocated strings
2025-09-22 15:45 ` Junio C Hamano
@ 2025-09-22 19:05 ` Jeff King
2025-09-22 19:36 ` Junio C Hamano
0 siblings, 1 reply; 26+ messages in thread
From: Jeff King @ 2025-09-22 19:05 UTC (permalink / raw)
To: Junio C Hamano
Cc: Kristoffer Haugsbakk, Lauri Niskanen, git, Patrick Steinhardt
On Mon, Sep 22, 2025 at 08:45:36AM -0700, Junio C Hamano wrote:
> "git stash show" gives a "git diff --stat stash~ stash" (i.e. the
> worktree relative to then-current-HEAD in diffstat format), and "git
> stash show --" (no other arguments) gives us "git diff -p" for the
> same, it seems; this is with or without your patch.
Oof, that is certainly unexpected. Ironically I had done all of the
manual testing with "-p --", because the point of this topic was looking
at how we passed arguments to setup_revisions(). And then I simplified
it when I added the test, because it seemed that the "-p" would just be
noise there.
So it is surprising that the test passes, but you explained that clearly
below.
> We may care "--" by itself does not change the output, but it has
> already been giving different output without your patch.
I think the difference is probably a bug, but one that is out of scope
for what this patch is trying to fix. So even if we wanted to fix it,
I'd prefer not to deal with it here.
> I do not think we want to drop this test (we do want the "handles
> without leaking" part of the test), but we should not expect the
> output from these commands match.
Yeah, I had originally written just:
test_expect_success 'stash show -- does not leak' '
git stash show --
'
but it felt funny, since the test is doing nothing in a build without
SANITIZE=leak. If we are OK with that funniness, I can switch back to
that.
> I only discovered this while merging this and another topic that
> happen to touch the same t3903 into 'seen'.
I'm glad you did. I would rather find out about it now while it is fresh
in my mind, then 3 years from now when we all wonder what the heck is
going on.
> * I personally find the traditional behaviour nonsense and it may be
> coming from the crappy command line parsing we have had forever, but
> I am sure people who wrote
>
> git stash show --
> git stash show --end-of-options
>
> out of "principle" in their scripts, and assumed that the patch
> output is the norm for the command even though it should have been
> giving diffstat, would be unhappy if we suddenly made them behave
> exactly like "git stash show" (nothing else on the command line).
Yeah, I agree with both points (that the current behavior is nonsense,
but people may accidentally have been relying on it). I wouldn't feel
_too_ bad about saying "you were relying on nonsense, and we have fixed
the bug" in this case. But I think it should be a separate topic (and
possibly one that makes "git stash show -- foo" do something sensible).
-Peff
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 1/6] stash: tell setup_revisions() to free our allocated strings
2025-09-22 19:05 ` Jeff King
@ 2025-09-22 19:36 ` Junio C Hamano
2025-09-22 20:25 ` Jeff King
0 siblings, 1 reply; 26+ messages in thread
From: Junio C Hamano @ 2025-09-22 19:36 UTC (permalink / raw)
To: Jeff King; +Cc: Kristoffer Haugsbakk, Lauri Niskanen, git, Patrick Steinhardt
Jeff King <peff@peff.net> writes:
> Yeah, I had originally written just:
>
> test_expect_success 'stash show -- does not leak' '
> git stash show --
> '
>
> but it felt funny, since the test is doing nothing in a build without
> SANITIZE=leak. If we are OK with that funniness, I can switch back to
> that.
We have a prerequisite for that. Very nice vehicle for
documentation purposes, even though we do not care about a single
"stash show" invocation for correctness or performance reasons.
Perhaps I can squash the following in, unless you have other changes
in mind.
t/t3903-stash.sh | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
diff --git c/t/t3903-stash.sh w/t/t3903-stash.sh
index 7ebeb057d3..930c31e547 100755
--- c/t/t3903-stash.sh
+++ w/t/t3903-stash.sh
@@ -1741,10 +1741,8 @@ test_expect_success 'submodules does not affect the branch recorded in stash mes
)
'
-test_expect_success 'stash show handles --' '
- git stash show >expect &&
- git stash show -- >actual &&
- test_cmp expect actual
+test_expect_success SANITIZE_LEAK 'stash show handles -- without leaking' '
+ git stash show --
'
test_expect_success 'controlled error return on unrecognized option' '
^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [PATCH 1/6] stash: tell setup_revisions() to free our allocated strings
2025-09-22 19:36 ` Junio C Hamano
@ 2025-09-22 20:25 ` Jeff King
2025-09-22 21:26 ` Junio C Hamano
0 siblings, 1 reply; 26+ messages in thread
From: Jeff King @ 2025-09-22 20:25 UTC (permalink / raw)
To: Junio C Hamano
Cc: Kristoffer Haugsbakk, Lauri Niskanen, git, Patrick Steinhardt
On Mon, Sep 22, 2025 at 12:36:22PM -0700, Junio C Hamano wrote:
> Jeff King <peff@peff.net> writes:
>
> > Yeah, I had originally written just:
> >
> > test_expect_success 'stash show -- does not leak' '
> > git stash show --
> > '
> >
> > but it felt funny, since the test is doing nothing in a build without
> > SANITIZE=leak. If we are OK with that funniness, I can switch back to
> > that.
>
> We have a prerequisite for that. Very nice vehicle for
> documentation purposes, even though we do not care about a single
> "stash show" invocation for correctness or performance reasons.
Ah, nice. It crossed my mind that we could try marking it as such, but I
forgot (or never knew) that we already had this prereq available. I
guess it was mainly used as "!SANITIZE_LEAK" before for suppressing
leaks.
> Perhaps I can squash the following in, unless you have other changes
> in mind.
Thanks, that looks perfect for the code change. But we probably need to
update the discussion of the test in the commit message. Here's what I
came up with (replacing patch 1):
-- >8 --
Subject: [PATCH] stash: tell setup_revisions() to free our allocated strings
In "git stash show", we do a first pass of parsing our command line
options by splitting them into revision args and stash args. These are
stored in strvecs, and we pass the revision args to setup_revisions().
But setup_revisions() may modify the argv we pass it, causing us to leak
some of the entries. In particular, if it sees a "--" string, that will
be dropped from argv. This is the same as other cases addressed by
f92dbdbc6a (revisions API: don't leak memory on argv elements that need
free()-ing, 2022-08-02), and we should fix it the same way: by passing
the free_removed_argv_elements option to setup_revisions().
The added test here is run only with SANITIZE=leak, without checking its
output, because the behavior of stash with "--" is a little odd:
1. Running "git stash show" will show --stat output. But running "git
stash show --" will show --patch.
2. I'd expect a non-option after "--" to be treated as a pathspec, so:
git stash show -p 1 -- foo
would look treat "1" as a stash (a synonym for stash@{1}) and
restrict the resulting diff to "foo". But it doesn't. We split the
revision/stash args without any regard to "--". So in the example
above both "1" and "foo" are stashes. Which is an error, but also:
git stash show -- foo
treats "foo" as a stash, not a pathspec.
These are both oddities that we may want to address (or may not, if we
want to retain historical quirks). But they are well outside the scope
of this patch. So for now we'll just let the tests confirm we aren't
leaking without otherwise expecting any behavior. If we later address
either of those points and end up with another test that covers "stash
show --", we can drop this leak-only test.
Signed-off-by: Jeff King <peff@peff.net>
---
builtin/stash.c | 3 ++-
t/t3903-stash.sh | 4 ++++
2 files changed, 6 insertions(+), 1 deletion(-)
diff --git a/builtin/stash.c b/builtin/stash.c
index f5ddee5c7f..e5ab3c4cf5 100644
--- a/builtin/stash.c
+++ b/builtin/stash.c
@@ -957,6 +957,7 @@ static void diff_include_untracked(const struct stash_info *info, struct diff_op
static int show_stash(int argc, const char **argv, const char *prefix,
struct repository *repo UNUSED)
{
+ struct setup_revision_opt opt = { .free_removed_argv_elements = 1 };
int i;
int ret = -1;
struct stash_info info = STASH_INFO_INIT;
@@ -1015,7 +1016,7 @@ static int show_stash(int argc, const char **argv, const char *prefix,
}
}
- argc = setup_revisions(revision_args.nr, revision_args.v, &rev, NULL);
+ argc = setup_revisions(revision_args.nr, revision_args.v, &rev, &opt);
if (argc > 1)
goto usage;
if (!rev.diffopt.output_format) {
diff --git a/t/t3903-stash.sh b/t/t3903-stash.sh
index 0bb4648e36..daf96aa931 100755
--- a/t/t3903-stash.sh
+++ b/t/t3903-stash.sh
@@ -1741,4 +1741,8 @@ test_expect_success 'submodules does not affect the branch recorded in stash mes
)
'
+test_expect_success SANITIZE_LEAK 'stash show handles -- without leaking' '
+ git stash show --
+'
+
test_done
--
2.51.0.582.g88c6764cd5
^ permalink raw reply related [flat|nested] 26+ messages in thread
* Re: [PATCH 1/6] stash: tell setup_revisions() to free our allocated strings
2025-09-22 20:25 ` Jeff King
@ 2025-09-22 21:26 ` Junio C Hamano
2025-09-23 0:48 ` Jeff King
0 siblings, 1 reply; 26+ messages in thread
From: Junio C Hamano @ 2025-09-22 21:26 UTC (permalink / raw)
To: Jeff King; +Cc: Kristoffer Haugsbakk, Lauri Niskanen, git, Patrick Steinhardt
Jeff King <peff@peff.net> writes:
> On Mon, Sep 22, 2025 at 12:36:22PM -0700, Junio C Hamano wrote:
>
>> Jeff King <peff@peff.net> writes:
>>
>> > Yeah, I had originally written just:
>> >
>> > test_expect_success 'stash show -- does not leak' '
>> > git stash show --
>> > '
>> >
>> > but it felt funny, since the test is doing nothing in a build without
>> > SANITIZE=leak. If we are OK with that funniness, I can switch back to
>> > that.
>>
>> We have a prerequisite for that. Very nice vehicle for
>> documentation purposes, even though we do not care about a single
>> "stash show" invocation for correctness or performance reasons.
>
> Ah, nice. It crossed my mind that we could try marking it as such, but I
> forgot (or never knew) that we already had this prereq available. I
> guess it was mainly used as "!SANITIZE_LEAK" before for suppressing
> leaks.
>
>> Perhaps I can squash the following in, unless you have other changes
>> in mind.
>
> Thanks, that looks perfect for the code change. But we probably need to
> update the discussion of the test in the commit message. Here's what I
> came up with (replacing patch 1):
Thanks, will replace then.
> -- >8 --
> Subject: [PATCH] stash: tell setup_revisions() to free our allocated strings
>
> In "git stash show", we do a first pass of parsing our command line
> options by splitting them into revision args and stash args. These are
> stored in strvecs, and we pass the revision args to setup_revisions().
>
> But setup_revisions() may modify the argv we pass it, causing us to leak
> some of the entries. In particular, if it sees a "--" string, that will
> be dropped from argv. This is the same as other cases addressed by
> f92dbdbc6a (revisions API: don't leak memory on argv elements that need
> free()-ing, 2022-08-02), and we should fix it the same way: by passing
> the free_removed_argv_elements option to setup_revisions().
>
> The added test here is run only with SANITIZE=leak, without checking its
> output, because the behavior of stash with "--" is a little odd:
>
> 1. Running "git stash show" will show --stat output. But running "git
> stash show --" will show --patch.
>
> 2. I'd expect a non-option after "--" to be treated as a pathspec, so:
>
> git stash show -p 1 -- foo
>
> would look treat "1" as a stash (a synonym for stash@{1}) and
> restrict the resulting diff to "foo". But it doesn't. We split the
> revision/stash args without any regard to "--". So in the example
> above both "1" and "foo" are stashes. Which is an error, but also:
>
> git stash show -- foo
>
> treats "foo" as a stash, not a pathspec.
All good. I do regret that I didn't resist harder against the "1 is
the same as stash@{1}" or insisted that the parser be written more
robustly if we wanted to support such syntax, but that also is way
too late to change now.
> These are both oddities that we may want to address (or may not, if we
> want to retain historical quirks). But they are well outside the scope
> of this patch. So for now we'll just let the tests confirm we aren't
> leaking without otherwise expecting any behavior. If we later address
> either of those points and end up with another test that covers "stash
> show --", we can drop this leak-only test.
OK.
By the way this also solves the original "git stash show -p --invalid";
do we want to credit the bug reporter and mention that issue as well?
Thanks.
> Signed-off-by: Jeff King <peff@peff.net>
> ---
> builtin/stash.c | 3 ++-
> t/t3903-stash.sh | 4 ++++
> 2 files changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/builtin/stash.c b/builtin/stash.c
> index f5ddee5c7f..e5ab3c4cf5 100644
> --- a/builtin/stash.c
> +++ b/builtin/stash.c
> @@ -957,6 +957,7 @@ static void diff_include_untracked(const struct stash_info *info, struct diff_op
> static int show_stash(int argc, const char **argv, const char *prefix,
> struct repository *repo UNUSED)
> {
> + struct setup_revision_opt opt = { .free_removed_argv_elements = 1 };
> int i;
> int ret = -1;
> struct stash_info info = STASH_INFO_INIT;
> @@ -1015,7 +1016,7 @@ static int show_stash(int argc, const char **argv, const char *prefix,
> }
> }
>
> - argc = setup_revisions(revision_args.nr, revision_args.v, &rev, NULL);
> + argc = setup_revisions(revision_args.nr, revision_args.v, &rev, &opt);
> if (argc > 1)
> goto usage;
> if (!rev.diffopt.output_format) {
> diff --git a/t/t3903-stash.sh b/t/t3903-stash.sh
> index 0bb4648e36..daf96aa931 100755
> --- a/t/t3903-stash.sh
> +++ b/t/t3903-stash.sh
> @@ -1741,4 +1741,8 @@ test_expect_success 'submodules does not affect the branch recorded in stash mes
> )
> '
>
> +test_expect_success SANITIZE_LEAK 'stash show handles -- without leaking' '
> + git stash show --
> +'
> +
> test_done
^ permalink raw reply [flat|nested] 26+ messages in thread
* Re: [PATCH 1/6] stash: tell setup_revisions() to free our allocated strings
2025-09-22 21:26 ` Junio C Hamano
@ 2025-09-23 0:48 ` Jeff King
0 siblings, 0 replies; 26+ messages in thread
From: Jeff King @ 2025-09-23 0:48 UTC (permalink / raw)
To: Junio C Hamano
Cc: Kristoffer Haugsbakk, Lauri Niskanen, git, Patrick Steinhardt
On Mon, Sep 22, 2025 at 02:26:44PM -0700, Junio C Hamano wrote:
> By the way this also solves the original "git stash show -p --invalid";
> do we want to credit the bug reporter and mention that issue as well?
That's not handled by this patch, but rather by patch 2 ("manage memory
ownership..."). I put this one first because until we start setting the
free_removed_argv_elements flag, that second patch won't do anything for
git-stash (and I don't think there was a way to trigger the problems
from the other callers which did set it).
That patch discusses the original issue, but yeah, it could probably
stand to have a:
Reported-by: Lauri Niskanen <ape@ape3000.com>
trailer tacked on.
-Peff
^ permalink raw reply [flat|nested] 26+ messages in thread
end of thread, other threads:[~2025-09-23 0:48 UTC | newest]
Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-19 10:18 [BUG] git stash show -p with invalid option aborts with double-free in show_stash() (strvec_clear) Lauri Niskanen
2025-09-19 13:11 ` Kristoffer Haugsbakk
2025-09-19 16:00 ` Junio C Hamano
2025-09-19 16:48 ` Jeff King
2025-09-19 17:13 ` Junio C Hamano
2025-09-19 16:58 ` Junio C Hamano
2025-09-19 17:20 ` Jeff King
2025-09-19 18:15 ` Junio C Hamano
2025-09-19 19:56 ` Jeff King
2025-09-19 22:33 ` [PATCH 0/6] fixing double-frees and leaks via setup_revisions() Jeff King
2025-09-19 22:40 ` [PATCH 1/6] stash: tell setup_revisions() to free our allocated strings Jeff King
2025-09-22 15:45 ` Junio C Hamano
2025-09-22 19:05 ` Jeff King
2025-09-22 19:36 ` Junio C Hamano
2025-09-22 20:25 ` Jeff King
2025-09-22 21:26 ` Junio C Hamano
2025-09-23 0:48 ` Jeff King
2025-09-19 22:45 ` [PATCH 2/6] revision: manage memory ownership of argv in setup_revisions() Jeff King
2025-09-19 22:48 ` [PATCH 3/6] revision: add wrapper to setup_revisions() from a strvec Jeff King
2025-09-20 5:10 ` Eric Sunshine
2025-09-20 5:48 ` Jeff King
2025-09-19 22:49 ` [PATCH 4/6] treewide: use setup_revisions_from_strvec() when we have " Jeff King
2025-09-19 22:50 ` [PATCH 5/6] treewide: pass strvecs around for setup_revisions_from_strvec() Jeff King
2025-09-19 23:11 ` Jeff King
2025-09-19 22:51 ` [PATCH 6/6] revision: retain argv NULL invariant in setup_revisions() Jeff King
2025-09-19 23:07 ` Jeff King
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).