* [BUG] rebase: can write reflog with uninit. `action` string
@ 2025-04-28 19:40 Kristoffer Haugsbakk
2025-04-29 9:22 ` Phillip Wood
0 siblings, 1 reply; 12+ messages in thread
From: Kristoffer Haugsbakk @ 2025-04-28 19:40 UTC (permalink / raw)
To: git; +Cc: phillip.wood123, code
From: code@khaugsbakk.name
We did `git rebase --rebase-merges` on a branch with merge commits. Including
back merges. The reflog after that showed some weird symbols for
certain merge commits (only merges):
e9c962f2ea0 HEAD@{8}: <binary>�: Merged in <branch> (pull request #4441)
Some merge commits were normal.
No backmerges were affected.
We have a main branch. This other branch was created from the main
branch. It had been kept up to date with backmerges. Then someone did
a rebase on it once it was supposed to go into the main branch soonish.
It looks like the string is uninit. The values are different each time.
§ Bisection
Bisects to d188a60d722 (sequencer: stop exporting GIT_REFLOG_ACTION,
2022-11-09).
§ Reproduction on latest code
Reproduced on `master`, on f65182a99e5 (The ninth batch, 2025-04-24).
§ Gdb
This is the backtrace when I get the apparently uninit. string:
```
#0 run_git_commit (defmsg=0x555555babe70 "<repo path>/MERGE_MSG", opts=0x7fffffffc070, flags=0) at sequencer.c:1158
#1 0x00005555558740d9 in do_merge (r=0x555555a67020 <the_repo>, commit=0x555555b75cb0,
arg=0x555555b2164d "<branch stuff> # Merged in <branch> { (pull request #4464)\nlabel branch-point-9\npick 4026b5ced849724bd3857283b6ad50c8609b6d33 only sh"..., arg_len=125, flags=0, check_todo=0x7fffffffb1e0, opts=0x7fffffffc070) at sequencer.c:4380
#2 0x0000555555876629 in pick_commits (r=0x555555a67020 <the_repo>, todo_list=0x7fffffffbf50, opts=0x7fffffffc070) at sequencer.c:5048
#3 0x0000555555877eeb in sequencer_continue (r=0x555555a67020 <the_repo>, opts=0x7fffffffc070) at sequencer.c:5480
#4 0x000055555563a491 in run_sequencer_rebase (opts=0x7fffffffc330) at builtin/rebase.c:369
#5 0x000055555563bc74 in run_specific_rebase (opts=0x7fffffffc330) at builtin/rebase.c:746
#6 0x000055555563fe2a in cmd_rebase (argc=0, argv=0x555555a73890, prefix=0x0, repo=0x555555a67020 <the_repo>) at builtin/rebase.c:1878
#7 0x0000555555574c0d in run_builtin (p=0x555555a34908 <commands+2280>, argc=2, argv=0x555555a73890, repo=0x555555a67020 <the_repo>) at git.c:480
#8 0x00005555555750ca in handle_builtin (args=0x7fffffffd8a0) at git.c:743
#9 0x000055555557538c in run_argv (args=0x7fffffffd8a0) at git.c:810
#10 0x00005555555759e2 in cmd_main (argc=2, argv=0x7fffffffda30) at git.c:950
#11 0x000055555569b0c3 in main (argc=5, argv=0x7fffffffda18) at common-main.c:9
```
§ No reproduction script
I was unable to reproduce with a simple repo. setup. I tried:
1. Creating a side branch which had a merge
2. The side branch conflicted with the other branch
3. Rebased with `--rebase-merges`
-----
Normal bugreport questionaire follows.
-----
Thank you for filling out a Git bug report!
Please answer the following questions to help us understand your issue.
What did you do before the bug happened? (Steps to reproduce your issue)
`git rebase --rebase-merges` on a branch with merge commits. Including
back merges.
What did you expect to happen? (Expected behavior)
Normal “action” string for the reflog like for example `continue`:
3f90f6ab14d (HEAD -> <branch>) HEAD@{1}: rebase (continue): Merged in <branch> (pull request #4507)
What happened instead? (Actual behavior)
The “action” (or whatever it is) string is arbitrary bytes. Like some
uninit. memory.
```
e9c962f2ea0 HEAD@{8}: <binary>�: Merged in <branch> (pull request #4441)
```
What's different between what you expected and what actually happened?
Apparently uninit. string.
Anything else you want to add:
Please review the rest of the bug report below.
You can delete any lines you don't wish to share.
[System Info]
git version:
git version 2.49.0.459.gf65182a99e5
cpu: x86_64
built from commit: f65182a99e545d2f2bc22e6c1c2da192133b16a3
sizeof-long: 8
sizeof-size_t: 8
shell-path: /bin/sh
libcurl: 7.81.0
OpenSSL: OpenSSL 3.0.2 15 Mar 2022
zlib: 1.2.11
SHA-1: SHA1_DC
SHA-256: SHA256_BLK
uname: Linux 6.8.0-58-generic #60~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Fri Mar 28 16:09:21 UTC 2 x86_64
compiler info: gnuc: 11.4
libc info: glibc: 2.35
$SHELL (typically, interactive shell): /bin/bash
[Enabled Hooks]
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [BUG] rebase: can write reflog with uninit. `action` string
2025-04-28 19:40 [BUG] rebase: can write reflog with uninit. `action` string Kristoffer Haugsbakk
@ 2025-04-29 9:22 ` Phillip Wood
2025-04-29 19:40 ` Kristoffer Haugsbakk
0 siblings, 1 reply; 12+ messages in thread
From: Phillip Wood @ 2025-04-29 9:22 UTC (permalink / raw)
To: Kristoffer Haugsbakk, git
Hi Kristoffer
On 28/04/2025 20:40, Kristoffer Haugsbakk wrote:
> From: code@khaugsbakk.name
>
> We did `git rebase --rebase-merges` on a branch with merge commits. Including
> back merges. The reflog after that showed some weird symbols for
> certain merge commits (only merges):
>
> e9c962f2ea0 HEAD@{8}: <binary>�: Merged in <branch> (pull request #4441)
Thanks for reporting this. I'm afraid I'm rather confused. In
run_git_commit() we set GIT_REFLOG_ACTION with
strvec_pushf(&cmd.env, GIT_REFLOG_ACTION "=%s", ctx->reflog_message);
ctx is allocated with calloc() so ctx->reflog_message is initially NULL.
At the start of pick_commits() we store the result of
sequencer_reflog_action() in ctx->reflog_message. That function returns
a heap allocation that is not freed until we call replay_opts_release()
at the end of the rebase. If we're doing a normal pick rather than a
merge we store the result of reflog_message() in ctx->reflog_message.
That function returns a string stored in a static strbuf. So when we
call do_merge() from pick_commits() ctx->reflog_message should point to
a valid string though that string will not hold the correct reflog
message for the merge because do_merge() does not call reflog_message().
> This is the backtrace when I get the apparently uninit. string:
>
> ```
> #0 run_git_commit (defmsg=0x555555babe70 "<repo path>/MERGE_MSG", opts=0x7fffffffc070, flags=0) at sequencer.c:1158
Thanks for the backtrace. It would be useful to know what's stored in
opts->ctx->reflog_message at this point if it's not too much trouble
please can you run "print *opts" and "print *opts->ctx" here.
Thanks
Phillip
> #1 0x00005555558740d9 in do_merge (r=0x555555a67020 <the_repo>, commit=0x555555b75cb0,
> arg=0x555555b2164d "<branch stuff> # Merged in <branch> { (pull request #4464)\nlabel branch-point-9\npick 4026b5ced849724bd3857283b6ad50c8609b6d33 only sh"..., arg_len=125, flags=0, check_todo=0x7fffffffb1e0, opts=0x7fffffffc070) at sequencer.c:4380
> #2 0x0000555555876629 in pick_commits (r=0x555555a67020 <the_repo>, todo_list=0x7fffffffbf50, opts=0x7fffffffc070) at sequencer.c:5048
> #3 0x0000555555877eeb in sequencer_continue (r=0x555555a67020 <the_repo>, opts=0x7fffffffc070) at sequencer.c:5480
> #4 0x000055555563a491 in run_sequencer_rebase (opts=0x7fffffffc330) at builtin/rebase.c:369
> #5 0x000055555563bc74 in run_specific_rebase (opts=0x7fffffffc330) at builtin/rebase.c:746
> #6 0x000055555563fe2a in cmd_rebase (argc=0, argv=0x555555a73890, prefix=0x0, repo=0x555555a67020 <the_repo>) at builtin/rebase.c:1878
> #7 0x0000555555574c0d in run_builtin (p=0x555555a34908 <commands+2280>, argc=2, argv=0x555555a73890, repo=0x555555a67020 <the_repo>) at git.c:480
> #8 0x00005555555750ca in handle_builtin (args=0x7fffffffd8a0) at git.c:743
> #9 0x000055555557538c in run_argv (args=0x7fffffffd8a0) at git.c:810
> #10 0x00005555555759e2 in cmd_main (argc=2, argv=0x7fffffffda30) at git.c:950
> #11 0x000055555569b0c3 in main (argc=5, argv=0x7fffffffda18) at common-main.c:9
> ```
>
> § No reproduction script
>
> I was unable to reproduce with a simple repo. setup. I tried:
>
> 1. Creating a side branch which had a merge
> 2. The side branch conflicted with the other branch
> 3. Rebased with `--rebase-merges`
>
> -----
>
> Normal bugreport questionaire follows.
>
> -----
>
> Thank you for filling out a Git bug report!
> Please answer the following questions to help us understand your issue.
>
> What did you do before the bug happened? (Steps to reproduce your issue)
>
> `git rebase --rebase-merges` on a branch with merge commits. Including
> back merges.
>
> What did you expect to happen? (Expected behavior)
>
> Normal “action” string for the reflog like for example `continue`:
>
> 3f90f6ab14d (HEAD -> <branch>) HEAD@{1}: rebase (continue): Merged in <branch> (pull request #4507)
>
> What happened instead? (Actual behavior)
>
> The “action” (or whatever it is) string is arbitrary bytes. Like some
> uninit. memory.
>
> ```
> e9c962f2ea0 HEAD@{8}: <binary>�: Merged in <branch> (pull request #4441)
> ```
>
> What's different between what you expected and what actually happened?
>
> Apparently uninit. string.
>
> Anything else you want to add:
>
> Please review the rest of the bug report below.
> You can delete any lines you don't wish to share.
>
>
> [System Info]
> git version:
> git version 2.49.0.459.gf65182a99e5
> cpu: x86_64
> built from commit: f65182a99e545d2f2bc22e6c1c2da192133b16a3
> sizeof-long: 8
> sizeof-size_t: 8
> shell-path: /bin/sh
> libcurl: 7.81.0
> OpenSSL: OpenSSL 3.0.2 15 Mar 2022
> zlib: 1.2.11
> SHA-1: SHA1_DC
> SHA-256: SHA256_BLK
> uname: Linux 6.8.0-58-generic #60~22.04.1-Ubuntu SMP PREEMPT_DYNAMIC Fri Mar 28 16:09:21 UTC 2 x86_64
> compiler info: gnuc: 11.4
> libc info: glibc: 2.35
> $SHELL (typically, interactive shell): /bin/bash
>
>
> [Enabled Hooks]
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [BUG] rebase: can write reflog with uninit. `action` string
2025-04-29 9:22 ` Phillip Wood
@ 2025-04-29 19:40 ` Kristoffer Haugsbakk
2025-04-29 21:51 ` Jeff King
0 siblings, 1 reply; 12+ messages in thread
From: Kristoffer Haugsbakk @ 2025-04-29 19:40 UTC (permalink / raw)
To: Phillip Wood, git
Hi Phillip
On Tue, Apr 29, 2025, at 11:22, Phillip Wood wrote:
>> #0 run_git_commit (defmsg=0x555555babe70 "<repo path>/MERGE_MSG", opts=0x7fffffffc070, flags=0) at sequencer.c:1158
>
> Thanks for the backtrace. It would be useful to know what's stored in
> opts->ctx->reflog_message at this point if it's not too much trouble
> please can you run "print *opts" and "print *opts->ctx" here.
Today I ran on f65182a99e5 (The ninth batch, 2025-04-24) at
sequencer.c:1148. I was never able to reproduce this
`opts->ctx->reflog_message` having a weird value with GDB today. The
reflog was also fine.
Then I ran without GDB and I got the weird reflog that I expected.
So I don’t know what `*opts` or `*opts->ctx` looks like here. But I did
find just two minutes ago some old notes about `ctx->reflog_message`:
```
Thread 1 "git" hit Breakpoint 1, run_git_commit (defmsg=0x555555babe70 "<merge msg path>", opts=0x7fffffffc070, flags=0) at sequencer.c:1158
1158 strvec_pushf(&cmd.env, GIT_REFLOG_ACTION "=%s", ctx->reflog_message);
@(gdb) p ctx->reflog_message
$23 = 0x555555ba0d50 "\250y\267UUU"
@(gdb)
```
It’s line 1158 because of my debug code apparently.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [BUG] rebase: can write reflog with uninit. `action` string
2025-04-29 19:40 ` Kristoffer Haugsbakk
@ 2025-04-29 21:51 ` Jeff King
2025-04-30 15:17 ` Kristoffer Haugsbakk
2025-05-01 14:10 ` phillip.wood123
0 siblings, 2 replies; 12+ messages in thread
From: Jeff King @ 2025-04-29 21:51 UTC (permalink / raw)
To: Kristoffer Haugsbakk; +Cc: Phillip Wood, git
On Tue, Apr 29, 2025 at 09:40:13PM +0200, Kristoffer Haugsbakk wrote:
> On Tue, Apr 29, 2025, at 11:22, Phillip Wood wrote:
> >> #0 run_git_commit (defmsg=0x555555babe70 "<repo path>/MERGE_MSG", opts=0x7fffffffc070, flags=0) at sequencer.c:1158
> >
> > Thanks for the backtrace. It would be useful to know what's stored in
> > opts->ctx->reflog_message at this point if it's not too much trouble
> > please can you run "print *opts" and "print *opts->ctx" here.
>
> Today I ran on f65182a99e5 (The ninth batch, 2025-04-24) at
> sequencer.c:1148. I was never able to reproduce this
> `opts->ctx->reflog_message` having a weird value with GDB today. The
> reflog was also fine.
>
> Then I ran without GDB and I got the weird reflog that I expected.
Have you tried building with "make SANITIZE=address,undefined"?
This is a wild guess, but since ctx->reflog_message is pointing to a
static strbuf, it could be a use after free if the strbuf is reallocated
due to another call to reflog_message(), but we are still holding the
old pointer via ctx->reflog_message.
-Peff
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [BUG] rebase: can write reflog with uninit. `action` string
2025-04-29 21:51 ` Jeff King
@ 2025-04-30 15:17 ` Kristoffer Haugsbakk
2025-05-01 13:17 ` Jeff King
2025-05-01 14:10 ` phillip.wood123
1 sibling, 1 reply; 12+ messages in thread
From: Kristoffer Haugsbakk @ 2025-04-30 15:17 UTC (permalink / raw)
To: Jeff King; +Cc: Phillip Wood, git
On Tue, Apr 29, 2025, at 23:51, Jeff King wrote:
> On Tue, Apr 29, 2025 at 09:40:13PM +0200, Kristoffer Haugsbakk wrote:
>
>> On Tue, Apr 29, 2025, at 11:22, Phillip Wood wrote:
>> >> #0 run_git_commit (defmsg=0x555555babe70 "<repo path>/MERGE_MSG", opts=0x7fffffffc070, flags=0) at sequencer.c:1158
>> >
>> > Thanks for the backtrace. It would be useful to know what's stored in
>> > opts->ctx->reflog_message at this point if it's not too much trouble
>> > please can you run "print *opts" and "print *opts->ctx" here.
>>
>> Today I ran on f65182a99e5 (The ninth batch, 2025-04-24) at
>> sequencer.c:1148. I was never able to reproduce this
>> `opts->ctx->reflog_message` having a weird value with GDB today. The
>> reflog was also fine.
>>
>> Then I ran without GDB and I got the weird reflog that I expected.
>
> Have you tried building with "make SANITIZE=address,undefined"?
No I haven’t. Thank you. The following is with that `make`.
Still on f65182a99e5 (The ninth batch, 2025-04-24). I eventually[1]
got this:
[1] I run through 19 merge conflicts which I `--continue` (using rerere)
until the rebase is done
```
detached HEAD 5d96584c836] Merge branch '<branch>' into <something else>
Author: [author]
=================================================================
==87324==ERROR: AddressSanitizer: heap-use-after-free on address 0x60300001daa0 at pc 0x79371ca5df89 bp 0x7fff8e215a50 sp 0x7fff8e2151c8
READ of size 2 at 0x60300001daa0 thread T0
#0 0x79371ca5df88 in printf_common ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors_format.inc:553
#1 0x79371ca5fbd5 in __interceptor_vsnprintf ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors.inc:1668
#2 0x5702aa6bd851 in strbuf_vaddf /home/kristoffer/programming/git/strbuf.c:415
#3 0x5702aa6d2384 in strvec_pushf /home/kristoffer/programming/git/strvec.c:35
#4 0x5702aa629087 in run_git_commit /home/kristoffer/programming/git/sequencer.c:1148
#5 0x5702aa64c652 in do_merge /home/kristoffer/programming/git/sequencer.c:4363
#6 0x5702aa655714 in pick_commits /home/kristoffer/programming/git/sequencer.c:5029
#7 0x5702aa659a68 in sequencer_continue /home/kristoffer/programming/git/sequencer.c:5461
#8 0x5702a9d0f5ce in run_sequencer_rebase builtin/rebase.c:370
#9 0x5702a9d14cd9 in run_specific_rebase builtin/rebase.c:747
#10 0x5702a9d23798 in cmd_rebase builtin/rebase.c:1887
#11 0x5702a9a3c26c in run_builtin /home/kristoffer/programming/git/git.c:480
#12 0x5702a9a3d3d5 in handle_builtin /home/kristoffer/programming/git/git.c:744
#13 0x5702a9a3dc2c in run_argv /home/kristoffer/programming/git/git.c:811
#14 0x5702a9a3f17c in cmd_main /home/kristoffer/programming/git/git.c:951
#15 0x5702a9e77aa3 in main /home/kristoffer/programming/git/common-main.c:9
#16 0x79371be29d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
#17 0x79371be29e3f in __libc_start_main_impl ../csu/libc-start.c:392
#18 0x5702a9a35384 in _start (/home/kristoffer/programming/git/git+0x12a0384)
0x60300001daa0 is located 0 bytes inside of 24-byte region [0x60300001daa0,0x60300001dab8)
freed by thread T0 here:
#0 0x79371cab4c38 in __interceptor_realloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:164
#1 0x5702aa7ecdca in xrealloc /home/kristoffer/programming/git/wrapper.c:140
#2 0x5702aa6b83b7 in strbuf_grow /home/kristoffer/programming/git/strbuf.c:114
#3 0x5702aa6bd8ca in strbuf_vaddf /home/kristoffer/programming/git/strbuf.c:420
#4 0x5702aa6476dc in reflog_message /home/kristoffer/programming/git/sequencer.c:3948
#5 0x5702aa648e42 in do_reset /home/kristoffer/programming/git/sequencer.c:4059
#6 0x5702aa65545e in pick_commits /home/kristoffer/programming/git/sequencer.c:5026
#7 0x5702aa659a68 in sequencer_continue /home/kristoffer/programming/git/sequencer.c:5461
#8 0x5702a9d0f5ce in run_sequencer_rebase builtin/rebase.c:370
#9 0x5702a9d14cd9 in run_specific_rebase builtin/rebase.c:747
#10 0x5702a9d23798 in cmd_rebase builtin/rebase.c:1887
#11 0x5702a9a3c26c in run_builtin /home/kristoffer/programming/git/git.c:480
#12 0x5702a9a3d3d5 in handle_builtin /home/kristoffer/programming/git/git.c:744
#13 0x5702a9a3dc2c in run_argv /home/kristoffer/programming/git/git.c:811
#14 0x5702a9a3f17c in cmd_main /home/kristoffer/programming/git/git.c:951
#15 0x5702a9e77aa3 in main /home/kristoffer/programming/git/common-main.c:9
#16 0x79371be29d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
previously allocated by thread T0 here:
#0 0x79371cab4c38 in __interceptor_realloc ../../../../src/libsanitizer/asan/asan_malloc_linux.cpp:164
#1 0x5702aa7ecdca in xrealloc /home/kristoffer/programming/git/wrapper.c:140
#2 0x5702aa6b83b7 in strbuf_grow /home/kristoffer/programming/git/strbuf.c:114
#3 0x5702aa6bbeae in strbuf_add /home/kristoffer/programming/git/strbuf.c:313
#4 0x5702aa61da33 in strbuf_addstr /home/kristoffer/programming/git/strbuf.h:310
#5 0x5702aa64766a in reflog_message /home/kristoffer/programming/git/sequencer.c:3943
#6 0x5702aa659637 in sequencer_continue /home/kristoffer/programming/git/sequencer.c:5426
#7 0x5702a9d0f5ce in run_sequencer_rebase builtin/rebase.c:370
#8 0x5702a9d14cd9 in run_specific_rebase builtin/rebase.c:747
#9 0x5702a9d23798 in cmd_rebase builtin/rebase.c:1887
#10 0x5702a9a3c26c in run_builtin /home/kristoffer/programming/git/git.c:480
#11 0x5702a9a3d3d5 in handle_builtin /home/kristoffer/programming/git/git.c:744
#12 0x5702a9a3dc2c in run_argv /home/kristoffer/programming/git/git.c:811
#13 0x5702a9a3f17c in cmd_main /home/kristoffer/programming/git/git.c:951
#14 0x5702a9e77aa3 in main /home/kristoffer/programming/git/common-main.c:9
#15 0x79371be29d8f in __libc_start_call_main ../sysdeps/nptl/libc_start_call_main.h:58
SUMMARY: AddressSanitizer: heap-use-after-free ../../../../src/libsanitizer/sanitizer_common/sanitizer_common_interceptors_format.inc:553 in printf_common
Shadow bytes around the buggy address:
0x0c067fffbb00: fd fd fd fd fa fa fd fd fd fd fa fa fd fd fd fa
0x0c067fffbb10: fa fa fd fd fd fd fa fa fd fd fd fa fa fa fd fd
0x0c067fffbb20: fd fd fa fa fd fd fd fd fa fa fd fd fd fd fa fa
0x0c067fffbb30: fd fd fd fd fa fa fd fd fd fd fa fa fd fd fd fd
0x0c067fffbb40: fa fa fd fd fd fd fa fa fd fd fd fd fa fa fd fd
=>0x0c067fffbb50: fd fa fa fa[fd]fd fd fa fa fa fd fd fd fa fa fa
0x0c067fffbb60: fd fd fd fa fa fa fd fd fd fa fa fa fd fd fd fa
0x0c067fffbb70: fa fa fd fd fd fa fa fa fd fd fd fa fa fa fd fd
0x0c067fffbb80: fd fa fa fa fd fd fd fa fa fa fd fd fd fa fa fa
0x0c067fffbb90: fd fd fd fd fa fa fd fd fd fa fa fa fd fd fd fd
0x0c067fffbba0: fa fa fd fd fd fa fa fa fd fd fd fd fa fa fd fd
Shadow byte legend (one shadow byte represents 8 application bytes):
Addressable: 00
Partially addressable: 01 02 03 04 05 06 07
Heap left redzone: fa
Freed heap region: fd
Stack left redzone: f1
Stack mid redzone: f2
Stack right redzone: f3
Stack after return: f5
Stack use after scope: f8
Global redzone: f9
Global init order: f6
Poisoned by user: f7
Container overflow: fc
Array cookie: ac
Intra object redzone: bb
ASan internal: fe
Left alloca redzone: ca
Right alloca redzone: cb
Shadow gap: cc
==87324==ABORTING
```
> This is a wild guess, but since ctx->reflog_message is pointing to a
> static strbuf, it could be a use after free if the strbuf is reallocated
> due to another call to reflog_message(), but we are still holding the
> old pointer via ctx->reflog_message.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [BUG] rebase: can write reflog with uninit. `action` string
2025-04-30 15:17 ` Kristoffer Haugsbakk
@ 2025-05-01 13:17 ` Jeff King
2025-05-01 14:36 ` phillip.wood123
2025-05-01 16:07 ` Junio C Hamano
0 siblings, 2 replies; 12+ messages in thread
From: Jeff King @ 2025-05-01 13:17 UTC (permalink / raw)
To: Kristoffer Haugsbakk; +Cc: Phillip Wood, git
On Wed, Apr 30, 2025 at 05:17:38PM +0200, Kristoffer Haugsbakk wrote:
> > Have you tried building with "make SANITIZE=address,undefined"?
>
> No I haven’t. Thank you. The following is with that `make`.
>
> Still on f65182a99e5 (The ninth batch, 2025-04-24). I eventually[1]
> got this:
>
> [1] I run through 19 merge conflicts which I `--continue` (using rerere)
> until the rebase is done
>
> ```
> detached HEAD 5d96584c836] Merge branch '<branch>' into <something else>
> Author: [author]
> =================================================================
> ==87324==ERROR: AddressSanitizer: heap-use-after-free on address 0x60300001daa0 at pc 0x79371ca5df89 bp 0x7fff8e215a50 sp 0x7fff8e2151c8
Makes sense. Presumably it triggers in your case but not others because
something about your particular reflog messages causes the strbuf to
reallocate (I guess one of them is a lot longer than the others).
But we should be able to trigger it reliably with this:
diff --git a/sequencer.c b/sequencer.c
index b5c4043757..43db0ce66b 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -3939,7 +3939,7 @@ static const char *reflog_message(struct replay_opts *opts,
static struct strbuf buf = STRBUF_INIT;
va_start(ap, fmt);
- strbuf_reset(&buf);
+ strbuf_release(&buf); /* guarantees reallocation */
strbuf_addstr(&buf, sequencer_reflog_action(opts));
if (sub_action)
strbuf_addf(&buf, " (%s)", sub_action);
And indeed, building with that patch and SANITIZE=address seems to show
the problem reliably with the existing tests in t3430 (it might also
show up without ASan, but probably not as reliably).
Probably the smallest solution is for ctx->reflog_message to copy the
result and always own the memory (and then remember to free it, both at
cleanup and if it is ever overwritten).
But I think the way reflog_message() returns the "buf" member of a
static strbuf is kind of an anti-pattern, exactly because you can get
this kind of subtle re-use. It probably should just return a non-const
pointer, handing over memory ownership to the caller. That would require
adjusting its other callers, too.
So the "smallest" version is perhaps something like this, totally
untested except for confirming that t3430 no longer complains:
diff --git a/sequencer.c b/sequencer.c
index b5c4043757..07aa3b3731 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -228,7 +228,7 @@ struct replay_ctx {
* Stores the reflog message that will be used when creating a
* commit. Points to a static buffer and should not be free()'d.
*/
- const char *reflog_message;
+ char *reflog_message;
/*
* The number of completed fixup and squash commands in the
* current chain.
@@ -411,6 +411,7 @@ static void replay_ctx_release(struct replay_ctx *ctx)
{
strbuf_release(&ctx->current_fixups);
strbuf_release(&ctx->message);
+ free(ctx->reflog_message);
}
void replay_opts_release(struct replay_opts *opts)
@@ -3939,7 +3940,7 @@ static const char *reflog_message(struct replay_opts *opts,
static struct strbuf buf = STRBUF_INIT;
va_start(ap, fmt);
- strbuf_reset(&buf);
+ strbuf_release(&buf); /* guarantees realloaction */
strbuf_addstr(&buf, sequencer_reflog_action(opts));
if (sub_action)
strbuf_addf(&buf, " (%s)", sub_action);
@@ -4886,9 +4887,11 @@ static int pick_one_commit(struct repository *r,
int res;
struct todo_item *item = todo_list->items + todo_list->current;
const char *arg = todo_item_get_arg(todo_list, item);
- if (is_rebase_i(opts))
- ctx->reflog_message = reflog_message(
- opts, command_to_string(item->command), NULL);
+ if (is_rebase_i(opts)) {
+ free(ctx->reflog_message);
+ ctx->reflog_message = xstrdup(reflog_message(
+ opts, command_to_string(item->command), NULL));
+ }
res = do_pick_commit(r, item, opts, is_final_fixup(todo_list),
check_todo);
@@ -4947,7 +4950,8 @@ static int pick_commits(struct repository *r,
struct replay_ctx *ctx = opts->ctx;
int res = 0, reschedule = 0;
- ctx->reflog_message = sequencer_reflog_action(opts);
+ free(ctx->reflog_message);
+ ctx->reflog_message = xstrdup(sequencer_reflog_action(opts));
if (opts->allow_ff)
ASSERT(!(opts->signoff || opts->no_commit ||
opts->record_origin || should_edit(opts) ||
@@ -5423,7 +5427,8 @@ int sequencer_continue(struct repository *r, struct replay_opts *opts)
unlink(rebase_path_dropped());
}
- ctx->reflog_message = reflog_message(opts, "continue", NULL);
+ free(ctx->reflog_message);
+ ctx->reflog_message = xstrdup(reflog_message(opts, "continue", NULL));
if (commit_staged_changes(r, opts, &todo_list)) {
res = -1;
goto release_todo_list;
@@ -5475,7 +5480,8 @@ static int single_pick(struct repository *r,
TODO_PICK : TODO_REVERT;
item.commit = cmit;
- opts->ctx->reflog_message = sequencer_reflog_action(opts);
+ free(opts->ctx->reflog_message);
+ opts->ctx->reflog_message = xstrdup(sequencer_reflog_action(opts));
return do_pick_commit(r, &item, opts, 0, &check_todo);
}
I'm hoping your or Phillip can decide on the best fix from here.
-Peff
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [BUG] rebase: can write reflog with uninit. `action` string
2025-04-29 21:51 ` Jeff King
2025-04-30 15:17 ` Kristoffer Haugsbakk
@ 2025-05-01 14:10 ` phillip.wood123
1 sibling, 0 replies; 12+ messages in thread
From: phillip.wood123 @ 2025-05-01 14:10 UTC (permalink / raw)
To: Jeff King, Kristoffer Haugsbakk; +Cc: Phillip Wood, git
Hi Peff
On 29/04/2025 22:51, Jeff King wrote:
> On Tue, Apr 29, 2025 at 09:40:13PM +0200, Kristoffer Haugsbakk wrote:
>
>> On Tue, Apr 29, 2025, at 11:22, Phillip Wood wrote:
>>>> #0 run_git_commit (defmsg=0x555555babe70 "<repo path>/MERGE_MSG", opts=0x7fffffffc070, flags=0) at sequencer.c:1158
>>>
>>> Thanks for the backtrace. It would be useful to know what's stored in
>>> opts->ctx->reflog_message at this point if it's not too much trouble
>>> please can you run "print *opts" and "print *opts->ctx" here.
>>
>> Today I ran on f65182a99e5 (The ninth batch, 2025-04-24) at
>> sequencer.c:1148. I was never able to reproduce this
>> `opts->ctx->reflog_message` having a weird value with GDB today. The
>> reflog was also fine.
>>
>> Then I ran without GDB and I got the weird reflog that I expected.
>
> Have you tried building with "make SANITIZE=address,undefined"?
>
> This is a wild guess, but since ctx->reflog_message is pointing to a
> static strbuf, it could be a use after free if the strbuf is reallocated
> due to another call to reflog_message(), but we are still holding the
> old pointer via ctx->reflog_message.
Oh, nice insight. I'd forgotten we had callers of reflog_message() that
didn't store the result in ctx->reflog_message. One of those callers is
in do_reset() which due to the way the todo list gets constructed is
likely to be called just before do_merge().
Thanks
Phillip
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [BUG] rebase: can write reflog with uninit. `action` string
2025-05-01 13:17 ` Jeff King
@ 2025-05-01 14:36 ` phillip.wood123
2025-05-01 15:36 ` Jeff King
2025-05-01 16:07 ` Junio C Hamano
1 sibling, 1 reply; 12+ messages in thread
From: phillip.wood123 @ 2025-05-01 14:36 UTC (permalink / raw)
To: Jeff King, Kristoffer Haugsbakk; +Cc: Phillip Wood, git
Hi Peff and Kristoffer
On 01/05/2025 14:17, Jeff King wrote:
> On Wed, Apr 30, 2025 at 05:17:38PM +0200, Kristoffer Haugsbakk wrote:
>
> Probably the smallest solution is for ctx->reflog_message to copy the
> result and always own the memory (and then remember to free it, both at
> cleanup and if it is ever overwritten).
>
> But I think the way reflog_message() returns the "buf" member of a
> static strbuf is kind of an anti-pattern, exactly because you can get
> this kind of subtle re-use. It probably should just return a non-const
> pointer, handing over memory ownership to the caller. That would require
> adjusting its other callers, too.
Getting rid of the static buffer would certainly protect us from the
use-after-free. The bug here is that we're not calling reflog_message()
and storing the result in ctx->reflog_message() to create the correct
message in do_merge(). Looking at your patch, having to remember to copy
the string returned from reflog_message() is a bit of a pain. I wonder
if we could change ctx->reflog_message to be an strbuf and update
reflog_message() like so
diff --git a/sequencer.c b/sequencer.c
index ad0ab75c8d4..59d80ddf0cc 100644
--- a/sequencer.c
+++ b/sequencer.c
@@ -3941,11 +3941,10 @@ static const char
*sequencer_reflog_action(struct replay_opts *opts)
}
__attribute__((format (printf, 3, 4)))
-static const char *reflog_message(struct replay_opts *opts,
+static void reflog_message(struct replay_opts *opts,
const char *sub_action, const char *fmt, ...)
{
va_list ap;
- static struct strbuf buf = STRBUF_INIT;
+ struct strbuf *buf = &opts->ctx->reflog_message;
va_start(ap, fmt);
strbuf_reset(&buf);
@@ -3957,8 +3956,6 @@ static const char *reflog_message(struct
replay_opts *opts,
strbuf_vaddf(&buf, fmt, ap);
}
va_end(ap);
-
- return buf.buf;
}
static struct commit *lookup_label(struct repository *r, const char
*label,
All of the other callers should can then use use ctx->reflog_message.buf
where they were using the return value of reflog_message() before. That
would protect us from the use-after-free.
I'll try and put a proper patch together next week that removes the
static buffer and starts calling reflog_message() when we're merging.
Best Wishes
Phillip
> So the "smallest" version is perhaps something like this, totally
> untested except for confirming that t3430 no longer complains:
>
> diff --git a/sequencer.c b/sequencer.c
> index b5c4043757..07aa3b3731 100644
> --- a/sequencer.c
> +++ b/sequencer.c
> @@ -228,7 +228,7 @@ struct replay_ctx {
> * Stores the reflog message that will be used when creating a
> * commit. Points to a static buffer and should not be free()'d.
> */
> - const char *reflog_message;
> + char *reflog_message;
> /*
> * The number of completed fixup and squash commands in the
> * current chain.
> @@ -411,6 +411,7 @@ static void replay_ctx_release(struct replay_ctx *ctx)
> {
> strbuf_release(&ctx->current_fixups);
> strbuf_release(&ctx->message);
> + free(ctx->reflog_message);
> }
>
> void replay_opts_release(struct replay_opts *opts)
> @@ -3939,7 +3940,7 @@ static const char *reflog_message(struct replay_opts *opts,
> static struct strbuf buf = STRBUF_INIT;
>
> va_start(ap, fmt);
> - strbuf_reset(&buf);
> + strbuf_release(&buf); /* guarantees realloaction */
> strbuf_addstr(&buf, sequencer_reflog_action(opts));
> if (sub_action)
> strbuf_addf(&buf, " (%s)", sub_action);
> @@ -4886,9 +4887,11 @@ static int pick_one_commit(struct repository *r,
> int res;
> struct todo_item *item = todo_list->items + todo_list->current;
> const char *arg = todo_item_get_arg(todo_list, item);
> - if (is_rebase_i(opts))
> - ctx->reflog_message = reflog_message(
> - opts, command_to_string(item->command), NULL);
> + if (is_rebase_i(opts)) {
> + free(ctx->reflog_message);
> + ctx->reflog_message = xstrdup(reflog_message(
> + opts, command_to_string(item->command), NULL));
> + }
>
> res = do_pick_commit(r, item, opts, is_final_fixup(todo_list),
> check_todo);
> @@ -4947,7 +4950,8 @@ static int pick_commits(struct repository *r,
> struct replay_ctx *ctx = opts->ctx;
> int res = 0, reschedule = 0;
>
> - ctx->reflog_message = sequencer_reflog_action(opts);
> + free(ctx->reflog_message);
> + ctx->reflog_message = xstrdup(sequencer_reflog_action(opts));
> if (opts->allow_ff)
> ASSERT(!(opts->signoff || opts->no_commit ||
> opts->record_origin || should_edit(opts) ||
> @@ -5423,7 +5427,8 @@ int sequencer_continue(struct repository *r, struct replay_opts *opts)
> unlink(rebase_path_dropped());
> }
>
> - ctx->reflog_message = reflog_message(opts, "continue", NULL);
> + free(ctx->reflog_message);
> + ctx->reflog_message = xstrdup(reflog_message(opts, "continue", NULL));
> if (commit_staged_changes(r, opts, &todo_list)) {
> res = -1;
> goto release_todo_list;
> @@ -5475,7 +5480,8 @@ static int single_pick(struct repository *r,
> TODO_PICK : TODO_REVERT;
> item.commit = cmit;
>
> - opts->ctx->reflog_message = sequencer_reflog_action(opts);
> + free(opts->ctx->reflog_message);
> + opts->ctx->reflog_message = xstrdup(sequencer_reflog_action(opts));
> return do_pick_commit(r, &item, opts, 0, &check_todo);
> }
>
>
> I'm hoping your or Phillip can decide on the best fix from here.
>
> -Peff
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [BUG] rebase: can write reflog with uninit. `action` string
2025-05-01 14:36 ` phillip.wood123
@ 2025-05-01 15:36 ` Jeff King
0 siblings, 0 replies; 12+ messages in thread
From: Jeff King @ 2025-05-01 15:36 UTC (permalink / raw)
To: phillip.wood; +Cc: Kristoffer Haugsbakk, git
On Thu, May 01, 2025 at 03:36:13PM +0100, phillip.wood123@gmail.com wrote:
> > But I think the way reflog_message() returns the "buf" member of a
> > static strbuf is kind of an anti-pattern, exactly because you can get
> > this kind of subtle re-use. It probably should just return a non-const
> > pointer, handing over memory ownership to the caller. That would require
> > adjusting its other callers, too.
>
> Getting rid of the static buffer would certainly protect us from the
> use-after-free. The bug here is that we're not calling reflog_message() and
> storing the result in ctx->reflog_message() to create the correct message in
> do_merge(). Looking at your patch, having to remember to copy the string
> returned from reflog_message() is a bit of a pain. I wonder if we could
> change ctx->reflog_message to be an strbuf and update reflog_message() like
> so
Yeah, I agree that would solve the use-after-free. I just wasn't sure if
there would still be a logic bug, though. If we have a sequence like:
/* somebody sets the variable */
ctx->reflog_message = reflog_message(...);
...
/* some caller, possibly far away, uses the function again */
foo(reflog_message(...));
...
/* now the original caller wants to use what it stored */
printf("%s", ctx->reflog_message);
Right now that is a potential use-after-free because of reallocating the
static storage inside reflog_message(). If we teach reflog_message
to store things in ctx->reflog_message always, that use-after-free goes
away, but will the printf() above print the wrong thing?
I wasn't clear on who is setting the value or why. It could be that
doing so is actually _fixing_ a bug (because it should be printing what
comes from the invocation in the foo() call).
> I'll try and put a proper patch together next week that removes the static
> buffer and starts calling reflog_message() when we're merging.
Great, thanks.
-Peff
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [BUG] rebase: can write reflog with uninit. `action` string
2025-05-01 13:17 ` Jeff King
2025-05-01 14:36 ` phillip.wood123
@ 2025-05-01 16:07 ` Junio C Hamano
2025-05-01 16:38 ` Jeff King
1 sibling, 1 reply; 12+ messages in thread
From: Junio C Hamano @ 2025-05-01 16:07 UTC (permalink / raw)
To: Jeff King; +Cc: Kristoffer Haugsbakk, Phillip Wood, git
Jeff King <peff@peff.net> writes:
> Probably the smallest solution is for ctx->reflog_message to copy the
> result and always own the memory (and then remember to free it, both at
> cleanup and if it is ever overwritten).
>
> But I think the way reflog_message() returns the "buf" member of a
> static strbuf is kind of an anti-pattern, exactly because you can get
> this kind of subtle re-use. It probably should just return a non-const
> pointer, handing over memory ownership to the caller. That would require
> adjusting its other callers, too.
Yeah, yesterday I was looking at the same report and was thinking
that the memory ownership model here is somewhat screwed up. As
a few more allocations/deallocations should be dwarfed by the real
processing cost of replaying each commit, I think this "smallest"
solution is also the solution in the right direction.
> So the "smallest" version is perhaps something like this, totally
> untested except for confirming that t3430 no longer complains:
;-)
> va_start(ap, fmt);
> - strbuf_reset(&buf);
> + strbuf_release(&buf); /* guarantees realloaction */
I initially thought that this comment may have to be updated in the
production version, but because we have to freshly allocate for each
new message for ownership change, this comment still is correct.
The only difference between the "here is how to expose" and "this is
part of the smallest solution" is why we want to guarantee it.
> I'm hoping your or Phillip can decide on the best fix from here.
;-)
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [BUG] rebase: can write reflog with uninit. `action` string
2025-05-01 16:07 ` Junio C Hamano
@ 2025-05-01 16:38 ` Jeff King
2025-05-01 17:53 ` Junio C Hamano
0 siblings, 1 reply; 12+ messages in thread
From: Jeff King @ 2025-05-01 16:38 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Kristoffer Haugsbakk, Phillip Wood, git
On Thu, May 01, 2025 at 09:07:06AM -0700, Junio C Hamano wrote:
> > va_start(ap, fmt);
> > - strbuf_reset(&buf);
> > + strbuf_release(&buf); /* guarantees realloaction */
>
> I initially thought that this comment may have to be updated in the
> production version, but because we have to freshly allocate for each
> new message for ownership change, this comment still is correct.
> The only difference between the "here is how to expose" and "this is
> part of the smallest solution" is why we want to guarantee it.
This code change is just to stimulate the bug more readily. ;)
I think if we started to actually allocate here, we'd want to switch the
"return buf.buf" at the end to strbuf_detach().
-Peff
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [BUG] rebase: can write reflog with uninit. `action` string
2025-05-01 16:38 ` Jeff King
@ 2025-05-01 17:53 ` Junio C Hamano
0 siblings, 0 replies; 12+ messages in thread
From: Junio C Hamano @ 2025-05-01 17:53 UTC (permalink / raw)
To: Jeff King; +Cc: Kristoffer Haugsbakk, Phillip Wood, git
Jeff King <peff@peff.net> writes:
> On Thu, May 01, 2025 at 09:07:06AM -0700, Junio C Hamano wrote:
>
>> > va_start(ap, fmt);
>> > - strbuf_reset(&buf);
>> > + strbuf_release(&buf); /* guarantees realloaction */
>>
>> I initially thought that this comment may have to be updated in the
>> production version, but because we have to freshly allocate for each
>> new message for ownership change, this comment still is correct.
>> The only difference between the "here is how to expose" and "this is
>> part of the smallest solution" is why we want to guarantee it.
>
> This code change is just to stimulate the bug more readily. ;)
>
> I think if we started to actually allocate here, we'd want to switch the
> "return buf.buf" at the end to strbuf_detach().
That's sensible.
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2025-05-01 17:53 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-28 19:40 [BUG] rebase: can write reflog with uninit. `action` string Kristoffer Haugsbakk
2025-04-29 9:22 ` Phillip Wood
2025-04-29 19:40 ` Kristoffer Haugsbakk
2025-04-29 21:51 ` Jeff King
2025-04-30 15:17 ` Kristoffer Haugsbakk
2025-05-01 13:17 ` Jeff King
2025-05-01 14:36 ` phillip.wood123
2025-05-01 15:36 ` Jeff King
2025-05-01 16:07 ` Junio C Hamano
2025-05-01 16:38 ` Jeff King
2025-05-01 17:53 ` Junio C Hamano
2025-05-01 14:10 ` phillip.wood123
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).