* [PATCH 0/1] diffcore-break: prevent dangling pointer
@ 2026-02-11 4:11 Han Young
2026-02-11 4:11 ` [PATCH 1/1] " Han Young
2026-02-12 7:20 ` [PATCH v2 0/1] " Han Young
0 siblings, 2 replies; 11+ messages in thread
From: Han Young @ 2026-02-11 4:11 UTC (permalink / raw)
To: git; +Cc: gitster, Han Young
The diffcore_break function in diffcore-break.c forgets to set the queue
reference to NULL after freeing it. In a blobless cloned repository,
the queue could be accessed by prefetch and result in a segmentation fault.
This bug is only triggered if:
* the repository is partially cloned
* the diff operation triggers prefetch
* a diff is split into delete and create before prefetching
I've prepared a example repository that triggers this bug.
git clone git@github.com:hanyang-tony/dangle_sanitize.git --filter=blob:none
cd dangle_sanitize
# download the old version of the file
# to ensure the splited diff exists in local repository
git checkout HEAD~1 .iac
# reset the file so we have a diff
git reset --hard HEAD
# segmentation fault
git reset HEAD~1
Here is how to create the example repository:
mkdir example && cd example && git init
mkdir -p .iac/configs/devops
cat >.iac/configs/devops/config.yml <<EOL
whiteListInfo:
- target: TARGET1
emails:
- user01
- user02
- user03
- user04
- user05
- user06
- user07
- user08
- user09
- user10
- user11
- user12
- user13
- target: TARGET2
department:
- DEPARTMENT_DEPARTMENT_DEPARTMENT_DEPARTMENT_01
- DEPARTMENT_DEPARTMENT_DEPARTMENT_DEPARTMENT_02
- DEPARTMENT_DEPARTMENT_DEPARTMENT_DEPARTMENT_03
- DEPARTMENT_DEPARTMENT_DEPARTMENT_DEPARTMENT_04
- DEPARTMENT_DEPARTMENT_DEPARTMENT_DEPARTMENT_05
- target: TARGET3
emails:
- user14
- target: TARGET4
department:
- DEPARTMENT_06
emails:
- user15
- target: TARGET5
department:
- DEPARTMENT_07
- target: TARGET6
department:
- DEPARTMENT_08
EOL
echo bar >> foo.c
git add -A && git commit -m init
echo baz >> foo.c
cat >.iac/configs/devops/config.yml <<EOL
whiteListInfo:
- target: TARGET1
emails:
- user01
- user02
- user03
- user04
- user05
- user06
- user07
- user08
- user09
- user10
- user11
- user12
- user13
- target: TARGET2
department:
- DEPARTMENT_DEPARTMENT_DEPARTMENT_DEPARTMENT_DEPARTMENTS_01
- DEPARTMENT_DEPARTMENT_DEPARTMENT_DEPARTMENT_DEPARTMENTS_02
- DEPARTMENT_DEPARTMENT_DEPARTMENT_DEPARTMENT_DEPARTMENTS_03
- DEPARTMENT_DEPARTMENT_DEPARTMENT_DEPARTMENT_DEPARTMENTS_04
- DEPARTMENT_DEPARTMENT_DEPARTMENT_DEPARTMENT_DEPARTMENTS_05
- target: TARGET3
emails:
- user14
- target: TARGET4
department:
- DEPARTMENT_06
emails:
- user15
- target: TARGET5
department:
- DEPARTMENT_07
- target: TARGET6
department:
- DEPARTMENT_08
EOL
git add -A && git commit -m 1
After partially cloning the the example repository, fetch the old version of
.iac/configs/devops/config.yml. Trying to reset to the initial commit should
result in a segmentation fault.
Han Young (1):
diffcore-break: prevent dangling pointer
diffcore-break.c | 1 +
1 file changed, 1 insertion(+)
--
2.52.0
^ permalink raw reply [flat|nested] 11+ messages in thread* [PATCH 1/1] diffcore-break: prevent dangling pointer 2026-02-11 4:11 [PATCH 0/1] diffcore-break: prevent dangling pointer Han Young @ 2026-02-11 4:11 ` Han Young 2026-02-11 17:54 ` Junio C Hamano 2026-02-12 7:20 ` [PATCH v2 0/1] " Han Young 1 sibling, 1 reply; 11+ messages in thread From: Han Young @ 2026-02-11 4:11 UTC (permalink / raw) To: git; +Cc: gitster, Han Young After we have freed the file pair, we should set the queue reference to null. This prevents us from encountering a dangling pointer later on. --- diffcore-break.c | 1 + 1 file changed, 1 insertion(+) diff --git a/diffcore-break.c b/diffcore-break.c index c4c2173f30..9b11fe2fa0 100644 --- a/diffcore-break.c +++ b/diffcore-break.c @@ -222,6 +222,7 @@ void diffcore_break(struct repository *r, int break_score) free(p); /* not diff_free_filepair(), we are * reusing one and two here. */ + q->queue[i] = NULL; continue; } } -- 2.52.0 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH 1/1] diffcore-break: prevent dangling pointer 2026-02-11 4:11 ` [PATCH 1/1] " Han Young @ 2026-02-11 17:54 ` Junio C Hamano 0 siblings, 0 replies; 11+ messages in thread From: Junio C Hamano @ 2026-02-11 17:54 UTC (permalink / raw) To: Han Young; +Cc: git Han Young <hanyang.tony@bytedance.com> writes: > After we have freed the file pair, we should set the queue reference to null. > This prevents us from encountering a dangling pointer later on. > --- > diffcore-break.c | 1 + > 1 file changed, 1 insertion(+) Missing are sign-off and tests. This reminds me of 56d388e6 (diff: avoid segfault with freed entries, 2025-12-29). > > diff --git a/diffcore-break.c b/diffcore-break.c > index c4c2173f30..9b11fe2fa0 100644 > --- a/diffcore-break.c > +++ b/diffcore-break.c > @@ -222,6 +222,7 @@ void diffcore_break(struct repository *r, int break_score) > free(p); /* not diff_free_filepair(), we are > * reusing one and two here. > */ > + q->queue[i] = NULL; > continue; > } > } ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v2 0/1] diffcore-break: prevent dangling pointer 2026-02-11 4:11 [PATCH 0/1] diffcore-break: prevent dangling pointer Han Young 2026-02-11 4:11 ` [PATCH 1/1] " Han Young @ 2026-02-12 7:20 ` Han Young 2026-02-12 7:20 ` [PATCH v2 1/1] " Han Young 2026-02-24 6:13 ` [PATCH v3 0/1] diffcore-break: avoid segfault with freed entries Han Young 1 sibling, 2 replies; 11+ messages in thread From: Han Young @ 2026-02-12 7:20 UTC (permalink / raw) To: git; +Cc: gitster, Han Young This bug is difficult to trigger, git-diff won't trigger the bug because in git-diff, the prefetching happens before the break-rewrites. The test uses git reset to trigger it. Changes since v1: * Added the test Han Young (1): diffcore-break: prevent dangling pointer diffcore-break.c | 1 + t/t4067-diff-partial-clone.sh | 30 ++++++++++++++++++++++++++++++ 2 files changed, 31 insertions(+) -- 2.52.0 ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v2 1/1] diffcore-break: prevent dangling pointer 2026-02-12 7:20 ` [PATCH v2 0/1] " Han Young @ 2026-02-12 7:20 ` Han Young 2026-02-12 18:58 ` Junio C Hamano 2026-02-24 6:13 ` [PATCH v3 0/1] diffcore-break: avoid segfault with freed entries Han Young 1 sibling, 1 reply; 11+ messages in thread From: Han Young @ 2026-02-12 7:20 UTC (permalink / raw) To: git; +Cc: gitster, Han Young After we have freed the file pair, we should set the queue reference to null. This prevents us from encountering a dangling pointer later on. The test uses git reset to trigger prefetching after break-rewrites have freed the file pair. Signed-off-by: Han Young <hanyang.tony@bytedance.com> --- diffcore-break.c | 1 + t/t4067-diff-partial-clone.sh | 30 ++++++++++++++++++++++++++++++ 2 files changed, 31 insertions(+) diff --git a/diffcore-break.c b/diffcore-break.c index c4c2173f30..9b11fe2fa0 100644 --- a/diffcore-break.c +++ b/diffcore-break.c @@ -222,6 +222,7 @@ void diffcore_break(struct repository *r, int break_score) free(p); /* not diff_free_filepair(), we are * reusing one and two here. */ + q->queue[i] = NULL; continue; } } diff --git a/t/t4067-diff-partial-clone.sh b/t/t4067-diff-partial-clone.sh index 72f25de449..a980cd30a0 100755 --- a/t/t4067-diff-partial-clone.sh +++ b/t/t4067-diff-partial-clone.sh @@ -132,6 +132,36 @@ test_expect_success 'diff with rename detection batches blobs' ' test_line_count = 1 done_lines ' +test_expect_success 'diff succeeds even if prefetch triggered by break-rewrites' ' + test_when_finished "rm -rf server client trace" && + + test_create_repo server && + echo xyz >server/foo && + mkdir server/bar && + test_seq -f "line %d" 1 100 >server/bar/baz && + git -C server add -A && + git -C server commit -m x && + + + echo xyzz >server/foo && + rm server/bar/baz && + test_seq -f "line %d" 90 190 >server/bar/baz && + git -C server add -A && + git -C server commit -m x && + + test_config -C server uploadpack.allowfilter 1 && + test_config -C server uploadpack.allowanysha1inwant 1 && + git clone --filter=blob:limit=0 "file://$(pwd)/server" client && + + # Fetch bar/baz without fetching foo. + git -C client checkout HEAD~1 bar && + # Ensure baz has diff + git -C client reset --hard HEAD && + + # reset's break-rewrites detection will trigger prefetch + git -C client reset HEAD~1 +' + test_expect_success 'diff succeeds even if entries are removed from queue' ' test_when_finished "rm -rf server client trace" && -- 2.52.0 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v2 1/1] diffcore-break: prevent dangling pointer 2026-02-12 7:20 ` [PATCH v2 1/1] " Han Young @ 2026-02-12 18:58 ` Junio C Hamano 2026-02-13 7:14 ` [External] " Han Young 0 siblings, 1 reply; 11+ messages in thread From: Junio C Hamano @ 2026-02-12 18:58 UTC (permalink / raw) To: Han Young; +Cc: git Han Young <hanyang.tony@bytedance.com> writes: > After we have freed the file pair, we should set the queue reference to null. > This prevents us from encountering a dangling pointer later on. I sense that "This prevents ... later on" needs further be clarified, since it is totally unclear what "later on" refers to. We are done with the old filepair, and have no reason to revisit the q->queue[] item ourselves, but somebody later attempts to use it. Who is it and why does it do so? That is a natural question readers of the above description would ask, isn't it? Side note: I am guessiong that this is similar to the problem fixed with recent 56d388e6 (diff: avoid segfault with freed entries, 2025-12-29), where pointers in diff_queue->queue[] that point at file pairs that have been freed were mistakenly used by diff_queued_diff_prefetch() to populate them. Is a prefetch happening in "git reset --mixed" that calls read_from_tree() which calls update_index_from_diff() via do_diff_cache()? > The test uses git reset to trigger prefetching after break-rewrites have freed > the file pair. > > Signed-off-by: Han Young <hanyang.tony@bytedance.com> > --- > diffcore-break.c | 1 + > t/t4067-diff-partial-clone.sh | 30 ++++++++++++++++++++++++++++++ > 2 files changed, 31 insertions(+) > > diff --git a/diffcore-break.c b/diffcore-break.c > index c4c2173f30..9b11fe2fa0 100644 > --- a/diffcore-break.c > +++ b/diffcore-break.c > @@ -222,6 +222,7 @@ void diffcore_break(struct repository *r, int break_score) > free(p); /* not diff_free_filepair(), we are > * reusing one and two here. > */ > + q->queue[i] = NULL; > continue; > } > } > diff --git a/t/t4067-diff-partial-clone.sh b/t/t4067-diff-partial-clone.sh > index 72f25de449..a980cd30a0 100755 > --- a/t/t4067-diff-partial-clone.sh > +++ b/t/t4067-diff-partial-clone.sh > @@ -132,6 +132,36 @@ test_expect_success 'diff with rename detection batches blobs' ' > test_line_count = 1 done_lines > ' > > +test_expect_success 'diff succeeds even if prefetch triggered by break-rewrites' ' > + test_when_finished "rm -rf server client trace" && > + > + test_create_repo server && > + echo xyz >server/foo && > + mkdir server/bar && > + test_seq -f "line %d" 1 100 >server/bar/baz && > + git -C server add -A && > + git -C server commit -m x && > + > + > + echo xyzz >server/foo && The blank line above does not have to be doubled, I think. So the first commit yas "xyz" in "foo", and 100 lines 1..100 in "bar/baz". > + rm server/bar/baz && We are overwriting it, so I am not sure why this "rm" is needed. Is it necessary to avoid reusing the same i-num for the file to avoid racily clean condition, or something? I find it unlikely because the length of the new contents ... > + test_seq -f "line %d" 90 190 >server/bar/baz && ... is different from the original. > + git -C server add -A && > + git -C server commit -m x && In any case, the second commit has "xyzz" in "foo", and different 100 lines 90..190 in "bar/baz". > + test_config -C server uploadpack.allowfilter 1 && > + test_config -C server uploadpack.allowanysha1inwant 1 && > + git clone --filter=blob:limit=0 "file://$(pwd)/server" client && And we grab both commits and their trees, but not any blob. > + # Fetch bar/baz without fetching foo. > + git -C client checkout HEAD~1 bar && And we move "bar/baz" back to the 1..100 (the tip of the history immediately after cloning had 90..190). > + # Ensure baz has diff > + git -C client reset --hard HEAD && I am not sure what the comment wants to say. Before this hard reset, we did have modification relative to HEAD in bar/baz; with a hard reset, we are ensuring that everything including bar/baz exactly match HEAD, aren't we? > + # reset's break-rewrites detection will trigger prefetch "reset's break-rewrites detection" -> "break-rewrites detction in reset" or something to avoid the "'"; otherwise you'd get error: bug in the test script: not 2 or 3 parameters to test-expect-success You rewrote this line as a part of the last-minute change before you ran the test for the last time, or something? Anyway, starting from the "everything clean wrt HEAD" state, we try to move to the HEAD~1. We already have both 1..100 and 90..190 versions of bar/baz (and HEAD: and HEAD~1: trees), and we have the contents for foo in HEAD, which is "xyzz", but we have never seen HEAD~1:foo so far. "reset --mixed HEAD~1" requires us to obtain it... > + git -C client reset HEAD~1 ... and cause us to run the prefetch to obtain "foo", but it runs do_diff_cache() and makes it notice bar/baz has changed too much? Your "do not leave q->queue[] dangling, as other people may still look at them" fix certainly is a good hygiene, but I have to wonder why we are doing break detection in this case in the first place. For the internal "Let's figure out which path have changed, so that we re-read only those changed paths" invocation of diff machinery, we should not be doing so. A break detection is to see if the change in the contents of a single path is a total rewrite, and regardless of the answer, the fact that the path was modified does not change, update_index_from_diff() would work on the path anyway. I also suspect that, if we are doing rename detection in this call to do_diff_cache(), it is a totally wasted effort. We may want to take a deeper look at it, possibly outside the theme of this more focused fix. > +' > + > test_expect_success 'diff succeeds even if entries are removed from queue' ' > test_when_finished "rm -rf server client trace" && By the way, I find it highly curious that with the following patch to revert the fix with a bit of extra output sprinkled to your tests, the problem does not reproduce reliably, which may indicate that your test may be flaky (i.e., timing dependent). Am I doing something bogus in the patch? * revert of the fix in diffcore-break.c is a deliberate attempt to reproduce the problem, and "echo/cd client/ls" are for inspection so you should not take it in your final version. * removal of double blank lines is a style fix I'd want you to take. * "reset's" fix is a shell script syntax fix I'd want you to take. Thanks. diff --git c/diffcore-break.c w/diffcore-break.c index 9b11fe2fa0..c4c2173f30 100644 --- c/diffcore-break.c +++ w/diffcore-break.c @@ -222,7 +222,6 @@ void diffcore_break(struct repository *r, int break_score) free(p); /* not diff_free_filepair(), we are * reusing one and two here. */ - q->queue[i] = NULL; continue; } } diff --git c/t/t4067-diff-partial-clone.sh w/t/t4067-diff-partial-clone.sh index a980cd30a0..b5a35ded99 100755 --- c/t/t4067-diff-partial-clone.sh +++ w/t/t4067-diff-partial-clone.sh @@ -142,7 +142,6 @@ test_expect_success 'diff succeeds even if prefetch triggered by break-rewrites' git -C server add -A && git -C server commit -m x && - echo xyzz >server/foo && rm server/bar/baz && test_seq -f "line %d" 90 190 >server/bar/baz && @@ -153,12 +152,22 @@ test_expect_success 'diff succeeds even if prefetch triggered by break-rewrites' test_config -C server uploadpack.allowanysha1inwant 1 && git clone --filter=blob:limit=0 "file://$(pwd)/server" client && +echo after clone && +(cd client && ls -l foo bar/baz) && + # Fetch bar/baz without fetching foo. git -C client checkout HEAD~1 bar && + +echo after checkout HEAD~1 bar && +(cd client && ls -l foo bar/baz) && + # Ensure baz has diff git -C client reset --hard HEAD && - # reset's break-rewrites detection will trigger prefetch +echo after reset --hard HEAD && +(cd client && ls -l foo bar/baz) && + + # break-rewrites detection in reset will trigger prefetch git -C client reset HEAD~1 ' ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [External] Re: [PATCH v2 1/1] diffcore-break: prevent dangling pointer 2026-02-12 18:58 ` Junio C Hamano @ 2026-02-13 7:14 ` Han Young 2026-02-13 17:16 ` Junio C Hamano 0 siblings, 1 reply; 11+ messages in thread From: Han Young @ 2026-02-13 7:14 UTC (permalink / raw) To: Junio C Hamano; +Cc: git On Fri, Feb 13, 2026 at 2:58 AM Junio C Hamano <gitster@pobox.com> wrote: > I sense that "This prevents ... later on" needs further be > clarified, since it is totally unclear what "later on" refers to. > We are done with the old filepair, and have no reason to revisit the > q->queue[] item ourselves, but somebody later attempts to use it. > Who is it and why does it do so? That is a natural question readers > of the above description would ask, isn't it? Sorry, I'll try to describe the problem thoroughly in version 3 of the patch. > > + echo xyzz >server/foo && > > The blank line above does not have to be doubled, I think. So the > first commit yas "xyz" in "foo", and 100 lines 1..100 in "bar/baz" Yes, I wasn't being careful, I will ensure there are no double blank lines. > > + rm server/bar/baz && > > We are overwriting it, so I am not sure why this "rm" is needed. Is > it necessary to avoid reusing the same i-num for the file to avoid > racily clean condition, or something? I find it unlikely because > the length of the new contents ... This is an artifact from before I found the test_seq helper function. I will remove it. > > + # Ensure baz has diff > > + git -C client reset --hard HEAD && > > I am not sure what the comment wants to say. Before this hard > reset, we did have modification relative to HEAD in bar/baz; with a > hard reset, we are ensuring that everything including bar/baz > exactly match HEAD, aren't we? This resets bar/baz to the HEAD's version. So that in the reset below, The `bar/baz` in the worktree is different from the `bar/baz` in HEAD~1. We rely on bar/baz to be broken into delete/create to trigger the use-after-free bug. I'll clarify the comment in v3. > > + # reset's break-rewrites detection will trigger prefetch > > "reset's break-rewrites detection" -> "break-rewrites detction in reset" > or something to avoid the "'"; otherwise you'd get > > error: bug in the test script: not 2 or 3 parameters to test-expect-success > > You rewrote this line as a part of the last-minute change before you > ran the test for the last time, or something? Sorry, I only added the comments after finish writing the test, and forgot to run the test again. > ... and cause us to run the prefetch to obtain "foo", but it runs > do_diff_cache() and makes it notice bar/baz has changed too much? > > Your "do not leave q->queue[] dangling, as other people may still > look at them" fix certainly is a good hygiene, but I have to wonder > why we are doing break detection in this case in the first place. > For the internal "Let's figure out which path have changed, so that > we re-read only those changed paths" invocation of diff machinery, > we should not be doing so. A break detection is to see if the > change in the contents of a single path is a total rewrite, and > regardless of the answer, the fact that the path was modified does > not change, update_index_from_diff() would work on the path anyway. > I also suspect that, if we are doing rename detection in this call > to do_diff_cache(), it is a totally wasted effort. We may want to > take a deeper look at it, possibly outside the theme of this more > focused fix. I'm not familiar with reset and diff machinery; I encountered this bug during a real world mixed reset. The segmentfault calling stack is cmd_reset -> read_from_tree -> diffcore_std -> diffcore_break It looks like rename detection is indeed pointless. > By the way, I find it highly curious that with the following patch > to revert the fix with a bit of extra output sprinkled to your > tests, the problem does not reproduce reliably, which may indicate > that your test may be flaky (i.e., timing dependent). Am I doing > something bogus in the patch? It seems the problem does not reproduce reliably with or without your patch. I suspect that could be due to the freed memory on some occasions isn't reused by system, thus the access later on doesn't trigger a segment fault. On my macOS system, the test passes around 5% of the time. However, if I set q->queue[i] to a bogus memory location like 0x1 causes a Git segment fault every time. Is there a better way to write tests for this kind of situation? Thanks. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [External] Re: [PATCH v2 1/1] diffcore-break: prevent dangling pointer 2026-02-13 7:14 ` [External] " Han Young @ 2026-02-13 17:16 ` Junio C Hamano 0 siblings, 0 replies; 11+ messages in thread From: Junio C Hamano @ 2026-02-13 17:16 UTC (permalink / raw) To: Han Young; +Cc: git Han Young <hanyang.tony@bytedance.com> writes: >> Your "do not leave q->queue[] dangling, as other people may still >> look at them" fix certainly is a good hygiene, but I have to wonder >> why we are doing break detection in this case in the first place. >> For the internal "Let's figure out which path have changed, so that >> we re-read only those changed paths" invocation of diff machinery, >> we should not be doing so. A break detection is to see if the >> change in the contents of a single path is a total rewrite, and >> regardless of the answer, the fact that the path was modified does >> not change, update_index_from_diff() would work on the path anyway. >> I also suspect that, if we are doing rename detection in this call >> to do_diff_cache(), it is a totally wasted effort. We may want to >> take a deeper look at it, possibly outside the theme of this more >> focused fix. > > I'm not familiar with reset and diff machinery; I encountered this bug > during a real world mixed reset. The segmentfault calling stack is > cmd_reset -> read_from_tree -> diffcore_std -> diffcore_break > It looks like rename detection is indeed pointless. > >> By the way, I find it highly curious that with the following patch >> to revert the fix with a bit of extra output sprinkled to your >> tests, the problem does not reproduce reliably, which may indicate >> that your test may be flaky (i.e., timing dependent). Am I doing >> something bogus in the patch? > > It seems the problem does not reproduce reliably with or without your > patch. I suspect that could be due to the freed memory on some > occasions isn't reused by system, thus the access later on doesn't > trigger a segment fault. On my macOS system, the test passes around > 5% of the time. However, if I set q->queue[i] to a bogus memory > location like 0x1 causes a Git segment fault every time. > Is there a better way to write tests for this kind of situation? If it is flaky because it depends on the way the system allocator happens to reuse or not reuse a piece of memory, as long as we do not get hit by false positives, it would be OK to leave it as-is if we do not find a good solution, because running tests under asan would catch such problems, I think. Thanks. ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v3 0/1] diffcore-break: avoid segfault with freed entries 2026-02-12 7:20 ` [PATCH v2 0/1] " Han Young 2026-02-12 7:20 ` [PATCH v2 1/1] " Han Young @ 2026-02-24 6:13 ` Han Young 2026-02-24 6:13 ` [PATCH v3 1/1] " Han Young 2026-02-24 15:22 ` [PATCH v3 0/1] " Junio C Hamano 1 sibling, 2 replies; 11+ messages in thread From: Han Young @ 2026-02-24 6:13 UTC (permalink / raw) To: git; +Cc: gitster, Han Young Changes since v2: * fixed the ' in test, and removed unused lines * clarify the cause of segfault in commit message Han Young (1): diffcore-break: avoid segfault with freed entries diffcore-break.c | 1 + t/t4067-diff-partial-clone.sh | 31 +++++++++++++++++++++++++++++++ 2 files changed, 32 insertions(+) -- 2.52.0 ^ permalink raw reply [flat|nested] 11+ messages in thread
* [PATCH v3 1/1] diffcore-break: avoid segfault with freed entries 2026-02-24 6:13 ` [PATCH v3 0/1] diffcore-break: avoid segfault with freed entries Han Young @ 2026-02-24 6:13 ` Han Young 2026-02-24 15:22 ` [PATCH v3 0/1] " Junio C Hamano 1 sibling, 0 replies; 11+ messages in thread From: Han Young @ 2026-02-24 6:13 UTC (permalink / raw) To: git; +Cc: gitster, Han Young After we have freed the file pair, we should set the queue reference to null. When computing a diff in a partial clone, there is a chance that we could trigger a prefetch of missing objects when there are freed entries in the global diff queue due to break-rewrites detection. The segfault only occurs if an entry has been freed by break-rewrites and there is an entry to be prefetched. There is a new test in t4067 that trigger the segmentation fault that results in this case. The test explicitly fetch the necessary blobs to trigger the break rewrites, some blobs are left to be prefetched. The fix is to set the queue pointer to NULL after it is freed, the prefetch will skip NULL entries. Signed-off-by: Han Young <hanyang.tony@bytedance.com> --- diffcore-break.c | 1 + t/t4067-diff-partial-clone.sh | 31 +++++++++++++++++++++++++++++++ 2 files changed, 32 insertions(+) diff --git a/diffcore-break.c b/diffcore-break.c index c4c2173f30..9b11fe2fa0 100644 --- a/diffcore-break.c +++ b/diffcore-break.c @@ -222,6 +222,7 @@ void diffcore_break(struct repository *r, int break_score) free(p); /* not diff_free_filepair(), we are * reusing one and two here. */ + q->queue[i] = NULL; continue; } } diff --git a/t/t4067-diff-partial-clone.sh b/t/t4067-diff-partial-clone.sh index 72f25de449..30813109ac 100755 --- a/t/t4067-diff-partial-clone.sh +++ b/t/t4067-diff-partial-clone.sh @@ -132,6 +132,37 @@ test_expect_success 'diff with rename detection batches blobs' ' test_line_count = 1 done_lines ' +test_expect_success 'diff succeeds even if prefetch triggered by break-rewrites' ' + test_when_finished "rm -rf server client trace" && + + test_create_repo server && + echo xyz >server/foo && + mkdir server/bar && + test_seq -f "line %d" 1 100 >server/bar/baz && + git -C server add -A && + git -C server commit -m x && + + echo xyzz >server/foo && + test_seq -f "line %d" 90 190 >server/bar/baz && + git -C server add -A && + git -C server commit -m x && + + test_config -C server uploadpack.allowfilter 1 && + test_config -C server uploadpack.allowanysha1inwant 1 && + git clone --filter=blob:limit=0 "file://$(pwd)/server" client && + + # Fetch bar/baz without fetching foo. + # Foo will be lazily fetched during break rewrites detection. + git -C client checkout HEAD~1 bar && + + # Ensure baz in the working tree is different from baz in HEAD~1. + # We need baz to trigger break-rewrites detection. + git -C client reset --hard HEAD && + + # break-rewrites detction in reset. + git -C client reset HEAD~1 +' + test_expect_success 'diff succeeds even if entries are removed from queue' ' test_when_finished "rm -rf server client trace" && -- 2.52.0 ^ permalink raw reply related [flat|nested] 11+ messages in thread
* Re: [PATCH v3 0/1] diffcore-break: avoid segfault with freed entries 2026-02-24 6:13 ` [PATCH v3 0/1] diffcore-break: avoid segfault with freed entries Han Young 2026-02-24 6:13 ` [PATCH v3 1/1] " Han Young @ 2026-02-24 15:22 ` Junio C Hamano 1 sibling, 0 replies; 11+ messages in thread From: Junio C Hamano @ 2026-02-24 15:22 UTC (permalink / raw) To: Han Young; +Cc: git Han Young <hanyang.tony@bytedance.com> writes: > Changes since v2: > * fixed the ' in test, and removed unused lines > * clarify the cause of segfault in commit message > > Han Young (1): > diffcore-break: avoid segfault with freed entries > > diffcore-break.c | 1 + > t/t4067-diff-partial-clone.sh | 31 +++++++++++++++++++++++++++++++ > 2 files changed, 32 insertions(+) Thanks, will replace. ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2026-02-24 15:22 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-02-11 4:11 [PATCH 0/1] diffcore-break: prevent dangling pointer Han Young 2026-02-11 4:11 ` [PATCH 1/1] " Han Young 2026-02-11 17:54 ` Junio C Hamano 2026-02-12 7:20 ` [PATCH v2 0/1] " Han Young 2026-02-12 7:20 ` [PATCH v2 1/1] " Han Young 2026-02-12 18:58 ` Junio C Hamano 2026-02-13 7:14 ` [External] " Han Young 2026-02-13 17:16 ` Junio C Hamano 2026-02-24 6:13 ` [PATCH v3 0/1] diffcore-break: avoid segfault with freed entries Han Young 2026-02-24 6:13 ` [PATCH v3 1/1] " Han Young 2026-02-24 15:22 ` [PATCH v3 0/1] " Junio C Hamano
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox