Git development
 help / color / mirror / Atom feed
* [PATCH v2] history: streamline message preparation and plug file stream leak
From: Junio C Hamano @ 2026-06-29 16:08 UTC (permalink / raw)
  To: git; +Cc: Patrick Steinhardt, Johannes Schindelin
In-Reply-To: <xmqqecht8df1.fsf@gitster.g>

An early part of fill_commit_message() function uses write_file_buf()
to write out what was prepared in a strbuf, which is primarily meant
for use by callers that have their own message prepared fully and
called as the last thing to flush it to the destination file.

However, the function then opens a file stream in append mode to
further write into it.  It may have been understandable if this was
a later addition, but it seems it came from a single commit,
d205234c (builtin/history: implement "reword" subcommand,
2026-01-13), which is somewhat puzzling, but anyway...

Just open the file stream upfront for writing, write the message
the function has in the strbuf, and then keep writing whatever it
wants to write to the same open file stream.

And do not forget to close the stream.  We are about to pass the
resulting file to an external editor, and on some systems, notably
Windows, you are not supposed to keep a file open while expecting
another program to access it.

Diagnosed-by: Johannes Schindelin <Johannes.Schindelin@gmx.de>
Signed-off-by: Junio C Hamano <gitster@pobox.com>
---

 * Changes from v1 are two additional error checks to notice failure
   from fwrite() and fclose() to die.  Interdiff appears at the end.

 builtin/history.c | 15 ++++++++-------
 1 file changed, 8 insertions(+), 7 deletions(-)

diff --git a/builtin/history.c b/builtin/history.c
index 8dcb9a6046..365e81379b 100644
--- a/builtin/history.c
+++ b/builtin/history.c
@@ -41,11 +41,6 @@ static int fill_commit_message(struct repository *repo,
 		  " empty message aborts the commit.\n");
 	struct wt_status s;
 
-	strbuf_addstr(out, default_message);
-	strbuf_addch(out, '\n');
-	strbuf_commented_addf(out, comment_line_str, hint, action, comment_line_str);
-	write_file_buf(path, out->buf, out->len);
-
 	wt_status_prepare(repo, &s);
 	FREE_AND_NULL(s.branch);
 	s.ahead_behind_flags = AHEAD_BEHIND_QUICK;
@@ -57,14 +52,22 @@ static int fill_commit_message(struct repository *repo,
 	s.whence = FROM_COMMIT;
 	s.committable = 1;
 
-	s.fp = fopen(git_path_commit_editmsg(), "a");
+	s.fp = fopen(path, "w");
 	if (!s.fp)
-		return error_errno(_("could not open '%s'"), git_path_commit_editmsg());
+		return error_errno(_("could not open '%s'"), path);
+
+	strbuf_addstr(out, default_message);
+	strbuf_addch(out, '\n');
+	strbuf_commented_addf(out, comment_line_str, hint, action, comment_line_str);
+	if (fwrite(out->buf, 1, out->len, s.fp) != out->len)
+		die_errno(_("could not write to '%s'"), path);
 
 	wt_status_collect_changes_trees(&s, old_tree, new_tree);
 	wt_status_print(&s);
 	wt_status_collect_free_buffers(&s);
 	string_list_clear_func(&s.change, change_data_free);
+	if (fclose(s.fp))
+		die_errno(_("could not write to '%s'"), path);
 
 	strbuf_reset(out);
 	if (launch_editor(path, out, NULL)) {

Interdiff against v1:
  diff --git a/builtin/history.c b/builtin/history.c
  index f17ec049c0..365e81379b 100644
  --- a/builtin/history.c
  +++ b/builtin/history.c
  @@ -59,13 +59,15 @@ static int fill_commit_message(struct repository *repo,
   	strbuf_addstr(out, default_message);
   	strbuf_addch(out, '\n');
   	strbuf_commented_addf(out, comment_line_str, hint, action, comment_line_str);
  -	fwrite(out->buf, 1, out->len, s.fp);
  +	if (fwrite(out->buf, 1, out->len, s.fp) != out->len)
  +		die_errno(_("could not write to '%s'"), path);
   
   	wt_status_collect_changes_trees(&s, old_tree, new_tree);
   	wt_status_print(&s);
   	wt_status_collect_free_buffers(&s);
   	string_list_clear_func(&s.change, change_data_free);
  -	fclose(s.fp);
  +	if (fclose(s.fp))
  +		die_errno(_("could not write to '%s'"), path);
   
   	strbuf_reset(out);
   	if (launch_editor(path, out, NULL)) {
-- 
2.55.0-180-gf61bfe2e0b



^ permalink raw reply related

* Re: [PATCH] history: streamline message preparation and plug file stream leak
From: Patrick Steinhardt @ 2026-06-29 16:04 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Johannes Schindelin
In-Reply-To: <xmqq33y5z82l.fsf@gitster.g>

On Mon, Jun 29, 2026 at 08:21:06AM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> >> +	strbuf_addstr(out, default_message);
> >> +	strbuf_addch(out, '\n');
> >> +	strbuf_commented_addf(out, comment_line_str, hint, action, comment_line_str);
> >> +	fwrite(out->buf, 1, out->len, s.fp);
> >>  
> >>  	wt_status_collect_changes_trees(&s, old_tree, new_tree);
> >>  	wt_status_print(&s);
> >>  	wt_status_collect_free_buffers(&s);
> >>  	string_list_clear_func(&s.change, change_data_free);
> >> +	fclose(s.fp);
> >
> > This is fixing the leaked file descriptor.
> >
> > One thing I wonder though is that we don't perform any error checking on
> > the file in the new version. Previously, we would have died in case
> > `write_file_buf()` failed. But now we just `fwrite()` without error
> > checking. I don't think that "wt-status.c" does error checking either,
> > so we might end up with a partially-written file without us noticing.
> 
> Yes, the fwrite() should be protected with an error checking and
> die() the same way as the code before.  Will send a v2.
> 
> But isn't the end result the same between preimage and postimage?
> If the stuff appended by wt_status_* are still written without error
> checking, we would leave a partially-written file that has the
> default_messages and the commented hint/action but not necessarily
> whatever we wanted to add with wt_status().

At least it would only be the status information that's missing in that
case, the commit message itself would be retained (or we'd die if it
wasn't written). So we didn't have the potential to loose information
that is intended to end up in the final commit.

Patrick

^ permalink raw reply

* Re: [PATCH v5 0/4] history: add squash subcommand to fold a range
From: Phillip Wood @ 2026-06-29 15:51 UTC (permalink / raw)
  To: Patrick Steinhardt, phillip.wood
  Cc: Harald Nordgren via GitGitGadget, git, Harald Nordgren
In-Reply-To: <akIQLM6xZTHBudWT@pks.im>

Hi Harald and Patrick

On 29/06/2026 07:26, Patrick Steinhardt wrote:
> On Fri, Jun 26, 2026 at 09:52:57AM +0100, Phillip Wood wrote:
>> Hi Harald
>>
>> On 24/06/2026 22:54, Harald Nordgren via GitGitGadget wrote:
>>> Adds git history squash <revision-range> to fold a range of commits.
>>
>> It would be helpful to give a bit more detail here about the command so that
>> the reader has an overview of what is actually being implemented.
>>
>>   - what does it do with fixup!, squash! and amend! commits? Can it use
>>     the message from amend! commits to reword the commit?
>>   - can the user reword the commit message?
> 
> Good things to document/discuss.

Thanks, I was disappointed that these questions were not addressed in 
the cover letter for v6 which contains no more detail than this one.

We should make rewording the commit as easy as possible to encourage 
users to create useful commit histories. I think that means having some 
support for fixup! style commits. We could just do what rebase does and 
comment out the fixup! style commit subjects and the commit messages 
replaced by amend! commits but I think we have the opportunity to do 
something nicer (I find the commented-out messages annoying). We could 
have a comment saying "this is the combination of the following commits" 
followed by a list of the subject lines and then in the template message 
we'd simply omit the useless fixup! subjects when the commit body is 
empty and also omit the original commit messages that have been replaced 
by an amend! commit.

So instead of

     # This is the combination of 4 commits
     # This is the first commit message
     Base subject

     Base body

     # This is the second commit message
     # Another subject

     # Another body

     # This is the third commit message
     # fixup! Base subject

     # This is the fourth commit message
     # amend! Another subject
     A better subject

     A better body

We'd have

     # This is the combination of 4 commits
     # 123 Base subject
     # 456 Another subject
     # 789 fixup! Base subject
     # abc amend! Another subject

     Base Subject

     Base Body

     Another subject

     Another Body

Possibly with a comment before each message saying where it came from.

It would be good to error out if the user tries squash a fixup! style 
commit and range does not contain its target commit.

In the long run we should provide a way to squash an arbitrary list of 
commits rather than just a range.

> 
>>   - what happens if a merge commit inside the range has a parent outside
>>     the range?
> 
> Yeah, I agree that we should punt on merge commits for now. They are a
> can of worms, and I'm not sure that we should just squash them. I would
> at least like the user to ask a flag that tells us that it's fine
> squashing those.

There does seem to be some support for merges in this patch series which 
I think behaves pretty sensibly. If we have

           C - D
          /     \
   - A - B - E - F - G

Then squashing A..G should be fine because the parents of F are in the 
range and it looks like we support that. Squashing should B..G without 
--ancestry-path should be safe as well because B ends up as the parent 
of the squashed commit but we don't have a way to disable 
--ancestry-path (and maybe we don't want to add one). Squashing F^@..G 
might be useful to fixup a merge (though perhaps amending F rather than 
creating G is a simpler way to fix a broken merge). Squashing E..G does 
not make sense because the range does not include one of the merge parents.

>>   - what happens to branches that point to commits inside the range?
> 
> Yeah, this should be documented indeed.
> 
>> I had a quick play and found that it accepts ranges that containing a single
>> commit (e.g. @^!) where there is nothing to squash. It also accepts ranges
>> that are not ancestors of HEAD (e.g. checkout master and run "git history
>> squash --dry-run origin/seen^2^!") without printing an error message. Only
>> accepting a single argument is quite limiting as one cannot say
>>
>> 	git history squash ^:/base :/tip

We should sanitize what the user passes though - we do not want to 
accept arbitrary rev-list options. Off the top of my head "--left-only" 
and "--right-only" would allow the use of "A...B" and allowing "--not" 
seems reasonable.

> Note that it is intentional that you can rewrite branches that are not
> currently checked out, and the other subcommands work the same. So I'd
> argue this should also be the case for "squash".

Ah, thanks for clarifying that - it does not seem to check that there is 
a branch to be rewritten though if I do

     ./git checkout origin/master
     ./git history squash --dry-run origin/seen^2^^..origin/seen^2

there is no error message and it exits 0. If I create a branch pointing 
at origin/seen^2 then it does print a sensible ref-update.

Thanks

Phillip

^ permalink raw reply

* Re: receive-pack hangs on zero-object push into promisor-shaped repository
From: Patrick Steinhardt @ 2026-06-29 15:30 UTC (permalink / raw)
  To: Wei Hu; +Cc: git
In-Reply-To: <CACLXMtCSzW9BY7idqB1yGa87MeG0Y2FN5Ho2hRXuPJ_qswE27Q@mail.gmail.com>

On Mon, Jun 29, 2026 at 03:10:42PM +0800, Wei Hu wrote:
> Hello,
> 
> I found a reproducible hang in `git receive-pack` when pushing a ref update
> that sends zero objects into a repository that has promisor remote
> configuration and `.promisor` pack sidecar files.
> 
> The same zero-object ref update returns normally when the receiving
> repository is a normal non-bare repository or a bare repository. It
> also returns normally if I remove either the promisor remote config or
> the `.promisor` sidecar files from the receiving repository.
> 
> Check the attached script to reproduce the bug.

Thanks for your report! I was able to reduce your test case to the
following minimal reproducer:

diff --git a/t/t5616-partial-clone.sh b/t/t5616-partial-clone.sh
index 1c2805acca..2850e78e49 100755
--- a/t/t5616-partial-clone.sh
+++ b/t/t5616-partial-clone.sh
@@ -723,6 +723,28 @@ test_expect_success 'after fetching descendants of non-promisor commits, gc work
 	git -C partial gc --prune=now
 '
 
+test_expect_success 'zero-object push does not hang' '
+	rm -rf src dst &&
+
+	git init src &&
+	test_commit -C src initial &&
+
+	git init --bare dst &&
+	git -C src push "$(pwd)/dst" main &&
+	git -C dst config set remote.origin.promisor true &&
+	git -C dst maintenance run &&
+	for pack in dst/objects/pack/*.pack
+	do
+		>"${pack%.pack}.promisor" || return 1
+	done &&
+
+	# Push the already-existing commit with a new branch name, which
+	# results in zero objects being written. This used to hang in the past.
+	git -C src push "$(pwd)/dst" main:topic &&
+	git -C src rev-parse main >expect &&
+	git -C dst rev-parse topic >actual &&
+	test_cmp expect actual
+'
 
 . "$TEST_DIRECTORY"/lib-httpd.sh
 start_httpd

As it turns out, the bug itself was fixed already via d9982e8290
(connected: close err_fd in promisor fast-path, 2026-05-15), and that
fix is going to be part of Git 2.55 (which is due today).

That commit didn't add a test for this scenario though, even though the
commit message points out that there's been multiple regressions in this
area already. It's probably worth it to add the above test to our test
suite. Is this something you'd like to do? Otherwise I'm happy to send a
patch.

Thanks!

Patrick

^ permalink raw reply related

* Re: [PATCH] history: streamline message preparation and plug file stream leak
From: Junio C Hamano @ 2026-06-29 15:21 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git, Johannes Schindelin
In-Reply-To: <akIRrmD4Tqp-Gi9d@pks.im>

Patrick Steinhardt <ps@pks.im> writes:

>> +	strbuf_addstr(out, default_message);
>> +	strbuf_addch(out, '\n');
>> +	strbuf_commented_addf(out, comment_line_str, hint, action, comment_line_str);
>> +	fwrite(out->buf, 1, out->len, s.fp);
>>  
>>  	wt_status_collect_changes_trees(&s, old_tree, new_tree);
>>  	wt_status_print(&s);
>>  	wt_status_collect_free_buffers(&s);
>>  	string_list_clear_func(&s.change, change_data_free);
>> +	fclose(s.fp);
>
> This is fixing the leaked file descriptor.
>
> One thing I wonder though is that we don't perform any error checking on
> the file in the new version. Previously, we would have died in case
> `write_file_buf()` failed. But now we just `fwrite()` without error
> checking. I don't think that "wt-status.c" does error checking either,
> so we might end up with a partially-written file without us noticing.

Yes, the fwrite() should be protected with an error checking and
die() the same way as the code before.  Will send a v2.

But isn't the end result the same between preimage and postimage?
If the stuff appended by wt_status_* are still written without error
checking, we would leave a partially-written file that has the
default_messages and the commented hint/action but not necessarily
whatever we wanted to add with wt_status().

^ permalink raw reply

* Re: [PATCH v6 4/4] history: re-edit a squash with every message
From: Junio C Hamano @ 2026-06-29 14:49 UTC (permalink / raw)
  To: Harald Nordgren; +Cc: Harald Nordgren via GitGitGadget, git
In-Reply-To: <CAHwyqnXXFz4z_ULUq7Oqu0ykwpLJyEyW-uoF2bKfoYZQAjrNdQ@mail.gmail.com>

Harald Nordgren <haraldnordgren@gmail.com> writes:

>> I doubt it would make practical difference, but one thing I notice
>> is that unlike "git rebase -i", this one does not intersperse
>> markers like "# This is the 1st commit message" in between the
>> messages taken from the squashed commits, so it is not exactly
>> "mirroring".
>
> I wouldn't mind extracting that logic from 'rebase -i' to show it
> here. It would be nice to have.

If we can share more code (not necessarily the exact existing
code---after cleaning it up if needed is perfectly fine and may even
be better) across codebaes that would be excellent.  Thanks.

^ permalink raw reply

* Re: [PATCH v4 1/1] environment: move excludes_file into repo_config_values
From: Junio C Hamano @ 2026-06-29 14:47 UTC (permalink / raw)
  To: Christian Couder
  Cc: Tian Yuchen, git, cirnovskyv, szeder.dev, Ayush Chandekar,
	Olamide Caleb Bello
In-Reply-To: <CAP8UFD3Z0M_1NEXGcAxNZKpRUQiSkHZLTEvNNYushKA_PoPgjA@mail.gmail.com>

Christian Couder <christian.couder@gmail.com> writes:

> I agree that the best end state would be to have no `if (!repo ||
> !repo->initialized)` check, but we shouldn't have to get there right
> away. I think it's fine to proceed in several steps:
>
> 1) `if (!repo || !repo->initialized) return NULL;` allows us to remove
> the global variable and use getters which will help us in the next
> step.
>
> 2) `if (!repo || !repo->initialized) return BUG("repo must be an
> initialized repository");` now we want to find and fix callers
> (including tests) that haven't properly initialized things.
>
> 3) No `if (!repo || !repo->initialized)` check, as we are sure that
> all the callers that didn't properly initialized things have been
> found and fixed.
>
> So I think 1) is fine for now as long as we properly explain in the
> commit messages and in code comments (maybe using NEEDSWORK comments)
> that we know there is more work to do on this in the future.

I am OK with the progressive improvements, but if "wean us away from
global variables" topic stops at step 1 I would have to say that is
a failed experiment.  Not doing (2) means you made the system bug-to-bug
compatible from the old world where these things weren't in repo-config
but were still globals, which is code churn for nothing good to show
in the end result.  We need to get to (2) before declaring victory.

But of course, we need to start somewhere.  (1) with in-code
comments sprinkled to such places that say that we'd need to revisit
would be a good place to start.

Thanks.

^ permalink raw reply

* Re: [PATCH 2/3] t5551: put many-tags case into its own repo
From: Junio C Hamano @ 2026-06-29 14:42 UTC (permalink / raw)
  To: Jeff King; +Cc: Michael Montalbo, Patrick Steinhardt, git
In-Reply-To: <20260629003434.GA1228461@coredump.intra.peff.net>

Jeff King <peff@peff.net> writes:

> On Sun, Jun 28, 2026 at 02:44:32PM -0700, Junio C Hamano wrote:
>
>> Jeff King <peff@peff.net> writes:
>> 
>> > diff --git a/t/t5551-http-fetch-smart.sh b/t/t5551-http-fetch-smart.sh
>> > index e236e526f0..cd851f24b8 100755
>> > --- a/t/t5551-http-fetch-smart.sh
>> > +++ b/t/t5551-http-fetch-smart.sh
>> > @@ -397,15 +397,16 @@ create_tags () {
>> >  }
>> >  
>> >  test_expect_success 'create 2,000 tags in the repo' '
>> > +	git init "$HTTPD_DOCUMENT_ROOT_PATH/many-tags.git" &&
>> >  	(
>> > -		cd "$HTTPD_DOCUMENT_ROOT_PATH/repo.git" &&
>> > +		cd "$HTTPD_DOCUMENT_ROOT_PATH/many-tags.git" &&
>> >  		create_tags 1 2000
>> >  	)
>> >  '
>> 
>> While all the other repositories used in this tests are bare
>> repositories, this new one is a non-bare repository.
>> 
>> It shouldn't make any difference, but since I noticed it...
>
> Ah, yeah. It should work either way, but it is slightly confusing for it
> to be non-bare. I'll wait to re-send (though if nothing else comes up,
> it may be simpler for you to just amend on your side).

OK.  It seems both Patrick and you are in favor of using only [1/3]
& [2/3] but dropping [3/3]?  If that is the concensus I can just
tweak this one and apply before 2.55 final.

Thanks.

^ permalink raw reply

* Re: [PATCH 0/3] fixing expensive http test timeouts
From: Junio C Hamano @ 2026-06-29 14:39 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: Jeff King, Michael Montalbo, git
In-Reply-To: <akIfsaVMB_S6kfJQ@pks.im>

Patrick Steinhardt <ps@pks.im> writes:

> By the way, the only reason why we at GitLab haven't been feeling the
> pain is that we only enable GIT_TEST_LONG for GitHub. So I was wondering
> whether we want to have something like the below patch on top.

If we can afford the cycles, it would be good to have similarly
larger coverage on two different platforms (compared to leaving one
of them not doing as much as the other when we know it).  On the
other hand, if we cannot cover _everything_ in one platform, it may
be a better use of the resources to have the other platform things
that are not covered already.  I see that among different pipeline
sources, we are doing TEST_LONG for pull requests to any branch, and
pushes only to "cast in stone" branches.  If there are other
branches that deserve to be tested with TEST_LONG upon other events
that the existing GitHub Actions CI does not trigger, it may be good
to have GitLab CI cover them, perhaps?





>
> Patrick
>
> diff --git a/ci/lib.sh b/ci/lib.sh
> index b939110a6e..57801586aa 100755
> --- a/ci/lib.sh
> +++ b/ci/lib.sh
> @@ -215,6 +215,14 @@ then
>  	test macos != "$CI_OS_NAME" || CI_OS_NAME=osx
>  	CI_REPO_SLUG="$GITHUB_REPOSITORY"
>  	CI_JOB_ID="$GITHUB_RUN_ID"
> +
> +	case "$GITHUB_EVENT_NAME" in
> +	pull_request)
> +		CI_EVENT=pull_request;;
> +	push)
> +		CI_EVENT=push;;
> +	esac
> +
>  	CC="${CC_PACKAGE:-${CC:-gcc}}"
>  	DONT_SKIP_TAGS=t
>  	handle_failed_tests () {
> @@ -239,6 +247,13 @@ then
>  	CI_BRANCH="$CI_COMMIT_REF_NAME"
>  	CI_COMMIT="$CI_COMMIT_SHA"
>  
> +	case "$CI_PIPELINE_SOURCE" in
> +	merge_request_event)
> +		CI_EVENT=pull_request;;
> +	push)
> +		CI_EVENT=push;;
> +	esac
> +
>  	case "$OS,$CI_JOB_IMAGE" in
>  	Windows_NT,*)
>  		CI_OS_NAME=windows
> @@ -319,7 +334,7 @@ export SKIP_DASHED_BUILT_INS=YesPlease
>  # enable "expensive" tests for PR events.
>  # In order to catch bugs introduced at integration time by mismerges,
>  # enable the long tests for pushes to the integration branches as well.
> -case "$GITHUB_EVENT_NAME,$CI_BRANCH" in
> +case "$CI_EVENT,$CI_BRANCH" in
>  pull_request,*|push,*next*|push,*master*|push,*main*|push,*maint*)
>  	export GIT_TEST_LONG=YesPlease
>  	;;

^ permalink raw reply

* Re: Security Vulnerability in Git 2.54.0/OpenSSL 3.5.6 Status
From: Johannes Schindelin @ 2026-06-29 13:57 UTC (permalink / raw)
  To: Person, Tim; +Cc: git@vger.kernel.org
In-Reply-To: <SN4P221MB0713994458A94BFCB51F7AC494EA2@SN4P221MB0713.NAMP221.PROD.OUTLOOK.COM>

Hi Tim,

On Sat, 27 Jun 2026, Person, Tim wrote:

> I am writing to determine when Git plans to release an update installer
> to patch the security vulnerability in Git 2.54.0 because of the
> included OpenSSL executable. This vulnerability is rated "Critical" in
> the CVE (https://www.cve.org/CVERecord?id=CVE-2026-34182). An updated
> version of the OpenSSL.exe fixing this problem has been available since
> 06/12/2026. I am just wondering if/when you plan to address this major
> security issue.

OpenSSL.exe is not part of the critical path of Git for Windows. It is
merely included as a curiosity for historical reasons. The critical CVE
you mentioned does not affect anything in Git itself. Therefore, I did not
even consider making an out-of-band release of Git for Windows merely for
that OpenSSL v3.5.7 update.

The next Git for Windows release (v2.55.0, likely due later today, may
slip to tomorrow) will include OpenSSL v3.5.7.

Ciao,
Johannes

^ permalink raw reply

* Re: [PATCH v6 4/4] history: re-edit a squash with every message
From: Harald Nordgren @ 2026-06-29 13:49 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Harald Nordgren via GitGitGadget, git
In-Reply-To: <xmqqwlvhzyhz.fsf@gitster.g>

> I doubt it would make practical difference, but one thing I notice
> is that unlike "git rebase -i", this one does not intersperse
> markers like "# This is the 1st commit message" in between the
> messages taken from the squashed commits, so it is not exactly
> "mirroring".

I wouldn't mind extracting that logic from 'rebase -i' to show it
here. It would be nice to have.


Harald

^ permalink raw reply

* [PATCH 2/2] commit-reach: guard !FIND_ALL early exit with generation ordering check
From: Kristofer Karlsson via GitGitGadget @ 2026-06-29 13:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Derrick Stolee, Kristofer Karlsson,
	Kristofer Karlsson
In-Reply-To: <pull.2162.git.1782739162.gitgitgadget@gmail.com>

From: Kristofer Karlsson <krka@spotify.com>

When paint_down_to_common() falls back to commit-date ordering (for
v1 commit graphs without corrected commit dates), the !FIND_ALL early
exit incorrectly fires.  The exit assumes the queue is generation-
ordered, so the first RESULT commit found must be the shallowest.
With date ordering this is not guaranteed: a closer merge base with
a lower committer date (clock skew) may still be in the queue behind
deeper commits.

Add a gen_ordered flag that is cleared when the date fallback fires,
and require it for the early exit.

Update the test from the previous commit to test_expect_success.

Signed-off-by: Kristofer Karlsson <krka@spotify.com>
---
 commit-reach.c        | 10 +++++++---
 t/t6600-test-reach.sh |  2 +-
 2 files changed, 8 insertions(+), 4 deletions(-)

diff --git a/commit-reach.c b/commit-reach.c
index 5df471a313..708798a39b 100644
--- a/commit-reach.c
+++ b/commit-reach.c
@@ -108,11 +108,14 @@ static int paint_down_to_common(struct repository *r,
 		{ compare_commits_by_gen_then_commit_date }
 	};
 	int i;
+	int gen_ordered = 1;
 	timestamp_t last_gen = GENERATION_NUMBER_INFINITY;
 	struct commit_list **tail = result;
 
-	if (!min_generation && !corrected_commit_dates_enabled(r))
+	if (!min_generation && !corrected_commit_dates_enabled(r)) {
 		queue.pq.compare = compare_commits_by_commit_date;
+		gen_ordered = 0;
+	}
 
 	one->object.flags |= PARENT1;
 	if (!n) {
@@ -147,11 +150,12 @@ static int paint_down_to_common(struct repository *r,
 				commit->object.flags |= RESULT;
 				tail = commit_list_append(commit, tail);
 				/*
-				 * The queue is generation-ordered; no
-				 * remaining common ancestor can be a
+				 * When the queue is generation-ordered,
+				 * no remaining common ancestor can be a
 				 * descendant of this one.
 				 */
 				if (!(mb_flags & MERGE_BASE_FIND_ALL) &&
+				    gen_ordered &&
 				    generation < GENERATION_NUMBER_INFINITY)
 					break;
 			}
diff --git a/t/t6600-test-reach.sh b/t/t6600-test-reach.sh
index 1090104220..0ff41381ff 100755
--- a/t/t6600-test-reach.sh
+++ b/t/t6600-test-reach.sh
@@ -1003,7 +1003,7 @@ test_expect_success 'merge-base without --all is one of --all results' '
 	grep -F -f single all
 '
 
-test_expect_failure 'merge-base without --all, clock skew, v1 commit-graph' '
+test_expect_success 'merge-base without --all, clock skew, v1 commit-graph' '
 	git rev-parse skew-M2 >expect &&
 	merge_base_all_modes skew-P1 skew-P2
 '
-- 
gitgitgadget

^ permalink raw reply related

* [PATCH 1/2] t6600: add test for merge-base early exit with clock skew
From: Kristofer Karlsson via GitGitGadget @ 2026-06-29 13:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Derrick Stolee, Kristofer Karlsson,
	Kristofer Karlsson
In-Reply-To: <pull.2162.git.1782739162.gitgitgadget@gmail.com>

From: Kristofer Karlsson <krka@spotify.com>

Add a topology where the correct merge base (M2) has a lower
committer date than its ancestor (M1) due to clock skew.  With a
v1 commit graph (topological levels only, no corrected commit
dates), paint_down_to_common() falls back to commit-date ordering.
In that mode, M1 pops before M2, acquires both paint sides, and
the !FIND_ALL early exit fires -- returning the wrong merge base.

Mark the test as test_expect_failure to document the bug; the next
commit will fix it.

Signed-off-by: Kristofer Karlsson <krka@spotify.com>
---
 t/t6600-test-reach.sh | 41 +++++++++++++++++++++++++++++++++++++++++
 1 file changed, 41 insertions(+)

diff --git a/t/t6600-test-reach.sh b/t/t6600-test-reach.sh
index b5b314e570..1090104220 100755
--- a/t/t6600-test-reach.sh
+++ b/t/t6600-test-reach.sh
@@ -49,6 +49,42 @@ test_expect_success 'setup' '
 			git tag -a -m "$x-$i" tag-$x-$i commit-$x-$i || return 1
 		done
 	done &&
+	# Build a topology with clock skew to test the !FIND_ALL early
+	# exit in paint_down_to_common().  M2 is the correct merge base
+	# of P1 and P2, but its ancestor M1 has a higher committer date
+	# due to clock skew.  With date-only ordering (v1 commit graph
+	# without corrected commit dates), M1 pops from the queue first,
+	# gets both paint sides, and the early exit fires before M2 is
+	# ever visited.
+	#
+	#        P1     P2          @7000
+	#        |     /  \
+	#        A    B    D        @6000
+	#       / \   |    |
+	#      |  M2--+    |        @2000 (correct merge base)
+	#       \ |        |
+	#        M1--------+        @5000 (clock skew: date > M2)
+	#        |
+	#       root                @1000
+	#
+	git checkout --orphan skew-orphan &&
+	skew_tree=$(git mktree </dev/null) &&
+	skew_commit () {
+		GIT_COMMITTER_DATE="@$1 +0000" GIT_AUTHOR_DATE="@$1 +0000" \
+			git commit-tree -m "$2" "$skew_tree" $3 $4 $5 $6
+	} &&
+	skew_root=$(skew_commit 1000 root) &&
+	skew_M1=$(skew_commit 5000 M1 -p "$skew_root") &&
+	skew_M2=$(skew_commit 2000 M2 -p "$skew_M1") &&
+	skew_A=$(skew_commit 6000 A -p "$skew_M1" -p "$skew_M2") &&
+	skew_B=$(skew_commit 6000 B -p "$skew_M2") &&
+	skew_D=$(skew_commit 6000 D -p "$skew_M1") &&
+	skew_P1=$(skew_commit 7000 P1 -p "$skew_A") &&
+	skew_P2=$(skew_commit 7000 P2 -p "$skew_B" -p "$skew_D") &&
+	git branch -f skew-P1 "$skew_P1" &&
+	git branch -f skew-P2 "$skew_P2" &&
+	git tag skew-M2 "$skew_M2" &&
+
 	git commit-graph write --reachable &&
 	mv .git/objects/info/commit-graph commit-graph-full &&
 	chmod u+w commit-graph-full &&
@@ -967,4 +1003,9 @@ test_expect_success 'merge-base without --all is one of --all results' '
 	grep -F -f single all
 '
 
+test_expect_failure 'merge-base without --all, clock skew, v1 commit-graph' '
+	git rev-parse skew-M2 >expect &&
+	merge_base_all_modes skew-P1 skew-P2
+'
+
 test_done
-- 
gitgitgadget


^ permalink raw reply related

* [PATCH 0/2] commit-reach: fix !FIND_ALL early exit with v1 commit graph
From: Kristofer Karlsson via GitGitGadget @ 2026-06-29 13:19 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Derrick Stolee, Kristofer Karlsson

Fixes a bug introduced by 93e5b1680e (commit-reach: early exit
paint_down_to_common for single merge-base, 2025-04-10) where git merge-base
can return the wrong result.

The bug requires all of the following to trigger:

 1. A v1 commit graph (topological levels only, no corrected commit dates).
    Generation v2 with corrected commit dates has been the default since
    2021, so only repos that have not rewritten their commit graph in over
    four years would be affected.
 2. git merge-base without --all (the common case, but --all is unaffected
    because it disables the early exit).
 3. A topology with clock skew: the correct merge base has a lower committer
    date than one of its ancestors that is also a common ancestor. With date
    ordering, the deeper ancestor pops first and the early exit fires before
    the correct result is found.

This two-patch series:

 1. Adds a test demonstrating the bug (clock-skew topology where the correct
    merge base has a lower date than its ancestor)
 2. Fixes it by tracking whether the queue is generation-ordered and gating
    the early exit on that flag

Kristofer Karlsson (2):
  t6600: add test for merge-base early exit with clock skew
  commit-reach: guard !FIND_ALL early exit with generation ordering
    check

 commit-reach.c        | 10 +++++++---
 t/t6600-test-reach.sh | 41 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 48 insertions(+), 3 deletions(-)


base-commit: 9aa172cd1f113276d360d4e48937dc95ef46b780
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-2162%2Fspkrka%2Ffind-all-fix-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-2162/spkrka/find-all-fix-v1
Pull-Request: https://github.com/gitgitgadget/git/pull/2162
-- 
gitgitgadget

^ permalink raw reply

* Re: [PATCH v4 0/8] commit-reach: terminate merge-base walk when one side is exhausted
From: Kristofer Karlsson @ 2026-06-29 12:59 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Kristofer Karlsson via GitGitGadget, git, Elijah Newren
In-Reply-To: <5ef694a3-9164-4ab4-8835-136439f6d267@gmail.com>

On Mon, 29 Jun 2026 at 14:40, Derrick Stolee <stolee@gmail.com> wrote:
>
> I agree with your reasoning, data-backed discovery, and the course of
> action to fix this. I'm happy that you're able to close the loop on
> this long-standing performance issue even with v1 generation numbers.

Sounds good, then I can continue with the approach of removing some code
(even though it will likely be a net addition in the end).

> > Do you see any cases I might be missing where removing the fallback
> > could cause problems?
> I don't see any other concerns here. You're right that if we were to
> have a different mode that changes the priority-queue ordering, then
> the side-exhaustion optimization cannot be trusted, but you will
> remove this possibility.
>
> It _may_ be worth mentioning this with a comment when initializing
> the queue order for the paint_queue, because the use of the queue
> requires topological ordering.

Yes my plan is to rewrite v5 in a few ways:
- update original documentation to note that infinite -> finite
   generation does not always hold
- add a test (or more than one) for this problem
- don't introduce the bug at any point
- add a commit to replace the disabled optimization with
   removal of the commit-date based ordering (+ doc update)

Thanks for helping with this,
Kristofer

^ permalink raw reply

* Re: [PATCH v4 0/8] commit-reach: terminate merge-base walk when one side is exhausted
From: Derrick Stolee @ 2026-06-29 12:40 UTC (permalink / raw)
  To: Kristofer Karlsson
  Cc: Kristofer Karlsson via GitGitGadget, git, Elijah Newren
In-Reply-To: <CAL71e4O8gTLm4WUcPF-ZbOYTuEzuNSVh0Qjf8ys1w4LVF9Hi8Q@mail.gmail.com>

On 6/29/2026 8:11 AM, Kristofer Karlsson wrote:
> On Sun, 28 Jun 2026 at 17:16, Derrick Stolee <stolee@gmail.com> wrote:
>>
>> I reviewed the v3 discussion, the range-diff, and reread patch 8. I think
>> that this version is good to go.
> 
> Thanks for all your reviews and feedback. However, I found one more
> problem that needs to be resolved before this is good to go.
> 
> paint_down_to_common() has this fallback:
> 
>     if (!min_generation && !corrected_commit_dates_enabled(r))
>         queue.pq.compare = compare_commits_by_commit_date;
...> I traced the history of this fallback. 
...> The problem that 091f4cf3 addresses looks closely related to what
> side-exhaustion solves: the walk goes deep into a subgraph where
> only one paint side has presence. With side-exhaustion, the walk
> terminates as soon as one paint side is exhausted from the queue,
> so the deep walk never happens regardless of queue ordering.
> 
> I benchmarked "git merge-base --all v4.8 v4.9" on the Linux kernel
> (the same case from 091f4cf3) with three configurations:
> 
>                     master (--all)    side-exhaust (--all, gen ordering)
>   no graph:           3212 ms            3268 ms
>   v1 graph:            188 ms              17 ms
>   v2 graph:            227 ms              17 ms
> 
> With side-exhaustion, the v1 case no longer shows a regression
> compared to the date fallback -- if anything, it is slightly faster
> since the walk terminates earlier. This suggests that the workaround
> from 091f4cf3 may no longer be needed when side-exhaustion is
> present.

Thanks for digging into the history of this fallback and catching it
during review!

> If that reasoning holds, the fix for v5 would be to remove the date
> fallback entirely, always using compare_commits_by_gen_then_commit_date.
> This would:
> 
>  1. Fix the bug (finite generation always means generation-ordered
>     queue).
>  2. Remove corrected_commit_dates_enabled() which has no other
>     callers.

I agree with your reasoning, data-backed discovery, and the course of
action to fix this. I'm happy that you're able to close the loop on
this long-standing performance issue even with v1 generation numbers.
> Do you see any cases I might be missing where removing the fallback
> could cause problems?
I don't see any other concerns here. You're right that if we were to
have a different mode that changes the priority-queue ordering, then
the side-exhaustion optimization cannot be trusted, but you will
remove this possibility.

It _may_ be worth mentioning this with a comment when initializing
the queue order for the paint_queue, because the use of the queue
requires topological ordering.

Thanks,
-Stolee


^ permalink raw reply

* Re: [PATCH v4 0/8] commit-reach: terminate merge-base walk when one side is exhausted
From: Kristofer Karlsson @ 2026-06-29 12:11 UTC (permalink / raw)
  To: Derrick Stolee; +Cc: Kristofer Karlsson via GitGitGadget, git, Elijah Newren
In-Reply-To: <48bfdb11-2624-4aa6-8fbd-d3f894c33bcc@gmail.com>

On Sun, 28 Jun 2026 at 17:16, Derrick Stolee <stolee@gmail.com> wrote:
>
> I reviewed the v3 discussion, the range-diff, and reread patch 8. I think
> that this version is good to go.

Thanks for all your reviews and feedback. However, I found one more
problem that needs to be resolved before this is good to go.

paint_down_to_common() has this fallback:

    if (!min_generation && !corrected_commit_dates_enabled(r))
        queue.pq.compare = compare_commits_by_commit_date;

When this fires, the queue uses commit-date ordering instead of
generation ordering. The side-exhaustion optimization and my older
patch for !FIND_ALL early exit both check for reaching the finite
generation, but with date ordering, that check is wrong --
a commit can have a finite topo level (it is in a v1 commit graph)
while the queue is not ordered by generation. This unfortunately
means there is a regression for the !FIND_ALL optimization that
I should fix before 2.55 is final. I will send a small patch for
that separately: add tests that demonstrate the problem, and disable
the !FIND_ALL early exit when generation ordering is not active.

I traced the history of this fallback. The queue was switched from
date ordering to generation ordering in 3afc679b (2018-05). Then in
091f4cf3 (2018-08) you added the date fallback after finding that v1
topo levels caused "git merge-base v4.8 v4.9" on the Linux kernel to
walk 636k commits instead of 167k -- a side branch with a low topo
level stayed in the queue behind a long chain, preventing early STALE
propagation. Later, 8d00d7c3 (2021-01) tightened the fallback to
only fire without corrected commit dates, since v2 does not have the
regression.

The problem that 091f4cf3 addresses looks closely related to what
side-exhaustion solves: the walk goes deep into a subgraph where
only one paint side has presence. With side-exhaustion, the walk
terminates as soon as one paint side is exhausted from the queue,
so the deep walk never happens regardless of queue ordering.

I benchmarked "git merge-base --all v4.8 v4.9" on the Linux kernel
(the same case from 091f4cf3) with three configurations:

                    master (--all)    side-exhaust (--all, gen ordering)
  no graph:           3212 ms            3268 ms
  v1 graph:            188 ms              17 ms
  v2 graph:            227 ms              17 ms

With side-exhaustion, the v1 case no longer shows a regression
compared to the date fallback -- if anything, it is slightly faster
since the walk terminates earlier. This suggests that the workaround
from 091f4cf3 may no longer be needed when side-exhaustion is
present.

It is also worth noting that commitGraph.generationVersion has
defaulted to 2 since 2021, so the v1 fallback path is rarely
exercised in practice. Any commit-graph rewrite produces v2 data,
and only repos that have not rewritten their commit graph in over
four years would still have v1-only data.

If that reasoning holds, the fix for v5 would be to remove the date
fallback entirely, always using compare_commits_by_gen_then_commit_date.
This would:

 1. Fix the bug (finite generation always means generation-ordered
    queue).
 2. Remove corrected_commit_dates_enabled() which has no other
    callers.

The alternative would be to keep the fallback and disable the
optimizations that depend on ordering (via a flag like
paint_state.gen_ordered).

Do you see any cases I might be missing where removing the fallback
could cause problems?

Thanks,
Kristofer

^ permalink raw reply

* Re: [GSoC Blog] Week 5 : Improve Disk Space Recovery for Partial Clones
From: Siddharth Shrimali @ 2026-06-29 11:49 UTC (permalink / raw)
  To: git; +Cc: Christian Couder, Siddharth Asthana
In-Reply-To: <CAGWgyh_WJ2mAgbJ2agp9UQm8iyR=eq0xWjdYT59CC9fZTnAbzA@mail.gmail.com>

Hello everyone,

My latest blog post, covering week 5, is now live:
https://siddharth.shrimali.info/#post/7

Please feel free to review my work and share your feedback.
Always open to discussions! :)

Regards,
Siddharth Shrimali

^ permalink raw reply

* Re: [PATCH v4 6/8] commit-reach: remove unused nonstale_queue dedup wrappers
From: Kristofer Karlsson @ 2026-06-29 10:09 UTC (permalink / raw)
  To: SZEDER Gábor
  Cc: Kristofer Karlsson via GitGitGadget, git, Derrick Stolee,
	Elijah Newren
In-Reply-To: <akIBvWT7nIWntCNT@szeder.dev>

On Mon, 29 Jun 2026 at 07:25, SZEDER Gábor <szeder.dev@gmail.com> wrote:
>
> On Sun, Jun 28, 2026 at 12:25:44PM +0000, Kristofer Karlsson via GitGitGadget wrote:
> > From: Kristofer Karlsson <krka@spotify.com>
> >
> > nonstale_queue_put_dedup() and nonstale_queue_get_dedup() became
> > unused after the previous commit. The core nonstale_queue functions
> > remain in use by ahead_behind().
>
> Please squash this patch into the previous one.  Since the last
> callers of these static functions went away in that commit, it can't
> be built with DEVELOPER=1:
>
>   commit-reach.c:91:23: warning: ‘nonstale_queue_get_dedup’ defined but not used [-Wunused-function]
>      91 | static struct commit *nonstale_queue_get_dedup(struct nonstale_queue *queue)
>         |                       ^~~~~~~~~~~~~~~~~~~~~~~~
>   commit-reach.c:82:13: warning: ‘nonstale_queue_put_dedup’ defined but not used [-Wunused-function]
>      82 | static void nonstale_queue_put_dedup(struct nonstale_queue *queue,
>         |             ^~~~~~~~~~~~~~~~~~~~~~~~
>

Thanks, will squash for v5! It's unfortunate that this means the commit itself
becomes less clean, but I don't have any other good solution
-- and having each commit compile cleanly is more important.

- Kristofer

^ permalink raw reply

* [PATCH v2 12/12] reftable/table: fix OOB read on truncated table
From: Patrick Steinhardt @ 2026-06-29  9:02 UTC (permalink / raw)
  To: git; +Cc: oxsignal, Christian Couder
In-Reply-To: <20260629-pks-reftable-hardening-v2-0-b0228e7d908d@pks.im>

When opening a table we compute the size of its data section by
subtracting the footer size from the file size. We do not verify that
the file is actually large enough to contain both the header and the
footer though. With a truncated table the subtraction can thus
underflow, causing us to read the footer out of bounds:

  SUMMARY: AddressSanitizer: heap-buffer-overflow (/home/pks/Development/git/build/t/unit-tests+0x2479a4) in __asan_memcpy
  Shadow bytes around the buggy address:
    0x7ccff6e0de80: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd
    0x7ccff6e0df00: fd fd fd fd fd fd fd fd fd fa fa fa fa fa fa fa
    0x7ccff6e0df80: fa fa fd fd fd fd fd fd fd fd fd fd fd fd fd fd
    0x7ccff6e0e000: fd fd fd fd fa fa fa fa fa fa fa fa fd fd fd fd
    0x7ccff6e0e080: fd fd fd fd fd fd fd fd fd fd fd fd fd fd fa fa
  =>0x7ccff6e0e100: fa fa fa fa fa[fa]00 00 00 00 00 00 00 00 00 00
    0x7ccff6e0e180: 00 00 00 00 00 00 00 04 fa fa fa fa fa fa fa fa
    0x7ccff6e0e200: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
    0x7ccff6e0e280: 00 00 fa fa fa fa fa fa fa fa fa fa fa fa fa fa
    0x7ccff6e0e300: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
    0x7ccff6e0e380: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa
  Shadow byte legend (one shadow byte represents 8 application bytes):
    Addressable:           00
    Partially addressable: 01 02 03 04 05 06 07
    Heap left redzone:       fa
    Freed heap region:       fd
    Stack left redzone:      f1
    Stack mid redzone:       f2
    Stack right redzone:     f3
    Stack after return:      f5
    Stack use after scope:   f8
    Global redzone:          f9
    Global init order:       f6
    Poisoned by user:        f7
    Container overflow:      fc
    Array cookie:            ac
    Intra object redzone:    bb
    ASan internal:           fe
    Left alloca redzone:     ca
    Right alloca redzone:    cb
  ==1500371==ABORTING

Verify that the file is large enough to contain both the header and the
footer before computing the table size.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 reftable/table.c                |  5 +++++
 t/unit-tests/u-reftable-table.c | 28 ++++++++++++++++++++++++++++
 2 files changed, 33 insertions(+)

diff --git a/reftable/table.c b/reftable/table.c
index f4bc86a29d..b4d3f9e211 100644
--- a/reftable/table.c
+++ b/reftable/table.c
@@ -562,6 +562,11 @@ int reftable_table_new(struct reftable_table **out,
 		goto done;
 	}
 
+	if (file_size < header_size(t->version) + footer_size(t->version)) {
+		err = REFTABLE_FORMAT_ERROR;
+		goto done;
+	}
+
 	t->size = file_size - footer_size(t->version);
 	t->source = *source;
 	t->name = reftable_strdup(name);
diff --git a/t/unit-tests/u-reftable-table.c b/t/unit-tests/u-reftable-table.c
index c7dca45e70..28b0ef5258 100644
--- a/t/unit-tests/u-reftable-table.c
+++ b/t/unit-tests/u-reftable-table.c
@@ -262,3 +262,31 @@ void test_reftable_table__seek_invalid_log_offset(void)
 	reftable_table_decref(table);
 	reftable_buf_release(&buf);
 }
+
+void test_reftable_table__new_with_truncated_table(void)
+{
+	struct reftable_ref_record refs[] = {
+		{
+			.refname = (char *) "refs/heads/main",
+			.value_type = REFTABLE_REF_VAL1,
+			.value.val1 = { 42 },
+		},
+	};
+	struct reftable_block_source source = { 0 };
+	struct reftable_table *table;
+	struct reftable_buf buf = REFTABLE_BUF_INIT;
+
+	cl_reftable_write_to_buf(&buf, refs, ARRAY_SIZE(refs), NULL, 0, NULL);
+
+	/*
+	 * Truncate the table so that it is large enough to read the header, but
+	 * too small to also contain the footer.
+	 */
+	buf.len = footer_size(1) - 1;
+	block_source_from_buf(&source, &buf);
+
+	cl_assert_equal_i(reftable_table_new(&table, &source, "name"),
+			  REFTABLE_FORMAT_ERROR);
+
+	reftable_buf_release(&buf);
+}

-- 
2.55.0.rc2.803.g1fd1e6609c.dirty


^ permalink raw reply related

* [PATCH v2 11/12] reftable/table: fix NULL pointer access when seeking to bogus offsets
From: Patrick Steinhardt @ 2026-06-29  9:02 UTC (permalink / raw)
  To: git; +Cc: oxsignal, Christian Couder
In-Reply-To: <20260629-pks-reftable-hardening-v2-0-b0228e7d908d@pks.im>

When seeking an iterator to an arbitrary offset we may return a positive
value in case the offset points beyond the block. This makes sense when
iterating through multiple blocks of the same section, as the positive
value indicates to us that we're at the end of the table.

But when the offset originates from a section or index offset it is
supposed to point at a valid block, so an out-of-bounds value means that
the table is corrupt. Treating it as a normal end-of-iteration causes us
to silently report an empty section instead of surfacing the corruption,
and we are left with a partially-initialized block. This may later on
cause a NULL pointer exception:

  ==1486841==ERROR: AddressSanitizer: SEGV on unknown address 0x000000000000 (pc 0x55555598e02c bp 0x7fffffff4eb0 sp 0x7fffffff4e70 T0)
  ==1486841==The signal is caused by a READ memory access.
  ==1486841==Hint: address points to the zero page.
      #0 0x55555598e02c in reftable_block_type ./git/build/../reftable/block.c:392:9
      #1 0x55555598ee6e in block_iter_seek_key ./git/build/../reftable/block.c:536:35
      #2 0x5555559ae553 in table_iter_seek_linear ./git/build/../reftable/table.c:344:8
      #3 0x5555559adbff in table_iter_seek ./git/build/../reftable/table.c:450:9
      #4 0x5555559ada9c in table_iter_seek_void ./git/build/../reftable/table.c:460:9
      #5 0x555555992872 in reftable_iterator_seek_log_at ./git/build/../reftable/iter.c:281:9
      #6 0x555555992953 in reftable_iterator_seek_log ./git/build/../reftable/iter.c:287:9
      #7 0x55555583aa78 in test_reftable_table__seek_invalid_log_offset ./git/build/../t/unit-tests/u-reftable-table.c:257:20
      #8 0x5555557f684e in clar_run_test ./git/build/../t/unit-tests/clar/clar.c:335:3
      #9 0x5555557f2e69 in clar_run_suite ./git/build/../t/unit-tests/clar/clar.c:431:3
      #10 0x5555557f2882 in clar_test_run ./git/build/../t/unit-tests/clar/clar.c:636:4
      #11 0x5555557f375f in clar_test ./git/build/../t/unit-tests/clar/clar.c:687:11
      #12 0x5555557fa49d in cmd_main ./git/build/../t/unit-tests/unit-test.c:62:8
      #13 0x55555584cffa in main ./git/build/../common-main.c:9:11
      #14 0x7ffff7a2b284 in __libc_start_call_main (/nix/store/57iz36553175g3178pvxjij8z5rcsd4n-glibc-2.42-61/lib/libc.so.6+0x2b284) (BuildId: 8ae0b698f2d4e727f569f64bb166e08ae30bd077)
      #15 0x7ffff7a2b337 in __libc_start_main@GLIBC_2.2.5 (/nix/store/57iz36553175g3178pvxjij8z5rcsd4n-glibc-2.42-61/lib/libc.so.6+0x2b337) (BuildId: 8ae0b698f2d4e727f569f64bb166e08ae30bd077)
      #16 0x555555694c24 in _start (./git/build/t/unit-tests+0x140c24)

  ==1486841==Register values:
  rax = 0x0000000000000000  rbx = 0x00007fffffff4ec0  rcx = 0x0000000000000000  rdx = 0x00007cfff6e2bd58
  rdi = 0x00007cfff6e2bd58  rsi = 0x00007bfff5da1020  rbp = 0x00007fffffff4eb0  rsp = 0x00007fffffff4e70
   r8 = 0x0000000000000000   r9 = 0x0000000000000002  r10 = 0x0000000000000000  r11 = 0x0000000000000017
  r12 = 0x00007fffffff5908  r13 = 0x0000000000000001  r14 = 0x00007ffff7ffd000  r15 = 0x0000555556056e90
  AddressSanitizer can not provide additional info.
  SUMMARY: AddressSanitizer: SEGV ./git/build/../reftable/block.c:392:9 in reftable_block_type
  ==1486841==ABORTING

Fix this by returning a proper error in `table_iter_seek_to()` when the
offset ranges beyond the block.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 reftable/table.c                |  2 ++
 t/unit-tests/u-reftable-table.c | 63 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 65 insertions(+)

diff --git a/reftable/table.c b/reftable/table.c
index 56362df0ed..f4bc86a29d 100644
--- a/reftable/table.c
+++ b/reftable/table.c
@@ -242,6 +242,8 @@ static int table_iter_seek_to(struct table_iter *ti, uint64_t off, uint8_t typ)
 	int err;
 
 	err = table_init_block(ti->table, &ti->block, off, typ);
+	if (err > 0)
+		return REFTABLE_FORMAT_ERROR;
 	if (err != 0)
 		return err;
 
diff --git a/t/unit-tests/u-reftable-table.c b/t/unit-tests/u-reftable-table.c
index 14fae8b199..c7dca45e70 100644
--- a/t/unit-tests/u-reftable-table.c
+++ b/t/unit-tests/u-reftable-table.c
@@ -1,8 +1,11 @@
 #include "unit-test.h"
 #include "lib-reftable.h"
+#include "reftable/basics.h"
+#include "reftable/block.h"
 #include "reftable/blocksource.h"
 #include "reftable/constants.h"
 #include "reftable/iter.h"
+#include "reftable/reftable-error.h"
 #include "reftable/table.h"
 #include "strbuf.h"
 
@@ -199,3 +202,63 @@ void test_reftable_table__block_iterator(void)
 	reftable_buf_release(&buf);
 	reftable_free(records);
 }
+
+void test_reftable_table__seek_invalid_log_offset(void)
+{
+	struct reftable_ref_record refs[] = {
+		{
+			.refname = (char *) "refs/heads/main",
+			.value_type = REFTABLE_REF_VAL1,
+			.value.val1 = { 42 },
+		},
+	};
+	struct reftable_log_record logs[] = {
+		{
+			.refname = (char *) "refs/heads/main",
+			.update_index = 1,
+			.value_type = REFTABLE_LOG_UPDATE,
+			.value.update = {
+				.name = (char *) "user",
+				.email = (char *) "user@example.com",
+				.message = (char *) "message\n",
+			},
+		},
+	};
+	struct reftable_block_source source = { 0 };
+	struct reftable_log_record log = { 0 };
+	struct reftable_iterator it = { 0 };
+	struct reftable_table *table;
+	struct reftable_buf buf = REFTABLE_BUF_INIT;
+	size_t fsize = footer_size(1);
+	uint8_t *footer;
+
+	cl_reftable_write_to_buf(&buf, refs, ARRAY_SIZE(refs),
+				 logs, ARRAY_SIZE(logs), NULL);
+
+	/*
+	 * Corrupt the log section offset stored in the footer so that it points
+	 * past the end of the table. The footer is checksummed, so we also have
+	 * to recompute and rewrite the CRC.
+	 */
+	footer = (uint8_t *) buf.buf + buf.len - fsize;
+	reftable_put_be64(footer + header_size(1) + 24, UINT64_MAX);
+	reftable_put_be32(footer + fsize - 4, crc32(0, footer, fsize - 4));
+
+	block_source_from_buf(&source, &buf);
+	cl_must_pass(reftable_table_new(&table, &source, "name"));
+
+	/*
+	 * Seeking the log iterator must not crash even though the log section
+	 * offset is bogus. As the offset points past the end of the table we
+	 * know that the table is corrupt, so the seek must report a format
+	 * error instead of pretending that the section is empty.
+	 */
+	reftable_table_init_log_iterator(table, &it);
+	cl_assert_equal_i(reftable_iterator_seek_log(&it, ""),
+			  REFTABLE_FORMAT_ERROR);
+
+	reftable_log_record_release(&log);
+	reftable_iterator_destroy(&it);
+	reftable_table_decref(table);
+	reftable_buf_release(&buf);
+}

-- 
2.55.0.rc2.803.g1fd1e6609c.dirty


^ permalink raw reply related

* [PATCH v2 10/12] reftable/block: fix OOB read with bogus restart offset
From: Patrick Steinhardt @ 2026-06-29  9:02 UTC (permalink / raw)
  To: git; +Cc: oxsignal, Christian Couder
In-Reply-To: <20260629-pks-reftable-hardening-v2-0-b0228e7d908d@pks.im>

Restart points encode records in a given block that do not use prefix
compression and that can thus immediately be seeked to. These offsets
are encoded in the restart table, where each offset needs to point at
one of the records of the block. We do not verify this though, so a
bogus restart offset may cause an out-of-bounds read:

  ==1472280==ERROR: AddressSanitizer: SEGV on unknown address 0x7d8ff7de5f7f (pc 0x55555599502b bp 0x7fffffff4df0 sp 0x7fffffff4d40 T0)
  ==1472280==The signal is caused by a READ memory access.
      #0 0x55555599502b in get_var_int ./git/build/../reftable/record.c:30:6
      #1 0x555555995c2a in reftable_decode_keylen ./git/build/../reftable/record.c:177:6
      #2 0x55555598e85c in restart_needle_less ./git/build/../reftable/block.c:455:6
      #3 0x55555598895f in binsearch ./git/build/../reftable/basics.c:175:9
      #4 0x55555598e189 in block_iter_seek_key ./git/build/../reftable/block.c:543:6
      #5 0x555555814aee in test_reftable_block__corrupt_restart_offset ./git/build/../t/unit-tests/u-reftable-block.c:636:20
      #6 0x5555557f684e in clar_run_test ./git/build/../t/unit-tests/clar/clar.c:335:3
      #7 0x5555557f2e69 in clar_run_suite ./git/build/../t/unit-tests/clar/clar.c:431:3
      #8 0x5555557f2882 in clar_test_run ./git/build/../t/unit-tests/clar/clar.c:636:4
      #9 0x5555557f375f in clar_test ./git/build/../t/unit-tests/clar/clar.c:687:11
      #10 0x5555557fa49d in cmd_main ./git/build/../t/unit-tests/unit-test.c:62:8
      #11 0x55555584c25a in main ./git/build/../common-main.c:9:11
      #12 0x7ffff7a2b284 in __libc_start_call_main (/nix/store/57iz36553175g3178pvxjij8z5rcsd4n-glibc-2.42-61/lib/libc.so.6+0x2b284) (BuildId: 8ae0b698f2d4e727f569f64bb166e08ae30bd077)
      #13 0x7ffff7a2b337 in __libc_start_main@GLIBC_2.2.5 (/nix/store/57iz36553175g3178pvxjij8z5rcsd4n-glibc-2.42-61/lib/libc.so.6+0x2b337) (BuildId: 8ae0b698f2d4e727f569f64bb166e08ae30bd077)
      #14 0x555555694c24 in _start (./git/build/t/unit-tests+0x140c24)

  ==1472280==Register values:
  rax = 0x00007d8ff7de5f7f  rbx = 0x00007fffffff4e00  rcx = 0x00007d8ff7de5f80  rdx = 0x00007bfff5b6af60
  rdi = 0x00007bfff5b6af40  rsi = 0x00007bfff592dfa0  rbp = 0x00007fffffff4df0  rsp = 0x00007fffffff4d40
   r8 = 0x00000000ff00002b   r9 = 0x00007d8ff7de5f7f  r10 = 0x00000f7ffeb25bf0  r11 = 0xf3f30000f1f1f1f1
  r12 = 0x00007fffffff58f8  r13 = 0x0000000000000001  r14 = 0x00007ffff7ffd000  r15 = 0x0000555556055fd0
  AddressSanitizer can not provide additional info.
  SUMMARY: AddressSanitizer: SEGV ./git/build/../reftable/record.c:30:6 in get_var_int

Guard against such restart offsets and signal an error to the caller via
`args.error`.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 reftable/block.c                |  9 +++++++++
 t/unit-tests/u-reftable-block.c | 39 +++++++++++++++++++++++++++++++++++++++
 2 files changed, 48 insertions(+)

diff --git a/reftable/block.c b/reftable/block.c
index 89efce8751..1fa81405d2 100644
--- a/reftable/block.c
+++ b/reftable/block.c
@@ -440,6 +440,15 @@ static int restart_needle_less(size_t idx, void *_args)
 	uint8_t extra;
 	int n;
 
+	/*
+	 * The restart offset must point to a record, which is stored before
+	 * the restart table. Verify that this is the case.
+	 */
+	if (off >= args->block->restart_off) {
+		args->error = 1;
+		return -1;
+	}
+
 	/*
 	 * Records at restart points are stored without prefix compression, so
 	 * there is no need to fully decode the record key here. This removes
diff --git a/t/unit-tests/u-reftable-block.c b/t/unit-tests/u-reftable-block.c
index a0fbfb247f..b8bb9a23e4 100644
--- a/t/unit-tests/u-reftable-block.c
+++ b/t/unit-tests/u-reftable-block.c
@@ -563,3 +563,42 @@ void test_reftable_block__corrupt_restart_count(void)
 	reftable_block_release(&block);
 	reftable_buf_release(&data);
 }
+
+void test_reftable_block__corrupt_restart_offset(void)
+{
+	struct reftable_block_source source = { 0 };
+	struct reftable_record rec = {
+		.type = REFTABLE_BLOCK_TYPE_REF,
+		.u.ref = {
+			.value_type = REFTABLE_REF_VAL1,
+			.refname = (char *) "refs/heads/main",
+		},
+	};
+	struct reftable_block block = { 0 };
+	struct block_iter it = BLOCK_ITER_INIT;
+	struct reftable_buf want = REFTABLE_BUF_INIT;
+	struct reftable_buf data = REFTABLE_BUF_INIT;
+
+	cl_reftable_write_block(&data, REFTABLE_BLOCK_TYPE_REF, &rec, 1);
+
+	block_source_from_buf(&source, &data);
+	cl_must_pass(reftable_block_init(&block, &source, 0, 0, data.len,
+					 REFTABLE_HASH_SIZE_SHA1, REFTABLE_BLOCK_TYPE_REF));
+
+	/*
+	 * Corrupt the first restart offset, stored as a big-endian 24-bit
+	 * integer at the start of the restart table, to point past the end of
+	 * the records section. Seeking such a block must fail gracefully.
+	 */
+	reftable_put_be24((uint8_t *) block.block_data.data + block.restart_off,
+			  0xffffff);
+
+	block_iter_init(&it, &block);
+	cl_must_pass(reftable_buf_addstr(&want, "refs/heads/main"));
+	cl_assert_equal_i(block_iter_seek_key(&it, &want), REFTABLE_FORMAT_ERROR);
+
+	reftable_buf_release(&want);
+	block_iter_close(&it);
+	reftable_block_release(&block);
+	reftable_buf_release(&data);
+}

-- 
2.55.0.rc2.803.g1fd1e6609c.dirty


^ permalink raw reply related

* [PATCH v2 09/12] reftable/block: fix use of uninitialized memory when binsearch fails
From: Patrick Steinhardt @ 2026-06-29  9:02 UTC (permalink / raw)
  To: git; +Cc: oxsignal, Christian Couder
In-Reply-To: <20260629-pks-reftable-hardening-v2-0-b0228e7d908d@pks.im>

When doing the binary search through our restart offsets we may hit an
error in case `restart_needle_less()` fails to decode the record at the
given offset. While we correctly detect this case and error out, it will
cause us to call `reftable_record_release()` on the yet-uninitialized
record.

Fix this by initializing the record earlier.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 reftable/block.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/reftable/block.c b/reftable/block.c
index 4d285aefd7..89efce8751 100644
--- a/reftable/block.c
+++ b/reftable/block.c
@@ -517,6 +517,10 @@ int block_iter_seek_key(struct block_iter *it, struct reftable_buf *want)
 	int err = 0;
 	size_t i;
 
+	err = reftable_record_init(&rec, reftable_block_type(it->block));
+	if (err < 0)
+		goto done;
+
 	/*
 	 * Perform a binary search over the block's restart points, which
 	 * avoids doing a linear scan over the whole block. Like this, we
@@ -558,10 +562,6 @@ int block_iter_seek_key(struct block_iter *it, struct reftable_buf *want)
 	else
 		it->next_off = it->block->header_off + 4;
 
-	err = reftable_record_init(&rec, reftable_block_type(it->block));
-	if (err < 0)
-		goto done;
-
 	/*
 	 * We're looking for the last entry less than the wanted key so that
 	 * the next call to `block_reader_next()` would yield the wanted

-- 
2.55.0.rc2.803.g1fd1e6609c.dirty


^ permalink raw reply related

* [PATCH v2 08/12] reftable/block: fix OOB read with bogus restart count
From: Patrick Steinhardt @ 2026-06-29  9:02 UTC (permalink / raw)
  To: git; +Cc: oxsignal, Christian Couder
In-Reply-To: <20260629-pks-reftable-hardening-v2-0-b0228e7d908d@pks.im>

The restart count is stored in the last two bytes of a block. We use it
without verification to compute the offset of the restart table. With a
bogus restart count that is large enough this computation underflows,
and the subsequent reads via the restart table access out-of-bounds
memory:

  ==129439==ERROR: AddressSanitizer: SEGV on unknown address 0x7d90f6dcd0ad (pc 0x55555598ce89 bp 0x7fffffff4ed0 sp 0x7fffffff4e80 T0)
  ==129439==The signal is caused by a READ memory access.
      #0 0x55555598ce89 in reftable_get_be24 ./git/build/../reftable/basics.h:125:9
      #1 0x55555598eabf in block_restart_offset ./git/build/../reftable/block.c:407:9
      #2 0x55555598e5d5 in restart_needle_less ./git/build/../reftable/block.c:431:17
      #3 0x5555559887e2 in binsearch ./git/build/../reftable/basics.c:165:13
      #4 0x55555598dfec in block_iter_seek_key ./git/build/../reftable/block.c:529:6
      #5 0x555555814517 in test_reftable_block__corrupt_restart_count ./git/build/../t/unit-tests/u-reftable-block.c:593:15
      #6 0x5555557f684e in clar_run_test ./git/build/../t/unit-tests/clar/clar.c:335:3
      #7 0x5555557f2e69 in clar_run_suite ./git/build/../t/unit-tests/clar/clar.c:431:3
      #8 0x5555557f2882 in clar_test_run ./git/build/../t/unit-tests/clar/clar.c:636:4
      #9 0x5555557f375f in clar_test ./git/build/../t/unit-tests/clar/clar.c:687:11
      #10 0x5555557fa49d in cmd_main ./git/build/../t/unit-tests/unit-test.c:62:8
      #11 0x55555584c12a in main ./git/build/../common-main.c:9:11
      #12 0x7ffff7a2b284 in __libc_start_call_main (/nix/store/57iz36553175g3178pvxjij8z5rcsd4n-glibc-2.42-61/lib/libc.so.6+0x2b284) (BuildId: 8ae0b698f2d4e727f569f64bb166e08ae30bd077)
      #13 0x7ffff7a2b337 in __libc_start_main@GLIBC_2.2.5 (/nix/store/57iz36553175g3178pvxjij8z5rcsd4n-glibc-2.42-61/lib/libc.so.6+0x2b337) (BuildId: 8ae0b698f2d4e727f569f64bb166e08ae30bd077)
      #14 0x555555694c24 in _start (./git/build/t/unit-tests+0x140c24)

  ==129439==Register values:
  rax = 0x00007d90f6dcd0ad  rbx = 0x00007fffffff4f20  rcx = 0xf2f2f2f8f2f2f2f8  rdx = 0x0000000000000000
  rdi = 0x00007d90f6dcd0ad  rsi = 0x0000000000007fff  rbp = 0x00007fffffff4ed0  rsp = 0x00007fffffff4e80
   r8 = 0x0000000000000000   r9 = 0x0000000000000000  r10 = 0x0000000000000000  r11 = 0x0000000000000017
  r12 = 0x00007fffffff58e8  r13 = 0x0000000000000001  r14 = 0x00007ffff7ffd000  r15 = 0x00005555560550b0
  AddressSanitizer can not provide additional info.
  SUMMARY: AddressSanitizer: SEGV ./git/build/../reftable/basics.h:125:9 in reftable_get_be24

Verify that the restart table actually fits into the block.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 reftable/block.c                |  4 ++++
 t/unit-tests/u-reftable-block.c | 33 +++++++++++++++++++++++++++++++++
 2 files changed, 37 insertions(+)

diff --git a/reftable/block.c b/reftable/block.c
index 4d6b11c2e7..4d285aefd7 100644
--- a/reftable/block.c
+++ b/reftable/block.c
@@ -351,6 +351,10 @@ int reftable_block_init(struct reftable_block *block,
 
 	restart_count = reftable_get_be16(block->block_data.data + block_size - 2);
 	restart_off = block_size - 2 - 3 * restart_count;
+	if (restart_off < header_size + 4 || restart_off > block_size - 2) {
+		err = REFTABLE_FORMAT_ERROR;
+		goto done;
+	}
 
 	block->block_type = block_type;
 	block->hash_size = hash_size;
diff --git a/t/unit-tests/u-reftable-block.c b/t/unit-tests/u-reftable-block.c
index 43b9d5fb59..a0fbfb247f 100644
--- a/t/unit-tests/u-reftable-block.c
+++ b/t/unit-tests/u-reftable-block.c
@@ -530,3 +530,36 @@ void test_reftable_block__corrupt_block_size(void)
 	reftable_block_release(&block);
 	reftable_buf_release(&data);
 }
+
+void test_reftable_block__corrupt_restart_count(void)
+{
+	struct reftable_block_source source = { 0 };
+	struct reftable_record rec = {
+		.type = REFTABLE_BLOCK_TYPE_REF,
+		.u.ref = {
+			.value_type = REFTABLE_REF_VAL1,
+			.refname = (char *) "refs/heads/main",
+		},
+	};
+	struct reftable_block block = { 0 };
+	struct reftable_buf data = REFTABLE_BUF_INIT;
+	int block_size;
+
+	block_size = cl_reftable_write_block(&data, REFTABLE_BLOCK_TYPE_REF, &rec, 1);
+
+	/*
+	 * Corrupt the restart count to claim a bogus number of restart points.
+	 * Note that this would only cause us to perform an out-of-bounds
+	 * access when seeking into the block, but we want to refuse such a
+	 * block outright.
+	 */
+	reftable_put_be16((uint8_t *) data.buf + block_size - 2, 0xffff);
+
+	block_source_from_buf(&source, &data);
+	cl_assert_equal_i(reftable_block_init(&block, &source, 0, 0, data.len,
+					      REFTABLE_HASH_SIZE_SHA1, REFTABLE_BLOCK_TYPE_REF),
+			  REFTABLE_FORMAT_ERROR);
+
+	reftable_block_release(&block);
+	reftable_buf_release(&data);
+}

-- 
2.55.0.rc2.803.g1fd1e6609c.dirty


^ permalink raw reply related

* [PATCH v2 07/12] reftable/block: fix OOB read with bogus block size
From: Patrick Steinhardt @ 2026-06-29  9:02 UTC (permalink / raw)
  To: git; +Cc: oxsignal, Christian Couder
In-Reply-To: <20260629-pks-reftable-hardening-v2-0-b0228e7d908d@pks.im>

The block size is read from the block header, which is untrusted data.
We use it without verification to access the restart count at the end of
the block as well as to compute the restart table offset. With a bogus
block size that exceeds the data we have actually read this can lead to
an out-of-bounds read:

  ==1458284==ERROR: AddressSanitizer: SEGV on unknown address 0x7d8ff7de4b7d (pc 0x55555598c339 bp 0x7fffffff4ef0 sp 0x7fffffff4eb0 T0)
  ==1458284==The signal is caused by a READ memory access.
      #0 0x55555598c339 in reftable_get_be16 ./build/../reftable/basics.h:118:9
      #1 0x55555598bee2 in reftable_block_init ./build/../reftable/block.c:344:18
      #2 0x555555813e0e in test_reftable_block__corrupt_block_size ./build/../t/unit-tests/u-reftable-block.c:540:8
      #3 0x5555557f684e in clar_run_test ./build/../t/unit-tests/clar/clar.c:335:3
      #4 0x5555557f2e69 in clar_run_suite ./build/../t/unit-tests/clar/clar.c:431:3
      #5 0x5555557f2882 in clar_test_run ./build/../t/unit-tests/clar/clar.c:636:4
      #6 0x5555557f375f in clar_test ./build/../t/unit-tests/clar/clar.c:687:11
      #7 0x5555557fa49d in cmd_main ./build/../t/unit-tests/unit-test.c:62:8
      #8 0x55555584b55a in main ./build/../common-main.c:9:11
      #9 0x7ffff7a2b284 in __libc_start_call_main (/nix/store/57iz36553175g3178pvxjij8z5rcsd4n-glibc-2.42-61/lib/libc.so.6+0x2b284) (BuildId: 8ae0b698f2d4e727f569f64bb166e08ae30bd077)
      #10 0x7ffff7a2b337 in __libc_start_main@GLIBC_2.2.5 (/nix/store/57iz36553175g3178pvxjij8z5rcsd4n-glibc-2.42-61/lib/libc.so.6+0x2b337) (BuildId: 8ae0b698f2d4e727f569f64bb166e08ae30bd077)
      #11 0x555555694c24 in _start (./build/t/unit-tests+0x140c24)

  ==1458284==Register values:
  rax = 0x00007d8ff7de4b7d  rbx = 0x00007fffffff4f00  rcx = 0x0000000000000006  rdx = 0x0000000000000010
  rdi = 0x00007d8ff7de4b7d  rsi = 0x00007bfff5cf0420  rbp = 0x00007fffffff4ef0  rsp = 0x00007fffffff4eb0
   r8 = 0x00000f807eb960b8   r9 = 0x0000000000000001  r10 = 0x00007bfff5cf05e7  r11 = 0x000000000000000f
  r12 = 0x00007fffffff58f8  r13 = 0x0000000000000001  r14 = 0x0000555555ee8160  r15 = 0x0000000000000000
  AddressSanitizer can not provide additional info.

Verify that the claimed block size fits into the block data before using
it.

Signed-off-by: Patrick Steinhardt <ps@pks.im>
---
 reftable/block.c                |  9 +++++++++
 t/unit-tests/u-reftable-block.c | 33 +++++++++++++++++++++++++++++++++
 2 files changed, 42 insertions(+)

diff --git a/reftable/block.c b/reftable/block.c
index b86cb9ec5a..4d6b11c2e7 100644
--- a/reftable/block.c
+++ b/reftable/block.c
@@ -340,6 +340,15 @@ int reftable_block_init(struct reftable_block *block,
 		full_block_size = block_size;
 	}
 
+	/*
+	 * Ensure that we have sufficient data available now to satisfy the
+	 * claimed block size.
+	 */
+	if (block_size > block->block_data.len) {
+		err = REFTABLE_FORMAT_ERROR;
+		goto done;
+	}
+
 	restart_count = reftable_get_be16(block->block_data.data + block_size - 2);
 	restart_off = block_size - 2 - 3 * restart_count;
 
diff --git a/t/unit-tests/u-reftable-block.c b/t/unit-tests/u-reftable-block.c
index 088162483e..43b9d5fb59 100644
--- a/t/unit-tests/u-reftable-block.c
+++ b/t/unit-tests/u-reftable-block.c
@@ -497,3 +497,36 @@ void test_reftable_block__corrupt_log_block_size(void)
 	reftable_block_release(&block);
 	reftable_buf_release(&data);
 }
+
+void test_reftable_block__corrupt_block_size(void)
+{
+	struct reftable_block_source source = { 0 };
+	struct reftable_record rec = {
+		.type = REFTABLE_BLOCK_TYPE_REF,
+		.u.ref = {
+			.value_type = REFTABLE_REF_VAL1,
+			.refname = (char *) "refs/heads/main",
+		},
+	};
+	struct reftable_block block = { 0 };
+	struct reftable_buf data = REFTABLE_BUF_INIT;
+
+	cl_reftable_write_block(&data, REFTABLE_BLOCK_TYPE_REF, &rec, 1);
+
+	/*
+	 * The block size is stored as a big-endian 24-bit integer right after
+	 * the one-byte block type at the start of the block. Corrupt it to
+	 * claim a size that is larger than the data we actually have. Reading
+	 * the restart count and restart table relative to such a bogus block
+	 * size must not access out-of-bounds memory.
+	 */
+	reftable_put_be24((uint8_t *) data.buf + 1, 0xffffff);
+
+	block_source_from_buf(&source, &data);
+	cl_assert_equal_i(reftable_block_init(&block, &source, 0, 0, data.len,
+					      REFTABLE_HASH_SIZE_SHA1, REFTABLE_BLOCK_TYPE_REF),
+			  REFTABLE_FORMAT_ERROR);
+
+	reftable_block_release(&block);
+	reftable_buf_release(&data);
+}

-- 
2.55.0.rc2.803.g1fd1e6609c.dirty


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox