Git development

Git development
 help / color / mirror / Atom feed

* Re: [PATCH] builtin/history: unuse the commit buffer after use
From: Jeff King @ 2026-06-30  6:44 UTC (permalink / raw)
  To: Kaartic Sivaraam; +Cc: Patrick Steinhardt, Git mailing list
In-Reply-To: <20260630055026.GE2495216@coredump.intra.peff.net>

On Tue, Jun 30, 2026 at 01:50:26AM -0400, Jeff King wrote:

> t4014 also fails, but with a unique leak. I don't think it's the same
> thing; it looks like we may leak the slab from init_topo_walk().

Fixed over in
https://lore.kernel.org/git/20260630063944.GA3733670@coredump.intra.peff.net/.

I think it can be handled independently of your fix here. You might also
run afoul of the prove TAP-output thing fixed in that thread.

-Peff

^ permalink raw reply

* Re: [PATCH v5 0/4] history: add squash subcommand to fold a range
From: Harald Nordgren @ 2026-06-30  7:19 UTC (permalink / raw)
  To: Matt Hunter
  Cc: phillip.wood, Patrick Steinhardt,
	Harald Nordgren via GitGitGadget, git
In-Reply-To: <DJM1N17VMUM5.3V5Y6YMFLIFQJ@lfurio.us>

> This is probably a larger question, since (according to the man page) it
> affects the other 'git history' commands as well.  When I run
> 'git history ...' and discover that I made a mistake after inspecting
> the results, is there a fool-proof way to undo the change and return to
> the previous state?  My first thought was to run 'git reset --hard ...',
> but the default behavior of --update-refs (moving other branches) can
> make this more complicated.

This is a larger question: But I would love to have a reflog that is
more human-centered. When e.g. rebasing a series with N commits, it's
very tricky in the reflog to find what was the state before that.

I feel like branch switching is given too much space in the reflog,
since it's not a destructive action, I don't care about it.

And when handling multiple commits in on go (squashing, rebasing), I
would love to see a visual hierarchy (with indentation for sub-steps)
instead of treating each action as equally important when it isn't.

Harald

^ permalink raw reply

* Re: [PATCH 1/2] odb/source: generalize `reprepare()` callback
From: Toon Claes @ 2026-06-30  8:18 UTC (permalink / raw)
  To: Patrick Steinhardt; +Cc: git
In-Reply-To: <akINy-hP5EPD4Y4e@pks.im>

Patrick Steinhardt <ps@pks.im> writes:

> On Fri, Jun 26, 2026 at 02:10:32PM +0200, Toon Claes wrote:
>> > diff --git a/builtin/grep.c b/builtin/grep.c
>> > index 8080d1bf5e..7361bf071e 100644
>> > --- a/builtin/grep.c
>> > +++ b/builtin/grep.c
>> > @@ -1361,10 +1360,8 @@ int cmd_grep(int argc,
>> >  			struct odb_source *source;
>> >  
>> >  			odb_prepare_alternates(the_repository->objects);
>> > -			for (source = the_repository->objects->sources; source; source = source->next) {
>> > -				struct odb_source_files *files = odb_source_files_downcast(source);
>> > -				odb_source_packed_prepare(files->packed);
>> > +			for (source = the_repository->objects->sources; source; source = source->next)
>> > +				odb_source_prepare(source, 0);
>> 
>> So you're downcasting inside the implementation by the backends itself.
>> That makes sense, but would it be worth to say something about that in
>> the commit message?
>
> Hm. Would that provide much value? I'm probably quite a bit biased here,
> but I think that it's implicit that the backends have to eventually cast
> the generic structure to their own backend.
>
> So I wouldn't really know how to clarify this. Did you have anything
> specific in mind?

Ah, I'm sorry, I misread that. I thought you changed the vtable function
to do the downcasting, but you're simply changing from calling a 
`*_packed_*()` to the generic variant that goes through the vtable.

Anyhow, not worth mentioning in the commit message.

>> > diff --git a/odb/source-packed.c b/odb/source-packed.c
>> > index 42c28fba0e..fa5a072488 100644
>> > --- a/odb/source-packed.c
>> > +++ b/odb/source-packed.c
>> > @@ -15,7 +15,7 @@ static int find_pack_entry(struct odb_source_packed *store,
>> >  {
>> >  	struct packfile_list_entry *l;
>> >  
>> > -	odb_source_packed_prepare(store);
>> > +	odb_source_prepare(&store->base, 0);
>> 
>> Why are you not using ODB_PREPARE_FLUSH_CACHES here? It used to do
>> before?
>
> Because this was calling `odb_source_packed_prepare()` before, not
> `odb_source_reprepare()`. So this was calling the non-flushing
> variant.

Again, confusion on my end.

>> >  	if (store->midx && fill_midx_entry(store->midx, oid, e))
>> >  		return 1;
>> >  
>> > @@ -47,7 +47,7 @@ static int odb_source_packed_read_object_info(struct odb_source *source,
>> >  	 * been added since the last time we have prepared the packfile store.
>> >  	 */
>> >  	if (flags & OBJECT_INFO_SECOND_READ)
>> > -		odb_source_reprepare(source);
>> > +		odb_source_prepare(source, ODB_PREPARE_FLUSH_CACHES);
>> 
>> I think the new code is correct, but why wasn't `packed` used here in
>> the past? The old odb_source_reprepare() expected a downcasted, didn't
>> it?
>
> No, `odb_source_reprepare()` is the generic variant. The naming schema
> is typically:
>
>   - `odb_source_frobnicate()` for the generic variants, which receive a
>     `struct odb_source` as input.
>
>   - `odb_source_<type>_frobnitcate()` for their backend-specific
>     implementations, which cast down the generic `struct odb_source` to
>     their backend-specific struct.

Yeah, I understand things better now. Thanks for clarifying.

-- 
Cheers,
Toon

^ permalink raw reply

* Re: What's cooking in git.git (Jun 2026, #10)
From: Toon Claes @ 2026-06-30  8:20 UTC (permalink / raw)
  To: Junio C Hamano, git
In-Reply-To: <xmqq5x36dtyf.fsf@gitster.g>

Junio C Hamano <gitster@pobox.com> writes:

> * ps/odb-generalize-prepare (2026-06-22) 3 commits
>  - odb: introduce `odb_prepare()`
>  - odb/source: generalize `reprepare()` callback
>  - Merge branch 'ps/odb-source-packed' into ps/odb-generalize-prepare
>  (this branch uses ps/odb-source-packed.)
>
>  The `reprepare()` callback for object database sources has been
>  generalized into a `prepare()` callback with an optional flush cache
>  flag, and a new `odb_prepare()` wrapper has been introduced to
>  allow pre-opening object database sources.
>
>  Needs review.
>  source: <20260622-b4-pks-odb-generalize-prepare-v1-0-d2a5c5d13144@pks.im>

I did have some questions/remarks, but Patrick answered them, and with
those answers I'm happy about this series.

-- 
Cheers,
Toon

^ permalink raw reply

* Re: [PATCH 2/2] odb: introduce `odb_prepare()`
From: Patrick Steinhardt @ 2026-06-30  8:31 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Toon Claes, git
In-Reply-To: <xmqqa4sdt3e6.fsf@gitster.g>

On Mon, Jun 29, 2026 at 02:58:41PM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> >> According to my grep results are there 17 callsites for odb_reprepare(),
> >> then I agree it makes sense to create this wrapper.
> >
> > Yeah, I was a bit torn myself whether or not to keep the wrapper. I
> > eventually decided to just keep it because it reduces churn, and it's a
> > trivial wrapper anyway.
> 
> That sounds OK.  Are we all happy with the current shape of the
> topic?  I myself did not find anything iffy in these two patches.

Based on Toon's reply [1] it seems like this series is ready to go.
Thanks!

Patrick

[1]: <87ik704f1j.fsf@emacs.iotcl.com>

^ permalink raw reply

* Re: [PATCH 6/6] builtin/receive-pack: stage incoming objects via ODB transactions
From: Patrick Steinhardt @ 2026-06-30  8:45 UTC (permalink / raw)
  To: Justin Tobler; +Cc: git
In-Reply-To: <akLLB_J-pvJ7iR7c@denethor>

On Mon, Jun 29, 2026 at 03:25:18PM -0500, Justin Tobler wrote:
> On 26/06/24 01:26PM, Patrick Steinhardt wrote:
> > On Tue, Jun 23, 2026 at 11:19:20PM -0500, Justin Tobler wrote:
> > > Objects received by git-receive-pack(1) are quarantined in a temporary
> > > "incoming" directory and migrated into the object database prior to the
> > > reference updates. The quarantine is currently managed through
> > > `tmp_objdir` directly. In a pluggable ODB future, how exactly an object
> > > gets written to a transaction may vary for a given ODB source. Refactor
> > > git-receive-pack(1) to use the ODB transaction interfaces to manage the
> > > object staging area in a more agnostic manner accordingly.
> > > 
> > > Note that the temporary directory created for git-receive-pack(1) is
> > > eagerly created and uses a different prefix name. This behavior is
> > 
> > A different prefix name compared to what?
> 
> Currently the temporary directories created for ODB transactions all use
> the prefix "bulk-fsync". The temp dir created by git-receive-pack(1) is
> expected to have the prefix "incoming".
> 
> > > special cased in the "files" backend by having `odb_transaction_begin()`
> > > callers that require this behavior provide an `ODB_TRANSACTION_RECEIVE`
> > > flag.
> > 
> > Okay. I guess this is to retain existing behaviour where the temporary
> > directory is created lazily everywhere else. Makes me wonder whether we
> > should eventually change this to just unconditionally create the
> > directory in all cases so that we can drop this new flag.
> 
> It would be nice to not have to have a flag here, but if we want to also
> keep the existing temp dir prefixes, we would also need to keep the
> flags.

Fair. Makes me wonder whether we really need to keep the exact same
naming for this temporary directory. This is so deep into internals that
I'm not sure whether we really need to treat this as part of our
interface. I'm rather inclined to say it's not necessary.

In any case, I think it's fine to defer that discussion and keep this
as-is for now. But we might keep it in the back of our minds and maybe
simplify this in a subsequent patch series.

> > > diff --git a/builtin/receive-pack.c b/builtin/receive-pack.c
> > > index 19eb6a1b61..ee8e03e2ab 100644
> > > --- a/builtin/receive-pack.c
> > > +++ b/builtin/receive-pack.c
> > > @@ -2326,7 +2323,8 @@ static void push_header_arg(struct strvec *args, struct pack_header *hdr)
> > >  		     ntohl(hdr->hdr_version), ntohl(hdr->hdr_entries));
> > >  }
> > >  
> > > -static const char *unpack(int err_fd, struct shallow_info *si)
> > > +static const char *unpack(int err_fd, struct shallow_info *si,
> > > +			  struct odb_transaction *transaction)
> > >  {
> > >  	struct pack_header hdr;
> > >  	const char *hdr_err;
> > 
> > It feels a bit weird that we sometimes pass the transaction as
> > parameter, whereas othertimes we access it via `the_repository`.
> 
> That's fair. I was trying to avoid the churn of wiring to all its
> callsites, but it's probably best to be consistent. Maybe it would be
> fine to just create a transaction global like we do for the reference
> transaction?

My first kneejerk reaction was "no", but then I noticed that the global
variable you're talking about is local to "builtin/receive-pack.c". So
that might be an okayish solution.

Thanks!

Patrick

^ permalink raw reply

* Re: [PATCH 2/6] object-file: propagate files transaction errors
From: Patrick Steinhardt @ 2026-06-30  8:45 UTC (permalink / raw)
  To: Justin Tobler; +Cc: git
In-Reply-To: <akK1roQJknYstX0u@denethor>

On Mon, Jun 29, 2026 at 01:58:54PM -0500, Justin Tobler wrote:
> On 26/06/24 01:26PM, Patrick Steinhardt wrote:
> > On Tue, Jun 23, 2026 at 11:19:16PM -0500, Justin Tobler wrote:
> > > diff --git a/object-file.c b/object-file.c
> > > index a3eb8d71dd..18c2df75fb 100644
> > > --- a/object-file.c
> > > +++ b/object-file.c
> > > @@ -499,7 +499,7 @@ struct odb_transaction_files {
> > >  	struct transaction_packfile packfile;
> > >  };
> > >  
> > > -static void odb_transaction_files_prepare(struct odb_transaction *base)
> > > +static int odb_transaction_files_prepare(struct odb_transaction *base)
> > >  {
> > >  	struct odb_transaction_files *transaction =
> > >  		container_of_or_null(base, struct odb_transaction_files, base);
> > 
> > By the way, is there any reason why those functions are still hosted in
> > "object-file.c" instead of in "odb/source-files.c"? I should probably
> > know, but I forgot.
> 
> There are currently a couple spots in the "files" object write path in
> "object-file.c" that still reach into some of these transaction function
> that are not part of the generic ODB transaction interface. I'm hoping
> in a future followup series to detangle this a bit and be able to get
> all the "files" ODB transaction related code moved into
> "odb/source-files.c".

Makes sense. I also revisited that code a couple days ago, and the
answer is "it's messy right now". Hopefully this will become easier to
detangle as we make progress on pluggifying transactions.

> > > @@ -511,11 +511,15 @@ static void odb_transaction_files_prepare(struct odb_transaction *base)
> > >  	 * added at the time they call odb_transaction_files_begin.
> > >  	 */
> > >  	if (!transaction || transaction->objdir)
> > > -		return;
> > > +		return 0;
> > >  
> > >  	transaction->objdir = tmp_objdir_create(base->source->odb->repo, "bulk-fsync");
> > > -	if (transaction->objdir)
> > > -		tmp_objdir_replace_primary_odb(transaction->objdir, 0);
> > > +	if (!transaction->objdir)
> > > +		return -1;
> > 
> > Huh. So previously we just didn't handle this error at all and just
> > continued to tag along? Did that result in anything sensible or was this
> > just YOLOing it?
> 
> Good question. Previously if there was an error, we wouldn't end up
> creating any tmpdir and would instead continue to use the primary ODB to
> write objects in. This change would make it a hard error if we fail to
> create the temp dir. This matches the behavior that git-receive-pack(1)
> expects, but I didn't consider that the existing callers could
> transparently handle there being no temp dir.
> 
> I suspect we may want existing ODB transaction users to continue being
> resilient in the same manner. In the next version, I'll maintain the
> same behavior.

Honestly I'd say that the change is a good one. I cannot think of a
single reason to just blindly not create the transaction and proceed.
But it certainly is something that should be documented as part of the
commit message.

> > > @@ -1670,27 +1678,34 @@ int read_loose_object(struct repository *repo,
> > >  	return ret;
> > >  }
> > >  
> > > -static void odb_transaction_files_commit(struct odb_transaction *base)
> > > +static int odb_transaction_files_commit(struct odb_transaction *base)
> > >  {
> > >  	struct odb_transaction_files *transaction =
> > >  		container_of(base, struct odb_transaction_files, base);
> > >  
> > > -	flush_loose_object_transaction(transaction);
> > > +	if (flush_loose_object_transaction(transaction))
> > > +		return -1;
> > >  	flush_packfile_transaction(transaction);
> > > +
> > > +	return 0;
> > >  }
> > >  
> > > -struct odb_transaction *odb_transaction_files_begin(struct odb_source *source)
> > > +int odb_transaction_files_begin(struct odb_source *source,
> > > +				struct odb_transaction **out)
> > >  {
> > >  	struct odb_transaction_files *transaction;
> > >  	struct object_database *odb = source->odb;
> > >  
> > > -	if (odb->transaction)
> > > -		return NULL;
> > > +	if (odb->transaction) {
> > > +		*out = NULL;
> > > +		return 0;
> > > +	}
> > >  
> > >  	transaction = xcalloc(1, sizeof(*transaction));
> > >  	transaction->base.source = source;
> > >  	transaction->base.commit = odb_transaction_files_commit;
> > >  	transaction->base.write_object_stream = odb_transaction_files_write_object_stream;
> > > +	*out = &transaction->base;
> > >  
> > > -	return &transaction->base;
> > > +	return 0;
> > >  }
> > 
> > It's still somewhat fishy that we have this ODB-level transaction, but
> > that's a preexisting issue and thus outside the scope of this patch
> > series. Ideally though, it would be possible for there to be multiple
> > transactions, and it would be the caller's responsibility for juggling
> > these transactions. Just as it happens with reference transactions.
> 
> I completely agree. One of the current problems preventing this is that
> only a single instance of tmp_objdir is allowed and stored globally.
> This is done to keep atexit() cleanup simple.
> 
> My plan is to eventually convert all existing tmp_objdir callsites to
> use ODB transactions which should hopefully allow us to remove the need
> for a separate tmp_objdir system. At that point, we can also work to
> change how temp dir cleanup is handled at exit.

Great.

> Another problem is that there are also a couple of ODB transaction
> callsites that require to know if there is already a pending transaction
> for the repository and the transaction has not been wired down to these
> callsites. My hope is that this can be addressed though as we expand ODB
> transaction usage for object writes.

Yeah, agreed. Making the use of transactions explicit feels sensible to
me.

Patrick

^ permalink raw reply

* Re: [PATCH 2/6] object-file: propagate files transaction errors
From: Patrick Steinhardt @ 2026-06-30  8:45 UTC (permalink / raw)
  To: Justin Tobler; +Cc: git
In-Reply-To: <akLBFaTfBEq8vHUr@denethor>

On Mon, Jun 29, 2026 at 02:04:08PM -0500, Justin Tobler wrote:
> On 26/06/29 01:58PM, Justin Tobler wrote:
> > On 26/06/24 01:26PM, Patrick Steinhardt wrote:
> > > On Tue, Jun 23, 2026 at 11:19:16PM -0500, Justin Tobler wrote:
> > > > @@ -511,11 +511,15 @@ static void odb_transaction_files_prepare(struct odb_transaction *base)
> > > >  	 * added at the time they call odb_transaction_files_begin.
> > > >  	 */
> > > >  	if (!transaction || transaction->objdir)
> > > > -		return;
> > > > +		return 0;
> > > >  
> > > >  	transaction->objdir = tmp_objdir_create(base->source->odb->repo, "bulk-fsync");
> > > > -	if (transaction->objdir)
> > > > -		tmp_objdir_replace_primary_odb(transaction->objdir, 0);
> > > > +	if (!transaction->objdir)
> > > > +		return -1;
> > > 
> > > Huh. So previously we just didn't handle this error at all and just
> > > continued to tag along? Did that result in anything sensible or was this
> > > just YOLOing it?
> > 
> > Good question. Previously if there was an error, we wouldn't end up
> > creating any tmpdir and would instead continue to use the primary ODB to
> > write objects in. This change would make it a hard error if we fail to
> > create the temp dir. This matches the behavior that git-receive-pack(1)
> > expects, but I didn't consider that the existing callers could
> > transparently handle there being no temp dir.
> > 
> > I suspect we may want existing ODB transaction users to continue being
> > resilient in the same manner. In the next version, I'll maintain the
> > same behavior.
> 
> I think I got a bit ahead of myself. The existing callers of
> odb_transaction_files_prepare() still continue to ignore this error. So
> the behavior already does remain the same here.

Oh, well, okay. I think this behaviour is plain bad -- if the caller
wants to have a transaction, then we should bail in case we cannot
create one. But this doesn't need to be fixed in this patch series.

Patrick

^ permalink raw reply

* Re: [PATCH 0/3] fixing expensive http test timeouts
From: Patrick Steinhardt @ 2026-06-30  9:05 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Jeff King, Michael Montalbo, git
In-Reply-To: <xmqqik71xqtc.fsf@gitster.g>

On Mon, Jun 29, 2026 at 09:19:11AM -0700, Junio C Hamano wrote:
> Patrick Steinhardt <ps@pks.im> writes:
> 
> >> pushes only to "cast in stone" branches.  If there are other
> >> branches that deserve to be tested with TEST_LONG upon other events
> >> that the existing GitHub Actions CI does not trigger, it may be good
> >> to have GitLab CI cover them, perhaps?
> >
> > I'm a bit hesitant to do such a split, mostly because the canonical
> > source of truth that the project typically uses is GitHub's CI. So I
> > want us at GitLab to be able to catch the same issues that GitHub would
> > flag. And if GitLab's CI stopped detecting everything that GitHub does,
> > then the result would likely be that we often create merge requests on
> > both platforms, which would only result in more wasted resources.
> 
> I didn't suggest splitting them into two circles that overlap but
> each with area only it covers, though.  GitLab's coverage can be
> superset to GitHub's and that would satify what I suggested.

Fair.

> FWIW, I do not consider GitHub's CI "the canonical source" at all.
> It is a very handy service to use to check how well we are doing,
> but from time to time it has its own hiccups ;-).

Well, GitLab of course has its own share of hiccups, like for example
the Chocolatey issues we've been facing.

> What can we do to make the visibility of GitLab's CI more prominent?
> 
> I know where the CI jobs that are triggered when I push out the
> integration branches are found at GitHub's website[*], but I do not
> think I know the corresponding one at GitLab, for example, and I
> think that is a shame.

The pipelines of the official mirror can be found at [1]. We might for
example add something like the below patch to our README.md to make it
more discoverable.

Patrick

[1]: https://gitlab.com/git-scm/git/-/pipelines

diff --git a/README.md b/README.md
index d87bca1b8c..9ad77fdf7e 100644
--- a/README.md
+++ b/README.md
@@ -1,4 +1,5 @@
-[![Build status](https://github.com/git/git/workflows/CI/badge.svg)](https://github.com/git/git/actions?query=branch%3Amaster+event%3Apush)
+[![GitHub build status](https://github.com/git/git/workflows/CI/badge.svg)](https://github.com/git/git/actions?query=branch%3Amaster+event%3Apush)
+[![GitLab build status](https://gitlab.com/git-scm/git/badges/master/pipeline.svg)](https://gitlab.com/git-scm/git/-/pipelines?ref=master)
 
 Git - fast, scalable, distributed revision control system
 =========================================================

^ permalink raw reply related

* Re: [PATCH 3/3] t5551: pack refs after creating many tags
From: Patrick Steinhardt @ 2026-06-30  9:05 UTC (permalink / raw)
  To: Jeff King; +Cc: Michael Montalbo, git, Junio C Hamano
In-Reply-To: <20260629203527.GA1895313@coredump.intra.peff.net>

On Mon, Jun 29, 2026 at 04:35:27PM -0400, Jeff King wrote:
> There's one other thing you might find interesting. While poking at the
> timings here the other day, I noticed that reftable is very eager to
> stat the tables.list file. Try this:
> 
>   git init --ref-format=reftable
>   blob=$(echo foo | git hash-object -w --stdin)
>   seq -f "create refs/tags/foo-%g $blob" 2000 |
>   strace -c git update-ref --stdin
> 
> We make 2000 fstat, which strace claims takes 85% of the time. I suspect
> this is over-emphasized because strace inherently makes syscalls slow,
> but running with perf also highlights it as a non-trivial cost.

Yeah, this rings a bell. If I remember correctly, this is because we
call `refs_resolve_ref_unsafe()` to verify whether the target already
exists. And as that function is generic, it wasn't easy to optimize it
by reusing the already-loaded reftable stack.

> It has been a long time since I've thought about reftable internals, but
> it feels like we ought to be able to take the lock and then trust that
> the stack has not been manipulated.

Yeah, that would certainly be an option to explore.

> It may not be worth digging into too much, though. I can make 50,000
> refs in 150ms on my system, which is probably good enough (especially
> compared to the files backend).

True, it's going to be much better compared to the "files" backend. But
that isn't enough reason to not optimize it even further -- doubly so if
it actually takes 85% of the time. Sounds like a low-hanging fruit to me
that can result in a significant speedup.

I'll probably not get to it anytime soon, but I'll create an issue to
keep track of it.

Patrick

^ permalink raw reply

page:              | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox