Re: [PATCH] lockfile: add PID file for debugging stale locks

git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Taylor Blau <me@ttaylorr.com>
To: Jeff King <peff@peff.net>
Cc: Paulo Casaretto via GitGitGadget <gitgitgadget@gmail.com>,
	git@vger.kernel.org, Paulo Casaretto <pcasaretto@gmail.com>
Subject: Re: [PATCH] lockfile: add PID file for debugging stale locks
Date: Wed, 3 Dec 2025 18:19:46 -0500	[thread overview]
Message-ID: <aTDFks3RW57Ytwvq@nand.local> (raw)
In-Reply-To: <20251203211610.GA64204@coredump.intra.peff.net>

On Wed, Dec 03, 2025 at 04:16:10PM -0500, Jeff King wrote:
> On Tue, Dec 02, 2025 at 03:07:27PM +0000, Paulo Casaretto via GitGitGadget wrote:
>
> > The .lock.pid file is created when a lock is acquired (if enabled), and
> > automatically cleaned up when the lock is released (via commit or
> > rollback). The file is registered as a tempfile so it gets cleaned up
> > by signal and atexit handlers if the process terminates abnormally.
>
> I'm sympathetic to the goal of this series, and the implementation looks
> cleanly done. But I wonder if there might be some system-level side
> effects that make these .pid files awkward.
>
> Temporarily having an extra .git/index.lock.pid file is probably not a
> big deal. But for other namespaces, like refs, we're colliding with
> names that have other meanings. So if we want to update refs/heads/foo,
> for example, we'll create refs/heads/foo.lock now. And after your patch,
> also refs/heads/foo.lock.pid.
>
> The ".lock" suffix is special, in that we disallow it in a refname and
> know to skip it when iterating over loose refs. But for the ".pid"
> variant, we run the risk of colliding with a real branch named
> "foo.lock.pid", both for reading and writing.

Good point. I don't have a strong opinion on whether or not we should
use an append-only log of which PIDs grabbed which lockfiles when versus
tracking them on a per-lock basis. But I wonder if this would be
mitigated by either:

 - Keeping the ".lock" suffix as-is, so that holding a lockfile at path
   "$GIT_DIR/index.lock" would create "$GIT_DIR/index-pid.lock" or
   something similar.

 - Introducing a new reference name constraint that treats ".lock.pid"
   as a reserved in a manner identical to how we currently treat
   ".lock".

Between the two, I vastly prefer the former, but see below for more on
why.

> But we can see the writes in the opposite order, which I think can also
> lead to data loss. Something like:
>
>   - process A wants to write branch "foo", so it holds
>     refs/heads/foo.lock and now also the matching foo.lock.pid
>
>   - process B wants to write branch "foo.lock.pid", so it holds
>     refs/heads/foo.lock.pid.lock (and the matching pid)

Changing the naming scheme as above would cause us to hold
"foo.pid.lock" in addition to "foo.lock". That would allow process B
here to write branch "foo.lock.pid" (as is the case today). But if the
scenario were instead "process B wants to write branch foo.pid.lock", it
would fail immediately since the ".lock" suffix is reserved.

> I think both could be mitigated if we disallowed ".lock.pid" as a suffix
> in refnames, but that is a big user-facing change.

Yeah, I don't think that we should change the refname constraints here,
especially in a world where reftable deployments are more common. In
that world I think we should err on the side of removing constraints,
not adding them ;-).

> So I dunno what that means for your patch. I notice that the user has to
> enable the feature manually. But it feels more like it should be
> selective based on which subsystem is using the lockfile (so refs would
> never want it, but other lockfiles/tempfiles might).

Yeah, I think that something similar to the "which files do we fsync()
and how?" configuration we have today would be a nice complement here.

(As an aside, I wonder if that interface, too, could be slightly
improved. Right now we have a comma-separated list of values in the
"core.fsync" configuration for listing different "components", and then
a global core.fsyncMethod to either issue a fsync(), or a pagecache
writeback, or writeout-only flushes in batches. It might be nice to have
something like:

  [fsync "loose-object"]
    method = fsync
  [fsync "packfile"]
    method = writeout

, so the analog here would be something like:

  [lockfile "refs"]
    pidfile = false
  [lockfile "index"]
    pidfile = true

or similar. That could also be represented as core.lockfile=index,
omitting "refs" to avoid tracking it. It may be that people don't really
care to ever use different fsync methods for different fsync-able
components, so perhaps the analogy doesn't hold up perfectly.)

Thanks,
Taylor

next prev parent reply	other threads:[~2025-12-03 23:19 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-02 15:07 [PATCH] lockfile: add PID file for debugging stale locks Paulo Casaretto via GitGitGadget
2025-12-02 22:29 ` D. Ben Knoble
2025-12-03 19:48 ` Torsten Bögershausen
2025-12-03 21:16 ` Jeff King
2025-12-03 22:21   ` Junio C Hamano
2025-12-03 22:32     ` Jeff King
2025-12-03 23:19   ` Taylor Blau [this message]
2025-12-05 11:03     ` Patrick Steinhardt
2025-12-05 18:46     ` Jeff King
2025-12-03 23:39 ` Taylor Blau
2025-12-17 18:59 ` [PATCH v2] " Paulo Casaretto via GitGitGadget
2025-12-18  0:32   ` Junio C Hamano
2025-12-18  0:47   ` Junio C Hamano
2025-12-18  1:33     ` Junio C Hamano
2025-12-18  3:38   ` Ben Knoble
2025-12-18  8:07   ` Patrick Steinhardt
2025-12-24 12:24   ` [PATCH v3] " Paulo Casaretto via GitGitGadget
2025-12-25  0:01     ` Junio C Hamano
2025-12-27  7:50     ` Jeff King

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aTDFks3RW57Ytwvq@nand.local \
    --to=me@ttaylorr.com \
    --cc=git@vger.kernel.org \
    --cc=gitgitgadget@gmail.com \
    --cc=pcasaretto@gmail.com \
    --cc=peff@peff.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).