git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org, "Karthik Nayak" <karthik.188@gmail.com>,
	"Justin Tobler" <jltobler@gmail.com>,
	"Junio C Hamano" <gitster@pobox.com>,
	"SZEDER Gábor" <szeder.dev@gmail.com>,
	"Toon Claes" <toon@iotcl.com>,
	"Kristoffer Haugsbakk" <kristofferhaugsbakk@fastmail.com>,
	"Ben Knoble" <ben.knoble@gmail.com>
Subject: Re: [PATCH v3 7/9] refs/files: detect race when generating reflog entry for HEAD
Date: Mon, 4 Aug 2025 09:38:29 +0200	[thread overview]
Message-ID: <aJBjdWv9qUeBL25-@pks.im> (raw)
In-Reply-To: <20250802111128.GC1180347@coredump.intra.peff.net>

On Sat, Aug 02, 2025 at 07:11:28AM -0400, Jeff King wrote:
> On Tue, Jul 29, 2025 at 10:55:25AM +0200, Patrick Steinhardt wrote:
> 
> > Unfortunately, this change only helps with the second race. We cannot
> > reliably plug the first race without locking the HEAD reference at the
> > start of the transaction. Locking HEAD unconditionally would effectively
> > serialize all writes though, and that doesn't seem like an option. Also,
> > double checking its value at the end of the transaction is not an option
> > either, as its target may have flip-flopped during the transaction.
> 
> I agree we should not always take a lock on HEAD, since most refs would
> not need it. But I wonder if we could do better by examining HEAD, then
> taking a lock when we think we'll need it, and then re-checking the
> value of HEAD. That is still racy, though (somebody could have pointed
> HEAD at us between the two checks). Fundamentally the files backend is
> not atomic across the whole namespace, and we are trying to update two
> refs. So I think there will always be some race.

We do that though. When queueing the log-only update for HEAD we don't
lock immediately, but we lock once we process that log-only update. And
that's where we now do the check whether HEAD has changed meanwhile,
which should be race-free given that it did change indeed.

> It does make me wonder if this race-fix is even worth it, then. We are
> catching the case where somebody moves HEAD away from the ref we are
> updating while we are updating it. But without atomicity, do we even
> know which happened first? That is, would it be incorrect to update
> HEAD anyway? I guess the outcome is observable because their movement of
> HEAD generated a reflog entry, and thus the entries would be out of
> order. So maybe that is worth it.

So with the above I think that we are race-free for the case where HEAD
has been modified after we have decided to write a reflog entry for it.
We aren't though in the case where we _haven't_ decided to write a
reflog entry, as HEAD might have been adjusted to point to one of the
updated refs meanwhile as you rightfully point out.

But I think that's an acceptable tradeoff. I'd rather write no reflog
entry than an incorrect one.

> Anyway, I had two questions about the code:
> 
> > @@ -2600,7 +2607,36 @@ static enum ref_transaction_error lock_ref_for_update(struct files_ref_store *re
> >  
> >  	update->backend_data = lock;
> >  
> > -	if (update->type & REF_ISSYMREF) {
> > +	if (update->flags & REF_LOG_VIA_SPLIT) {
> > +		struct ref_lock *parent_lock;
> > +
> > +		if (!update->parent_update)
> > +			BUG("split update without a parent");
> > +
> > +		parent_lock = update->parent_update->backend_data;
> > +
> > +		/*
> > +		 * Check that "HEAD" didn't racily change since we have looked
> > +		 * it up. If it did we must refuse to write the reflog entry.
> > +		 *
> > +		 * Note that this does not catch all races: if "HEAD" was
> > +		 * racily changed to point to one of the refs part of the
> > +		 * transaction then we would miss writing the split reflog
> > +		 * entry for "HEAD".
> > +		 */
> > +		if (!(update->type & REF_ISSYMREF) ||
> > +		    strcmp(update->parent_update->refname, referent.buf)) {
> > +			strbuf_addstr(err, "HEAD has been racily updated");
> > +			ret = REF_TRANSACTION_ERROR_GENERIC;
> > +			goto out;
> > +		}
> 
> One, what happens with a multi-level ref (e.g., HEAD points to
> refs/heads/foo which points to refs/heads/bar)?
> 
> We've resolved HEAD to get referent.buf. Do we get "foo" or "bar" here?
> If "bar", then a write through "foo" will complain. But if we get "foo",
> then theoretically a write through "bar" will complain.
> 
> I _think_ we are OK, though. Constructing it like this:
> 
>   git init
>   git commit --allow-empty -m whatever
> 
>   git symbolic-ref refs/heads/foo refs/heads/bar
>   git symbolic-ref HEAD refs/heads/foo
>   git update-ref refs/heads/foo main
> 
> triggers the check and shows that our referent from lock_raw_ref() is
> the first level (i.e., "foo"). Which is good.
> 
> If we swap out "foo" for "bar" in the update-ref call, then we'd get a
> mismatch. But in that case we do not figure out that HEAD needs be
> written at all! That is, we only do a single level of look-back to
> decide whether to write HEAD at all. So as long as we keep doing so, we
> are OK.

Yeah, the whole code is best-effort only and doesn't work for all kinds
of edge cases. The mere fact that we only handle this case for HEAD is
already a limitation -- in theory we should do it for all symbolic refs.

> > +		if (!(update->type & REF_ISSYMREF) ||
> > +		    strcmp(update->parent_update->refname, referent.buf)) {
> > +			strbuf_addstr(err, "HEAD has been racily updated");
> > +			ret = REF_TRANSACTION_ERROR_GENERIC;
> > +			goto out;
> > +		}
> 
> And two, is an error the right thing here? The user asked us to update
> "foo", and we saw that HEAD pointed to it. So we decided to update
> HEAD's reflog, too. And when it came time to do so under lock, we found
> that HEAD did not point to "foo" any more.
> 
> Shouldn't we quietly drop the HEAD reflog update, rather than forcing
> the whole transaction to fail? The user never asked us to update HEAD at
> all. It was something we opportunistically decided to do, and now we
> find out that it is not appropriate to do so.

That's something I wasn't quite sure about, either. Honestly, the reason
I shied away from it is that it needs a bit more munging for an edge
case that is hard to test reliably. But I guess we can do something like
the below patch to skip writing the reflog message instea.

Patrick

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 905555365b..851b1b33f4 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -643,14 +643,16 @@ int parse_loose_ref_contents(const struct git_hash_algo *algop,
 	return 0;
 }
 
-static void unlock_ref(struct ref_lock *lock)
+static int unlock_ref(struct ref_lock *lock)
 {
 	lock->count--;
 	if (!lock->count) {
 		rollback_lock_file(&lock->lk);
 		free(lock->ref_name);
 		free(lock);
+		return 1;
 	}
+	return 0;
 }
 
 /*
@@ -2557,6 +2559,9 @@ struct files_transaction_backend_data {
  *   the referent to transaction.
  * - If it is an update of head_ref, add a corresponding REF_LOG_ONLY
  *   update of HEAD.
+ *
+ * Returns 0 on success, 1 in case the update needs to be dropped or a negative
+ * error code otherwise.
  */
 static enum ref_transaction_error lock_ref_for_update(struct files_ref_store *refs,
 						      struct ref_update *update,
@@ -2617,7 +2622,8 @@ static enum ref_transaction_error lock_ref_for_update(struct files_ref_store *re
 
 		/*
 		 * Check that "HEAD" didn't racily change since we have looked
-		 * it up. If it did we must refuse to write the reflog entry.
+		 * it up. If it did we remove the reflog-only updateg from the
+		 * transaction again.
 		 *
 		 * Note that this does not catch all races: if "HEAD" was
 		 * racily changed to point to one of the refs part of the
@@ -2626,8 +2632,16 @@ static enum ref_transaction_error lock_ref_for_update(struct files_ref_store *re
 		 */
 		if (!(update->type & REF_ISSYMREF) ||
 		    strcmp(update->parent_update->refname, referent.buf)) {
-			strbuf_addstr(err, "HEAD has been racily updated");
-			ret = REF_TRANSACTION_ERROR_GENERIC;
+			if (unlock_ref(lock))
+				strmap_remove(&backend_data->ref_locks,
+					      update->refname, 0);
+
+			memmove(transaction->updates + update_idx,
+				transaction->updates + update_idx + 1,
+				(transaction->nr - update_idx - 1) * sizeof(*transaction->updates));
+			transaction->nr--;
+
+			ret = 1;
 			goto out;
 		}
 
@@ -2896,6 +2910,8 @@ static int files_transaction_prepare(struct ref_store *ref_store,
 					  head_ref, &refnames_to_check,
 					  err);
 		if (ret) {
+			if (ret > 0)
+				continue;
 			if (ref_transaction_maybe_set_rejected(transaction, i, ret)) {
 				strbuf_reset(err);
 				ret = 0;


  reply	other threads:[~2025-08-04  7:38 UTC|newest]

Thread overview: 114+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-22 11:20 [PATCH 0/8] refs: fix migration of reflog entries Patrick Steinhardt
2025-07-22 11:20 ` [PATCH 1/8] Documentation/git-reflog: convert to use synopsis type Patrick Steinhardt
2025-07-22 22:04   ` Junio C Hamano
2025-07-22 11:20 ` [PATCH 2/8] builtin/reflog: improve grouping of subcommands Patrick Steinhardt
2025-07-23 18:14   ` Justin Tobler
2025-07-24  7:42     ` Patrick Steinhardt
2025-07-24 16:45       ` Junio C Hamano
2025-07-22 11:20 ` [PATCH 3/8] refs: export `ref_transaction_update_reflog()` Patrick Steinhardt
2025-07-23 18:25   ` Justin Tobler
2025-07-24  8:36   ` Karthik Nayak
2025-07-24 12:55   ` Toon Claes
2025-07-22 11:20 ` [PATCH 4/8] builtin/reflog: implement subcommand to write new entries Patrick Steinhardt
2025-07-23 19:00   ` Justin Tobler
2025-07-24  7:42     ` Patrick Steinhardt
2025-07-24 12:54   ` Toon Claes
2025-07-25  5:36     ` Patrick Steinhardt
2025-07-24 16:20   ` SZEDER Gábor
2025-07-24 21:10     ` Junio C Hamano
2025-07-25  5:36       ` Patrick Steinhardt
2025-07-25 14:35         ` Junio C Hamano
2025-07-22 11:20 ` [PATCH 5/8] ident: fix type of string length parameter Patrick Steinhardt
2025-07-22 11:20 ` [PATCH 6/8] refs: fix identity for migrated reflogs Patrick Steinhardt
2025-07-23 19:41   ` Justin Tobler
2025-07-24  7:42     ` Patrick Steinhardt
2025-07-24  9:41   ` Karthik Nayak
2025-07-24 12:56   ` Toon Claes
2025-07-22 11:20 ` [PATCH 7/8] refs: stop unsetting REF_HAVE_OLD for log-only updates Patrick Steinhardt
2025-07-23 20:31   ` Justin Tobler
2025-07-24  7:42     ` Patrick Steinhardt
2025-07-24 10:21   ` Karthik Nayak
2025-07-24 11:35     ` Patrick Steinhardt
2025-07-22 11:20 ` [PATCH 8/8] refs: fix invalid old object IDs when migrating reflogs Patrick Steinhardt
2025-07-22 22:09   ` Junio C Hamano
2025-07-23  4:04     ` Patrick Steinhardt
2025-07-25  6:58 ` [PATCH v2 0/8] refs: fix migration of reflog entries Patrick Steinhardt
2025-07-25  6:58   ` [PATCH v2 1/8] Documentation/git-reflog: convert to use synopsis type Patrick Steinhardt
2025-07-25  6:58   ` [PATCH v2 2/8] builtin/reflog: improve grouping of subcommands Patrick Steinhardt
2025-07-25  6:58   ` [PATCH v2 3/8] refs: export `ref_transaction_update_reflog()` Patrick Steinhardt
2025-07-25  6:58   ` [PATCH v2 4/8] builtin/reflog: implement subcommand to write new entries Patrick Steinhardt
2025-07-28 15:33     ` Kristoffer Haugsbakk
2025-07-28 18:49       ` Junio C Hamano
2025-07-28 20:39         ` Karthik Nayak
2025-07-28 20:59           ` Junio C Hamano
2025-07-30  7:55             ` Karthik Nayak
2025-07-29  0:25       ` Ben Knoble
2025-07-29  6:14         ` Kristoffer Haugsbakk
2025-07-29  6:51         ` Patrick Steinhardt
2025-07-29 15:00           ` Junio C Hamano
2025-07-30  5:33             ` Patrick Steinhardt
2025-07-30 10:33               ` Patrick Steinhardt
2025-07-25  6:58   ` [PATCH v2 5/8] ident: fix type of string length parameter Patrick Steinhardt
2025-07-25  6:58   ` [PATCH v2 6/8] refs: fix identity for migrated reflogs Patrick Steinhardt
2025-07-25  6:58   ` [PATCH v2 7/8] refs: stop unsetting REF_HAVE_OLD for log-only updates Patrick Steinhardt
2025-07-25 11:36     ` Jeff King
2025-07-28 14:43       ` Patrick Steinhardt
2025-07-29  7:14         ` Jeff King
2025-07-29  7:54           ` Patrick Steinhardt
2025-07-25  6:58   ` [PATCH v2 8/8] refs: fix invalid old object IDs when migrating reflogs Patrick Steinhardt
2025-07-29  8:55 ` [PATCH v3 0/9] refs: fix migration of reflog entries Patrick Steinhardt
2025-07-29  8:55   ` [PATCH v3 1/9] Documentation/git-reflog: convert to use synopsis type Patrick Steinhardt
2025-07-29  8:55   ` [PATCH v3 2/9] builtin/reflog: improve grouping of subcommands Patrick Steinhardt
2025-07-29  8:55   ` [PATCH v3 3/9] refs: export `ref_transaction_update_reflog()` Patrick Steinhardt
2025-08-01 11:38     ` Toon Claes
2025-08-04  7:37       ` Patrick Steinhardt
2025-07-29  8:55   ` [PATCH v3 4/9] builtin/reflog: implement subcommand to write new entries Patrick Steinhardt
2025-07-29 16:07     ` Junio C Hamano
2025-08-01 11:37     ` Toon Claes
2025-08-04  7:38       ` Patrick Steinhardt
2025-07-29  8:55   ` [PATCH v3 5/9] ident: fix type of string length parameter Patrick Steinhardt
2025-07-29  8:55   ` [PATCH v3 6/9] refs: fix identity for migrated reflogs Patrick Steinhardt
2025-07-29  8:55   ` [PATCH v3 7/9] refs/files: detect race when generating reflog entry for HEAD Patrick Steinhardt
2025-07-29 16:16     ` Junio C Hamano
2025-08-01 11:55     ` Toon Claes
2025-08-02 11:11     ` Jeff King
2025-08-04  7:38       ` Patrick Steinhardt [this message]
2025-08-04 14:47         ` Jeff King
2025-07-29  8:55   ` [PATCH v3 8/9] refs: stop unsetting REF_HAVE_OLD for log-only updates Patrick Steinhardt
2025-07-29  8:55   ` [PATCH v3 9/9] refs: fix invalid old object IDs when migrating reflogs Patrick Steinhardt
2025-08-04  9:46 ` [PATCH v4 0/9] refs: fix migration of reflog entries Patrick Steinhardt
2025-08-04  9:46   ` [PATCH v4 1/9] Documentation/git-reflog: convert to use synopsis type Patrick Steinhardt
2025-08-04  9:46   ` [PATCH v4 2/9] builtin/reflog: improve grouping of subcommands Patrick Steinhardt
2025-08-04  9:46   ` [PATCH v4 3/9] refs: export `ref_transaction_update_reflog()` Patrick Steinhardt
2025-08-04  9:46   ` [PATCH v4 4/9] builtin/reflog: implement subcommand to write new entries Patrick Steinhardt
2025-08-04  9:46   ` [PATCH v4 5/9] ident: fix type of string length parameter Patrick Steinhardt
2025-08-04  9:46   ` [PATCH v4 6/9] refs: fix identity for migrated reflogs Patrick Steinhardt
2025-08-04  9:46   ` [PATCH v4 7/9] refs/files: detect race when generating reflog entry for HEAD Patrick Steinhardt
2025-08-04 15:38     ` Jeff King
2025-08-04  9:46   ` [PATCH v4 8/9] refs: stop unsetting REF_HAVE_OLD for log-only updates Patrick Steinhardt
2025-08-04  9:46   ` [PATCH v4 9/9] refs: fix invalid old object IDs when migrating reflogs Patrick Steinhardt
2025-08-05 15:11 ` [PATCH v5 0/9] refs: fix migration of reflog entries Patrick Steinhardt
2025-08-05 15:11   ` [PATCH v5 1/9] Documentation/git-reflog: convert to use synopsis type Patrick Steinhardt
2025-08-05 17:04     ` Jean-Noël AVILA
2025-08-05 21:47       ` Junio C Hamano
2025-08-06  5:53         ` Patrick Steinhardt
2025-08-05 15:11   ` [PATCH v5 2/9] builtin/reflog: improve grouping of subcommands Patrick Steinhardt
2025-08-05 15:11   ` [PATCH v5 3/9] refs: export `ref_transaction_update_reflog()` Patrick Steinhardt
2025-08-05 15:11   ` [PATCH v5 4/9] builtin/reflog: implement subcommand to write new entries Patrick Steinhardt
2025-08-05 15:11   ` [PATCH v5 5/9] ident: fix type of string length parameter Patrick Steinhardt
2025-08-05 15:11   ` [PATCH v5 6/9] refs: fix identity for migrated reflogs Patrick Steinhardt
2025-08-05 15:11   ` [PATCH v5 7/9] refs/files: detect race when generating reflog entry for HEAD Patrick Steinhardt
2025-08-05 15:11   ` [PATCH v5 8/9] refs: stop unsetting REF_HAVE_OLD for log-only updates Patrick Steinhardt
2025-08-05 15:11   ` [PATCH v5 9/9] refs: fix invalid old object IDs when migrating reflogs Patrick Steinhardt
2025-08-05 18:47   ` [PATCH v5 0/9] refs: fix migration of reflog entries Jeff King
2025-08-06  5:53     ` Patrick Steinhardt
2025-08-06  5:54 ` [PATCH v6 " Patrick Steinhardt
2025-08-06  5:54   ` [PATCH v6 1/9] Documentation/git-reflog: convert to use synopsis type Patrick Steinhardt
2025-08-06  5:54   ` [PATCH v6 2/9] builtin/reflog: improve grouping of subcommands Patrick Steinhardt
2025-08-06  5:54   ` [PATCH v6 3/9] refs: export `ref_transaction_update_reflog()` Patrick Steinhardt
2025-08-06  5:54   ` [PATCH v6 4/9] builtin/reflog: implement subcommand to write new entries Patrick Steinhardt
2025-08-06  5:54   ` [PATCH v6 5/9] ident: fix type of string length parameter Patrick Steinhardt
2025-08-06  5:54   ` [PATCH v6 6/9] refs: fix identity for migrated reflogs Patrick Steinhardt
2025-08-06  5:54   ` [PATCH v6 7/9] refs/files: detect race when generating reflog entry for HEAD Patrick Steinhardt
2025-08-06  5:54   ` [PATCH v6 8/9] refs: stop unsetting REF_HAVE_OLD for log-only updates Patrick Steinhardt
2025-08-06  5:54   ` [PATCH v6 9/9] refs: fix invalid old object IDs when migrating reflogs Patrick Steinhardt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aJBjdWv9qUeBL25-@pks.im \
    --to=ps@pks.im \
    --cc=ben.knoble@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jltobler@gmail.com \
    --cc=karthik.188@gmail.com \
    --cc=kristofferhaugsbakk@fastmail.com \
    --cc=peff@peff.net \
    --cc=szeder.dev@gmail.com \
    --cc=toon@iotcl.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).