All of lore.kernel.org
 help / color / mirror / Atom feed
From: Patrick Steinhardt <ps@pks.im>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org, "Karthik Nayak" <karthik.188@gmail.com>,
	"Justin Tobler" <jltobler@gmail.com>,
	"Junio C Hamano" <gitster@pobox.com>,
	"SZEDER Gábor" <szeder.dev@gmail.com>,
	"Toon Claes" <toon@iotcl.com>,
	"Kristoffer Haugsbakk" <kristofferhaugsbakk@fastmail.com>,
	"Ben Knoble" <ben.knoble@gmail.com>
Subject: Re: [PATCH v3 7/9] refs/files: detect race when generating reflog entry for HEAD
Date: Mon, 4 Aug 2025 09:38:29 +0200	[thread overview]
Message-ID: <aJBjdWv9qUeBL25-@pks.im> (raw)
In-Reply-To: <20250802111128.GC1180347@coredump.intra.peff.net>

On Sat, Aug 02, 2025 at 07:11:28AM -0400, Jeff King wrote:
> On Tue, Jul 29, 2025 at 10:55:25AM +0200, Patrick Steinhardt wrote:
> 
> > Unfortunately, this change only helps with the second race. We cannot
> > reliably plug the first race without locking the HEAD reference at the
> > start of the transaction. Locking HEAD unconditionally would effectively
> > serialize all writes though, and that doesn't seem like an option. Also,
> > double checking its value at the end of the transaction is not an option
> > either, as its target may have flip-flopped during the transaction.
> 
> I agree we should not always take a lock on HEAD, since most refs would
> not need it. But I wonder if we could do better by examining HEAD, then
> taking a lock when we think we'll need it, and then re-checking the
> value of HEAD. That is still racy, though (somebody could have pointed
> HEAD at us between the two checks). Fundamentally the files backend is
> not atomic across the whole namespace, and we are trying to update two
> refs. So I think there will always be some race.

We do that though. When queueing the log-only update for HEAD we don't
lock immediately, but we lock once we process that log-only update. And
that's where we now do the check whether HEAD has changed meanwhile,
which should be race-free given that it did change indeed.

> It does make me wonder if this race-fix is even worth it, then. We are
> catching the case where somebody moves HEAD away from the ref we are
> updating while we are updating it. But without atomicity, do we even
> know which happened first? That is, would it be incorrect to update
> HEAD anyway? I guess the outcome is observable because their movement of
> HEAD generated a reflog entry, and thus the entries would be out of
> order. So maybe that is worth it.

So with the above I think that we are race-free for the case where HEAD
has been modified after we have decided to write a reflog entry for it.
We aren't though in the case where we _haven't_ decided to write a
reflog entry, as HEAD might have been adjusted to point to one of the
updated refs meanwhile as you rightfully point out.

But I think that's an acceptable tradeoff. I'd rather write no reflog
entry than an incorrect one.

> Anyway, I had two questions about the code:
> 
> > @@ -2600,7 +2607,36 @@ static enum ref_transaction_error lock_ref_for_update(struct files_ref_store *re
> >  
> >  	update->backend_data = lock;
> >  
> > -	if (update->type & REF_ISSYMREF) {
> > +	if (update->flags & REF_LOG_VIA_SPLIT) {
> > +		struct ref_lock *parent_lock;
> > +
> > +		if (!update->parent_update)
> > +			BUG("split update without a parent");
> > +
> > +		parent_lock = update->parent_update->backend_data;
> > +
> > +		/*
> > +		 * Check that "HEAD" didn't racily change since we have looked
> > +		 * it up. If it did we must refuse to write the reflog entry.
> > +		 *
> > +		 * Note that this does not catch all races: if "HEAD" was
> > +		 * racily changed to point to one of the refs part of the
> > +		 * transaction then we would miss writing the split reflog
> > +		 * entry for "HEAD".
> > +		 */
> > +		if (!(update->type & REF_ISSYMREF) ||
> > +		    strcmp(update->parent_update->refname, referent.buf)) {
> > +			strbuf_addstr(err, "HEAD has been racily updated");
> > +			ret = REF_TRANSACTION_ERROR_GENERIC;
> > +			goto out;
> > +		}
> 
> One, what happens with a multi-level ref (e.g., HEAD points to
> refs/heads/foo which points to refs/heads/bar)?
> 
> We've resolved HEAD to get referent.buf. Do we get "foo" or "bar" here?
> If "bar", then a write through "foo" will complain. But if we get "foo",
> then theoretically a write through "bar" will complain.
> 
> I _think_ we are OK, though. Constructing it like this:
> 
>   git init
>   git commit --allow-empty -m whatever
> 
>   git symbolic-ref refs/heads/foo refs/heads/bar
>   git symbolic-ref HEAD refs/heads/foo
>   git update-ref refs/heads/foo main
> 
> triggers the check and shows that our referent from lock_raw_ref() is
> the first level (i.e., "foo"). Which is good.
> 
> If we swap out "foo" for "bar" in the update-ref call, then we'd get a
> mismatch. But in that case we do not figure out that HEAD needs be
> written at all! That is, we only do a single level of look-back to
> decide whether to write HEAD at all. So as long as we keep doing so, we
> are OK.

Yeah, the whole code is best-effort only and doesn't work for all kinds
of edge cases. The mere fact that we only handle this case for HEAD is
already a limitation -- in theory we should do it for all symbolic refs.

> > +		if (!(update->type & REF_ISSYMREF) ||
> > +		    strcmp(update->parent_update->refname, referent.buf)) {
> > +			strbuf_addstr(err, "HEAD has been racily updated");
> > +			ret = REF_TRANSACTION_ERROR_GENERIC;
> > +			goto out;
> > +		}
> 
> And two, is an error the right thing here? The user asked us to update
> "foo", and we saw that HEAD pointed to it. So we decided to update
> HEAD's reflog, too. And when it came time to do so under lock, we found
> that HEAD did not point to "foo" any more.
> 
> Shouldn't we quietly drop the HEAD reflog update, rather than forcing
> the whole transaction to fail? The user never asked us to update HEAD at
> all. It was something we opportunistically decided to do, and now we
> find out that it is not appropriate to do so.

That's something I wasn't quite sure about, either. Honestly, the reason
I shied away from it is that it needs a bit more munging for an edge
case that is hard to test reliably. But I guess we can do something like
the below patch to skip writing the reflog message instea.

Patrick

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 905555365b..851b1b33f4 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -643,14 +643,16 @@ int parse_loose_ref_contents(const struct git_hash_algo *algop,
 	return 0;
 }
 
-static void unlock_ref(struct ref_lock *lock)
+static int unlock_ref(struct ref_lock *lock)
 {
 	lock->count--;
 	if (!lock->count) {
 		rollback_lock_file(&lock->lk);
 		free(lock->ref_name);
 		free(lock);
+		return 1;
 	}
+	return 0;
 }
 
 /*
@@ -2557,6 +2559,9 @@ struct files_transaction_backend_data {
  *   the referent to transaction.
  * - If it is an update of head_ref, add a corresponding REF_LOG_ONLY
  *   update of HEAD.
+ *
+ * Returns 0 on success, 1 in case the update needs to be dropped or a negative
+ * error code otherwise.
  */
 static enum ref_transaction_error lock_ref_for_update(struct files_ref_store *refs,
 						      struct ref_update *update,
@@ -2617,7 +2622,8 @@ static enum ref_transaction_error lock_ref_for_update(struct files_ref_store *re
 
 		/*
 		 * Check that "HEAD" didn't racily change since we have looked
-		 * it up. If it did we must refuse to write the reflog entry.
+		 * it up. If it did we remove the reflog-only updateg from the
+		 * transaction again.
 		 *
 		 * Note that this does not catch all races: if "HEAD" was
 		 * racily changed to point to one of the refs part of the
@@ -2626,8 +2632,16 @@ static enum ref_transaction_error lock_ref_for_update(struct files_ref_store *re
 		 */
 		if (!(update->type & REF_ISSYMREF) ||
 		    strcmp(update->parent_update->refname, referent.buf)) {
-			strbuf_addstr(err, "HEAD has been racily updated");
-			ret = REF_TRANSACTION_ERROR_GENERIC;
+			if (unlock_ref(lock))
+				strmap_remove(&backend_data->ref_locks,
+					      update->refname, 0);
+
+			memmove(transaction->updates + update_idx,
+				transaction->updates + update_idx + 1,
+				(transaction->nr - update_idx - 1) * sizeof(*transaction->updates));
+			transaction->nr--;
+
+			ret = 1;
 			goto out;
 		}
 
@@ -2896,6 +2910,8 @@ static int files_transaction_prepare(struct ref_store *ref_store,
 					  head_ref, &refnames_to_check,
 					  err);
 		if (ret) {
+			if (ret > 0)
+				continue;
 			if (ref_transaction_maybe_set_rejected(transaction, i, ret)) {
 				strbuf_reset(err);
 				ret = 0;


  reply	other threads:[~2025-08-04  7:38 UTC|newest]

Thread overview: 114+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-22 11:20 [PATCH 0/8] refs: fix migration of reflog entries Patrick Steinhardt
2025-07-22 11:20 ` [PATCH 1/8] Documentation/git-reflog: convert to use synopsis type Patrick Steinhardt
2025-07-22 22:04   ` Junio C Hamano
2025-07-22 11:20 ` [PATCH 2/8] builtin/reflog: improve grouping of subcommands Patrick Steinhardt
2025-07-23 18:14   ` Justin Tobler
2025-07-24  7:42     ` Patrick Steinhardt
2025-07-24 16:45       ` Junio C Hamano
2025-07-22 11:20 ` [PATCH 3/8] refs: export `ref_transaction_update_reflog()` Patrick Steinhardt
2025-07-23 18:25   ` Justin Tobler
2025-07-24  8:36   ` Karthik Nayak
2025-07-24 12:55   ` Toon Claes
2025-07-22 11:20 ` [PATCH 4/8] builtin/reflog: implement subcommand to write new entries Patrick Steinhardt
2025-07-23 19:00   ` Justin Tobler
2025-07-24  7:42     ` Patrick Steinhardt
2025-07-24 12:54   ` Toon Claes
2025-07-25  5:36     ` Patrick Steinhardt
2025-07-24 16:20   ` SZEDER Gábor
2025-07-24 21:10     ` Junio C Hamano
2025-07-25  5:36       ` Patrick Steinhardt
2025-07-25 14:35         ` Junio C Hamano
2025-07-22 11:20 ` [PATCH 5/8] ident: fix type of string length parameter Patrick Steinhardt
2025-07-22 11:20 ` [PATCH 6/8] refs: fix identity for migrated reflogs Patrick Steinhardt
2025-07-23 19:41   ` Justin Tobler
2025-07-24  7:42     ` Patrick Steinhardt
2025-07-24  9:41   ` Karthik Nayak
2025-07-24 12:56   ` Toon Claes
2025-07-22 11:20 ` [PATCH 7/8] refs: stop unsetting REF_HAVE_OLD for log-only updates Patrick Steinhardt
2025-07-23 20:31   ` Justin Tobler
2025-07-24  7:42     ` Patrick Steinhardt
2025-07-24 10:21   ` Karthik Nayak
2025-07-24 11:35     ` Patrick Steinhardt
2025-07-22 11:20 ` [PATCH 8/8] refs: fix invalid old object IDs when migrating reflogs Patrick Steinhardt
2025-07-22 22:09   ` Junio C Hamano
2025-07-23  4:04     ` Patrick Steinhardt
2025-07-25  6:58 ` [PATCH v2 0/8] refs: fix migration of reflog entries Patrick Steinhardt
2025-07-25  6:58   ` [PATCH v2 1/8] Documentation/git-reflog: convert to use synopsis type Patrick Steinhardt
2025-07-25  6:58   ` [PATCH v2 2/8] builtin/reflog: improve grouping of subcommands Patrick Steinhardt
2025-07-25  6:58   ` [PATCH v2 3/8] refs: export `ref_transaction_update_reflog()` Patrick Steinhardt
2025-07-25  6:58   ` [PATCH v2 4/8] builtin/reflog: implement subcommand to write new entries Patrick Steinhardt
2025-07-28 15:33     ` Kristoffer Haugsbakk
2025-07-28 18:49       ` Junio C Hamano
2025-07-28 20:39         ` Karthik Nayak
2025-07-28 20:59           ` Junio C Hamano
2025-07-30  7:55             ` Karthik Nayak
2025-07-29  0:25       ` Ben Knoble
2025-07-29  6:14         ` Kristoffer Haugsbakk
2025-07-29  6:51         ` Patrick Steinhardt
2025-07-29 15:00           ` Junio C Hamano
2025-07-30  5:33             ` Patrick Steinhardt
2025-07-30 10:33               ` Patrick Steinhardt
2025-07-25  6:58   ` [PATCH v2 5/8] ident: fix type of string length parameter Patrick Steinhardt
2025-07-25  6:58   ` [PATCH v2 6/8] refs: fix identity for migrated reflogs Patrick Steinhardt
2025-07-25  6:58   ` [PATCH v2 7/8] refs: stop unsetting REF_HAVE_OLD for log-only updates Patrick Steinhardt
2025-07-25 11:36     ` Jeff King
2025-07-28 14:43       ` Patrick Steinhardt
2025-07-29  7:14         ` Jeff King
2025-07-29  7:54           ` Patrick Steinhardt
2025-07-25  6:58   ` [PATCH v2 8/8] refs: fix invalid old object IDs when migrating reflogs Patrick Steinhardt
2025-07-29  8:55 ` [PATCH v3 0/9] refs: fix migration of reflog entries Patrick Steinhardt
2025-07-29  8:55   ` [PATCH v3 1/9] Documentation/git-reflog: convert to use synopsis type Patrick Steinhardt
2025-07-29  8:55   ` [PATCH v3 2/9] builtin/reflog: improve grouping of subcommands Patrick Steinhardt
2025-07-29  8:55   ` [PATCH v3 3/9] refs: export `ref_transaction_update_reflog()` Patrick Steinhardt
2025-08-01 11:38     ` Toon Claes
2025-08-04  7:37       ` Patrick Steinhardt
2025-07-29  8:55   ` [PATCH v3 4/9] builtin/reflog: implement subcommand to write new entries Patrick Steinhardt
2025-07-29 16:07     ` Junio C Hamano
2025-08-01 11:37     ` Toon Claes
2025-08-04  7:38       ` Patrick Steinhardt
2025-07-29  8:55   ` [PATCH v3 5/9] ident: fix type of string length parameter Patrick Steinhardt
2025-07-29  8:55   ` [PATCH v3 6/9] refs: fix identity for migrated reflogs Patrick Steinhardt
2025-07-29  8:55   ` [PATCH v3 7/9] refs/files: detect race when generating reflog entry for HEAD Patrick Steinhardt
2025-07-29 16:16     ` Junio C Hamano
2025-08-01 11:55     ` Toon Claes
2025-08-02 11:11     ` Jeff King
2025-08-04  7:38       ` Patrick Steinhardt [this message]
2025-08-04 14:47         ` Jeff King
2025-07-29  8:55   ` [PATCH v3 8/9] refs: stop unsetting REF_HAVE_OLD for log-only updates Patrick Steinhardt
2025-07-29  8:55   ` [PATCH v3 9/9] refs: fix invalid old object IDs when migrating reflogs Patrick Steinhardt
2025-08-04  9:46 ` [PATCH v4 0/9] refs: fix migration of reflog entries Patrick Steinhardt
2025-08-04  9:46   ` [PATCH v4 1/9] Documentation/git-reflog: convert to use synopsis type Patrick Steinhardt
2025-08-04  9:46   ` [PATCH v4 2/9] builtin/reflog: improve grouping of subcommands Patrick Steinhardt
2025-08-04  9:46   ` [PATCH v4 3/9] refs: export `ref_transaction_update_reflog()` Patrick Steinhardt
2025-08-04  9:46   ` [PATCH v4 4/9] builtin/reflog: implement subcommand to write new entries Patrick Steinhardt
2025-08-04  9:46   ` [PATCH v4 5/9] ident: fix type of string length parameter Patrick Steinhardt
2025-08-04  9:46   ` [PATCH v4 6/9] refs: fix identity for migrated reflogs Patrick Steinhardt
2025-08-04  9:46   ` [PATCH v4 7/9] refs/files: detect race when generating reflog entry for HEAD Patrick Steinhardt
2025-08-04 15:38     ` Jeff King
2025-08-04  9:46   ` [PATCH v4 8/9] refs: stop unsetting REF_HAVE_OLD for log-only updates Patrick Steinhardt
2025-08-04  9:46   ` [PATCH v4 9/9] refs: fix invalid old object IDs when migrating reflogs Patrick Steinhardt
2025-08-05 15:11 ` [PATCH v5 0/9] refs: fix migration of reflog entries Patrick Steinhardt
2025-08-05 15:11   ` [PATCH v5 1/9] Documentation/git-reflog: convert to use synopsis type Patrick Steinhardt
2025-08-05 17:04     ` Jean-Noël AVILA
2025-08-05 21:47       ` Junio C Hamano
2025-08-06  5:53         ` Patrick Steinhardt
2025-08-05 15:11   ` [PATCH v5 2/9] builtin/reflog: improve grouping of subcommands Patrick Steinhardt
2025-08-05 15:11   ` [PATCH v5 3/9] refs: export `ref_transaction_update_reflog()` Patrick Steinhardt
2025-08-05 15:11   ` [PATCH v5 4/9] builtin/reflog: implement subcommand to write new entries Patrick Steinhardt
2025-08-05 15:11   ` [PATCH v5 5/9] ident: fix type of string length parameter Patrick Steinhardt
2025-08-05 15:11   ` [PATCH v5 6/9] refs: fix identity for migrated reflogs Patrick Steinhardt
2025-08-05 15:11   ` [PATCH v5 7/9] refs/files: detect race when generating reflog entry for HEAD Patrick Steinhardt
2025-08-05 15:11   ` [PATCH v5 8/9] refs: stop unsetting REF_HAVE_OLD for log-only updates Patrick Steinhardt
2025-08-05 15:11   ` [PATCH v5 9/9] refs: fix invalid old object IDs when migrating reflogs Patrick Steinhardt
2025-08-05 18:47   ` [PATCH v5 0/9] refs: fix migration of reflog entries Jeff King
2025-08-06  5:53     ` Patrick Steinhardt
2025-08-06  5:54 ` [PATCH v6 " Patrick Steinhardt
2025-08-06  5:54   ` [PATCH v6 1/9] Documentation/git-reflog: convert to use synopsis type Patrick Steinhardt
2025-08-06  5:54   ` [PATCH v6 2/9] builtin/reflog: improve grouping of subcommands Patrick Steinhardt
2025-08-06  5:54   ` [PATCH v6 3/9] refs: export `ref_transaction_update_reflog()` Patrick Steinhardt
2025-08-06  5:54   ` [PATCH v6 4/9] builtin/reflog: implement subcommand to write new entries Patrick Steinhardt
2025-08-06  5:54   ` [PATCH v6 5/9] ident: fix type of string length parameter Patrick Steinhardt
2025-08-06  5:54   ` [PATCH v6 6/9] refs: fix identity for migrated reflogs Patrick Steinhardt
2025-08-06  5:54   ` [PATCH v6 7/9] refs/files: detect race when generating reflog entry for HEAD Patrick Steinhardt
2025-08-06  5:54   ` [PATCH v6 8/9] refs: stop unsetting REF_HAVE_OLD for log-only updates Patrick Steinhardt
2025-08-06  5:54   ` [PATCH v6 9/9] refs: fix invalid old object IDs when migrating reflogs Patrick Steinhardt

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aJBjdWv9qUeBL25-@pks.im \
    --to=ps@pks.im \
    --cc=ben.knoble@gmail.com \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=jltobler@gmail.com \
    --cc=karthik.188@gmail.com \
    --cc=kristofferhaugsbakk@fastmail.com \
    --cc=peff@peff.net \
    --cc=szeder.dev@gmail.com \
    --cc=toon@iotcl.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.