[PATCH 2/3] lock_ref_sha1_basic: always fill old_oid while holding lock

git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: git@vger.kernel.org, Michael Haggerty <mhagger@alum.mit.edu>
Subject: [PATCH 2/3] lock_ref_sha1_basic: always fill old_oid while holding lock
Date: Tue, 12 Jan 2016 16:44:39 -0500	[thread overview]
Message-ID: <20160112214439.GB2841@sigill.intra.peff.net> (raw)
In-Reply-To: <20160112214318.GA2527@sigill.intra.peff.net>

Our basic strategy for taking a ref lock is:

  1. Create $ref.lock to take the lock

  2. Read the ref again while holding the lock (during which
     time we know that nobody else can be updating it).

  3. Compare the value we read to the expected "old_sha1"

The value we read in step (2) is returned to the caller via
the lock->old_oid field, who may use it for other purposes
(such as writing a reflog).

If we have no "old_sha1" (i.e., we are unconditionally
taking the lock), then we obviously must omit step 3. But we
_also_ omit step 2. This seems like a nice optimization, but
it means that the caller sees only whatever was left in
lock->old_oid from previous calls to resolve_ref_unsafe(),
which happened outside of the lock.

We can demonstrate this race pretty easily. Imagine you have
three commits, $one, $two, and $three. One script just flips
between $one and $two, without providing an old-sha1:

  while true; do
    git update-ref -m one refs/heads/foo $one
    git update-ref -m two refs/heads/foo $two
  done

Meanwhile, another script tries to set the value to $three,
also not using an old-sha1:

  while true; do
    git update-ref -m three refs/heads/foo $three
  done

If these run simultaneously, we'll see a lot of lock
contention, but each of the writes will succeed some of the
time. The reflog may record movements between any of the
three refs, but we would expect it to provide a consistent
log: the "from" field of each log entry should be the same
as the "two" field of the previous one.

But if we check this:

  perl -alne '
    print "mismatch on line $."
            if defined $last && $F[0] ne $last;
    $last = $F[1];
  ' .git/logs/refs/heads/foo

we'll see many mismatches. Why?

Because sometimes, in the time between lock_ref_sha1_basic
filling lock->old_oid via resolve_ref_unsafe() and it taking
the lock, there may be a complete write by another process.
And the "from" field in our reflog entry will be wrong, and
will refer to an older value.

This is probably quite rare in practice. It requires writers
which do not provide an old-sha1 value, and it is a very
quick race. However, it is easy to fix: we simply perform
step (2), the read-under-lock, whether we have an old-sha1
or not. Then the value we hand back to the caller is always
atomic.

Signed-off-by: Jeff King <peff@peff.net>
---
 refs/files-backend.c | 17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)

diff --git a/refs/files-backend.c b/refs/files-backend.c
index 180c837..69c3ecf 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -1840,12 +1840,17 @@ static int verify_lock(struct ref_lock *lock,
 	if (read_ref_full(lock->ref_name,
 			  mustexist ? RESOLVE_REF_READING : 0,
 			  lock->old_oid.hash, NULL)) {
-		int save_errno = errno;
-		strbuf_addf(err, "can't verify ref %s", lock->ref_name);
-		errno = save_errno;
-		return -1;
+		if (old_sha1) {
+			int save_errno = errno;
+			strbuf_addf(err, "can't verify ref %s", lock->ref_name);
+			errno = save_errno;
+			return -1;
+		} else {
+			hashclr(lock->old_oid.hash);
+			return 0;
+		}
 	}
-	if (hashcmp(lock->old_oid.hash, old_sha1)) {
+	if (old_sha1 && hashcmp(lock->old_oid.hash, old_sha1)) {
 		strbuf_addf(err, "ref %s is at %s but expected %s",
 			    lock->ref_name,
 			    sha1_to_hex(lock->old_oid.hash),
@@ -1985,7 +1990,7 @@ static struct ref_lock *lock_ref_sha1_basic(const char *refname,
 			goto error_return;
 		}
 	}
-	if (old_sha1 && verify_lock(lock, old_sha1, mustexist, err)) {
+	if (verify_lock(lock, old_sha1, mustexist, err)) {
 		last_errno = errno;
 		goto error_return;
 	}
-- 
2.7.0.368.g04bc9ee

next prev parent reply	other threads:[~2016-01-12 21:44 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-01-11 15:46 [PATCH 0/2] fix corner case with lock_ref_sha1_basic and REF_NODEREF Jeff King
2016-01-11 15:49 ` [PATCH 1/2] checkout,clone: check return value of create_symref Jeff King
2016-01-12  4:09   ` Michael Haggerty
2016-01-12  9:49     ` Jeff King
2016-01-11 15:52 ` [PATCH 2/2] lock_ref_sha1_basic: handle REF_NODEREF with invalid refs Jeff King
2016-01-11 18:28   ` Junio C Hamano
2016-01-12  4:55   ` Michael Haggerty
2016-01-12  9:52     ` Jeff King
2016-01-12 18:11       ` Junio C Hamano
2016-01-12  9:56 ` [PATCH v2 0/2] fix corner case with lock_ref_sha1_basic and REF_NODEREF Jeff King
2016-01-12  9:57   ` [PATCH v2 1/2] checkout,clone: check return value of create_symref Jeff King
2016-01-12  9:58   ` [PATCH v2 2/2] lock_ref_sha1_basic: handle REF_NODEREF with invalid refs Jeff King
2016-01-12 13:26     ` Jeff King
2016-01-12 13:55       ` [PATCH v3 " Jeff King
2016-01-12 19:41         ` Junio C Hamano
2016-01-12 20:22           ` Jeff King
2016-01-12 20:42             ` Jeff King
2016-01-12 21:43               ` [PATCH v4 0/3] fix corner cases with lock_ref_sha1_basic Jeff King
2016-01-12 21:44                 ` [PATCH 1/3] checkout,clone: check return value of create_symref Jeff King
2016-01-12 21:44                 ` Jeff King [this message]
2016-01-13  1:25                   ` [PATCH 2/3] lock_ref_sha1_basic: always fill old_oid while holding lock Eric Sunshine
2016-01-13 11:38                     ` Jeff King
2016-01-12 21:45                 ` [PATCH 3/3] lock_ref_sha1_basic: handle REF_NODEREF with invalid refs Jeff King

find likely ancestor, descendant, or conflicting patches for this message:
( dfblob:180c837 dfblob:69c3ecf )
 OR (
bs:"[PATCH 2/3] lock_ref_sha1_basic: always fill old_oid while holding lock" )
	(help)

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160112214439.GB2841@sigill.intra.peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=mhagger@alum.mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).