Re: [PATCH 5/5] ref_transaction_commit(): only keep one lockfile open at a time

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: Michael Haggerty <mhagger@alum.mit.edu>,
	Stefan Beller <sbeller@google.com>,
	git@vger.kernel.org
Subject: Re: [PATCH 5/5] ref_transaction_commit(): only keep one lockfile open at a time
Date: Tue, 28 Apr 2015 00:36:53 -0400	[thread overview]
Message-ID: <20150428043653.GC24580@peff.net> (raw)
In-Reply-To: <xmqqoamcypy4.fsf@gitster.dls.corp.google.com>

On Sat, Apr 25, 2015 at 12:21:07PM -0700, Junio C Hamano wrote:

> Junio C Hamano <gitster@pobox.com> writes:
> 
> > I am not too worried about "push --atomic", as we can just add a few
> > words to Release Notes and documentation saying "this is still an
> > experimental broken code that is unusable; don't use the feature in
> > production".
> >
> > I however am more worried about the other one "update-ref --stdin";
> > the change will be pure regression for those who want to do many
> > updates and do not care if the update is atomic, no?
> 
> I should have refrained from touching the keyboard so late at night
> X-<.  This regression was done long time ago (even in v2.1.0 I see
> that ref_transaction_commit() tries to grab all locks at once).
> 
> So it is only "push --atomic".
> 
> The choice is between (1) shipping "push --atomic" that is known to
> be broken, (2) applying your five-patch series which may (a) fix
> both "push --atomic" and "update-ref --stdin", or (b) break other
> transaction users including "update-ref -stdin" in unexpected ways.

I'm not sure "--atomic" is foolproof even with these patches.

Certainly there is the obvious problem that the filesystem is not atomic
(so a power loss halfway through committing the locks will result in a
non-atomic push). But there are also subtle locking issues related to
D/F conflicts.

For instance, if we do (and I take no credit for this discovery; Michael
showed it to me last week):

  {
    echo "create refs/heads/branch $sha1"
    echo "create refs/heads/foo/bar $sha1" &&
    echo "create refs/heads/foo $sha1"
  } | git update-ref --stdin

we do not notice the conflict between "foo" and "foo/bar" until the
commit stage. Fortunately our ref-ordering rules mean we will always try
to write "foo" before "foo/bar", which will fail because "foo" must be a
directory (because we have "foo/bar.lock"). But because "branch" comes
alphabetically first, we will have written that ref already, committing
part of the request.

You can work around this with code to check for D/F conflicts in a
single transaction, but then you are subject to races with other
processes. E.g., one process writes "branch" and "foo", the other is
simultaneously writing "foo/bar". There is no lock contention, but the
write of "foo" may fail at the commit stage.

So I think it is OK to restrict the definition of "atomic" here to the
common case of making sure that we can lock and check the $old_sha1 for
each of the refs. Under that definition, "push --atomic" is not broken
as-is, but there is room for improving its robustness.  And I have no
problem with trying to make our storage format as robust as possible in
these cases, but I think we need to admit that the filesystem ref
storage is never going to be 100% transactional.

I know that sounds a bit like moving the goalposts to say "eh, this
feature is not broken, your definition of broken is just wrong". But my
point is that shipping "push --atomic" as-is is still a useful step
forward, and we can continue to iterate and improve on the concept in
future releases.

-Peff

next prev parent reply	other threads:[~2015-04-28  4:37 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-24 11:35 [PATCH 0/5] Avoid file descriptor exhaustion in ref_transaction_commit() Michael Haggerty
2015-04-24 11:35 ` [PATCH 1/5] write_ref_to_lockfile(): new function, extracted from write_ref_sha1() Michael Haggerty
2015-04-24 11:35 ` [PATCH 2/5] commit_ref_update(): " Michael Haggerty
2015-04-24 11:35 ` [PATCH 3/5] write_ref_sha1(): inline function at callers Michael Haggerty
2015-04-24 11:35 ` [PATCH 4/5] ref_transaction_commit(): remove the local flags variables Michael Haggerty
2015-04-24 17:30   ` Jeff King
2015-04-24 21:15     ` Michael Haggerty
2015-04-24 21:19       ` Jeff King
2015-04-24 21:51   ` Eric Sunshine
2015-04-24 22:22     ` Michael Haggerty
2015-04-24 11:35 ` [PATCH 5/5] ref_transaction_commit(): only keep one lockfile open at a time Michael Haggerty
2015-04-25  6:08   ` Michael Haggerty
2015-04-25  6:57     ` Junio C Hamano
2015-04-25 19:21       ` Junio C Hamano
2015-04-28  4:36         ` Jeff King [this message]
2015-04-24 17:26 ` [PATCH 0/5] Avoid file descriptor exhaustion in ref_transaction_commit() Jeff King
2015-04-24 19:13   ` Stefan Beller
2015-04-25 19:27   ` Junio C Hamano
2015-04-25  4:28 ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150428043653.GC24580@peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=mhagger@alum.mit.edu \
    --cc=sbeller@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.