Re: [PATCH 5/5] ref_transaction_commit(): only keep one lockfile open at a time

git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Jeff King <peff@peff.net>
To: Junio C Hamano <gitster@pobox.com>
Cc: Michael Haggerty <mhagger@alum.mit.edu>,
	Stefan Beller <sbeller@google.com>,
	git@vger.kernel.org
Subject: Re: [PATCH 5/5] ref_transaction_commit(): only keep one lockfile open at a time
Date: Tue, 28 Apr 2015 00:36:53 -0400	[thread overview]
Message-ID: <20150428043653.GC24580@peff.net> (raw)
In-Reply-To: <xmqqoamcypy4.fsf@gitster.dls.corp.google.com>

On Sat, Apr 25, 2015 at 12:21:07PM -0700, Junio C Hamano wrote:

> Junio C Hamano <gitster@pobox.com> writes:
> 
> > I am not too worried about "push --atomic", as we can just add a few
> > words to Release Notes and documentation saying "this is still an
> > experimental broken code that is unusable; don't use the feature in
> > production".
> >
> > I however am more worried about the other one "update-ref --stdin";
> > the change will be pure regression for those who want to do many
> > updates and do not care if the update is atomic, no?
> 
> I should have refrained from touching the keyboard so late at night
> X-<.  This regression was done long time ago (even in v2.1.0 I see
> that ref_transaction_commit() tries to grab all locks at once).
> 
> So it is only "push --atomic".
> 
> The choice is between (1) shipping "push --atomic" that is known to
> be broken, (2) applying your five-patch series which may (a) fix
> both "push --atomic" and "update-ref --stdin", or (b) break other
> transaction users including "update-ref -stdin" in unexpected ways.

I'm not sure "--atomic" is foolproof even with these patches.

Certainly there is the obvious problem that the filesystem is not atomic
(so a power loss halfway through committing the locks will result in a
non-atomic push). But there are also subtle locking issues related to
D/F conflicts.

For instance, if we do (and I take no credit for this discovery; Michael
showed it to me last week):

  {
    echo "create refs/heads/branch $sha1"
    echo "create refs/heads/foo/bar $sha1" &&
    echo "create refs/heads/foo $sha1"
  } | git update-ref --stdin

we do not notice the conflict between "foo" and "foo/bar" until the
commit stage. Fortunately our ref-ordering rules mean we will always try
to write "foo" before "foo/bar", which will fail because "foo" must be a
directory (because we have "foo/bar.lock"). But because "branch" comes
alphabetically first, we will have written that ref already, committing
part of the request.

You can work around this with code to check for D/F conflicts in a
single transaction, but then you are subject to races with other
processes. E.g., one process writes "branch" and "foo", the other is
simultaneously writing "foo/bar". There is no lock contention, but the
write of "foo" may fail at the commit stage.

So I think it is OK to restrict the definition of "atomic" here to the
common case of making sure that we can lock and check the $old_sha1 for
each of the refs. Under that definition, "push --atomic" is not broken
as-is, but there is room for improving its robustness.  And I have no
problem with trying to make our storage format as robust as possible in
these cases, but I think we need to admit that the filesystem ref
storage is never going to be 100% transactional.

I know that sounds a bit like moving the goalposts to say "eh, this
feature is not broken, your definition of broken is just wrong". But my
point is that shipping "push --atomic" as-is is still a useful step
forward, and we can continue to iterate and improve on the concept in
future releases.

-Peff

next prev parent reply	other threads:[~2015-04-28  4:37 UTC|newest]

Thread overview: 19+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-04-24 11:35 [PATCH 0/5] Avoid file descriptor exhaustion in ref_transaction_commit() Michael Haggerty
2015-04-24 11:35 ` [PATCH 1/5] write_ref_to_lockfile(): new function, extracted from write_ref_sha1() Michael Haggerty
2015-04-24 11:35 ` [PATCH 2/5] commit_ref_update(): " Michael Haggerty
2015-04-24 11:35 ` [PATCH 3/5] write_ref_sha1(): inline function at callers Michael Haggerty
2015-04-24 11:35 ` [PATCH 4/5] ref_transaction_commit(): remove the local flags variables Michael Haggerty
2015-04-24 17:30   ` Jeff King
2015-04-24 21:15     ` Michael Haggerty
2015-04-24 21:19       ` Jeff King
2015-04-24 21:51   ` Eric Sunshine
2015-04-24 22:22     ` Michael Haggerty
2015-04-24 11:35 ` [PATCH 5/5] ref_transaction_commit(): only keep one lockfile open at a time Michael Haggerty
2015-04-25  6:08   ` Michael Haggerty
2015-04-25  6:57     ` Junio C Hamano
2015-04-25 19:21       ` Junio C Hamano
2015-04-28  4:36         ` Jeff King [this message]
2015-04-24 17:26 ` [PATCH 0/5] Avoid file descriptor exhaustion in ref_transaction_commit() Jeff King
2015-04-24 19:13   ` Stefan Beller
2015-04-25 19:27   ` Junio C Hamano
2015-04-25  4:28 ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20150428043653.GC24580@peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=gitster@pobox.com \
    --cc=mhagger@alum.mit.edu \
    --cc=sbeller@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).