All of lore.kernel.org
 help / color / mirror / Atom feed
From: Martin Fick <mfick@codeaurora.org>
To: "Pyeron, Jason J CTR (US)" <jason.j.pyeron.ctr@mail.mil>
Cc: "git@vger.kernel.org" <git@vger.kernel.org>
Subject: Re: Lockless Refs?  (Was [PATCH] refs: do not use cached refs in repack_without_ref)
Date: Fri, 4 Jan 2013 11:01:02 -0700	[thread overview]
Message-ID: <201301041101.02756.mfick@codeaurora.org> (raw)
In-Reply-To: <871B6C10EBEFE342A772D1159D1320853A011469@umechphj.easf.csd.disa.mil>

On Friday, January 04, 2013 10:52:43 am Pyeron, Jason J 
CTR (US) wrote:
> > From: Martin Fick
> > Sent: Thursday, January 03, 2013 6:53 PM
> > 
> > Any thoughts on this idea?  Is it flawed?  I am 
trying
> > to write it up in a more formal generalized manner 
and
> > was hoping to get at least one "it seems sane" 
before
> > I do.
> 
> If you are assuming that atomic renames, etc. are
> available, then you should identify a test case and a
> degrade operation path when it is not available.

Thanks, sound reasonable.  Where you thinking a runtime 
test case that would be run before every transaction?  I 
was anticipating a per repo config option called 
something like "core.locks = recoverable" that would be 
needed to turn them on?  I was thinking that this was 
something that server sites could test in advance on 
their repos and then enable it for them.  Maybe a git-
lock tool with a --test-recoverable option?

-Martin


> > 
> > On Monday, December 31, 2012 03:30:53 am Martin Fick 
wrote:
> > > On Thursday, December 27, 2012 04:11:51 pm Martin
> > > Fick
> > 
> > wrote:
> > > > It concerns me that git uses any locking at all,
> > > > even for refs since it has the potential to 
leave
> > > > around stale locks.
> > > > ...
> > > > [a previous not so great attempt to fix this]
> > > > ...
> > > 
> > > I may have finally figured out a working loose ref
> > > update mechanism which I think can avoid stale
> > > locks. Unfortunately it requires atomic directory
> > > renames and universally unique identifiers 
(uuids). 
> > > These may be no-go criteria?  But I figure it is
> > > worth at least exploring this idea because of the
> > > potential benefits?
> > > 
> > > The general approach is to setup a transaction and
> > > either commit or abort it.  A transaction can be
> > > setup by renaming an appropriately setup directory
> > > to the "ref.lock" name.  If the rename succeeds, 
the
> > > transaction is begun.  Any actor can abort the
> > > transaction (up until it is committed) by simply
> > > deleting the "ref.lock" directory, so it is not at
> > > risk of going stale.  However, once the actor who
> > > sets up the transaction commits it, deleting the
> > > "ref.lock" directory simply aids in cleaning it up
> > > for the next transaction (instead of aborting it).
> > > 
> > > One important piece of the transaction is the use 
of
> > > uuids. The uuids provide a mechanism to tie the
> > > atomic commit pieces to the transactions and thus 
to
> > > prevent long sleeping process from inadvertently
> > > performing actions which could be out of date when
> > > they wake finally up.  In each case, the atomic
> > > commit piece is the renaming of a file.   For the
> > > create and update pieces, a file is renamed from 
the
> > > "ref.lock" dir to the "ref" file resulting in an
> > > update to the sha for the ref. However, in the
> > > delete case, the "ref" file is instead renamed to
> > > end up in the "ref.lock" directory resulting in a
> > > delete of the ref.  This scheme does not affect 
the
> > > way refs are read today,
> > > 
> > > To prepare for a transaction, an actor first
> > > generates a uuid (an exercise I will delay for 
now).
> > >  Next, a tmp directory named after the uuid is
> > > generated in the parent directory for the ref to 
be
> > > updated, perhaps something like:  ".lock_uuid". In
> > > this directory is places either a file or a
> > > directory named after the uuid, something like:
> > > ".lock_uuid/,uuid".  In the case of a create or an
> > > update, the new sha is written to this file.  In 
the
> > > case of a delete, it is a directory.
> > > 
> > > Once the tmp directory is setup, the initiating 
actor
> > > attempts to start the transaction by renaming the 
tmp
> > > directory to "ref.lock".  If the rename fails, the
> > > update fails. If the rename succeeds, the actor 
can
> > > then attempt to commit the transaction (before
> > > another actor aborts it).
> > > 
> > > In the case of a create, the actor verifies that
> > > "ref" does not currently exist, and then renames 
the
> > > now named "ref.lock/uuid" file to "ref". On 
success,
> > > the ref was created.
> > > 
> > > In the case of an update, the actor verifies that
> > > "ref" currently contains the old sha, and then 
also
> > > renames the now named "ref.lock/uuid" file to 
"ref".
> > > On success, the ref was updated.
> > > 
> > > In the case of a delete, the actor may verify that
> > > "ref" currently contains the sha to "prune" if it
> > > needs to, and then renames the "ref" file to
> > > "ref.lock/uuid/delete". On success, the ref was
> > > deleted.
> > > 
> > > Whether successful or not, the actor may now 
simply
> > > delete the "ref.lock" directory, clearing the way
> > > for a new transaction.  Any other actor may delete
> > > this directory at any time also, likely either on
> > > conflict (if they are attempting to initiate a
> > > transaction), or after a grace period just to
> > > cleanup the FS.  Any actor may also safely cleanup
> > > the tmp directories, preferably also after a grace
> > > period.
> > > 
> > > One neat part about this scheme is that I believe 
it
> > > would be backwards compatible with the current
> > > locking mechanism since the transaction directory
> > > will simply appear to be a lock to older clients. 
> > > And the old lock file should continue to lock out
> > > these newer transactions.
> > > 
> > > Due to this backwards compatibility, I believe 
that
> > > this could be incrementally employed today without
> > > affecting very much.  It could be deployed in 
place
> > > of any updates which only hold ref.locks to update
> > > the loose ref.  So for example I think it could
> > > replace step 4a below from Michael Haggerty's
> > > description of today's loose ref pruning during
> > > 
> > > ref packing:
> > > > * Pack references:
> > > ...
> > > 
> > > > 4. prune_refs(): for each ref in the 
ref_to_prune
> > > > list,
> > > > 
> > > > call  prune_ref():
> > > >     a. Lock the reference using lock_ref_sha1(),
> > > >     verifying that the recorded SHA1 is still
> > > >     valid.  If it is, unlink the loose reference
> > > >     file then free the lock; otherwise leave the
> > > >     loose reference file untouched.
> > > 
> > > I think it would also therefore be able to replace
> > > the loose ref locking in Michael's new ref-packing
> > > scheme as well as the locking in Michael's new ref
> > > deletion scheme (again steps
> > > 
> > > 4):
> > > > * Delete reference foo:
> > > ...
> > > 
> > > >   4. Delete loose ref for "foo":
> > > >      a. Acquire the lock
> > > >      $GIT_DIR/refs/heads/foo.lock
> > > >      
> > > >      b. Unlink $GIT_DIR/refs/heads/foo if it is
> > > >      unchanged.
> > > >  
> > > >  If it is changed, leave it untouched.  If it is
> > > >  deleted,
> > > > 
> > > > that is OK too.
> > > > 
> > > >      c. Release lock 
$GIT_DIR/refs/heads/foo.lock
> > > 
> > > ...
> > > 
> > > > * Pack references:
> > > ...
> > > 
> > > >   4. prune_refs(): for each ref in the 
ref_to_prune
> > > >   list,
> > > > 
> > > > call prune_ref():
> > > >      a. Lock the loose reference using
> > > >      lock_ref_sha1(),
> > > > 
> > > > verifying that the recorded SHA1 is still valid
> > > > 
> > > >      b. If it is, unlink the loose reference 
file
> > > > 
> > > > (otherwise, leave it untouched)
> > > > 
> > > >      c. Release the lock on the loose reference
> > > 
> > > To be honest, I suspect I missed something obvious
> > > because this seems almost too simple to work.  I 
am
> > > ashamed that it took me so long to come up with 
(of
> > > course, I will be even more ashamed :( when it is
> > > shown to be flawed!) This scheme also feels
> > > extensible. if there are no obvious flaws in it, I
> > > will try to post solutions for ref packing and for
> > > multiple repository/ref transactions also soon.
> > > 
> > > I welcome any comments/criticisms,
> > > 
> > > -Martin
> > > --
> > > To unsubscribe from this list: send the line
> > > "unsubscribe git" in the body of a message to
> > > majordomo@vger.kernel.org More majordomo info at
> > > http://vger.kernel.org/majordomo-info.html
> > 
> > --
> > To unsubscribe from this list: send the line
> > "unsubscribe git" in the body of a message to
> > majordomo@vger.kernel.org More majordomo info at 
> > http://vger.kernel.org/majordomo-info.html

-- 
The Qualcomm Innovation Center, Inc. is a member of Code 
Aurora Forum, hosted by The Linux Foundation

  reply	other threads:[~2013-01-04 18:01 UTC|newest]

Thread overview: 23+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-12-21  8:04 [PATCH] refs: do not use cached refs in repack_without_ref Jeff King
2012-12-26  8:24 ` Michael Haggerty
2012-12-27 23:11   ` Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref) Martin Fick
2012-12-28 14:50     ` Martin Fick
2012-12-28 17:15       ` Lockless Refs? Junio C Hamano
2012-12-29  8:16         ` Jeff King
2012-12-29 21:15           ` Martin Fick
2012-12-29  8:12       ` Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref) Jeff King
2012-12-29 21:29         ` Martin Fick
2012-12-28 16:58     ` Lockless Refs? Junio C Hamano
2012-12-29  1:07       ` Martin Fick
2012-12-29  8:10     ` Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref) Jeff King
2012-12-29 22:18       ` Martin Fick
2012-12-30 17:03         ` Martin Fick
2012-12-31 10:30     ` Martin Fick
2013-01-03 23:52       ` Martin Fick
2013-01-04 17:52         ` Pyeron, Jason J CTR (US)
2013-01-04 18:01           ` Martin Fick [this message]
2013-01-04 21:28         ` Lockless Refs? Junio C Hamano
2013-01-05 16:12       ` Lockless Refs? (Was [PATCH] refs: do not use cached refs in repack_without_ref) Jeff King
2013-01-22  4:31         ` Drew Northup
2012-12-29  7:16   ` [PATCH] refs: do not use cached refs in repack_without_ref Jeff King
     [not found]     ` <201301071109.12086.mfick@codeaurora.org>
2013-01-07 18:14       ` Martin Fick

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=201301041101.02756.mfick@codeaurora.org \
    --to=mfick@codeaurora.org \
    --cc=git@vger.kernel.org \
    --cc=jason.j.pyeron.ctr@mail.mil \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.