Re: [PATCH v8 4/4] cache-tree: Write updated cache-tree after commit

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Junio C Hamano <gitster@pobox.com>
To: Duy Nguyen <pclouds@gmail.com>
Cc: Ramsay Jones <ramsay@ramsay1.demon.co.uk>,
	David Turner <dturner@twopensource.com>,
	Git Mailing List <git@vger.kernel.org>,
	David Turner <dturner@twitter.com>
Subject: Re: [PATCH v8 4/4] cache-tree: Write updated cache-tree after commit
Date: Tue, 15 Jul 2014 09:45:44 -0700	[thread overview]
Message-ID: <xmqqbnsqwoxz.fsf@gitster.dls.corp.google.com> (raw)
In-Reply-To: <20140715102314.GA8273@lanh> (Duy Nguyen's message of "Tue, 15 Jul 2014 17:23:14 +0700")

Duy Nguyen <pclouds@gmail.com> writes:

> On Mon, Jul 14, 2014 at 11:38:06PM -0700, Junio C Hamano wrote:
>> On Mon, Jul 14, 2014 at 7:15 PM, Duy Nguyen <pclouds@gmail.com> wrote:
>> >
>> > It makes me wonder if a cleaner way of rebuilding cache-treei in this
>> > case is from git-add--interactive.perl, or by simply spawn "git
>> > update-index --rebuild-cache-tree" after running
>> > git-add--interactive.perl.
>> 
>> We could check if the cache-tree has fully been populated by
>> "add -i" and limit the rebuilding by "git commit -p" main process,
>> but if "add -i" did not do so, there is no reason why "git commit -p"
>> should not do so, without relying on the implementation of "add -i"
>> to do so.
>> 
>> At least to me, what you suggested sounds nothing more than
>> a cop-out; instead of lifting the limitation of the API by enhancing
>> it with reopen-lock-file, punting to shift the burden elsewhere. I
>> am not quite sure why that is cleaner, but perhaps I am missing
>> something.
>
> Reopen-lock-file to me does not sound like an enhancement, at least in
> the index update context. We have always updated the index by writing
> to a new file, then rename.

Time to step back and think, I think.

What is the real point of "writing into *.lock and renaming"?  It
serves two purposes: (1) everybody adheres to that convention---if
we managed to take the lock "index.lock", nobody else will compete
and interfere with us until we rename it to "index". (2) in the
meantime, others can still read the old contents in the original
"index".

With "take lock", "write to a temporary", "commit by rename or
rollback by delete", we can have the usual transactional semantics.
While we are working on it, nobody is allowed to muck with the same
file, because everybody pays attention to *.lock.  People will not
see what we are doing until we release the lock because we are not
writing into the primary file.  And people can see what we did in
its entirety once we are done because we close and rename to commit
our changes atomically.

Think what CLOSE_LOCK adds to that and you would appreciate its
usefulness and at the same time realize its limitation.  By allowing
us to flush what we wrote to the disk without releasing the lock, we
can give others (e.g. subprograms we spawn) controlled access to the
new contents we prepared before we commit the result to the outside
world.  The access is controlled because we are in control when we
tell these others to peek into or update the temporary file "*.lock".

The original implementaiton of CLOSE_LOCK is limited in that you can
do so only once; you take a lock, you do your thing, you close, you
let (one or more) others see it, and you commit (or rollback).  You
cannot do another of your thing once you close with the original
implementation because there was no way to reopen.

What do you gain by your proposal to lock "index.lock" file?  We
know we already have "index.lock", so nobody should be competing on
mucking with its contents with us and we gain nothing by writing
into index.lock.lock and renaming it to index.lock.  We are in total
control of the lifetime of index.lock, when we spawn subprograms on
it to let them write to it, when we ourselves write to it, when we
spawn subprograms on it to let them read from it, all under the lock
we have on the "index", i.e. "index.lock".

The only thing use of another temporary file (that does not have to
be a lock on "index.lock", by the way, because we have total control
over when and who updates the file while we hold the "index.lock")
would give you is that it allows you to make the success of the "do
another of your thing" step optional.  While holding the lock, you
close and let "add -i" work on it, and after it returns, instead of
reopening, you write into yet another "index.lock.lock", expecting
that it might fail and when it fails you can roll it back, leaving
the contents "add -i" left in "index.lock" intact.  If you do not
use the extra temporary file, you start from "index.lock" left by
"add -i", write the updated index into "index.lock" and if you fail
to write, you have to roll back the entire "index"---you lose the
option to use the index left by "add -i" without repopulated
cache-tree.  But in the index update context, I do not think such a
complexity is not necessary.  If something fails, we should fail and
roll back the entire "index".

> Now if the new write_locked_index() blows
> up after reopen_lock_file(), we have no way but die() because
> index_lock.filename can no longer be trusted.

If that is the case, lock.filename[] is getting clobbered by
somebody too early, isn't it?  I think Michael's mh/lockfile topic
(dropped from my tree due to getting stale without rerolling) used a
separate flag to indicate the validity without mucking with the
filename member, and we may want to do something like that as the
right solution to that problem.

next prev parent reply	other threads:[~2014-07-15 16:45 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-07-12  4:44 [PATCH v8 1/4] cache-tree: Create/update cache-tree on checkout David Turner
2014-07-12  4:44 ` [PATCH v8 2/4] test-dump-cache-tree: invalid trees are not errors David Turner
2014-07-12  4:44 ` [PATCH v8 3/4] cache-tree: subdirectory tests David Turner
2014-07-12  4:44 ` [PATCH v8 4/4] cache-tree: Write updated cache-tree after commit David Turner
2014-07-13  5:09   ` Duy Nguyen
2014-07-14 15:54     ` Junio C Hamano
2014-07-14 17:33       ` Ramsay Jones
2014-07-14 17:51         ` Junio C Hamano
2014-07-14 18:41           ` Ramsay Jones
2014-07-14 22:16             ` Junio C Hamano
2014-07-14 22:32               ` David Turner
2014-07-15  2:15               ` Duy Nguyen
2014-07-15  6:38                 ` Junio C Hamano
2014-07-15 10:23                   ` Duy Nguyen
2014-07-15 16:45                     ` Junio C Hamano [this message]
2014-07-16 10:18                       ` Duy Nguyen
2014-07-16 17:33                         ` Junio C Hamano
2014-07-14 17:43       ` Junio C Hamano
2014-07-14 20:19         ` [PATCH v2] lockfile: allow reopening a closed but still locked file Junio C Hamano
2014-08-31 12:07 ` [PATCH v8 1/4] cache-tree: Create/update cache-tree on checkou John Keeping
2014-09-01 20:49   ` David Turner
2014-09-01 22:13     ` John Keeping
2014-09-02 20:52   ` Junio C Hamano
2014-09-02 21:10     ` Junio C Hamano
2014-09-02 21:24       ` [PATCH] cache-tree: propagate invalidation up when punting Junio C Hamano
2014-09-02 22:39         ` [PATCH v2] cache-tree: do not try to use an invalidated subtree info to build a tree Junio C Hamano
2014-09-03  2:56           ` David Turner
2014-09-03 12:02           ` Eric Sunshine

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=xmqqbnsqwoxz.fsf@gitster.dls.corp.google.com \
    --to=gitster@pobox.com \
    --cc=dturner@twitter.com \
    --cc=dturner@twopensource.com \
    --cc=git@vger.kernel.org \
    --cc=pclouds@gmail.com \
    --cc=ramsay@ramsay1.demon.co.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.