git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Jeff King <peff@peff.net>
To: Michael Haggerty <mhagger@alum.mit.edu>
Cc: git@vger.kernel.org
Subject: Re: [PATCH 13/16] prune: keep objects reachable from recent objects
Date: Wed, 8 Oct 2014 03:19:34 -0400	[thread overview]
Message-ID: <20141008071934.GA25250@peff.net> (raw)
In-Reply-To: <543414CC.9020104@alum.mit.edu>

On Tue, Oct 07, 2014 at 06:29:00PM +0200, Michael Haggerty wrote:

> I haven't read all of the old code, but if I understand correctly this
> is your new algorithm:
> 
> 1. Walk from all references etc., marking reachable objects.
> 
> 2. Iterate over *all* objects, in no particular order, skipping the
>    objects that are already known to be reachable. Use any unreachable
>    object that has a recent mtime as a tip for a second traversal that
>    marks all of its references as "to-keep".
> 
> 3. Iterate over any objects that are not marked "to-keep". (I assume
>    that this iteration is in no particular order.) For each object:
> 
>    * [Presumably] verify that its mtime is still "old"
>    * If so, prune the object

Yes, that's more or less accurate. The iteration is in readdir() order
on the filesystem.

We do verify that the mtime is still "old" in the final iteration, but
that is mostly because the existing check was left in. In theory any
recent objects would have been caught in step 2 and marked as "to-keep"
already. Anything we find in step 3 would have to have been racily
created or freshened.

> I see some problems with this.
> 
> * The one that you mentioned in your cover letter, namely that prune's
>   final mtime check is not atomic with the object deletion. I agree
>   that this race is so much shorter than the others that we can accept
>   a solution that doesn't eliminate it, so let's forget about this one.

Right, I don't see an easy way around this.

> * If the final mtime check fails, then the object is recognized as new
>   and not pruned. But does that prevent its referents from being pruned?
> 
>   * When this situation is encountered, you would have to start another
>     object traversal starting at the "renewed" object to mark its
>     referents "to-keep". I don't see that you do this. Another, much
>     less attractive alternative would be to abort the prune operation
>     if this situation arises. But even if you do one of these...
> 
>   * ...assuming the iteration in step 3 is in no defined order, a
>     referent might *already* have been pruned before you notice the
>     "renewed" object.
> 
> So although your changes are a big improvement, it seems to me that they
> still leave a race with a window approximately as long as the time it
> takes to scan and prune the unreachable objects.

Correct. There is a delay between marking objects and deleting them.
This goes for both the existing reachability checks and the new "recent
reachability" check.

As noted above, if we see a fresh object in the final check, then we
know that it was newly freshened (or created). We could then do an
additional traversal with it as the tip, to get a slightly more accurate
view of the world.

The obvious problem as you note is that we may already have deleted its
referents. But let's leave that aside for a moment.

But even if we fix that problem, I don't think traversing again can
eliminate the race. Another process may be freshening or creating
objects after we have processed them (or their containing directories).
So you can catch _some_ cases by re-traversing, but there will always be
cases where we delete an object that has just now become recent (or
reachable, for that matter, if somebody updates the refs).

The obvious solution is to have some atomic view of the object and ref
namespace (i.e., a global write lock that says "I'm pruning, nobody
write"). But that sucks for obvious reasons. I feel like there must be
some more clever solution, but it eludes me. Surely this is something
database people solved 30 years ago. :)

> I think that the only way to get rid of that race is to delete objects
> in parent-to-child order; in other words, *only prune an object after
> all objects that refer to it have been pruned*. This could be done by
> maintaining reference counts of the to-be-pruned objects and only
> deleting an object once its reference count is zero.

Yes, if you are going to traverse again, you would want to delete in
parent-child order. I'm not convinced that traversing again is worth it;
it's trying to shorten the window, but it can't eliminate it. And my
goal here was never to eliminate the race. It was to keep races to
"simultaneous reference and prune is a problem", and not "ongoing
unbounded operations and simultaneous prune are a problem". And I do not
claim to eliminate the possibility of referents going missing; only to
try to close some obvious and easy holes where it happens.

> Let's take the simpler case first. Suppose I run the following command
> between steps 1 and 3:
> 
>     git update-ref refs/heads/newbranch $COMMIT
> 
> , where $COMMIT is a previously-unreachable object. This doesn't affect
> the mtime of $COMMIT, does it? So how does prune know that it shouldn't
> delete $COMMIT?
> 
> -> So ISTM that updating a reference (or any other traversal starting
> point, like the index) must freshen the mtime of any object newly
> referred to.

_If_ the deletion of the object and the checking of its mtime were
atomic, that would be useful to do. But it's not. Before my patch, you
have one way of "saving" the object (and its referents): making it
reachable from a ref. After my patch, you have the additional option of
updating its mtime.

But why bother with the mtime? You can just make it reachable by
updating the ref. Both are racy, but we cannot help that, so one is as
good as the other.

> A more complicated case: suppose I create a new $COMMIT referring to an
> old $TREE during step 2, *after* prune has scanned the directory that
> now contains $COMMIT. (I.e., the scan in step 2 never notices $COMMIT.)
> Then I create a new reference pointing at $COMMIT. (I.e., the scan in
> step 1 never noticed that the reference exists.) None of this affects
> the mtime of $TREE, does it? So how does prune know that it mustn't
> prune $TREE?

It doesn't, and you are screwed. :)

You could freshen the referents here, but you are still racy. Just as
you might miss the creation of $COMMIT, you might miss the freshening of
$TREE and delete it.

Making the mtimes race-free requires an atomic check-timestamp-and-delete.
And without that, I'm not sure that shortening the race from 50 system
calls to 3 system calls is worth the additional complexity. If we had
such an atomic operation, even on only a subset of systems, it might
be worth it. But I do not know of any filesystem or system call that can
do so.

> I hope I understood that all correctly...

I think your analysis is all correct. The open question is whether it's
worth trying to shrink the last bits of raciness or not (or even whether
there is a clever way of eliminating them that I haven't considered).

-Peff

  reply	other threads:[~2014-10-08  7:19 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-10-03 20:20 [PATCH 0/16] make prune mtime-checking more careful Jeff King
2014-10-03 20:21 ` [PATCH 01/16] foreach_alt_odb: propagate return value from callback Jeff King
2014-10-03 22:55   ` René Scharfe
2014-10-04  0:31     ` Jeff King
2014-10-03 20:22 ` [PATCH 02/16] isxdigit: cast input to unsigned char Jeff King
2014-10-03 20:22 ` [PATCH 03/16] object_array: factor out slopbuf-freeing logic Jeff King
2014-10-07 11:25   ` Michael Haggerty
2014-10-08  7:36     ` Jeff King
2014-10-08  8:40       ` Michael Haggerty
2014-10-08  8:55         ` Jeff King
2014-10-03 20:22 ` [PATCH 04/16] object_array: add a "clear" function Jeff King
2014-10-03 20:23 ` [PATCH 05/16] clean up name allocation in prepare_revision_walk Jeff King
2014-10-03 20:24 ` [PATCH 06/16] reachable: clear pending array after walking it Jeff King
2014-10-03 20:25 ` [PATCH 07/16] t5304: use test_path_is_* instead of "test -f" Jeff King
2014-10-03 22:12   ` Junio C Hamano
2014-10-03 20:27 ` [PATCH 08/16] t5304: use helper to report failure of "test foo = bar" Jeff King
2014-10-03 22:17   ` Junio C Hamano
2014-10-04  0:13     ` Jeff King
2014-10-07 13:21   ` Michael Haggerty
2014-10-07 17:29     ` Junio C Hamano
2014-10-07 20:18       ` Jeff King
2014-10-07 20:35         ` Junio C Hamano
2014-10-07 21:29           ` Jeff King
2014-10-07 21:53             ` Junio C Hamano
2014-10-07 22:17               ` Michael Haggerty
2014-10-08  1:13                 ` Jeff King
2014-10-08 16:58                   ` Junio C Hamano
2014-10-07 21:16         ` Junio C Hamano
2014-10-03 20:29 ` [PATCH 09/16] prune: factor out loose-object directory traversal Jeff King
2014-10-03 22:19   ` Junio C Hamano
2014-10-04  0:24     ` Jeff King
2014-10-07 14:07   ` Michael Haggerty
2014-10-08  7:33     ` Jeff King
2014-10-03 20:31 ` [PATCH 10/16] count-objects: do not use xsize_t when counting object size Jeff King
2014-10-03 20:31 ` [PATCH 11/16] count-objects: use for_each_loose_file_in_objdir Jeff King
2014-10-03 20:32 ` [PATCH 12/16] sha1_file: add for_each iterators for loose and packed objects Jeff King
2014-10-05  8:15   ` René Scharfe
2014-10-05 10:47     ` Ramsay Jones
2014-10-03 20:39 ` [PATCH 13/16] prune: keep objects reachable from recent objects Jeff King
2014-10-03 21:47   ` Junio C Hamano
2014-10-04  0:09     ` Jeff King
2014-10-04  0:30     ` Jeff King
2014-10-04  3:04       ` Junio C Hamano
2014-10-07 16:29   ` Michael Haggerty
2014-10-08  7:19     ` Jeff King [this message]
2014-10-08 10:37       ` Michael Haggerty
2014-10-03 20:39 ` [PATCH 14/16] pack-objects: refactor unpack-unreachable expiration check Jeff King
2014-10-03 20:40 ` [PATCH 15/16] pack-objects: match prune logic for discarding objects Jeff King
2014-10-03 20:41 ` [PATCH 16/16] write_sha1_file: freshen existing objects Jeff King
2014-10-03 21:29   ` Junio C Hamano
2014-10-04  0:01     ` Jeff King
2014-10-05  9:12   ` René Scharfe
2014-10-03 22:20 ` [PATCH 0/16] make prune mtime-checking more careful Junio C Hamano
2014-10-04 22:22 ` Junio C Hamano
2014-10-05  9:19   ` René Scharfe
2014-10-06  1:42   ` Jeff King
2014-10-08  8:31     ` Jeff King
2014-10-08 17:03       ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20141008071934.GA25250@peff.net \
    --to=peff@peff.net \
    --cc=git@vger.kernel.org \
    --cc=mhagger@alum.mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).