From: Junio C Hamano <gitster@pobox.com>
To: Jeff King <peff@peff.net>
Cc: git@vger.kernel.org, Michael Haggerty <mhagger@alum.mit.edu>
Subject: Re: [PATCH 13/16] prune: keep objects reachable from recent objects
Date: Fri, 03 Oct 2014 14:47:57 -0700 [thread overview]
Message-ID: <xmqq1tqolt9u.fsf@gitster.dls.corp.google.com> (raw)
In-Reply-To: <20141003203931.GM16293@peff.net> (Jeff King's message of "Fri, 3 Oct 2014 16:39:31 -0400")
Jeff King <peff@peff.net> writes:
> Instead, this patch pushes the extra work onto prune, which
> runs less frequently (and has to look at the whole object
> graph anyway). It creates a new category of objects: objects
> which are not recent, but which are reachable from a recent
> object. We do not prune these objects, just like the
> reachable and recent ones.
>
> This lets us avoid the recursive check above, because if we
> have an object, even if it is unreachable, we should have
> its referent:
>
> - if we are creating new objects, then we cannot create
> the parent object without having the child
>
> - and if we are pruning objects, will not prune the child
> if we are keeping the parent
Sorry but this part is beyond a simple panda brain.
I can understand this
If we have an object, even if it is unreachable, we
should have its referent.
as a description of the desired behaviour. If we have a tree that
is unreachable, we must make sure that we do not discard blobs that
are reachable from that tree, or we would end up corrupting our
repository if we ever allow that tree to become reachable from our
refs later.
But how does that connect to these two bullet points?
> - if we are creating new objects, then we cannot create
> the parent object without having the child
We cannot create the parent (e.g. "tree") without having the child
(e.g. "blob that is referred to by the tree we are creating").
So this bullet point is repeating the same thing?
> - and if we are pruning objects, will not prune the child
> if we are keeping the parent
We will not prune "blob" that are reachable from a "tree" that we
are not yet ready to prune. So this again is repeating the same
thing?
But these are "this is how we want our system to behave". And if we
assume our system behaves like so, then prune would be safe.
But it is unclear how that behaviour is realized. Puzzled...
... goes and thinks ...
With this patch applied, the system will not prune unreachable old
objects that are reachable from a recent object (the recent object
itself may or may not be reachable but that does not make any
difference). And that is sufficient to ensure the integrity of the
repository even if you allow new objects to be created reusing any
of these unreachable objects that are left behind by prune, because
the reachability check done during prune (with this patch applied)
makes sure any object left in the repository can safely be used as a
starting point of connectivity traversal.
Ok, I think I got it now, but then do we still need to utime(2) the
loose object files for unreachable objects that are referenced by
a recent object (which is done in a later patch), or is that purely
an optimization for the next round of gc where you would have more
recent objects (i.e. you do not have to traverse to find out an old
one is reachable from a new one, as there will be fewer old ones)?
next prev parent reply other threads:[~2014-10-03 21:48 UTC|newest]
Thread overview: 58+ messages / expand[flat|nested] mbox.gz Atom feed top
2014-10-03 20:20 [PATCH 0/16] make prune mtime-checking more careful Jeff King
2014-10-03 20:21 ` [PATCH 01/16] foreach_alt_odb: propagate return value from callback Jeff King
2014-10-03 22:55 ` René Scharfe
2014-10-04 0:31 ` Jeff King
2014-10-03 20:22 ` [PATCH 02/16] isxdigit: cast input to unsigned char Jeff King
2014-10-03 20:22 ` [PATCH 03/16] object_array: factor out slopbuf-freeing logic Jeff King
2014-10-07 11:25 ` Michael Haggerty
2014-10-08 7:36 ` Jeff King
2014-10-08 8:40 ` Michael Haggerty
2014-10-08 8:55 ` Jeff King
2014-10-03 20:22 ` [PATCH 04/16] object_array: add a "clear" function Jeff King
2014-10-03 20:23 ` [PATCH 05/16] clean up name allocation in prepare_revision_walk Jeff King
2014-10-03 20:24 ` [PATCH 06/16] reachable: clear pending array after walking it Jeff King
2014-10-03 20:25 ` [PATCH 07/16] t5304: use test_path_is_* instead of "test -f" Jeff King
2014-10-03 22:12 ` Junio C Hamano
2014-10-03 20:27 ` [PATCH 08/16] t5304: use helper to report failure of "test foo = bar" Jeff King
2014-10-03 22:17 ` Junio C Hamano
2014-10-04 0:13 ` Jeff King
2014-10-07 13:21 ` Michael Haggerty
2014-10-07 17:29 ` Junio C Hamano
2014-10-07 20:18 ` Jeff King
2014-10-07 20:35 ` Junio C Hamano
2014-10-07 21:29 ` Jeff King
2014-10-07 21:53 ` Junio C Hamano
2014-10-07 22:17 ` Michael Haggerty
2014-10-08 1:13 ` Jeff King
2014-10-08 16:58 ` Junio C Hamano
2014-10-07 21:16 ` Junio C Hamano
2014-10-03 20:29 ` [PATCH 09/16] prune: factor out loose-object directory traversal Jeff King
2014-10-03 22:19 ` Junio C Hamano
2014-10-04 0:24 ` Jeff King
2014-10-07 14:07 ` Michael Haggerty
2014-10-08 7:33 ` Jeff King
2014-10-03 20:31 ` [PATCH 10/16] count-objects: do not use xsize_t when counting object size Jeff King
2014-10-03 20:31 ` [PATCH 11/16] count-objects: use for_each_loose_file_in_objdir Jeff King
2014-10-03 20:32 ` [PATCH 12/16] sha1_file: add for_each iterators for loose and packed objects Jeff King
2014-10-05 8:15 ` René Scharfe
2014-10-05 10:47 ` Ramsay Jones
2014-10-03 20:39 ` [PATCH 13/16] prune: keep objects reachable from recent objects Jeff King
2014-10-03 21:47 ` Junio C Hamano [this message]
2014-10-04 0:09 ` Jeff King
2014-10-04 0:30 ` Jeff King
2014-10-04 3:04 ` Junio C Hamano
2014-10-07 16:29 ` Michael Haggerty
2014-10-08 7:19 ` Jeff King
2014-10-08 10:37 ` Michael Haggerty
2014-10-03 20:39 ` [PATCH 14/16] pack-objects: refactor unpack-unreachable expiration check Jeff King
2014-10-03 20:40 ` [PATCH 15/16] pack-objects: match prune logic for discarding objects Jeff King
2014-10-03 20:41 ` [PATCH 16/16] write_sha1_file: freshen existing objects Jeff King
2014-10-03 21:29 ` Junio C Hamano
2014-10-04 0:01 ` Jeff King
2014-10-05 9:12 ` René Scharfe
2014-10-03 22:20 ` [PATCH 0/16] make prune mtime-checking more careful Junio C Hamano
2014-10-04 22:22 ` Junio C Hamano
2014-10-05 9:19 ` René Scharfe
2014-10-06 1:42 ` Jeff King
2014-10-08 8:31 ` Jeff King
2014-10-08 17:03 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=xmqq1tqolt9u.fsf@gitster.dls.corp.google.com \
--to=gitster@pobox.com \
--cc=git@vger.kernel.org \
--cc=mhagger@alum.mit.edu \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.