From: Michael Haggerty <mhagger@alum.mit.edu>
To: Junio C Hamano <gitster@pobox.com>
Cc: Jeff King <peff@peff.net>, Duy Nguyen <pclouds@gmail.com>,
Johannes Sixt <j6t@kdbg.org>,
Git Mailing List <git@vger.kernel.org>,
David Turner <dturner@twopensource.com>
Subject: Re: git gc and worktrees
Date: Thu, 2 Jun 2016 06:08:06 +0200 [thread overview]
Message-ID: <574FB126.4090805@alum.mit.edu> (raw)
In-Reply-To: <xmqqmvn4y9zq.fsf@gitster.mtv.corp.google.com>
On 06/01/2016 09:39 PM, Junio C Hamano wrote:
> Michael Haggerty <mhagger@alum.mit.edu> writes:
>
>> I argue that the fundamental concept in terms of the implementation
>> should be the individual physical reference stores, and these should be
>> compounded together to form the logical reference collections and the
>> sets of reachability roots that are interesting at the UI level.
>
> That is very good in principle. How does that principle translate
> to the current setup (with possible enhancement with pluggable ref
> backends) and multiple worktrees? Let me try thinking it through
> aloud.
>
> * Without pluggable ref backend or worktrees, we start from two
> "physical reference stores"; packed-refs file lists refs that
> will be covered (overridden) by loose refs in .git/refs/.
> Symbolic refs always being in loose falls out as a natural
> consequence that packed-refs file does not record symrefs.
>
> * Throw in multiple worktrees to the mix. How? Do we consider
> selected refs/ hierarchies (like refs/bisect/*) as separate
> physical store (even though it might be backed by the files in
> the same .git/refs/ filesystem hierarchy) and represent the
> "logical" view as an overlay across the traditional two types of
> physical reference stores? That is:
>
> - loose refs in .git/HEAD, .git/refs/{bisect,...} for
> per-worktree part form one physical store. If a ref is found
> here, that is what we use as a part of the logical view.
>
> - loose refs in .git/refs/{branches,tags,notes,...} for common
> part form one physical store. For a ref that is not found
> above but is found here becomes a part of the logical view.
>
> - packed refs in .git/packed-refs is another physical store. For
> a ref that is not found in the above two but is found here
> becomes a part of the logical view.
I think I would represent the logical store of a worktree repo as
follows. First, I would implement a cached_ref_store that introduces a
layer of caching around another ref_store. Then
def get_files_ref_store(dir) {
loose = create_cached_ref_store(get_loose_ref_store(dir))
packed = create_cached_ref_store(get_packed_ref_store(dir))
return create_files_ref_store(loose, packed)
}
common_ref_store = get_files_ref_store(common_dir)
/*
* I think we only allow loose refs in worktrees; otherwise
* this could be an overlay_ref_store too. Actually, we might
* want to omit the caching here.
*/
local_ref_store = create_cached_ref_store(
get_loose_ref_store(git_dir))
logical_ref_store = create_worktree_ref_store(
local_ref_store, common_ref_store)
Where worktree_ref_store does something like
if (is_per_worktree_ref(refname))
lookup in local_ref_store
else
lookup in common_ref_store
for reading, and uses a merge_ref_iterator with a select function that
does something similar for iterating.
The files_ref_store would do lookups by looking first in the
loose_ref_store then in the packed_ref_store, would use an
overlay_ref_iterator for iteration, and would know to do all writes in
the loose_ref_store (except for deletes, which also have to go to
packed_ref_store). It would have a special "pack-refs" operation,
specific to files_ref_store, that shuffles references between its two
backends.
Writing to a worktree_ref_store is a bit tricker, because we want to
allow ref_transactions to span worktree and common refs (though we
probably need to give up atomicity for any such transaction). The
worktree_ref_transaction_commit() method has to split the main
transaction into two sub-transactions, one for each of its component
ref_stores. I planned for this when designing split_under_lock and think
it is possible, though I admit I haven't implemented it yet.
One nice thing about this design is that you can skip the
worktree_ref_store layer and its overhead entirely for repositories that
are not linked. The decision can be made once, at instantiation time,
rather than every time a reference is looked up. See the pseudocode below.
> Up to this point, I am all for your "separate physical stores are
> composited to give a logical view". I can see how multi-worktree
> world view fits within that framework.
>
> * With pluggable ref backend, we may gain yet another "physical
> reference store" possibility, e.g. one backed by lmdb. If it
> supports symrefs, a repoitory may use lmdb backed reference store
> without the traditional two.
>
> But it is unclear how it would interact with the multi-worktree
> world order.
Since you could plug-and-play different ref_stores in the above scheme,
I don't see any problem here.
def get_logical_ref_store() {
local_ref_store = get_local_ref_store(git_dir)
if (is_linked_repo) {
common_ref_store = get_ref_store(common_dir)
return worktree_ref_store(local_ref_store,
common_ref_store)
} else {
return local_ref_store;
}
}
get_ref_store() would read the git config to decide what the ref store
to use for the specified repository, which itself might be an
lmdb_ref_store or an overlay_ref_store(loose_ref_store, packed_ref_store).
Michael
next prev parent reply other threads:[~2016-06-02 4:08 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-05-31 7:07 git gc and worktrees Johannes Sixt
2016-05-31 12:02 ` Duy Nguyen
2016-05-31 22:14 ` Jeff King
2016-06-01 7:00 ` Johannes Sixt
2016-06-01 8:57 ` Michael Haggerty
2016-06-01 15:15 ` Junio C Hamano
2016-06-01 16:12 ` Michael Haggerty
2016-06-01 19:39 ` Junio C Hamano
2016-06-02 4:08 ` Michael Haggerty [this message]
2016-06-03 16:45 ` Junio C Hamano
2016-06-01 10:45 ` [PATCH 0/4] Fix prune/gc problem with multiple worktrees Nguyễn Thái Ngọc Duy
2016-06-01 10:45 ` [PATCH 1/4] revision.c: move read_cache() out of add_index_objects_to_pending() Nguyễn Thái Ngọc Duy
2016-06-01 10:45 ` [PATCH 2/4] reachable.c: mark reachable objects in index from all worktrees Nguyễn Thái Ngọc Duy
2016-06-01 18:13 ` Eric Sunshine
2016-06-02 9:35 ` Duy Nguyen
2016-06-01 18:57 ` David Turner
2016-06-02 9:37 ` Duy Nguyen
2016-06-01 10:45 ` [PATCH 3/4] reachable.c: mark reachable detached HEAD " Nguyễn Thái Ngọc Duy
2016-06-01 10:45 ` [PATCH 4/4] reachable.c: make reachable reflogs for all per-worktree reflogs Nguyễn Thái Ngọc Duy
2016-06-01 15:51 ` Michael Haggerty
2016-06-01 16:01 ` [PATCH 0/4] Fix prune/gc problem with multiple worktrees Jeff King
2016-06-01 16:06 ` Junio C Hamano
2016-06-02 9:53 ` Duy Nguyen
2016-06-02 11:26 ` Michael Haggerty
2016-06-02 17:44 ` Junio C Hamano
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=574FB126.4090805@alum.mit.edu \
--to=mhagger@alum.mit.edu \
--cc=dturner@twopensource.com \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=j6t@kdbg.org \
--cc=pclouds@gmail.com \
--cc=peff@peff.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.