From: David Turner <dturner@twopensource.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: Michael Haggerty <mhagger@alum.mit.edu>,
git@vger.kernel.org, chriscool@tuxfamily.org, pclouds@gmail.com
Subject: Re: [PATCH v3 2/4] path: optimize common dir checking
Date: Fri, 14 Aug 2015 16:04:04 -0400 [thread overview]
Message-ID: <1439582644.8855.89.camel@twopensource.com> (raw)
In-Reply-To: <xmqqtws1iyxn.fsf@gitster.dls.corp.google.com>
On Fri, 2015-08-14 at 10:04 -0700, Junio C Hamano wrote:
> Michael Haggerty <mhagger@alum.mit.edu> writes:
>
> > Let's take a step back.
> >
> > We have always had a ton of code that uses `git_path()` and friends to
> > convert abstract things into filesystem paths. Let's take the
> > reference-handling code as an example:
> > ...
> > This seems crazy to me. It is the *reference* code that should know
> > whether a particular reference should be stored under `$GIT_DIR` or
> > `$GIT_COMMON_DIR`, or indeed whether it should be stored in a database.
>
> It is more like:
>
> 1. The system as a whole should decide if HEAD and refs/heads/
> should be per workspace or shared across a repository (and we say
> the former should be per workspace, the latter should be shared).
>
> 2. The reference code should decide which ref-backend is used to
> store refs.
>
> 3. And any ref-backend should follow the decision made by the
> system as a whole in 1.
>
> I'd imagine that David's ref-backend code inherited from Ronnie
> would still accept the string "refs/heads/master" from the rest of
> the system (i.e. callers that call into the ref API) to mean "the
> ref that represents the 'master' branch", and uses that as the key
> to decide "ok, that is shared across workspaces" to honor the
> system-wide decision made in 1. The outside callers wouldn't pass
> the result of calling git_path("refs/heads/master") into the ref
> API, which may expand to "$somewhere_else/refs/heads/master" when
> run in a secondary workspace to point at the common location.
>
> I'd also imagine that the workspace API would give ways for the
> implementation of the reference API to ask these questions:
>
> - which workspace am I operating for? where is the "common" thing?
> how would I identify this workspace among the ones that share the
> same "common" thing?
>
> - is this ref (or ref-like thing) supposed to be in common or per
> workspace?
>
> I agree with you that there needs an intermediate level, between
> what Duy and David's git_path() does (i.e. which gives the final
> result of deciding where in the filesystem the thing should go) and
> your two stupid functions would do (i.e. knowing which kind the
> thing is, give you the final location in the filesystem). That is,
> to let the caller know if the thing is supposed to be shared or per
> workspace in the first place.
>
> With that intermediate level function, a database-based ref-backend
> could make ('common', ref/heads/master) as the key for the current
> value of the master branch and (workspace-name, HEAD) as the key for
> the current value of the HEAD for a given workspace.
>
> > We should have two *stupid* functions, `git_workspace_path()` and
> > `git_common_path()`, and have the *callers* decide which one to call.
>
> So I think we should have *three* functions:
>
> - git_workspace_name(void) returns some name that uniquely
> identifies the current workspace among the workspaces linked to
> the same repository.
Random side note: the present workspace path name component is not
acceptable for this if alternate ref backends use a single db for
storage across all workspaces. That's because you might create a
workspace at foo, then manually rm -r it, and then create a new one also
named foo. The database wouldn't know about this series of events, and
would then have stale per-workspace refs for foo.
That said, with my lmdb backend, I've been falling back to the files
backend for per-workspace refs. This also means I don't have to worry
about expiring per-workspace refs when a workspace is removed.
I could change this, but IIRC, there are a fair number of things that
care about the existence of a file called HEAD, so the fallback was
easier. (That is, the other way was a giant hassle).
> - is_common_thing(const char *) takes a path (that is relative to
> $GIT_DIR where the thing would have lived at in a pre-workspace
> world), and tells if it is a common thing or a per-workspace
> thing.
>
> - git_path() can stay the external interface and can be thought of:
>
> git_path(const char *path)
> {
> if (is_common_thing(path))
> return git_common_path(path);
> return git_workspace_path(git_workspace_name(), path);
> }
>
> if you think in terms of your two helpers.
>
> But I do not think that git_common_path() and git_workspace_path()
> need to be called from any other place in the system, and that is
> the reason why I did not say we should have four functions (or five,
> counting git_path() itself).
I wrote an email arguing for Michael's position on this, and by the time
I was done writing it, I had come around to more-or-less Junio's
position.
My argument for keeping git_path as the external interface is this: in
the multiple worktree world, $GIT_DIR is effectively an overlay
filesystem. It's not a complete overlay API: e.g. we don't have a
git_path_opendir function that special-cases refs/ to add in
refs/worktree. But it handles the common cases.
It is true that even if we had a complete overlay API, it would not be
sufficient to hide all of the complexity of per-worktree refs (the files
ref backend still needs to know not to pack per-worktree refs). But
that is equally true if we have the refs code call
git_common_path/git_worktree_path. So we're not successfully hiding
details by using git_common_path/git_worktree_path in refs.c.
For this patch series, I don't think we need to change anything (except
that I realized that I forgot to add logs/refs/worktree to refs, and
people will probably find some issues once they start reviewing the
details of my code). In the present code, adjust_git_path already
handles the git_common_path vs git_workspace_path thing; it just does it
in a slightly less elegant way than Junio's proposal. Implementing
Junio's proposal would not affect this series; it would just be an
additional patch on top (or beforehand).
There is one case where the refs code will probably need to directly
call git_workspace_path: when we fix git prune to handle detached HEADs
in workspaces. It could just set the workspace and then call git_path,
but that is less elegant. So I think when we fix that (which should
probably wait on for_each_worktree), we can implement Junio's proposal.
next prev parent reply other threads:[~2015-08-14 20:04 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-08-12 21:57 [PATCH v3 1/4] refs: clean up common_list David Turner
2015-08-12 21:57 ` [PATCH v3 2/4] path: optimize common dir checking David Turner
2015-08-12 22:48 ` Junio C Hamano
2015-08-13 9:05 ` Michael Haggerty
2015-08-14 17:04 ` Junio C Hamano
2015-08-14 20:04 ` David Turner [this message]
2015-08-14 20:27 ` Junio C Hamano
2015-08-14 20:54 ` David Turner
2015-08-15 18:20 ` Michael Haggerty
2015-08-15 18:12 ` Michael Haggerty
2015-08-17 15:55 ` Junio C Hamano
2015-08-15 7:59 ` Duy Nguyen
2015-08-16 5:04 ` David Turner
2015-08-16 12:20 ` Duy Nguyen
2015-08-12 21:57 ` [PATCH v3 3/4] refs: make refs/worktree/* per-worktree David Turner
2015-08-13 17:15 ` Eric Sunshine
2015-08-13 17:41 ` David Turner
2015-08-13 20:16 ` Michael Haggerty
2015-08-13 20:32 ` David Turner
2015-08-14 8:18 ` Michael Haggerty
2015-08-14 17:10 ` Junio C Hamano
2015-08-15 8:04 ` Duy Nguyen
2015-08-12 21:57 ` [PATCH v3 4/4] bisect: make bisection refs per-worktree David Turner
2015-08-15 7:44 ` [PATCH v3 1/4] refs: clean up common_list Duy Nguyen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1439582644.8855.89.camel@twopensource.com \
--to=dturner@twopensource.com \
--cc=chriscool@tuxfamily.org \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=mhagger@alum.mit.edu \
--cc=pclouds@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).