From: David Turner <dturner@twopensource.com>
To: Junio C Hamano <gitster@pobox.com>
Cc: Michael Haggerty <mhagger@alum.mit.edu>,
git@vger.kernel.org, chriscool@tuxfamily.org, pclouds@gmail.com
Subject: Re: [PATCH v3 2/4] path: optimize common dir checking
Date: Fri, 14 Aug 2015 16:04:04 -0400 [thread overview]
Message-ID: <1439582644.8855.89.camel@twopensource.com> (raw)
In-Reply-To: <xmqqtws1iyxn.fsf@gitster.dls.corp.google.com>
On Fri, 2015-08-14 at 10:04 -0700, Junio C Hamano wrote:
> Michael Haggerty <mhagger@alum.mit.edu> writes:
>
> > Let's take a step back.
> >
> > We have always had a ton of code that uses `git_path()` and friends to
> > convert abstract things into filesystem paths. Let's take the
> > reference-handling code as an example:
> > ...
> > This seems crazy to me. It is the *reference* code that should know
> > whether a particular reference should be stored under `$GIT_DIR` or
> > `$GIT_COMMON_DIR`, or indeed whether it should be stored in a database.
>
> It is more like:
>
> 1. The system as a whole should decide if HEAD and refs/heads/
> should be per workspace or shared across a repository (and we say
> the former should be per workspace, the latter should be shared).
>
> 2. The reference code should decide which ref-backend is used to
> store refs.
>
> 3. And any ref-backend should follow the decision made by the
> system as a whole in 1.
>
> I'd imagine that David's ref-backend code inherited from Ronnie
> would still accept the string "refs/heads/master" from the rest of
> the system (i.e. callers that call into the ref API) to mean "the
> ref that represents the 'master' branch", and uses that as the key
> to decide "ok, that is shared across workspaces" to honor the
> system-wide decision made in 1. The outside callers wouldn't pass
> the result of calling git_path("refs/heads/master") into the ref
> API, which may expand to "$somewhere_else/refs/heads/master" when
> run in a secondary workspace to point at the common location.
>
> I'd also imagine that the workspace API would give ways for the
> implementation of the reference API to ask these questions:
>
> - which workspace am I operating for? where is the "common" thing?
> how would I identify this workspace among the ones that share the
> same "common" thing?
>
> - is this ref (or ref-like thing) supposed to be in common or per
> workspace?
>
> I agree with you that there needs an intermediate level, between
> what Duy and David's git_path() does (i.e. which gives the final
> result of deciding where in the filesystem the thing should go) and
> your two stupid functions would do (i.e. knowing which kind the
> thing is, give you the final location in the filesystem). That is,
> to let the caller know if the thing is supposed to be shared or per
> workspace in the first place.
>
> With that intermediate level function, a database-based ref-backend
> could make ('common', ref/heads/master) as the key for the current
> value of the master branch and (workspace-name, HEAD) as the key for
> the current value of the HEAD for a given workspace.
>
> > We should have two *stupid* functions, `git_workspace_path()` and
> > `git_common_path()`, and have the *callers* decide which one to call.
>
> So I think we should have *three* functions:
>
> - git_workspace_name(void) returns some name that uniquely
> identifies the current workspace among the workspaces linked to
> the same repository.
Random side note: the present workspace path name component is not
acceptable for this if alternate ref backends use a single db for
storage across all workspaces. That's because you might create a
workspace at foo, then manually rm -r it, and then create a new one also
named foo. The database wouldn't know about this series of events, and
would then have stale per-workspace refs for foo.
That said, with my lmdb backend, I've been falling back to the files
backend for per-workspace refs. This also means I don't have to worry
about expiring per-workspace refs when a workspace is removed.
I could change this, but IIRC, there are a fair number of things that
care about the existence of a file called HEAD, so the fallback was
easier. (That is, the other way was a giant hassle).
> - is_common_thing(const char *) takes a path (that is relative to
> $GIT_DIR where the thing would have lived at in a pre-workspace
> world), and tells if it is a common thing or a per-workspace
> thing.
>
> - git_path() can stay the external interface and can be thought of:
>
> git_path(const char *path)
> {
> if (is_common_thing(path))
> return git_common_path(path);
> return git_workspace_path(git_workspace_name(), path);
> }
>
> if you think in terms of your two helpers.
>
> But I do not think that git_common_path() and git_workspace_path()
> need to be called from any other place in the system, and that is
> the reason why I did not say we should have four functions (or five,
> counting git_path() itself).
I wrote an email arguing for Michael's position on this, and by the time
I was done writing it, I had come around to more-or-less Junio's
position.
My argument for keeping git_path as the external interface is this: in
the multiple worktree world, $GIT_DIR is effectively an overlay
filesystem. It's not a complete overlay API: e.g. we don't have a
git_path_opendir function that special-cases refs/ to add in
refs/worktree. But it handles the common cases.
It is true that even if we had a complete overlay API, it would not be
sufficient to hide all of the complexity of per-worktree refs (the files
ref backend still needs to know not to pack per-worktree refs). But
that is equally true if we have the refs code call
git_common_path/git_worktree_path. So we're not successfully hiding
details by using git_common_path/git_worktree_path in refs.c.
For this patch series, I don't think we need to change anything (except
that I realized that I forgot to add logs/refs/worktree to refs, and
people will probably find some issues once they start reviewing the
details of my code). In the present code, adjust_git_path already
handles the git_common_path vs git_workspace_path thing; it just does it
in a slightly less elegant way than Junio's proposal. Implementing
Junio's proposal would not affect this series; it would just be an
additional patch on top (or beforehand).
There is one case where the refs code will probably need to directly
call git_workspace_path: when we fix git prune to handle detached HEADs
in workspaces. It could just set the workspace and then call git_path,
but that is less elegant. So I think when we fix that (which should
probably wait on for_each_worktree), we can implement Junio's proposal.
next prev parent reply other threads:[~2015-08-14 20:04 UTC|newest]
Thread overview: 24+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-08-12 21:57 [PATCH v3 1/4] refs: clean up common_list David Turner
2015-08-12 21:57 ` [PATCH v3 2/4] path: optimize common dir checking David Turner
2015-08-12 22:48 ` Junio C Hamano
2015-08-13 9:05 ` Michael Haggerty
2015-08-14 17:04 ` Junio C Hamano
2015-08-14 20:04 ` David Turner [this message]
2015-08-14 20:27 ` Junio C Hamano
2015-08-14 20:54 ` David Turner
2015-08-15 18:20 ` Michael Haggerty
2015-08-15 18:12 ` Michael Haggerty
2015-08-17 15:55 ` Junio C Hamano
2015-08-15 7:59 ` Duy Nguyen
2015-08-16 5:04 ` David Turner
2015-08-16 12:20 ` Duy Nguyen
2015-08-12 21:57 ` [PATCH v3 3/4] refs: make refs/worktree/* per-worktree David Turner
2015-08-13 17:15 ` Eric Sunshine
2015-08-13 17:41 ` David Turner
2015-08-13 20:16 ` Michael Haggerty
2015-08-13 20:32 ` David Turner
2015-08-14 8:18 ` Michael Haggerty
2015-08-14 17:10 ` Junio C Hamano
2015-08-15 8:04 ` Duy Nguyen
2015-08-12 21:57 ` [PATCH v3 4/4] bisect: make bisection refs per-worktree David Turner
2015-08-15 7:44 ` [PATCH v3 1/4] refs: clean up common_list Duy Nguyen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1439582644.8855.89.camel@twopensource.com \
--to=dturner@twopensource.com \
--cc=chriscool@tuxfamily.org \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
--cc=mhagger@alum.mit.edu \
--cc=pclouds@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.