git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 00/21] Support multiple worktrees
  2013-12-11 14:15 [PATCH/POC 0/7] " Nguyễn Thái Ngọc Duy
@ 2013-12-14 10:54 ` Nguyễn Thái Ngọc Duy
  2013-12-15  2:29   ` Duy Nguyen
  0 siblings, 1 reply; 9+ messages in thread
From: Nguyễn Thái Ngọc Duy @ 2013-12-14 10:54 UTC (permalink / raw)
  To: git; +Cc: Jonathan Niedier, Junio C Hamano,
	Nguyễn Thái Ngọc Duy

The UI and behavior are taking shape now. On the UI side, you do

  git checkout --to /somewhere -b newbranch origin/master

which will create worktree-only repo at /somewhere. "git prune --repos"
could be used to remove cruft in .git/repos.

On the behavior side, you should be able to do everything in
/somewhere just like in a normal repository. If a ref is updated (from
any repository) that also happens to be your HEAD, it will be
detached. "git rev-list --all" is also taught to take repos/.../HEAD
into account.

The structure of repos/XXX is documented in 17/21. Known issues

 - naming ($GIT_SUPER_DIR, the name of the shared repo and the
   dependent one, the reuse of "gitdir" in .git files)

 - gc --auto support, support for manually pruning .git/repos

 - should probably support the new .git format in enter_repo() so that
   we can push to it

 - not sure if we need UI for deleting repositories created with
   checkout --to, or just "rm -r" and let "gc --auto" clean
   things up. The thing about "rm -r(f)" is that if .git happens to be
   a real repo, the user is screwed so I don't really like to
   encourage doing it.

 - more tests


Nguyễn Thái Ngọc Duy (21):
  path.c: avoid PATH_MAX as buffer size from get_pathname()
  path.c: rename vsnpath() to git_vsnpath()
  path.c: move git_path() closer to similar functions git_pathdup()
  Make git_path() aware of file relocation in $GIT_DIR
  reflog: use avoid constructing .lock path with git_path
  fast-import: use git_path() for accessing .git dir instead of get_git_dir()
  Add new environment variable $GIT_SUPER_DIR
  setup.c: refactor path manipulation out of read_gitfile()
  setup.c: add split-repo support to .git files
  setup.c: add split-repo support to is_git_directory()
  setup.c: reduce cleanup sites in setup_explicit_git_dir()
  environment.c: support super .git file specified by $GIT_DIR
  setup: support $GIT_SUPER_DIR as well as super .git files
  checkout: support checking out into a new working directory
  checkout: clean up half-prepared directories in --to mode
  setup.c: keep track of the .git file location if read
  prune: strategies for split repositories
  refs: adjust reflog path for repos/<id>/HEAD
  refs: detach split repos' HEAD when the linked ref is updated/deleted
  refs.c: refactor do_head_ref(... to do_one_ref("HEAD", ...
  revision: include repos/../HEAD in --all

 Documentation/config.txt               |   3 +-
 Documentation/git-checkout.txt         |   6 +
 Documentation/git-prune.txt            |   4 +
 Documentation/git.txt                  |   8 ++
 Documentation/gitrepository-layout.txt |  30 ++++
 builtin/checkout.c                     | 173 ++++++++++++++++++++++
 builtin/prune.c                        |  65 +++++++++
 builtin/reflog.c                       |   2 +-
 builtin/rev-parse.c                    |   6 +
 cache.h                                |   5 +
 environment.c                          |  37 ++++-
 fast-import.c                          |   5 +-
 path.c                                 | 140 ++++++++++++++----
 refs.c                                 |  88 ++++++++++--
 refs.h                                 |   1 +
 revision.c                             |   1 +
 setup.c                                | 253 ++++++++++++++++++++++++---------
 t/t0060-path-utils.sh                  | 133 +++++++++++++++++
 t/t1501-worktree.sh                    |  52 +++++++
 t/t1510-repo-setup.sh                  |   1 +
 test-path-utils.c                      |   7 +
 trace.c                                |   1 +
 22 files changed, 904 insertions(+), 117 deletions(-)

-- 
1.8.5.1.77.g42c48fa

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 00/21] Support multiple worktrees
  2013-12-14 10:54 ` [PATCH v2 00/21] " Nguyễn Thái Ngọc Duy
@ 2013-12-15  2:29   ` Duy Nguyen
  0 siblings, 0 replies; 9+ messages in thread
From: Duy Nguyen @ 2013-12-15  2:29 UTC (permalink / raw)
  To: Git Mailing List
  Cc: Jonathan Niedier, Junio C Hamano,
	Nguyễn Thái Ngọc Duy

On Sat, Dec 14, 2013 at 5:54 PM, Nguyễn Thái Ngọc Duy <pclouds@gmail.com> wrote:
> Known issues

Scripts that expand "$GIT_DIR/objects" and are not aware about the new
env variable. I introduced "test-path-utils --git-path" to test
git_path(). I could move it to "git rev-parse --git-path" for use in
scripts, but there'll be more changes. git-new-workdir's symlink
approach shines here.
-- 
Duy

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 00/21] Support multiple worktrees
@ 2013-12-19 14:12 Duy Nguyen
  2013-12-20 20:32 ` Junio C Hamano
  0 siblings, 1 reply; 9+ messages in thread
From: Duy Nguyen @ 2013-12-19 14:12 UTC (permalink / raw)
  To: Git Mailing List

I've got a better version [1] that fixes everything I can think of
(there's still some room for improvements). I'm going to use it a bit
longer before reposting again. But here's its basic design without
going down to code

New .git file format includes two lines:
-- 8< --
gitid: <id>
gitdir: <path>
-- 8< --

Which would set $GIT_COMMON_DIR to <path> and $GIT_DIR to
<path>/repos/<id>. Repository split is the same as before, worktree
stuff in $GIT_DIR, the rest in $GIT_COMMON_DIR. This .git file format
takes precedence over core.worktree but can still be overriden with
$GIT_WORK_TREE. The main interface to create new worktree is "git
checkout --to".

"repos" belongs to $GIT_COMMON_DIR (i.e. shared across all checkouts).
The new worktrees (which I call "linked checkouts") can also access
HEAD of the original worktree via a virtual path "main/HEAD". This
makes it possible for a linked checkout to detach HEAD of the main
one.

There are three entries in repos/<id>: "gitdir" should point to the
.git file that points it back here. Every time a linked checkout is
used, git should update mtime of this "gitdir" file to help pruning.
It should update the file content too if the repo is moved. "link" is
a hardlink to .git file, if supported, again for pruning support.
"locked", if exists, means no automatic pruning (e.g. the linked
checkout is on a portable device).

The interesting thing is support for third party scripts (or hooks,
maybe) so that they could work with both old and new git versions
without some sort of git version/feature detection. Of course old git
versions will only work with ordinary worktrees. To that end, "git
rev-parse --git-dir" behavior could be changed by two environment
variables. $GIT_ONE_PATH makes 'rev-parse --git-dir' return the .git
_file_ in this case, which makes it much easier to pass the repo's
checkout view around with "git --git-dir=... ".$GIT_COMMON_DIR_PATH
makes 'rev-parse --git-dir' return $GIT_COMMON_DIR if it's from a
linked checkout, or $GIT_DIR otherwise. This makes 'rev-parse
--git-dir' falls back safely when running using old git versions. The
last patch in [1] that updates git-completion.bash could demonstrate
how it's used.

[1] https://github.com/pclouds/git.git checkout-new-worktree
-- 
Duy

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 00/21] Support multiple worktrees
  2013-12-19 14:12 [PATCH v2 00/21] Support multiple worktrees Duy Nguyen
@ 2013-12-20 20:32 ` Junio C Hamano
  2013-12-21  2:00   ` Duy Nguyen
  0 siblings, 1 reply; 9+ messages in thread
From: Junio C Hamano @ 2013-12-20 20:32 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Git Mailing List

Duy Nguyen <pclouds@gmail.com> writes:

> I've got a better version [1] that fixes everything I can think of
> (there's still some room for improvements). I'm going to use it a bit
> longer before reposting again. But here's its basic design without
> going down to code
>
> New .git file format includes two lines:
> -- 8< --
> gitid: <id>
> gitdir: <path>
> -- 8< --
>
> Which would set $GIT_COMMON_DIR to <path> and $GIT_DIR to
> <path>/repos/<id>. Repository split is the same as before, worktree
> stuff in $GIT_DIR, the rest in $GIT_COMMON_DIR. This .git file format
> takes precedence over core.worktree but can still be overriden with
> $GIT_WORK_TREE. The main interface to create new worktree is "git
> checkout --to".
>
> "repos" belongs to $GIT_COMMON_DIR (i.e. shared across all checkouts).
> The new worktrees (which I call "linked checkouts") can also access
> HEAD of the original worktree via a virtual path "main/HEAD". This
> makes it possible for a linked checkout to detach HEAD of the main
> one.

I am not happy with the choice of "main/HEAD" that would squat on a
good name for remote-tracking branch (i.e. s/origin/main/), though.
$GIT_DIR/COMMON_HEAD perhaps?

> The interesting thing is support for third party scripts (or hooks,
> maybe) so that they could work with both old and new git versions
> without some sort of git version/feature detection. Of course old git
> versions will only work with ordinary worktrees. To that end, "git
> rev-parse --git-dir" behavior could be changed by two environment
> variables. $GIT_ONE_PATH makes 'rev-parse --git-dir' return the .git
> _file_ in this case, which makes it much easier to pass the repo's
> checkout view around with "git --git-dir=... ".$GIT_COMMON_DIR_PATH
> makes 'rev-parse --git-dir' return $GIT_COMMON_DIR if it's from a
> linked checkout, or $GIT_DIR otherwise.

I do not understand why you need to go such a route.

Existing scripts that works only in a real repository will only know
"git rev-parse --git-dir" as the way to get the real GIT_DIR and
would not care about the "common" thing.  Scripts updated to work
well with the "common" thing needs to be aware of the "common" thing
anyway, so adding "git rev-parse --common-git-dir" or somesuch that
only these updated knows would be sufficient, no?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 00/21] Support multiple worktrees
  2013-12-20 20:32 ` Junio C Hamano
@ 2013-12-21  2:00   ` Duy Nguyen
  2013-12-22  6:38     ` Junio C Hamano
  0 siblings, 1 reply; 9+ messages in thread
From: Duy Nguyen @ 2013-12-21  2:00 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List

On Sat, Dec 21, 2013 at 3:32 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Duy Nguyen <pclouds@gmail.com> writes:
>
>> I've got a better version [1] that fixes everything I can think of
>> (there's still some room for improvements). I'm going to use it a bit
>> longer before reposting again. But here's its basic design without
>> going down to code
>>
>> New .git file format includes two lines:
>> -- 8< --
>> gitid: <id>
>> gitdir: <path>
>> -- 8< --
>>
>> Which would set $GIT_COMMON_DIR to <path> and $GIT_DIR to
>> <path>/repos/<id>. Repository split is the same as before, worktree
>> stuff in $GIT_DIR, the rest in $GIT_COMMON_DIR. This .git file format
>> takes precedence over core.worktree but can still be overriden with
>> $GIT_WORK_TREE. The main interface to create new worktree is "git
>> checkout --to".
>>
>> "repos" belongs to $GIT_COMMON_DIR (i.e. shared across all checkouts).
>> The new worktrees (which I call "linked checkouts") can also access
>> HEAD of the original worktree via a virtual path "main/HEAD". This
>> makes it possible for a linked checkout to detach HEAD of the main
>> one.
>
> I am not happy with the choice of "main/HEAD" that would squat on a
> good name for remote-tracking branch (i.e. s/origin/main/), though.
> $GIT_DIR/COMMON_HEAD perhaps?

It's not just about HEAD. Anything worktree-specific of the main
checkout can be accessed this way, e.g. main/index,
main/FETCH_HEAD.... and it's not exactly "common" because it's
worktree info. Maybe 1ST_ as the prefix (e.g. 1ST_HEAD, 1ST_index...)
?

>> The interesting thing is support for third party scripts (or hooks,
>> maybe) so that they could work with both old and new git versions
>> without some sort of git version/feature detection. Of course old git
>> versions will only work with ordinary worktrees. To that end, "git
>> rev-parse --git-dir" behavior could be changed by two environment
>> variables. $GIT_ONE_PATH makes 'rev-parse --git-dir' return the .git
>> _file_ in this case, which makes it much easier to pass the repo's
>> checkout view around with "git --git-dir=... ".$GIT_COMMON_DIR_PATH
>> makes 'rev-parse --git-dir' return $GIT_COMMON_DIR if it's from a
>> linked checkout, or $GIT_DIR otherwise.
>
> I do not understand why you need to go such a route.
>
> Existing scripts that works only in a real repository will only know
> "git rev-parse --git-dir" as the way to get the real GIT_DIR and
> would not care about the "common" thing.  Scripts updated to work
> well with the "common" thing needs to be aware of the "common" thing
> anyway, so adding "git rev-parse --common-git-dir" or somesuch that
> only these updated knows would be sufficient, no?

It simplifies the changes, if the new script is to work with both old
and new git versions it may have to write

DIR=`git rev-parse --git-common-dir 2>/dev/null || git rev-parse --git-dir`

the env way makes it

DIR=`GIT_COMMON_DIR=1 git rev-parse --git-dir`
-- 
Duy

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 00/21] Support multiple worktrees
  2013-12-21  2:00   ` Duy Nguyen
@ 2013-12-22  6:38     ` Junio C Hamano
  2013-12-22  8:44       ` Duy Nguyen
  0 siblings, 1 reply; 9+ messages in thread
From: Junio C Hamano @ 2013-12-22  6:38 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Git Mailing List

Duy Nguyen <pclouds@gmail.com> writes:

>> I am not happy with the choice of "main/HEAD" that would squat on a
>> good name for remote-tracking branch (i.e. s/origin/main/), though.
>> $GIT_DIR/COMMON_HEAD perhaps?
>
> It's not just about HEAD. Anything worktree-specific of the main
> checkout can be accessed this way, e.g. main/index,
> main/FETCH_HEAD.... and it's not exactly "common" because it's
> worktree info. Maybe 1ST_ as the prefix (e.g. 1ST_HEAD, 1ST_index...)
> ?

Do we even need to expose them as ref-like things as a part of the
external API/UI in the first place?  For end-user scripts that want
to operate in a real or borrowing worktree, there should be no
reason to touch this "other" repository directly.  Things like "if
one of the wortrees tries to check out a branch that is already
checked out elsewhere, error out" policy may need to consult the
other worktrees via $GIT_COMMON_DIR mechanism but at that level we
have all the control without contaminating end-user facing ref
namespace in a way main/FETCH_HEAD... do.  You said

    This makes it possible for a linked checkout to detach HEAD of
    the main one.

but I think that is misguided---it just makes it easier to confuse
users, if done automatically and without any policy knob. It instead
should error out, while saying which worktree has the branch in
question checked out. After all, checking out a branch that is
checked out in another worktree (not the common one) needs the same
caution, so "main/HEAD" is not the only special one.

And if your updated "git checkout 'frotz'" gives a clear report of
which worktree has the branch 'frotz' checked out, the user can go
there to detach the HEAD in that worktree to detach with

	git -C $the_other_one checkout HEAD^0

if he chooses to.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 00/21] Support multiple worktrees
  2013-12-22  6:38     ` Junio C Hamano
@ 2013-12-22  8:44       ` Duy Nguyen
  2013-12-26 17:12         ` Junio C Hamano
  0 siblings, 1 reply; 9+ messages in thread
From: Duy Nguyen @ 2013-12-22  8:44 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List

On Sun, Dec 22, 2013 at 1:38 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Duy Nguyen <pclouds@gmail.com> writes:
>
>>> I am not happy with the choice of "main/HEAD" that would squat on a
>>> good name for remote-tracking branch (i.e. s/origin/main/), though.
>>> $GIT_DIR/COMMON_HEAD perhaps?
>>
>> It's not just about HEAD. Anything worktree-specific of the main
>> checkout can be accessed this way, e.g. main/index,
>> main/FETCH_HEAD.... and it's not exactly "common" because it's
>> worktree info. Maybe 1ST_ as the prefix (e.g. 1ST_HEAD, 1ST_index...)
>> ?
>
> Do we even need to expose them as ref-like things as a part of the
> external API/UI in the first place?  For end-user scripts that want
> to operate in a real or borrowing worktree, there should be no
> reason to touch this "other" repository directly.  Things like "if
> one of the wortrees tries to check out a branch that is already
> checked out elsewhere, error out" policy may need to consult the
> other worktrees via $GIT_COMMON_DIR mechanism but at that level we
> have all the control without contaminating end-user facing ref
> namespace in a way main/FETCH_HEAD... do.

No, external API/UI is just extra bonus. We need to (or should) do so
in order to handle $GIT_COMMON_DIR/HEAD exactly like how we do normal
refs. Given any ref, git_path(ref) gives the path to that ref,
git_path("logs/%s", ref) gives the path of its reflog. By mapping
special names to real paths behind git_path(), We can feed
$GIT_COMMON_DIR/HEAD (under special names) to refs.c and it'll handle
correctly without any changes for special cases.

> You said
>
>     This makes it possible for a linked checkout to detach HEAD of
>     the main one.
>
> but I think that is misguided---it just makes it easier to confuse
> users, if done automatically and without any policy knob. It instead
> should error out, while saying which worktree has the branch in
> question checked out. After all, checking out a branch that is
> checked out in another worktree (not the common one) needs the same
> caution, so "main/HEAD" is not the only special one.
>
> And if your updated "git checkout 'frotz'" gives a clear report of
> which worktree has the branch 'frotz' checked out, the user can go
> there to detach the HEAD in that worktree to detach with
>
>         git -C $the_other_one checkout HEAD^0
>
> if he chooses to.

Jonathan mentions about the "checkout in portable device" case that
would make the above a bit unnatural as you just can't "cd" there (git
update-ref still works).

I don't see any problems with checking out a branch multiple times. I
may want to try modifying something in the branch that will be thrown
away in the end. It's when the user updates a branch that we should
either error+reject or detach other checkouts. I guess it's up to the
user to decide which way they want. The error+reject way may make the
user hunt through dead checkouts waiting to be pruned. But we can
start with error+reject then add an option to auto-detach.
-- 
Duy

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 00/21] Support multiple worktrees
  2013-12-22  8:44       ` Duy Nguyen
@ 2013-12-26 17:12         ` Junio C Hamano
  2013-12-28  2:46           ` Duy Nguyen
  0 siblings, 1 reply; 9+ messages in thread
From: Junio C Hamano @ 2013-12-26 17:12 UTC (permalink / raw)
  To: Duy Nguyen; +Cc: Git Mailing List

Duy Nguyen <pclouds@gmail.com> writes:

> On Sun, Dec 22, 2013 at 1:38 PM, Junio C Hamano <gitster@pobox.com> wrote:
>
>> Do we even need to expose them as ref-like things as a part of the
>> external API/UI in the first place?  For end-user scripts that want
>> to operate in a real or borrowing worktree, there should be no
>> reason to touch this "other" repository directly.  Things like "if
>> one of the wortrees tries to check out a branch that is already
>> checked out elsewhere, error out" policy may need to consult the
>> other worktrees via $GIT_COMMON_DIR mechanism but at that level we
>> have all the control without contaminating end-user facing ref
>> namespace in a way main/FETCH_HEAD... do.
>
> No, external API/UI is just extra bonus. We need to (or should) do so
> in order to handle $GIT_COMMON_DIR/HEAD exactly like how we do normal
> refs.

And that is what I consider a confusion-trigger, not a nice bonus.

I do not think it is particularly a good idea to contaminate
end-user namespace for this kind of "peek another repository"
special operation.

In order to handle your "multiple worktrees manipulating the same
branch" case sanely, you need to be aware of not just the real
repository your worktree is borrowing from, but also _all_ the other
worktrees that borrow from that same real repository, so a single
"main" virtual namespace will not cut it. You will be dealing with
an unbounded set of HEADs, one for each such worktree.

Can't we do this by adding a "with this real repository" layer,
e.g. "resolve HEAD wrt that repository", somewhat similar to how we
peek into submodule namespaces?

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH v2 00/21] Support multiple worktrees
  2013-12-26 17:12         ` Junio C Hamano
@ 2013-12-28  2:46           ` Duy Nguyen
  0 siblings, 0 replies; 9+ messages in thread
From: Duy Nguyen @ 2013-12-28  2:46 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Git Mailing List

On Fri, Dec 27, 2013 at 12:12 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Duy Nguyen <pclouds@gmail.com> writes:
>
>> On Sun, Dec 22, 2013 at 1:38 PM, Junio C Hamano <gitster@pobox.com> wrote:
>>
>>> Do we even need to expose them as ref-like things as a part of the
>>> external API/UI in the first place?  For end-user scripts that want
>>> to operate in a real or borrowing worktree, there should be no
>>> reason to touch this "other" repository directly.  Things like "if
>>> one of the wortrees tries to check out a branch that is already
>>> checked out elsewhere, error out" policy may need to consult the
>>> other worktrees via $GIT_COMMON_DIR mechanism but at that level we
>>> have all the control without contaminating end-user facing ref
>>> namespace in a way main/FETCH_HEAD... do.
>>
>> No, external API/UI is just extra bonus. We need to (or should) do so
>> in order to handle $GIT_COMMON_DIR/HEAD exactly like how we do normal
>> refs.
>
> And that is what I consider a confusion-trigger, not a nice bonus.
>
> I do not think it is particularly a good idea to contaminate
> end-user namespace for this kind of "peek another repository"
> special operation.
>
> In order to handle your "multiple worktrees manipulating the same
> branch" case sanely, you need to be aware of not just the real
> repository your worktree is borrowing from, but also _all_ the other
> worktrees that borrow from that same real repository, so a single
> "main" virtual namespace will not cut it. You will be dealing with
> an unbounded set of HEADs, one for each such worktree.

Yes. My problem is, while all secondary worktrees are in
$GIT_DIR/repos and their HEADs can be accessed there with
"repos/xxx/HEAD", the first worktree's HEAD can't be accessed this way
because "HEAD" in a linked checkouts means repos/<my worktree>/HEAD.

> Can't we do this by adding a "with this real repository" layer,
> e.g. "resolve HEAD wrt that repository", somewhat similar to how we
> peek into submodule namespaces?

Hmm.. never thought of it like a "submodule". Thanks for the idea.
-- 
Duy

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2013-12-28  2:47 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2013-12-19 14:12 [PATCH v2 00/21] Support multiple worktrees Duy Nguyen
2013-12-20 20:32 ` Junio C Hamano
2013-12-21  2:00   ` Duy Nguyen
2013-12-22  6:38     ` Junio C Hamano
2013-12-22  8:44       ` Duy Nguyen
2013-12-26 17:12         ` Junio C Hamano
2013-12-28  2:46           ` Duy Nguyen
  -- strict thread matches above, loose matches on Subject: below --
2013-12-11 14:15 [PATCH/POC 0/7] " Nguyễn Thái Ngọc Duy
2013-12-14 10:54 ` [PATCH v2 00/21] " Nguyễn Thái Ngọc Duy
2013-12-15  2:29   ` Duy Nguyen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).