* Relative submodule URLs vs. clone URL DWIMming
@ 2008-08-27 12:00 Johan Herland
2008-08-28 14:50 ` Mark Levedahl
0 siblings, 1 reply; 8+ messages in thread
From: Johan Herland @ 2008-08-27 12:00 UTC (permalink / raw)
To: git
Hi,
I'm observing a couple of bugs/issues when using relative submodule
URLs:
Say, I have a repo "foo" with a couple of submodules "bar", "baz", and
the following .gitmodules:
[submodule "bar"]
path = bar
url = ./bar
[submodule "baz"]
path = baz
url = ./baz
Now, what I want to do is to clone this repo and initialize the
submodules using the following commands:
git clone <URL> <DIR>
cd <DIR>
git submodule update --init
I will demonstrate that whether or not the "git submodule update --init"
succeeds depends on the value of <URL>:
Scenarios:
#1: <URL> is a non-bare local repo: "/home/user/foo"
builtin-clone.c:get_repo_path() will DWIM this to "/home/user/foo/.git",
and store this as the origin URL. Relative submodule URLs will be
resolved against this, and end up looking
like "/home/user/foo/.git/bar".
Result: FAIL
#2: <URL> is a non-bare remote repo: "file:///home/user/foo"
(For the purposes of this discussion, a file:// URL is equivalent to a
real remote URL) builtin-clone.c:get_repo_path() fails to DWIM it, and
we fall back to using it verbatim as the origin URL. Relative submodule
URLs will be resolved against this, and end up
like "file:///home/user/foo/bar".
Result: Success
Note: When using bare repos below, I assume that the repos are laid out
in the following manner:
/home/user/foo.git
/home/user/foo.git/bar.git
/home/user/foo.git/baz.git
#3: <URL> is a bare local repo (indirectly): "/home/user/foo"
builtin-clone.c:get_repo_path() will DWIM it to "/home/user/foo.git",
and store this as the origin URL. Relative submodule URLs will be
resolved against this, and end up looking
like "/home/user/foo.git/bar", which will work (because the same
DWIMming is applied when cloning submodules).
Result: Success
#4: <URL> is a bare local repo (directly): "/home/user/foo.git"
This ends up looking exactly like scenario #3.
Result: Success
#5: <URL> is a bare remote repo (indirectly): "file:///home/user/foo"
As in scenario #2, builtin-clone.c:get_repo_path() fails to DWIM it, and
the origin URL ends up as a verbatim copy, but the initial clone still
works because of similar DWIMing at the remote/transport level.
Relative submodule URLs will be resolved against the origin URL,
however, and end up looking like "file:///home/user/foo/bar", which
will not work, since - on the submodule clone - remote/transport level
will only try to DWIM the last path component
(i.e. "file:///home/user/foo/bar.git", and
not "file:///home/user/foo.git/bar.git").
Result: FAIL
#6: <URL> is a bare remote repo (directly): "file:///home/user/foo.git"
The initial clone goes off without a hitch, and the submodule URLs end
up looking like "file:///home/user/foo.git/bar", which is subsequently
DWIMmed "correctly" to "file:///home/user/foo.git/bar".
Result: Success
I'd like to fix this, but I'm not sure whether the fix belongs in
builtin-clone.c (i.e. making sure the origin URL is always "correct"
wrt. resolving relative submodule URLs), or in git-submodule.sh (i.e.
adding smarts when resolving relative submodule URLs against the
super-repo's origin URL).
...Johan
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Relative submodule URLs vs. clone URL DWIMming
2008-08-27 12:00 Relative submodule URLs vs. clone URL DWIMming Johan Herland
@ 2008-08-28 14:50 ` Mark Levedahl
2008-08-28 23:01 ` Johan Herland
0 siblings, 1 reply; 8+ messages in thread
From: Mark Levedahl @ 2008-08-28 14:50 UTC (permalink / raw)
To: Johan Herland; +Cc: git
Johan Herland wrote:
>
> I'd like to fix this, but I'm not sure whether the fix belongs in
> builtin-clone.c (i.e. making sure the origin URL is always "correct"
> wrt. resolving relative submodule URLs), or in git-submodule.sh (i.e.
> adding smarts when resolving relative submodule URLs against the
> super-repo's origin URL).
>
>
> ...Johan
>
>
I think the right approach is to start with clone and make it record the
real url it is using, regardless of what was input. The problem with
doing this in submodule is that in effect this replicates the search
logic clone would use, and furthermore could lead to nasty surprises by
grabbing the wrong submodule in an extreme case of having two
identically named repositories in different locations on a server.
I was about to create a patch for submodule to always remove trailing
"/.git" before resolving, but in fact a user could put the submodule
.git into the superproject's .git, in a non-bare repository, and then
gitlink that in the checked out submodule. So, it may also be good to
define and enforce rules on how relative url naming can be used for this
purpose. So far, I have only used it for bare repositories using the
"../<path> form keeping the submodules out of the superproject.
Mark
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Relative submodule URLs vs. clone URL DWIMming
2008-08-28 14:50 ` Mark Levedahl
@ 2008-08-28 23:01 ` Johan Herland
2008-08-30 22:27 ` Junio C Hamano
0 siblings, 1 reply; 8+ messages in thread
From: Johan Herland @ 2008-08-28 23:01 UTC (permalink / raw)
To: Mark Levedahl; +Cc: git
On Thursday 28 August 2008, Mark Levedahl wrote:
> Johan Herland wrote:
> > I'd like to fix this, but I'm not sure whether the fix belongs in
> > builtin-clone.c (i.e. making sure the origin URL is always "correct"
> > wrt. resolving relative submodule URLs), or in git-submodule.sh (i.e.
> > adding smarts when resolving relative submodule URLs against the
> > super-repo's origin URL).
>
> I think the right approach is to start with clone and make it record the
> real url it is using, regardless of what was input. The problem with
> doing this in submodule is that in effect this replicates the search
> logic clone would use, and furthermore could lead to nasty surprises by
> grabbing the wrong submodule in an extreme case of having two
> identically named repositories in different locations on a server.
Thanks for your input. I looked into builtin-clone, and how to make it write
the real origin URL for remote repos (it already does this today for local
repos). But AFAICS, there's no way to get this information from the
transport layer. I assume that the actual repo location is resolved on the
remote side, and simply not communicated back to the local side.
Also, after thinking some more about this, I'm unsure of the practical
implications of this. Let's assume we can actually make "git clone" write
the real URL in all cases (i.e. typically "repo.git" for bare repos
and "repo/.git" for non-bare repos):
If we simply resolve submodule URLs against the _repo_ (i.e. the real origin
URL) and not the work tree (if any), we get results that are coupled to
whether we use bare or non-bare repos: Take, for example, your use
of "../<path>" to make submodules live outside the (bare) superproject. If I
now create a non-bare clone of this, I must move the submodule repos _into_
my work tree, so that the submodule repos are available, if someone tries to
clone from me.
Additionally, when making super/sub repos available for others, I must make
a mental effort to interpret the relative submodule URLs in .gitmodules
against - NOT the work tree in which .gitmodules is located - but rather
against the .git directory which is (most often) one directory level further
down from the .gitmodules file.
Maybe the easiest solution is to add a new config directive called something
like "core.submodules.baseUrl". When set, relative submodule URLs would be
resolved against this instead of the origin URL. When unset, the origin URL
is used as a fallback. Hmm?
Still, it would be nice to have intuitive and consistent behaviour by
default, though...
> I was about to create a patch for submodule to always remove trailing
> "/.git" before resolving, but in fact a user could put the submodule
> .git into the superproject's .git, in a non-bare repository, and then
> gitlink that in the checked out submodule. So, it may also be good to
> define and enforce rules on how relative url naming can be used for this
> purpose. So far, I have only used it for bare repositories using the
> "../<path> form keeping the submodules out of the superproject.
Yes, duplication of such logic is probably a bad idea. Would it be better to
consolidate all this URL DWIMming in one place which is #included from
builtin-clone, git-submodule, git-fetch, and other commands that need this
functionality?
In any case, defining the rules for how relative submodule URLs are supposed
to work would certainly make a good step in the right direction.
Have fun! :)
...Johan
PS: Here are the beginning of a test case for codifying the rules of
relative submodule URL behaviour. Needless to say, the last testcase
currently fails:
diff --git a/t/t7403-submodule-relative.sh b/t/t7403-submodule-relative.sh
new file mode 100755
index 0000000..9c12248
--- /dev/null
+++ b/t/t7403-submodule-relative.sh
@@ -0,0 +1,60 @@
+#!/bin/sh
+#
+# Copyright (c) 2008 Johan Herland
+#
+
+test_description='Testing repos with relative submodule URLs
+
+This test tries to verify the sanity of working with relative submodule URLs.
+'
+
+. ./test-lib.sh
+
+TRASH_DIR=$(pwd)
+
+#
+# Test setup:
+# - create the following super-/sub-repository hierarchy:
+# /
+# /file
+# /a submodule/
+# /a submodule/file
+#
+test_expect_success 'setup' '
+
+ echo file > file &&
+ git add file &&
+ test_tick &&
+ git commit -m initial &&
+ test_tick &&
+ git clone . "a submodule" &&
+ cat >.gitmodules <<EOF &&
+[submodule "a submodule"]
+ path = "a submodule"
+ url = "./a submodule"
+EOF
+ git add "a submodule" .gitmodules &&
+ git commit -m "a submodule"
+
+'
+
+test_expect_success 'Cloning repo and updating submodules using file:// URL' '
+
+ git clone "file://$TRASH_DIR" clone &&
+ cd clone &&
+ git submodule update --init &&
+ test -d "a submodule/.git"
+
+'
+
+test_expect_success 'Cloning repo and updating submodules using local path' '
+
+ rm -rf clone &&
+ git clone "$TRASH_DIR" clone &&
+ cd clone &&
+ git submodule update --init &&
+ test -d "a submodule/.git"
+
+'
+
+test_done
--
Johan Herland, <johan@herland.net>
www.herland.net
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: Relative submodule URLs vs. clone URL DWIMming
2008-08-28 23:01 ` Johan Herland
@ 2008-08-30 22:27 ` Junio C Hamano
2008-08-30 23:23 ` Johan Herland
0 siblings, 1 reply; 8+ messages in thread
From: Junio C Hamano @ 2008-08-30 22:27 UTC (permalink / raw)
To: Johan Herland; +Cc: Mark Levedahl, git
Johan Herland <johan@herland.net> writes:
> ... But AFAICS, there's no way to get this information from the
> transport layer. I assume that the actual repo location is resolved on the
> remote side, and simply not communicated back to the local side.
Correct.
> If we simply resolve submodule URLs against the _repo_ (i.e. the real origin
> URL) and not the work tree (if any), we get results that are coupled to
> whether we use bare or non-bare repos: Take, for example, your use
> of "../<path>" to make submodules live outside the (bare) superproject. If I
> now create a non-bare clone of this, I must move the submodule repos _into_
> my work tree, so that the submodule repos are available, if someone tries to
> clone from me.
I personally feel that cases that involve cloning from non-bare
repositories (and in addition, DWIMmed repositories), with or without
nested submodules, are not worth supporting.
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Relative submodule URLs vs. clone URL DWIMming
2008-08-30 22:27 ` Junio C Hamano
@ 2008-08-30 23:23 ` Johan Herland
2008-08-30 23:45 ` Junio C Hamano
0 siblings, 1 reply; 8+ messages in thread
From: Johan Herland @ 2008-08-30 23:23 UTC (permalink / raw)
To: git; +Cc: Junio C Hamano, Mark Levedahl
On Sunday 31 August 2008, Junio C Hamano wrote:
> Johan Herland <johan@herland.net> writes:
> I personally feel that cases that involve cloning from non-bare
> repositories (and in addition, DWIMmed repositories), with or without
> nested submodules, are not worth supporting.
Ok. I guess that's fair. Such a decision should definitely be documented
appropriately in the submodule documentation, though.
I guess this also means you don't plan to do anything about the difference
in origin URLs produced by "git clone" for the following two cases:
$ git clone /repo/foo bar
$ grep -B1 url bar/.git/config
[remote "origin"]
url = /repo/foo/.git
vs.
$ git clone file:///repo/foo bar
$ grep -B1 url bar/.git/config
[remote "origin"]
file:///home/johan/git/foo
Hmm?
...Johan
--
Johan Herland, <johan@herland.net>
www.herland.net
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: Relative submodule URLs vs. clone URL DWIMming
2008-08-30 23:23 ` Johan Herland
@ 2008-08-30 23:45 ` Junio C Hamano
2008-09-01 19:07 ` [PATCH] Bring local clone's origin URL in line with that of a remote clone Johan Herland
0 siblings, 1 reply; 8+ messages in thread
From: Junio C Hamano @ 2008-08-30 23:45 UTC (permalink / raw)
To: Johan Herland; +Cc: git, Mark Levedahl
Johan Herland <johan@herland.net> writes:
> $ git clone /repo/foo bar
> $ grep -B1 url bar/.git/config
> [remote "origin"]
> url = /repo/foo/.git
>
> vs.
>
> $ git clone file:///repo/foo bar
> $ grep -B1 url bar/.git/config
> [remote "origin"]
> file:///home/johan/git/foo
>
> Hmm?
If you mean "the latter lacks 'url =' and is broken", and if that is what
happens, then it needs to be fixed. But otherwise, I do not personally
find these differences interesting.
But that does not mean I'd veto if somebody else cares deeply enough and
comes up with a clean solution.
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH] Bring local clone's origin URL in line with that of a remote clone
2008-08-30 23:45 ` Junio C Hamano
@ 2008-09-01 19:07 ` Johan Herland
2008-09-02 7:23 ` Junio C Hamano
0 siblings, 1 reply; 8+ messages in thread
From: Johan Herland @ 2008-09-01 19:07 UTC (permalink / raw)
To: Junio C Hamano; +Cc: git, Mark Levedahl
On a local clone, "git clone" would use the fully DWIMmed path as the origin
URL in the resulting repo. This was slightly inconsistent with the case of a
remote clone where the _given_ URL was used as the origin URL (because the
DWIMming was done remotely, and was therefore not available to "git clone").
This behaviour caused problems when cloning a local non-bare repo with
relative submodule URLs, because these submodule URLs would then be resolved
against the DWIMmed URL (e.g. "/repo/.git") instead of the given URL (e.g.
"/repo").
This patch teaches "git clone" to use the _given_ URL - instead of the
DWIMmed path - as the origin URL. This causes relative submodule URLs to be
resolved correctly, as long the _given_ URL indicates the correct directory
against which the submodule URLs should be resolved.
The patch also updates a testcase that contained the old-style origin URLs.
Signed-off-by: Johan Herland <johan@herland.net>
---
On Sunday 31 August 2008, Junio C Hamano wrote:
> Johan Herland <johan@herland.net> writes:
> > $ git clone /repo/foo bar
> > $ grep -B1 url bar/.git/config
> > [remote "origin"]
> > url = /repo/foo/.git
> >
> > vs.
> >
> > $ git clone file:///repo/foo bar
> > $ grep -B1 url bar/.git/config
> > [remote "origin"]
> > file:///home/johan/git/foo
> >
> > Hmm?
>
> If you mean "the latter lacks 'url =' and is broken", and if that is what
> happens, then it needs to be fixed. But otherwise, I do not personally
> find these differences interesting.
Sorry, my fault (cut-n-paste error)
> But that does not mean I'd veto if somebody else cares deeply enough and
> comes up with a clean solution.
Is this clean enough? It passes selftests on my box, and although it
doesn't resolve all the issues I raised in this thread, it _does_ resolve
the one issue that I first bumped into, and that triggered this thread in
the first place. I expect this one issue is also the first that most other
people would bump into when working locally with relative submodule URLs.
Have fun! :)
...Johan
builtin-clone.c | 2 +-
t/t5505-remote.sh | 6 +++---
2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/builtin-clone.c b/builtin-clone.c
index c0e3086..f44ecea 100644
--- a/builtin-clone.c
+++ b/builtin-clone.c
@@ -387,7 +387,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
path = get_repo_path(repo_name, &is_bundle);
if (path)
- repo = path;
+ repo = xstrdup(make_nonrelative_path(repo_name));
else if (!strchr(repo_name, ':'))
repo = xstrdup(make_absolute_path(repo_name));
else
diff --git a/t/t5505-remote.sh b/t/t5505-remote.sh
index be9ee93..c449663 100755
--- a/t/t5505-remote.sh
+++ b/t/t5505-remote.sh
@@ -109,7 +109,7 @@ test_expect_success 'remove remote' '
cat > test/expect << EOF
* remote origin
- URL: $(pwd)/one/.git
+ URL: $(pwd)/one
Remote branch merged with 'git pull' while on branch master
master
New remote branch (next fetch will store in remotes/origin)
@@ -140,7 +140,7 @@ test_expect_success 'show' '
cat > test/expect << EOF
* remote origin
- URL: $(pwd)/one/.git
+ URL: $(pwd)/one
Remote branch merged with 'git pull' while on branch master
master
Tracked remote branches
@@ -169,7 +169,7 @@ test_expect_success 'prune' '
cat > test/expect << EOF
Pruning origin
-URL: $(pwd)/one/.git
+URL: $(pwd)/one
* [would prune] origin/side2
EOF
--
1.6.0.96.g2fad1
^ permalink raw reply related [flat|nested] 8+ messages in thread
end of thread, other threads:[~2008-09-02 7:24 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-08-27 12:00 Relative submodule URLs vs. clone URL DWIMming Johan Herland
2008-08-28 14:50 ` Mark Levedahl
2008-08-28 23:01 ` Johan Herland
2008-08-30 22:27 ` Junio C Hamano
2008-08-30 23:23 ` Johan Herland
2008-08-30 23:45 ` Junio C Hamano
2008-09-01 19:07 ` [PATCH] Bring local clone's origin URL in line with that of a remote clone Johan Herland
2008-09-02 7:23 ` Junio C Hamano
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).