git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Location-agnostic submodules
@ 2012-04-27 14:37 Pierre Thierry
  2012-04-30 20:39 ` Phil Hord
  0 siblings, 1 reply; 8+ messages in thread
From: Pierre Thierry @ 2012-04-27 14:37 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 1163 bytes --]

I just discovered the workings of the submodule command, and as I have
grown to like the fact that a repository is not unique with Git, and
specifically that it has no unique or central location, I'm bothered
by how submodule works.

Would there be any major issue in having (1) submodule to be able to
clone the submodules from the super repository when they are available
there and (2) having zero, one or many addresses for each submodule,
used as hints (obviously not used when (1) is satisfied)?

When the repository is not bare, the submodules would be found at
their place in the tree, nothing difficult here. When the repository
is bare, there could be a tree with the bare repositories of the
submodules.

This could be done by a new subcommand, that would take a remote as an
optional argument, its default being origin, as usual:

$ git submodule clone origin


As I see it, adding this 'clone' subcommand for the case where the
repository is not bare couldn't add any compatibility issue, so if I'm
right on this point, I'd like to try and implement this soon.

Curiously,
Pierre
-- 
pierre@nothos.net
OpenPGP 0xD9D50D8A

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Location-agnostic submodules
  2012-04-27 14:37 Location-agnostic submodules Pierre Thierry
@ 2012-04-30 20:39 ` Phil Hord
  2012-04-30 22:02   ` Pierre Thierry
  0 siblings, 1 reply; 8+ messages in thread
From: Phil Hord @ 2012-04-30 20:39 UTC (permalink / raw)
  To: Pierre Thierry; +Cc: git

On Fri, Apr 27, 2012 at 10:37 AM, Pierre Thierry <pierre@nothos.net> wrote:
> I just discovered the workings of the submodule command, and as I have
> grown to like the fact that a repository is not unique with Git, and
> specifically that it has no unique or central location, I'm bothered
> by how submodule works.
>
> Would there be any major issue in having (1) submodule to be able to
> clone the submodules from the super repository when they are available
> there and (2) having zero, one or many addresses for each submodule,
> used as hints (obviously not used when (1) is satisfied)?

Maybe something like this:
    [submodule "foo"]
        path = foo-mod
        url = ../foo ../foo-alternate
https://someplace.com/git/foo.git  https://kernel.org/git/foo


I think the problem now will be that you have an indeterminate source
URL for your submodule.  So far as all of your alternate locations are
the same it is probably not a problem.  But if one of them lags behind
the others by a day or even an hour, then you may have gitlinks in
your superproject which have not made it into the lagging mirror yet.
And this will cause problems.

Moreover, each time you clone the repository you may get different
results.  This would be confusing.

But aside from these administrative issues, I think this could work.

> When the repository is not bare, the submodules would be found at
> their place in the tree, nothing difficult here. When the repository
> is bare, there could be a tree with the bare repositories of the
> submodules.

I think this could work even if the repository were bare; these days,
submodule repositories are stored in .git/modules/* anyway.  So if it
were possible to craft a bare repository with this structure in place,
then even "bare" repos could support embedded submodules like this.

I think this amplifies the relative URL problem.  That problem exists
anyway, but this maybe gives it more ways to fail.

> This could be done by a new subcommand, that would take a remote as an
> optional argument, its default being origin, as usual:
>
> $ git submodule clone origin
>
> As I see it, adding this 'clone' subcommand for the case where the
> repository is not bare couldn't add any compatibility issue, so if I'm
> right on this point, I'd like to try and implement this soon.

I don't think there is any need for a new 'clone' command since the
clone porcelain already understands submodules.  Maybe a new switch is
needed to control the remote to use, but this switch is needed for
more cases than just clone.

   git remote add shared http://elsewhere.com/project.git
   git submodule init --remote=shared foo

This could also solve some existing ambiguities with relative paths
and with 'git-submodule sync', for example.

At first you worried me, but now I am starting to like this idea more and more.

Phil

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Location-agnostic submodules
  2012-04-30 20:39 ` Phil Hord
@ 2012-04-30 22:02   ` Pierre Thierry
  2012-05-01 15:16     ` Phil Hord
  0 siblings, 1 reply; 8+ messages in thread
From: Pierre Thierry @ 2012-04-30 22:02 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 1192 bytes --]

Scribit Phil Hord dies 30/04/2012 hora 16:39:
> Maybe something like this:
>     [submodule "foo"]
>         path = foo-mod
>         url = ../foo ../foo-alternate
> https://someplace.com/git/foo.git  https://kernel.org/git/foo

<rant>That is typically the kind of occasion when I wish every config
file were sexprs...</rant>

> But if one of them lags behind the others by a day or even an hour,
> then you may have gitlinks in your superproject which have not made
> it into the lagging mirror yet.  And this will cause problems.

I see your point, but what if your only repository is lagging behind?
I.e. in what way is it worse than now?

> Moreover, each time you clone the repository you may get different
> results.  This would be confusing.

Again, I fail to see the difference with the current state. If the
commit is specified, you will always get the same results, now or with
my suggested addition.

> I don't think there is any need for a new 'clone' command since the
> clone porcelain already understands submodules.

What do you mean? When I clone a repo with submodules, they are not
cloned as well.

-- 
pierre@nothos.net
OpenPGP 0xD9D50D8A

[-- Attachment #2: Digital signature --]
[-- Type: application/pgp-signature, Size: 198 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Location-agnostic submodules
  2012-04-30 22:02   ` Pierre Thierry
@ 2012-05-01 15:16     ` Phil Hord
  2012-05-01 17:19       ` Philip Oakley
  0 siblings, 1 reply; 8+ messages in thread
From: Phil Hord @ 2012-05-01 15:16 UTC (permalink / raw)
  To: Pierre Thierry; +Cc: git, Jens Lehmann

Adding Jens Lehmann, in case he hasn't noticed this thread yet.

On Mon, Apr 30, 2012 at 6:02 PM, Pierre Thierry <pierre@nothos.net> wrote:
> Scribit Phil Hord dies 30/04/2012 hora 16:39:
>> Maybe something like this:
>>     [submodule "foo"]
>>         path = foo-mod
>>         url = ../foo ../foo-alternate
>> https://someplace.com/git/foo.git  https://kernel.org/git/foo
>
> <rant>That is typically the kind of occasion when I wish every config
> file were sexprs...</rant>

Interesting.  But at least it's not yaml.  :-)

>> But if one of them lags behind the others by a day or even an hour,
>> then you may have gitlinks in your superproject which have not made
>> it into the lagging mirror yet.  And this will cause problems.
>
> I see your point, but what if your only repository is lagging behind?
> I.e. in what way is it worse than now?

I actually do not think it is very much worse than now.  But the
specific way it fails in this case is as follows:

Suppose I have mirrors A and B, each containing a superproject and its
submodule.

  A:super:master => A:sub:master
  B:super:master => B:sub:master

A and B are coherent, meaning their superproject gitlinks point to
commits which do exist in the submodule repositories.

Now, I push new commits to A:super and A:sub, giving this:
  A:super:new' => A:sub:new
  B:super:master => B:sub:master

Now, A and B are both internally coherent, but I have a problem if I
try to do this:
  A:super:master' => B:sub:new

This is because the sub:new commit has not made it to B yet, perhaps
intentionally or perhaps because of latency.

This problem still can occur without your change, so I do not think it
is a fatal flaw.  It is just a scenario to consider.

>> Moreover, each time you clone the repository you may get different
>> results.  This would be confusing.
>
> Again, I fail to see the difference with the current state. If the
> commit is specified, you will always get the same results, now or with
> my suggested addition.

The existing implementation has some flaws, and think the multiple
URLs option may expose the flaws further.  Again, not a fatal flaw to
your idea; just something to keep in mind.

Something else to keep in mind:  What you are proposing amounts to a
feature which identifies mirror repositories to use for submodules
when the primary remote repo cannot be reached.  Superprojects have no
such feature.  Why not?

Meanwhile, I really like the other feature you started off with, where
the "embedded" submodule repos could be used as the primary origin.

>> I don't think there is any need for a new 'clone' command since the
>> clone porcelain already understands submodules.
>
> What do you mean? When I clone a repo with submodules, they are not
> cloned as well.

Since git v1.6.5 or so, clone has known the --recursive option.

  git clone --recursive superproject

Since about v1.7.3, fetch and pull also know how to recurse and can do
so by default.

Phil

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Location-agnostic submodules
  2012-05-01 15:16     ` Phil Hord
@ 2012-05-01 17:19       ` Philip Oakley
  2012-05-01 17:57         ` Junio C Hamano
  0 siblings, 1 reply; 8+ messages in thread
From: Philip Oakley @ 2012-05-01 17:19 UTC (permalink / raw)
  To: Phil Hord, Pierre Thierry; +Cc: git, Jens Lehmann

From: "Phil Hord" <phil.hord@gmail.com> Sent: Tuesday, May 01, 2012 4:16 PM
> Adding Jens Lehmann, in case he hasn't noticed this thread yet.
>
> On Mon, Apr 30, 2012 at 6:02 PM, Pierre Thierry <pierre@nothos.net> wrote:
>> Scribit Phil Hord dies 30/04/2012 hora 16:39:
>>> Maybe something like this:
>>> [submodule "foo"]
>>> path = foo-mod
>>> url = ../foo ../foo-alternate
>>> https://someplace.com/git/foo.git https://kernel.org/git/foo
>>
>> <rant>That is typically the kind of occasion when I wish every config
>> file were sexprs...</rant>
>
> Interesting.  But at least it's not yaml.  :-)
>
>>> But if one of them lags behind the others by a day or even an hour,
>>> then you may have gitlinks in your superproject which have not made
>>> it into the lagging mirror yet. And this will cause problems.
>>
>> I see your point, but what if your only repository is lagging behind?
>> I.e. in what way is it worse than now?
>
> I actually do not think it is very much worse than now.  But the
> specific way it fails in this case is as follows:
>
> Suppose I have mirrors A and B, each containing a superproject and its
> submodule.
>
>  A:super:master => A:sub:master
>  B:super:master => B:sub:master
>
> A and B are coherent, meaning their superproject gitlinks point to
> commits which do exist in the submodule repositories.
>
> Now, I push new commits to A:super and A:sub, giving this:
>  A:super:new' => A:sub:new
>  B:super:master => B:sub:master
>
> Now, A and B are both internally coherent, but I have a problem if I
> try to do this:
>  A:super:master' => B:sub:new
>
> This is because the sub:new commit has not made it to B yet, perhaps
> intentionally or perhaps because of latency.
>
> This problem still can occur without your change, so I do not think it
> is a fatal flaw.  It is just a scenario to consider.
>
>>> Moreover, each time you clone the repository you may get different
>>> results. This would be confusing.
>>
>> Again, I fail to see the difference with the current state. If the
>> commit is specified, you will always get the same results, now or with
>> my suggested addition.
>
> The existing implementation has some flaws, and think the multiple
> URLs option may expose the flaws further.  Again, not a fatal flaw to
> your idea; just something to keep in mind.

Would an alternative be something like:
    git submodule update <module> --from <remote>

so that the user can state which of the current submodule's remotes should 
be used for fetching the desired update.

For compatibility, the update may need to use the '--reference' or something 
similar, or perhaps a different command word
    `git submodule fetch <module> --from <remote>`

I was just stung on msysgit (submodule `git`) because its NetInstall clones 
from the main repos, but I had forked my own copies, so the submodule URLs 
weren't right for me (doh). Luckily I have a patch of my silenty overwritten 
changes...

>
> Something else to keep in mind:  What you are proposing amounts to a
> feature which identifies mirror repositories to use for submodules
> when the primary remote repo cannot be reached.  Superprojects have no
> such feature.  Why not?
>
> Meanwhile, I really like the other feature you started off with, where
> the "embedded" submodule repos could be used as the primary origin.
>
>>> I don't think there is any need for a new 'clone' command since the
>>> clone porcelain already understands submodules.
>>
>> What do you mean? When I clone a repo with submodules, they are not
>> cloned as well.
>
> Since git v1.6.5 or so, clone has known the --recursive option.
>
>  git clone --recursive superproject
>
> Since about v1.7.3, fetch and pull also know how to recurse and can do
> so by default.
>
> Phil

Philip 

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Location-agnostic submodules
  2012-05-01 17:19       ` Philip Oakley
@ 2012-05-01 17:57         ` Junio C Hamano
  2012-05-01 19:58           ` Philip Oakley
  2012-05-02 16:55           ` Heiko Voigt
  0 siblings, 2 replies; 8+ messages in thread
From: Junio C Hamano @ 2012-05-01 17:57 UTC (permalink / raw)
  To: Philip Oakley; +Cc: Phil Hord, Pierre Thierry, git, Jens Lehmann

"Philip Oakley" <philipoakley@iee.org> writes:

> Would an alternative be something like:
>    git submodule update <module> --from <remote>
>
> so that the user can state which of the current submodule's remotes
> should be used for fetching the desired update.

Are you assuming that the <remote> in the above example will be different
per invocation for a single user?  I would imagine not---it would be more
like "the upstream has this URL in .gitmodules, but this other mirror is
closer to my network environment", i.e.

	cd <module's directory> && git config remote.origin.url $there

no?

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Location-agnostic submodules
  2012-05-01 17:57         ` Junio C Hamano
@ 2012-05-01 19:58           ` Philip Oakley
  2012-05-02 16:55           ` Heiko Voigt
  1 sibling, 0 replies; 8+ messages in thread
From: Philip Oakley @ 2012-05-01 19:58 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: Phil Hord, Pierre Thierry, Git List, Jens Lehmann

From: "Junio C Hamano" <gitster@pobox.com> Sent: Tuesday, May 01, 2012 6:57
PM
> "Philip Oakley" <philipoakley@iee.org> writes:
>
>> Would an alternative be something like:
>>    git submodule update <module> --from <remote>
>>
>> so that the user can state which of the current submodule's remotes
>> should be used for fetching the desired update.
>
> Are you assuming that the <remote> in the above example will be different
> per invocation for a single user?

possibly, but more likely the user would have identified which is the best
remote to use to find her missing sha1.

>         I would imagine not---it would be more
> like "the upstream has this URL in .gitmodules, but this other mirror is
> closer to my network environment", i.e.
>
> cd <module's directory> && git config remote.origin.url $there
>
I was presuming a reverse time sequence, where the user had already set up
the desired remote, but hadn't managed to change the URL in .gitmodules; but
either way, the user then let's 'git submodule' do the hard work of fetching
the correct sha1 to checkout.

I didn't think that there was a command yet to do the URL update, which
would most likely match one of the sub-module's URLs.

> no?
>
>
Philip

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: Location-agnostic submodules
  2012-05-01 17:57         ` Junio C Hamano
  2012-05-01 19:58           ` Philip Oakley
@ 2012-05-02 16:55           ` Heiko Voigt
  1 sibling, 0 replies; 8+ messages in thread
From: Heiko Voigt @ 2012-05-02 16:55 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Philip Oakley, Phil Hord, Pierre Thierry, git, Jens Lehmann

Hi,

On Tue, May 01, 2012 at 10:57:07AM -0700, Junio C Hamano wrote:
> "Philip Oakley" <philipoakley@iee.org> writes:
> 
> > Would an alternative be something like:
> >    git submodule update <module> --from <remote>
> >
> > so that the user can state which of the current submodule's remotes
> > should be used for fetching the desired update.
> 
> Are you assuming that the <remote> in the above example will be different
> per invocation for a single user?  I would imagine not---it would be more
> like "the upstream has this URL in .gitmodules, but this other mirror is
> closer to my network environment", i.e.
> 
> 	cd <module's directory> && git config remote.origin.url $there
> 
> no?

Yes I think this is an important point. If we start working on this I
would like to emphasize the fork use case Philip brought up. When
cloning a forked repository with submodules you always have the problem
of changing/adding the forked submodules remotes afterwards.

For me it would be more like an additional lookup mechanism of the
"official" urls / names. Since I like to stay as close as possible to
the upstream repository I usually refrain from changing the .gitmodules
file. A changed .gitmodules file with additional urls (possibly some
private ones) is not something you can propagate upstream.

What I would like is a mechanism that, given a wanted sha1, would lookup
the correct remote to clone/fetch from. But, I have to admit that even
though thinking about this for some time now, I have not got to a
satisfying answer for myself yet.

Cheers Heiko

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2012-05-02 16:55 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-04-27 14:37 Location-agnostic submodules Pierre Thierry
2012-04-30 20:39 ` Phil Hord
2012-04-30 22:02   ` Pierre Thierry
2012-05-01 15:16     ` Phil Hord
2012-05-01 17:19       ` Philip Oakley
2012-05-01 17:57         ` Junio C Hamano
2012-05-01 19:58           ` Philip Oakley
2012-05-02 16:55           ` Heiko Voigt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).