git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Teach git submodule update to use distributed repositories
@ 2008-07-17 12:08 Nigel Magnay
  2008-07-17 12:13 ` Johannes Schindelin
  2008-07-17 14:38 ` Petr Baudis
  0 siblings, 2 replies; 23+ messages in thread
From: Nigel Magnay @ 2008-07-17 12:08 UTC (permalink / raw)
  To: Git Mailing List

When doing a git submodule update, it fetches any missing submodule
commits from the repository specified in .gitmodules. If you instead
want to pull from another repository, you currently need to do a fetch
in each submodule by hand.

Signed-off-by: Nigel Magnay <nigel.magnay@gmail.com>
---
This is my first attempt at adding things to help everyday usage of
git submodule.

I don't usually write much shell script; and it's my first patch, so
it's possible there are better ways to do these things..

 git-submodule.sh |   33 +++++++++++++++++++++++++++++++--
 1 files changed, 31 insertions(+), 2 deletions(-)

diff --git a/git-submodule.sh b/git-submodule.sh
index 9228f56..40e1aa1 100755
--- a/git-submodule.sh
+++ b/git-submodule.sh
@@ -5,7 +5,7 @@
 # Copyright (c) 2007 Lars Hjemli

 USAGE="[--quiet] [--cached] \
-[add <repo> [-b branch] <path>]|[status|init|update
[-i|--init]|summary [-n|--summary-limit <n>] [<commit>]] \
+[add <repo> [-b branch] <path>]|[status|init|update [-i|--init]
[-o|--origin <repository>] [-r|-refspec <refspec>]|summary
[-n|--summary-limit <n>] [<commit>]] \
 [--] [<path>...]"
 OPTIONS_SPEC=
 . git-sh-setup
@@ -15,6 +15,8 @@ command=
 branch=
 quiet=
 cached=
+repository=
+refspec=

 #
 # print stuff on stdout unless -q was specified
@@ -270,6 +272,14 @@ cmd_update()
 			shift
 			cmd_init "$@" || return
 			;;
+		-o|--origin)
+			shift
+			repository=$1
+			;;
+		-r|--refspec)
+			shift
+			refspec=$1
+			;;
 		--)
 			shift
 			break
@@ -311,7 +321,9 @@ cmd_update()

 		if test "$subsha1" != "$sha1"
 		then
-			(unset GIT_DIR; cd "$path" && git-fetch &&
+			set_submodule_repository "$repository" "$path"
+
+			(unset GIT_DIR; cd "$path" && git-fetch "$subrepo" "$refspec" &&
 				git-checkout -q "$sha1") ||
 			die "Unable to checkout '$sha1' in submodule path '$path'"

@@ -320,6 +332,23 @@ cmd_update()
 	done
 }

+#
+# If we asked for a repository such as 'origin', just pass this through
+# otherwise, try to calculate what the repository URL might be by
+# adding the submodule path to the url, subtracting any /.git first
+#
+set_submodule_repository() {
+
+	if [ -z `echo "$1" | grep '/'` ]
+	then
+		# This is not a URL - just use the name
+		subrepo="$1"
+	else
+		# This is a URL. Chop off /.git if it's there, and add submodule path
+		subrepo="${1/%\/.git/}/$2"
+	fi
+}
+
 set_name_rev () {
 	revname=$( (
 		unset GIT_DIR
-- 
1.5.6.2

^ permalink raw reply related	[flat|nested] 23+ messages in thread

* Re: [PATCH] Teach git submodule update to use distributed repositories
  2008-07-17 12:08 [PATCH] Teach git submodule update to use distributed repositories Nigel Magnay
@ 2008-07-17 12:13 ` Johannes Schindelin
       [not found]   ` <320075ff0807170520r200e546ejbad2ed103bd65f82@mail.gmail.com>
  2008-07-17 14:38 ` Petr Baudis
  1 sibling, 1 reply; 23+ messages in thread
From: Johannes Schindelin @ 2008-07-17 12:13 UTC (permalink / raw)
  To: Nigel Magnay; +Cc: Git Mailing List

Hi,

On Thu, 17 Jul 2008, Nigel Magnay wrote:

> When doing a git submodule update, it fetches any missing submodule
> commits from the repository specified in .gitmodules.

Huh?  It takes what is in .git/config!  Not what is in .gitmodules.

So if you have another remote (or URL, e.g. if you have ssh:// access, but 
the .gitmodules file lists git://), just edit .git/config.

I meant, that is the whole _point_ of having a two-step init/update 
procedure.

Ciao,
Dscho

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] Teach git submodule update to use distributed repositories
       [not found]   ` <320075ff0807170520r200e546ejbad2ed103bd65f82@mail.gmail.com>
@ 2008-07-17 12:21     ` Nigel Magnay
  2008-07-17 12:58       ` Johannes Schindelin
  0 siblings, 1 reply; 23+ messages in thread
From: Nigel Magnay @ 2008-07-17 12:21 UTC (permalink / raw)
  To: Git Mailing List

On Thu, Jul 17, 2008 at 1:13 PM, Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
> Hi,
>
> On Thu, 17 Jul 2008, Nigel Magnay wrote:
>
>> When doing a git submodule update, it fetches any missing submodule
>> commits from the repository specified in .gitmodules.
>
> Huh?  It takes what is in .git/config!  Not what is in .gitmodules.
>

Huh? And where does .git/config get it from? Oh, that's right, .gitmodules.

> So if you have another remote (or URL, e.g. if you have ssh:// access, but
> the .gitmodules file lists git://), just edit .git/config.
>

So for my usecase, you'd have me go in and change *evey single one* of
my submodule refs from the centralised repository, *every time* I want
to do a peer review?

Doesn't the current system strike you as being somewhat centralised in nature?

> I meant, that is the whole _point_ of having a two-step init/update
> procedure.
>

Are you just determined that submodules should remain useless for "the
rest of us"?

> Ciao,
> Dscho
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] Teach git submodule update to use distributed repositories
  2008-07-17 12:21     ` Nigel Magnay
@ 2008-07-17 12:58       ` Johannes Schindelin
  2008-07-17 14:03         ` Nigel Magnay
  0 siblings, 1 reply; 23+ messages in thread
From: Johannes Schindelin @ 2008-07-17 12:58 UTC (permalink / raw)
  To: Nigel Magnay; +Cc: Git Mailing List

Hi,

On Thu, 17 Jul 2008, Nigel Magnay wrote:

> On Thu, Jul 17, 2008 at 1:13 PM, Johannes Schindelin
> <Johannes.Schindelin@gmx.de> wrote:
>
> > On Thu, 17 Jul 2008, Nigel Magnay wrote:
> >
> >> When doing a git submodule update, it fetches any missing submodule 
> >> commits from the repository specified in .gitmodules.
> >
> > Huh?  It takes what is in .git/config!  Not what is in .gitmodules.
> 
> Huh? And where does .git/config get it from? Oh, that's right, 
> .gitmodules.

Oh, that's right, after "git submodule init".  Right before you are 
supposed to change them if your setup commands that.

> > So if you have another remote (or URL, e.g. if you have ssh:// access, 
> > but the .gitmodules file lists git://), just edit .git/config.
> 
> So for my usecase, you'd have me go in and change *evey single one* of 
> my submodule refs from the centralised repository, *every time* I want 
> to do a peer review?

No.

> Doesn't the current system strike you as being somewhat centralised in 
> nature?

No.

> > I meant, that is the whole _point_ of having a two-step init/update 
> > procedure.
> 
> Are you just determined that submodules should remain useless for "the 
> rest of us"?

No.

If you really need to change the "origin" back and forth between reviews, 
while the committed state of the superproject stays the same, then 
something is seriously awkward and needs to be streamlined in your setup.

Because when the superproject's revision stays the same, "git submodule 
update" may fetch additional objects if you specify another remote, but it 
will check out just the same revisions of the submodules.  Because they 
were committed as such.

But if you want to get objects from another server (as opposed to update 
the submodules' working directories to the latest committed revisions), 
which happens to have the identical layout of the principal server (which 
I would deem another setup peculiarity to be fixed), you might want to 
look into the recurse patch that was flying about on this list a few 
months back.

Hth,
Dscho

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] Teach git submodule update to use distributed repositories
  2008-07-17 12:58       ` Johannes Schindelin
@ 2008-07-17 14:03         ` Nigel Magnay
  2008-07-17 14:16           ` Johannes Schindelin
  0 siblings, 1 reply; 23+ messages in thread
From: Nigel Magnay @ 2008-07-17 14:03 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Git Mailing List

On Thu, Jul 17, 2008 at 1:58 PM, Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
> Hi,
>
> On Thu, 17 Jul 2008, Nigel Magnay wrote:
>
>> On Thu, Jul 17, 2008 at 1:13 PM, Johannes Schindelin
>> <Johannes.Schindelin@gmx.de> wrote:
>>
>> > On Thu, 17 Jul 2008, Nigel Magnay wrote:
>> >
>> >> When doing a git submodule update, it fetches any missing submodule
>> >> commits from the repository specified in .gitmodules.
>> >
>> > Huh?  It takes what is in .git/config!  Not what is in .gitmodules.
>>
>> Huh? And where does .git/config get it from? Oh, that's right,
>> .gitmodules.
>
> Oh, that's right, after "git submodule init".  Right before you are
> supposed to change them if your setup commands that.
>
>> > So if you have another remote (or URL, e.g. if you have ssh:// access,
>> > but the .gitmodules file lists git://), just edit .git/config.
>>
>> So for my usecase, you'd have me go in and change *evey single one* of
>> my submodule refs from the centralised repository, *every time* I want
>> to do a peer review?
>
> No.
>
>> Doesn't the current system strike you as being somewhat centralised in
>> nature?
>
> No.
>
>> > I meant, that is the whole _point_ of having a two-step init/update
>> > procedure.
>>
>> Are you just determined that submodules should remain useless for "the
>> rest of us"?
>
> No.
>
> If you really need to change the "origin" back and forth between reviews,
> while the committed state of the superproject stays the same, then
> something is seriously awkward and needs to be streamlined in your setup.
>
> Because when the superproject's revision stays the same, "git submodule
> update" may fetch additional objects if you specify another remote, but it
> will check out just the same revisions of the submodules.  Because they
> were committed as such.
>
> But if you want to get objects from another server (as opposed to update
> the submodules' working directories to the latest committed revisions),
> which happens to have the identical layout of the principal server (which
> I would deem another setup peculiarity to be fixed), you might want to
> look into the recurse patch that was flying about on this list a few
> months back.

The layout wouldn't be the same - the submodules would be in the
corresponding subdirectories (I guess it could have some other,
stranger layout, but I'd consider that peculiar). So you're right, the
layout is different, which makes editing the config all the more
tedious.

I don't want to change the *origin* back and forth. I want to be able
to use repos with submodules in them as easily and as transparently
and in the same distributed way as git allows me to do if they don't
contain submodules. I.E I don't want it to be such a sisyphean
challenge every time with umpteen scripts to complete a usecase that
really ought to be supported as standard. The very first thing that
I've hit is that submodule update only talks to origin, so 'git pull
fred && git submodule update' falls flat on its face. Why am I being
forced to update config just to have a look-see at fred's project?

Your attitude seems to be that the status-quo is in some way
desirable; "It's no wonder that this tool is awkward to use in your
workflow.". This workflow is really common, and there's actual, real
people on this list complaining about it. Don't we think it could be
improved to be non-awkward ?

In the ideal UI, it ought to be possible to make the use of projects
with submodules (almost) completely transparent, like it is in the
vcs-that-dare-not-speak-it's-name.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] Teach git submodule update to use distributed repositories
  2008-07-17 14:03         ` Nigel Magnay
@ 2008-07-17 14:16           ` Johannes Schindelin
  2008-07-17 15:07             ` Nigel Magnay
  0 siblings, 1 reply; 23+ messages in thread
From: Johannes Schindelin @ 2008-07-17 14:16 UTC (permalink / raw)
  To: Nigel Magnay; +Cc: Git Mailing List

Hi,

On Thu, 17 Jul 2008, Nigel Magnay wrote:

> Your attitude seems to be that the status-quo is in some way desirable; 
> "It's no wonder that this tool is awkward to use in your workflow.". 
> This workflow is really common, and there's actual, real people on this 
> list complaining about it. Don't we think it could be improved to be 
> non-awkward ?

I do not think that the status quo is the best possible.

But I think that the way you go makes things so confusing that those who 
use it apart from you will have problems.

For example, in your setup everybody would have to install _different_ 
remotes in every submodule.

And then some would ask themselves why the original origin was not good 
enough.

And others would specify "-o origin" all the time, thinking it was 
required.

There must be a better way to promote submodules to a usable state,
Dscho

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] Teach git submodule update to use distributed repositories
  2008-07-17 12:08 [PATCH] Teach git submodule update to use distributed repositories Nigel Magnay
  2008-07-17 12:13 ` Johannes Schindelin
@ 2008-07-17 14:38 ` Petr Baudis
  1 sibling, 0 replies; 23+ messages in thread
From: Petr Baudis @ 2008-07-17 14:38 UTC (permalink / raw)
  To: Nigel Magnay; +Cc: Git Mailing List

On Thu, Jul 17, 2008 at 01:08:19PM +0100, Nigel Magnay wrote:
> When doing a git submodule update, it fetches any missing submodule
> commits from the repository specified in .gitmodules. If you instead
> want to pull from another repository, you currently need to do a fetch
> in each submodule by hand.
> 
> Signed-off-by: Nigel Magnay <nigel.magnay@gmail.com>

I don't think it is good idea to hijack git submodule update for this.
This command has a specific purpose:

	"When I pulled new version of the main tree, bring my
	submodule checkouts in line with whatever is specified
	within the new tree revision."

Your usage scenario has nothing to do with that, it is about "batch
manipulation" of all the submodules at once in a certain way. I think
using the same command for two conceptually pretty much unrelated
purposes will only clutter up the UI, and we should think of a better
general interface pattern for these operations.

In the new git-submodule description, it is said that

	"This command will manage the tree entries and contents of the
	gitmodules file for you."

and I think we should keep it at this; anything that is related to
submodules, but does not do this directly, would IMHO live better
as some kind of "submodule-recursive" extension of other existing
commands. Say, would this particular need of yours be served by a
hypothetical command like

	git checkout --submodules nifty

to check out branch nifty of all submodules or am I misunderstanding
what are you trying to achieve?

If not, then actually even _much_ more elegant solution for this
particular problem would be to store submodule.*.branch in .gitmodules
appropriate to the -b parameter of git submodule add. Then, in branch
'nifty' of the main project, you would set submodule.*.branch to 'nifty'
too.  Then, in order to bring all the submodules to the latest version,
I could imagine something like

	git pull --submodules

(and possibly just abort at the first sight of a conflict, for
starters).

Let's figure up some UI that is nifty and clean. ;-)

-- 
				Petr "Pasky" Baudis
GNU, n. An animal of South Africa, which in its domesticated state
resembles a horse, a buffalo and a stag. In its wild condition it is
something like a thunderbolt, an earthquake and a cyclone. -- A. Pierce

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] Teach git submodule update to use distributed repositories
  2008-07-17 14:16           ` Johannes Schindelin
@ 2008-07-17 15:07             ` Nigel Magnay
  2008-07-17 18:22               ` Petr Baudis
  0 siblings, 1 reply; 23+ messages in thread
From: Nigel Magnay @ 2008-07-17 15:07 UTC (permalink / raw)
  To: Johannes Schindelin; +Cc: Git Mailing List

On Thu, Jul 17, 2008 at 3:16 PM, Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
> Hi,
>
> On Thu, 17 Jul 2008, Nigel Magnay wrote:
>
>> Your attitude seems to be that the status-quo is in some way desirable;
>> "It's no wonder that this tool is awkward to use in your workflow.".
>> This workflow is really common, and there's actual, real people on this
>> list complaining about it. Don't we think it could be improved to be
>> non-awkward ?
>
> I do not think that the status quo is the best possible.
>
> But I think that the way you go makes things so confusing that those who
> use it apart from you will have problems.
>
Ok

> For example, in your setup everybody would have to install _different_
> remotes in every submodule.
>
> And then some would ask themselves why the original origin was not good
> enough.
>
> And others would specify "-o origin" all the time, thinking it was
> required.
>
> There must be a better way to promote submodules to a usable state,

My attempt was to try and do some small simple things, but you could
well be right, that might make some commands bloat out with
unneccessary options just to get something done, and that would be
bad.

Stepping back - lets try to come up with a better way (please comment
and and critique)

What we'd like (to start with) is for
$ git pull fred

perhaps with --submodules (as Petr mentions), perhaps with config
settings and caveats, to produce a result that means you don't need to
be aware that there were submodules, they're automatically fetched and
updated based on commits that may only exist in fred's repository.

So currently, you can do
$ git pull origin && git submodule init && git submodule update

And it works, but

$ git pull fred
$ git submodule update

Can leave you with problems, because if a submodule wasn't pushed to
origin, you won't have it available. This is because the commands are
equivalent to

$ git pull fred
for each submodule()
  cd submodule
  git fetch origin
  git checkout <sha1>

So somehow, you need to replace 'git fetch origin' with the "correct"
repository (on fred's computer). My patch was really just about being
able to pass parameters to 'git fetch'. The problems are that if you
did

$ git submodule update fred

Unless each submodule had a [remote] specified for "fred", you'd be
stuffed. But what you could do is either by passing the right URL, or
looking at the superproject [remote] for "fred" - i.e: If in the
superproject you have

[remote "fred"]
        url = ssh://git@fred.local/pub/scm/git/workspace/thing/.git
[submodule "module"]
        url = ssh://git@repo/pub/scm/git/module.git

Then the submodule "module" on fred, if it's a working-copy, can be calculated
       ssh://git@fred.local/pub/scm/git/workspace/thing/module/.git

If it isn't a WC then you'd have to have a [remote "fred"] in that
submodule, but I'm thinking that'd be a rare case.

I'd assumed (possibly wrongly?) that there was resistance to putting
any of the submodule logic in things other than git-submodules.

As a starter for 10, how about
- a '--submodules' option to git fetch / pull
- using the remote name if known, calculate it if not based on the above

WDYT?

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] Teach git submodule update to use distributed repositories
  2008-07-17 15:07             ` Nigel Magnay
@ 2008-07-17 18:22               ` Petr Baudis
  2008-07-18  8:11                 ` Nigel Magnay
  0 siblings, 1 reply; 23+ messages in thread
From: Petr Baudis @ 2008-07-17 18:22 UTC (permalink / raw)
  To: Nigel Magnay; +Cc: Johannes Schindelin, Git Mailing List

On Thu, Jul 17, 2008 at 04:07:11PM +0100, Nigel Magnay wrote:
> And it works, but
> 
> $ git pull fred
> $ git submodule update
> 
> Can leave you with problems, because if a submodule wasn't pushed to
> origin, you won't have it available. This is because the commands are
> equivalent to
> 
> $ git pull fred
> for each submodule()
>   cd submodule
>   git fetch origin
>   git checkout <sha1>

Oh! So, only after replying to most of your mail, I have realized what
are you talking about all the time - _just_ this particular failure
mode:

	"Someone pushed out a repository repointing submodules to
	invalid commits, and instead of waiting for the person to fix
	this breakage, we want to do a one-off fetch of all submodules
	from a different repository."

There's nothing else you're trying to solve by this, right?


Now, I think that this is a completely wrong problem to solve. Your
gitweb is going to be broken, everyone has to jump through hoops because
of this, and that all just because of a single mistake. It shouldn't
have _happenned_ in the first place.

So the proper solution for this should be to make an update hook that
will simply not _let_ you push out a tree that's broken like this.
Something like this (completely untested):

die() { echo "$@"; exit 1; }
git rev-list ^$2 $3 | while read commit; do
	git show $commit:.gitmodules >/tmp/gm$$
	git config -f /tmp/gm$$ --get-regexp 'submodule\..*\.path' |
		cut -d ' ' -f 1 |
		sed 's/^.*\.//; s/\..*$//;' |
		while read submodule; do
			path=$(git config -f /tmp/gm$$ "submodule.$submodule.path")
			url=$(git config -f /tmp/gm$$ "submodule.$submodule.url")
			entry=$(git ls-tree $commit "$path")
			[ -n "$entry" ] || die "submodule $submodule points at a non-existing path"
			[ "$(echo "$entry" | cut -d ' ' -f 1)" = "160000" ] || die "submodule $submodule does not point to a gitlink entry"
			
			subcommit="$(echo "$entry" | cut -d ' ' -f 2)"
			urlhash="$(echo "$url" | sha1sum | cut -d ' ' -f 1)"
			# We keep local copies of submodule repositories
			# for commit existence checking
			echo "Please wait, updating $url cache..."
			if [ -d /tmp/ucache/$urlhash ]; then
			        (cd /tmp/ucache/$urlhash && git fetch)
			else
			        git clone --bare "$url" /tmp/ucache/$urlhash
			fi
			[ "$(git --git-dir=/tmp/ucache/$urlhash cat-file -t "$subcommit" 2>/dev/null)" = "commit" ] || die "submodule $submodule does not point at an existing commit"
		done
	done

Comments? If it seems good, it might be worth including in
contrib/hooks/. Maybe even in the default update hook, controlled by
a config option.

All the troubles here stem from the fact that normally, Git will not let
you push any invalid state to the server. This is not completely true in
this case, but we should prevent this behaviour instead of inventing
hacks to work it around.

> Unless each submodule had a [remote] specified for "fred", you'd be
> stuffed. But what you could do is either by passing the right URL, or
> looking at the superproject [remote] for "fred" - i.e: If in the
> superproject you have
> 
> [remote "fred"]
>         url = ssh://git@fred.local/pub/scm/git/workspace/thing/.git
> [submodule "module"]
>         url = ssh://git@repo/pub/scm/git/module.git
> 
> Then the submodule "module" on fred, if it's a working-copy, can be calculated
>        ssh://git@fred.local/pub/scm/git/workspace/thing/module/.git
> 
> If it isn't a WC then you'd have to have a [remote "fred"] in that
> submodule, but I'm thinking that'd be a rare case.

This is ultra-evil. I think assuming things like this is extremely dirty
and not reasonable for a universal code, _unless_ we explicitly decide
that this is a new convention you want to introduce as a recommendation.
But you should've been very clear about this upfront.

_If_ you still insist on the one-off fetches for some reason, I think
it's reasonable to provide your own simple script for your users that
will autogenerate these URLs appropriately for your particular setup.
I don't think there is any real need for a more generic solution.

> I'd assumed (possibly wrongly?) that there was resistance to putting
> any of the submodule logic in things other than git-submodules.

Are you following the thread about submodule support for git mv, git rm?

-- 
				Petr "Pasky" Baudis
GNU, n. An animal of South Africa, which in its domesticated state
resembles a horse, a buffalo and a stag. In its wild condition it is
something like a thunderbolt, an earthquake and a cyclone. -- A. Pierce

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] Teach git submodule update to use distributed repositories
  2008-07-17 18:22               ` Petr Baudis
@ 2008-07-18  8:11                 ` Nigel Magnay
  2008-07-18  8:45                   ` Jakub Narebski
  2008-07-18  9:16                   ` Petr Baudis
  0 siblings, 2 replies; 23+ messages in thread
From: Nigel Magnay @ 2008-07-18  8:11 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Johannes Schindelin, Git Mailing List

On Thu, Jul 17, 2008 at 7:22 PM, Petr Baudis <pasky@suse.cz> wrote:
> On Thu, Jul 17, 2008 at 04:07:11PM +0100, Nigel Magnay wrote:
>> And it works, but
>>
>> $ git pull fred
>> $ git submodule update
>>
>> Can leave you with problems, because if a submodule wasn't pushed to
>> origin, you won't have it available. This is because the commands are
>> equivalent to
>>
>> $ git pull fred
>> for each submodule()
>>   cd submodule
>>   git fetch origin
>>   git checkout <sha1>
>
> Oh! So, only after replying to most of your mail, I have realized what
> are you talking about all the time - _just_ this particular failure
> mode:
>
>        "Someone pushed out a repository repointing submodules to
>        invalid commits, and instead of waiting for the person to fix
>        this breakage, we want to do a one-off fetch of all submodules
>        from a different repository."
>
> There's nothing else you're trying to solve by this, right?
>

No.
"Someone says 'please review the state of my tree, _before_ I push it
out to a (central) repository"

Fred is a person (and != origin). His tree(s) are entirely correct and
consistent, and he doesn't yet wish to push to origin (and perhaps he
cannot, because he does not have permission to do so).

All the tutorials give credit to the fact that in git you don't need a
central server - you can pull directly from people. Except in the case
where you're using submodules, where you're basically forced to
hand-modify .git/config (in this instance, to point to where 'fred' is
storing his submodule trees) before doing a submodule update. This
makes git complicated for users.

I'm trying to improve the UI for projects using submodules to make it
mostly transparent; the best way I can come up with is to pick on
individual usecases and show that they're a particular pain and that
perhaps they don't need to be.

>
> Now, I think that this is a completely wrong problem to solve. Your
> gitweb is going to be broken, everyone has to jump through hoops because
> of this, and that all just because of a single mistake. It shouldn't
> have _happenned_ in the first place.
>
> So the proper solution for this should be to make an update hook that
> will simply not _let_ you push out a tree that's broken like this.
> Something like this (completely untested):
>
> die() { echo "$@"; exit 1; }
> git rev-list ^$2 $3 | while read commit; do
>        git show $commit:.gitmodules >/tmp/gm$$
>        git config -f /tmp/gm$$ --get-regexp 'submodule\..*\.path' |
>                cut -d ' ' -f 1 |
>                sed 's/^.*\.//; s/\..*$//;' |
>                while read submodule; do
>                        path=$(git config -f /tmp/gm$$ "submodule.$submodule.path")
>                        url=$(git config -f /tmp/gm$$ "submodule.$submodule.url")
>                        entry=$(git ls-tree $commit "$path")
>                        [ -n "$entry" ] || die "submodule $submodule points at a non-existing path"
>                        [ "$(echo "$entry" | cut -d ' ' -f 1)" = "160000" ] || die "submodule $submodule does not point to a gitlink entry"
>
>                        subcommit="$(echo "$entry" | cut -d ' ' -f 2)"
>                        urlhash="$(echo "$url" | sha1sum | cut -d ' ' -f 1)"
>                        # We keep local copies of submodule repositories
>                        # for commit existence checking
>                        echo "Please wait, updating $url cache..."
>                        if [ -d /tmp/ucache/$urlhash ]; then
>                                (cd /tmp/ucache/$urlhash && git fetch)
>                        else
>                                git clone --bare "$url" /tmp/ucache/$urlhash
>                        fi
>                        [ "$(git --git-dir=/tmp/ucache/$urlhash cat-file -t "$subcommit" 2>/dev/null)" = "commit" ] || die "submodule $submodule does not point at an existing commit"
>                done
>        done
>
> Comments? If it seems good, it might be worth including in
> contrib/hooks/. Maybe even in the default update hook, controlled by
> a config option.
>
> All the troubles here stem from the fact that normally, Git will not let
> you push any invalid state to the server. This is not completely true in
> this case, but we should prevent this behaviour instead of inventing
> hacks to work it around.
>
>> Unless each submodule had a [remote] specified for "fred", you'd be
>> stuffed. But what you could do is either by passing the right URL, or
>> looking at the superproject [remote] for "fred" - i.e: If in the
>> superproject you have
>>
>> [remote "fred"]
>>         url = ssh://git@fred.local/pub/scm/git/workspace/thing/.git
>> [submodule "module"]
>>         url = ssh://git@repo/pub/scm/git/module.git
>>
>> Then the submodule "module" on fred, if it's a working-copy, can be calculated
>>        ssh://git@fred.local/pub/scm/git/workspace/thing/module/.git
>>
>> If it isn't a WC then you'd have to have a [remote "fred"] in that
>> submodule, but I'm thinking that'd be a rare case.
>
> This is ultra-evil. I think assuming things like this is extremely dirty
> and not reasonable for a universal code, _unless_ we explicitly decide
> that this is a new convention you want to introduce as a recommendation.
> But you should've been very clear about this upfront.
>
> _If_ you still insist on the one-off fetches for some reason, I think
> it's reasonable to provide your own simple script for your users that
> will autogenerate these URLs appropriately for your particular setup.
> I don't think there is any real need for a more generic solution.
>
>> I'd assumed (possibly wrongly?) that there was resistance to putting
>> any of the submodule logic in things other than git-submodules.
>
> Are you following the thread about submodule support for git mv, git rm?
>
> --
>                                Petr "Pasky" Baudis
> GNU, n. An animal of South Africa, which in its domesticated state
> resembles a horse, a buffalo and a stag. In its wild condition it is
> something like a thunderbolt, an earthquake and a cyclone. -- A. Pierce
>

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] Teach git submodule update to use distributed repositories
  2008-07-18  8:11                 ` Nigel Magnay
@ 2008-07-18  8:45                   ` Jakub Narebski
  2008-07-18  9:00                     ` Junio C Hamano
  2008-07-18  9:16                   ` Petr Baudis
  1 sibling, 1 reply; 23+ messages in thread
From: Jakub Narebski @ 2008-07-18  8:45 UTC (permalink / raw)
  To: Nigel Magnay; +Cc: Petr Baudis, Johannes Schindelin, Git Mailing List

"Nigel Magnay" <nigel.magnay@gmail.com> writes:

> On Thu, Jul 17, 2008 at 7:22 PM, Petr Baudis <pasky@suse.cz> wrote:
>> On Thu, Jul 17, 2008 at 04:07:11PM +0100, Nigel Magnay wrote:
>>> And it works, but
>>>
>>> $ git pull fred
>>> $ git submodule update
>>>
>>> Can leave you with problems, because if a submodule wasn't pushed to
>>> origin, you won't have it available. This is because the commands are
>>> equivalent to
>>>
>>> $ git pull fred
>>> for each submodule()
>>>   cd submodule
>>>   git fetch origin
>>>   git checkout <sha1>

> "Someone says 'please review the state of my tree, _before_ I push it
> out to a (central) repository"
> 
> Fred is a person (and != origin). His tree(s) are entirely correct and
> consistent, and he doesn't yet wish to push to origin (and perhaps he
> cannot, because he does not have permission to do so).
> 
> All the tutorials give credit to the fact that in git you don't need a
> central server - you can pull directly from people. Except in the case
> where you're using submodules, where you're basically forced to
> hand-modify .git/config (in this instance, to point to where 'fred' is
> storing his submodule trees) before doing a submodule update. This
> makes git complicated for users.
> 
> I'm trying to improve the UI for projects using submodules to make it
> mostly transparent; the best way I can come up with is to pick on
> individual usecases and show that they're a particular pain and that
> perhaps they don't need to be.

I _think_ that you can currently work around this problem by using
URL rewriting (url.<base>.insteadOf).

-- 
Jakub Narebski
Poland
ShadeHawk on #git

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] Teach git submodule update to use distributed repositories
  2008-07-18  8:45                   ` Jakub Narebski
@ 2008-07-18  9:00                     ` Junio C Hamano
  2008-07-18  9:07                       ` Jakub Narebski
  0 siblings, 1 reply; 23+ messages in thread
From: Junio C Hamano @ 2008-07-18  9:00 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Nigel Magnay, Petr Baudis, Johannes Schindelin, Git Mailing List

Jakub Narebski <jnareb@gmail.com> writes:

> "Nigel Magnay" <nigel.magnay@gmail.com> writes:
>
>> Fred is a person (and != origin). His tree(s) are entirely correct and
>> consistent, and he doesn't yet wish to push to origin (and perhaps he
>> cannot, because he does not have permission to do so).
>> ...
> I _think_ that you can currently work around this problem by using
> URL rewriting (url.<base>.insteadOf).

Doesn't it also involve config modification?

I think the right thing to do for this kind of "trial merge" should be the
same as cases that do not involve submodules.  You *DO NOT* give a handy
way to muck with your configuration to make "origin" point at fred.
Instead, you would do something like:

	$ git fetch ../fred master
        $ git checkout FETCH_HEAD
        ... review test fix ...
	... when you are done, go back, discarding the state from Fred
        $ git checkout master

What submodule changes from the above workflow would be what happens after
you switch to the trial state (the above example detaches HEAD temporarily
while peeking into Fred's history).  It is understandable that you would
want to script something that recurses into the submodules that you have
checked out (or submodules that Fred wants you to look at), do the
equivalent of "git fetch ../fred" you did at the toplevel to automate that
step, but I very much agree with Pasky here in that it feels very wrong to
hijack "submodule update" for it.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] Teach git submodule update to use distributed repositories
  2008-07-18  9:00                     ` Junio C Hamano
@ 2008-07-18  9:07                       ` Jakub Narebski
  2008-07-18  9:18                         ` Nigel Magnay
  0 siblings, 1 reply; 23+ messages in thread
From: Jakub Narebski @ 2008-07-18  9:07 UTC (permalink / raw)
  To: Junio C Hamano
  Cc: Nigel Magnay, Petr Baudis, Johannes Schindelin, Git Mailing List

Junio C Hamano wrote:

> [...] It is understandable that you would
> want to script something that recurses into the submodules that you have
> checked out (or submodules that Fred wants you to look at), do the
> equivalent of "git fetch ../fred" you did at the toplevel to automate that
> step, but I very much agree with Pasky here in that it feels very wrong to
> hijack "submodule update" for it.

There were two proposals how to deal with fetching all submodules:
(a) git-submodule recursing into submodules, IIRC even with some
implementation (b) new "git submodule fetch" command.

-- 
Jakub Narebski
Poland

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] Teach git submodule update to use distributed repositories
  2008-07-18  8:11                 ` Nigel Magnay
  2008-07-18  8:45                   ` Jakub Narebski
@ 2008-07-18  9:16                   ` Petr Baudis
  2008-07-18  9:36                     ` Nigel Magnay
  1 sibling, 1 reply; 23+ messages in thread
From: Petr Baudis @ 2008-07-18  9:16 UTC (permalink / raw)
  To: Nigel Magnay; +Cc: Johannes Schindelin, Git Mailing List

  Hi,

  _please_, trim the parts of quoted e-mails that you are not reacting
to. It makes your mails easier to read.

On Fri, Jul 18, 2008 at 09:11:53AM +0100, Nigel Magnay wrote:
> No.
> "Someone says 'please review the state of my tree, _before_ I push it
> out to a (central) repository"
> 
> Fred is a person (and != origin). His tree(s) are entirely correct and
> consistent, and he doesn't yet wish to push to origin (and perhaps he
> cannot, because he does not have permission to do so).
> 
> All the tutorials give credit to the fact that in git you don't need a
> central server - you can pull directly from people. Except in the case
> where you're using submodules, where you're basically forced to
> hand-modify .git/config (in this instance, to point to where 'fred' is
> storing his submodule trees) before doing a submodule update. This
> makes git complicated for users.

Ok! Handling this case makes sense, though I would have wished you to
word this use case this clearly from the beginning; or maybe I'm just
slow. :-)

Now, we (at least we two) agree that this use case is worth supporting,
I still don't like the solution you propose, though. The problem that we
are trying to solve is:

	"How do we mass-supply custom submodule URLs when publishing the
	customized main repository at a custom location too?"

Now, the most natural solution is for Fred to actually customize
.gitmodules content when committing the submodule updates:

  (i) Either just give submodule update a hypothetical flag that will
ignore .git/config for that particular run or,

  (ii) even much better, actually change logical submodule names in
.gitmodules; this is appropriate as you want the modules to actually
point at a significantly different repository. Thus,

	[submodule "boo"]
	path=boo
	url=git://repo.or.cz/boo.git

will become

	[submodule "boo/fred"]
	path=boo
	url=git://repo.or.cz/boo/fred.git

  Also, you will be able to redefine the URL of boo/fred too in
.git/config (e.g. you're behind a firewall that lets only HTTP
through; I'm actually behind such a firewall these days at my
(non-SUSE ;) work).


This should be reasonably elegant, works with no Git changes, however
still has one significant problem - you very much do not want such a
.gitmodules change in any of the commits you merge, since it breaks
bisectability in case Fred or his repositories go away.

In that case, several possibilities come up on my mind:

  (1) Fred will prepare special branch for testing with modified
.gitmodules and then for a merge he offers a different branch with clean
.gitmodules. This works, but it is obnoxious.

  (2) Fred will pass a patch for .gitmodules as a part of his review
request. This works too and is obnoxious in slightly different aspects
than (1).

  (3) Fred will offer a rewrite rule that you will pass to submodule
update, like your solution proposed, but much more universal so that it
is not tailored just to your particular repository hierarchy. A simple
sed script could work fine.

  (4) Something else that I'm not realizing.

-- 
				Petr "Pasky" Baudis
GNU, n. An animal of South Africa, which in its domesticated state
resembles a horse, a buffalo and a stag. In its wild condition it is
something like a thunderbolt, an earthquake and a cyclone. -- A. Pierce

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] Teach git submodule update to use distributed repositories
  2008-07-18  9:07                       ` Jakub Narebski
@ 2008-07-18  9:18                         ` Nigel Magnay
  0 siblings, 0 replies; 23+ messages in thread
From: Nigel Magnay @ 2008-07-18  9:18 UTC (permalink / raw)
  To: Jakub Narebski
  Cc: Junio C Hamano, Petr Baudis, Johannes Schindelin,
	Git Mailing List

On Fri, Jul 18, 2008 at 10:07 AM, Jakub Narebski <jnareb@gmail.com> wrote:
> Junio C Hamano wrote:
>
>> [...] It is understandable that you would
>> want to script something that recurses into the submodules that you have
>> checked out (or submodules that Fred wants you to look at), do the
>> equivalent of "git fetch ../fred" you did at the toplevel to automate that
>> step, but I very much agree with Pasky here in that it feels very wrong to
>> hijack "submodule update" for it.
>
> There were two proposals how to deal with fetching all submodules:
> (a) git-submodule recursing into submodules, IIRC even with some
> implementation (b) new "git submodule fetch" command.
>

Yes - I think there's a few more options and possible combinations

a. git submodule update having <repository> <refspec> to recurse into
submodules (a)(original patch)
b. git submodule fetch
c. git fetch --submodules
d. git fetch (automatically recurse if there are submodules)
e. git fetch (automatically recurse if there is some setting in .git/config)

I started at (a) and agree that it's a bad choice.
Any of b-e would work for me.
My (personal) preferences would be for d/e, then c, then b - but -
that's based on my belief that submodules are a pretty fundamental
thing and having a separate UI is bad.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] Teach git submodule update to use distributed repositories
  2008-07-18  9:16                   ` Petr Baudis
@ 2008-07-18  9:36                     ` Nigel Magnay
  2008-07-18 10:00                       ` Petr Baudis
  0 siblings, 1 reply; 23+ messages in thread
From: Nigel Magnay @ 2008-07-18  9:36 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Johannes Schindelin, Git Mailing List

On Fri, Jul 18, 2008 at 10:16 AM, Petr Baudis <pasky@suse.cz> wrote:
> snip
>
>        "How do we mass-supply custom submodule URLs when publishing the
>        customized main repository at a custom location too?"
>
Yes - that is an additional problem.

If I may expand the usecase just so it's clear (and to check we're
talkiing the same language)

I do something like
$ git remote add fred git://fredcomputer/superproject/.git
$ git fetch --submodules fred

And when the recursive fetching enters a submodule, it is trying
itself to do something like
$ git fetch fred

At which point
1) the submodule also has a remote specified for fred. In which case
it can continue
2) the submodule doesn't have remote specified for fred. How to solve
this case? (I.E how does 'my' git 'discover' where fred's git
repositories are for the submodules?)
 a) By getting some information from fred, either in *Fred's*
superproject .git/config (or some other readable file)
 b) By reading some information out of the superproject .gitmodules
that has been fetched from fred
 c) By calculating a relative URL based on the supposition that fred
has working copies laid out in the filesystem.

I was tentatively suggesing (c), with a backup of (a) for the minority
cases where you weren't pulling from a person but from a mirror or
something. Having the client edit config files just feels like a hack
to me, regardless of whether scripts are enabled to do it.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] Teach git submodule update to use distributed repositories
  2008-07-18  9:36                     ` Nigel Magnay
@ 2008-07-18 10:00                       ` Petr Baudis
  2008-07-18 11:20                         ` Nigel Magnay
  0 siblings, 1 reply; 23+ messages in thread
From: Petr Baudis @ 2008-07-18 10:00 UTC (permalink / raw)
  To: Nigel Magnay; +Cc: Johannes Schindelin, Git Mailing List

On Fri, Jul 18, 2008 at 10:36:51AM +0100, Nigel Magnay wrote:
> On Fri, Jul 18, 2008 at 10:16 AM, Petr Baudis <pasky@suse.cz> wrote:
> > snip
> >
> >        "How do we mass-supply custom submodule URLs when publishing the
> >        customized main repository at a custom location too?"
> >
> Yes - that is an additional problem.

Wait, I'm lost again - _additional_ problem? How does it differ from the
_original_ problem, how does it differ from what you're explaining below
and how does what you're explaining below differ from the original
problem?

Or are we talking exclusively about what I summed up above now?

> If I may expand the usecase just so it's clear (and to check we're
> talkiing the same language)
> 
> I do something like
> $ git remote add fred git://fredcomputer/superproject/.git
> $ git fetch --submodules fred

I think you mean git pull --submodules fred. Well, actually, you want to
pull the main repository, then submodule update (_not_ pull in the
submodules). See? This is part of the "semantic swamp" I mentioned
before.

I think it should be somehow part of the _main_ project's fred branch
that in this branch, the subprojects should be fetched from a different
location; thus, you would still do

	$ git remote add fred git://fredcomputer/superproject/.git
	$ git pull fred
	$ git submodule update

where either the submodule update takes the info from fred's adjusted
.gitmodules, or it is an implicit part of the branch as in fred tells
you to run the update command with some extra arguments.

However, I still believe the information should primarily stem from the
main repository; consider e.g. if you do not have some of the submodules
checked out when you switch to fred, then figure out that in fred's
branch, you really do want them checked out.

-- 
				Petr "Pasky" Baudis
GNU, n. An animal of South Africa, which in its domesticated state
resembles a horse, a buffalo and a stag. In its wild condition it is
something like a thunderbolt, an earthquake and a cyclone. -- A. Pierce

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] Teach git submodule update to use distributed repositories
  2008-07-18 10:00                       ` Petr Baudis
@ 2008-07-18 11:20                         ` Nigel Magnay
  2008-07-18 14:43                           ` Petr Baudis
  0 siblings, 1 reply; 23+ messages in thread
From: Nigel Magnay @ 2008-07-18 11:20 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Johannes Schindelin, Git Mailing List

On Fri, Jul 18, 2008 at 11:00 AM, Petr Baudis <pasky@suse.cz> wrote:
> On Fri, Jul 18, 2008 at 10:36:51AM +0100, Nigel Magnay wrote:
>> On Fri, Jul 18, 2008 at 10:16 AM, Petr Baudis <pasky@suse.cz> wrote:
>> > snip
>> >
>> >        "How do we mass-supply custom submodule URLs when publishing the
>> >        customized main repository at a custom location too?"
>> >
>> Yes - that is an additional problem.
>
> Wait, I'm lost again - _additional_ problem? How does it differ from the
> _original_ problem, how does it differ from what you're explaining below
> and how does what you're explaining below differ from the original
> problem?
>
In addition to the problem of needing to execute multiple commands and
edit files to acheive what is a rather simple usecase, there is the
additional problem of discovering (for a third party) a url for where
their submodules are stored.

> Or are we talking exclusively about what I summed up above now?
>
In this part of the thread. The first part seems to have broad
agreement that a command could be added / modified, but not yet what
it should look like.

>> If I may expand the usecase just so it's clear (and to check we're
>> talkiing the same language)
>>
>> I do something like
>> $ git remote add fred git://fredcomputer/superproject/.git
>> $ git fetch --submodules fred
>
> I think you mean git pull --submodules fred. Well, actually, you want to
> pull the main repository, then submodule update (_not_ pull in the
> submodules). See? This is part of the "semantic swamp" I mentioned
> before.

Ah - I understand. You're saying "you can't pull submodules when you
pull the supermodule, because you don't know which submodules might be
needed until you also merge / checkout the desired revision" ?

Ack.

>
> I think it should be somehow part of the _main_ project's fred branch
> that in this branch, the subprojects should be fetched from a different
> location; thus, you would still do
>
>        $ git remote add fred git://fredcomputer/superproject/.git
>        $ git pull fred
>        $ git submodule update
>

Yes, that makes sense.

> where either the submodule update takes the info from fred's adjusted
> .gitmodules, or it is an implicit part of the branch as in fred tells
> you to run the update command with some extra arguments.
>
> However, I still believe the information should primarily stem from the
> main repository; consider e.g. if you do not have some of the submodules
> checked out when you switch to fred, then figure out that in fred's
> branch, you really do want them checked out.
>

Yes.
Referring to your earlier mail, I'm now preferring "(4) Something else
that I'm not realizing." ;-)

Hm. It feels like each person could have some 'local' info in their
.gitmodules, and rules around merging; but I'm not sure of exactly
what, or how.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] Teach git submodule update to use distributed repositories
  2008-07-18 11:20                         ` Nigel Magnay
@ 2008-07-18 14:43                           ` Petr Baudis
  2008-07-18 15:09                             ` Nigel Magnay
  0 siblings, 1 reply; 23+ messages in thread
From: Petr Baudis @ 2008-07-18 14:43 UTC (permalink / raw)
  To: Nigel Magnay; +Cc: Johannes Schindelin, Git Mailing List

On Fri, Jul 18, 2008 at 12:20:13PM +0100, Nigel Magnay wrote:
> On Fri, Jul 18, 2008 at 11:00 AM, Petr Baudis <pasky@suse.cz> wrote:
> > On Fri, Jul 18, 2008 at 10:36:51AM +0100, Nigel Magnay wrote:
> >> On Fri, Jul 18, 2008 at 10:16 AM, Petr Baudis <pasky@suse.cz> wrote:
> >> > snip
> >> >
> >> >        "How do we mass-supply custom submodule URLs when publishing the
> >> >        customized main repository at a custom location too?"
> >> >
> >> Yes - that is an additional problem.
> >
> > Wait, I'm lost again - _additional_ problem? How does it differ from the
> > _original_ problem, how does it differ from what you're explaining below
> > and how does what you're explaining below differ from the original
> > problem?
> >
> In addition to the problem of needing to execute multiple commands and
> edit files to acheive what is a rather simple usecase, there is the
> additional problem of discovering (for a third party) a url for where
> their submodules are stored.

I see. That's interconnected as a single "How to check Fred's work"
problem for me. :-)

> >> If I may expand the usecase just so it's clear (and to check we're
> >> talkiing the same language)
> >>
> >> I do something like
> >> $ git remote add fred git://fredcomputer/superproject/.git
> >> $ git fetch --submodules fred
> >
> > I think you mean git pull --submodules fred. Well, actually, you want to
> > pull the main repository, then submodule update (_not_ pull in the
> > submodules). See? This is part of the "semantic swamp" I mentioned
> > before.
> 
> Ah - I understand. You're saying "you can't pull submodules when you
> pull the supermodule, because you don't know which submodules might be
> needed until you also merge / checkout the desired revision" ?
> 
> Ack.

That is something I might agree with, but my point is that within the
submodule,

	git pull

simply isn't a sensible operation at all! You don't want to do any
merges or whatever, just bring the submodule to a defined commit id.
So you want to do something significantly different:

	git fetch
	git reset --hard <commitid>

And that's what 'git submodule update' already does.

> Hm. It feels like each person could have some 'local' info in their
> .gitmodules, and rules around merging; but I'm not sure of exactly
> what, or how.

Again, when customizing .gitmodules, you need to either give up on

	(i) bisectability; it's not good enough to restore the canonical
	.gitmodules contents on merge, since the bisect can run into one
	of the commits on fred' branchs

	(ii) publishing the exact same branch for testing and merging

But I start to feel that the tradeoff of (ii) is really not so bad at
alland this would be perhaps the most elegant solution. You can either

	(a) make two parallel branches, one with your .gitmodules and
	one with the upstream's

	(b) probably better, stick a commit at the top of your branch
	that will change .gitmodules to your locations; others can
	check out fred, test things out, then merge fred^; you can even
	generally go back in fred's commits if you just 'git submodule
	update' right after checking fred out, since all the required
	submodule commits will be probably already fetched.

So I say just go for the (ii)-(b) combination. :-)

-- 
				Petr "Pasky" Baudis
As in certain cults it is possible to kill a process if you know
its true name.  -- Ken Thompson and Dennis M. Ritchie

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] Teach git submodule update to use distributed repositories
  2008-07-18 14:43                           ` Petr Baudis
@ 2008-07-18 15:09                             ` Nigel Magnay
  2008-07-18 15:49                               ` Petr Baudis
  0 siblings, 1 reply; 23+ messages in thread
From: Nigel Magnay @ 2008-07-18 15:09 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Johannes Schindelin, Git Mailing List

>> Ah - I understand. You're saying "you can't pull submodules when you
>> pull the supermodule, because you don't know which submodules might be
>> needed until you also merge / checkout the desired revision" ?
>>
>> Ack.
>
> That is something I might agree with, but my point is that within the
> submodule,
>
>        git pull
>
> simply isn't a sensible operation at all! You don't want to do any
> merges or whatever, just bring the submodule to a defined commit id.
> So you want to do something significantly different:
>
>        git fetch
>        git reset --hard <commitid>
>
> And that's what 'git submodule update' already does.
>

I wasn't wanting to do pull there - but either way, I agree :-)

>> Hm. It feels like each person could have some 'local' info in their
>> .gitmodules, and rules around merging; but I'm not sure of exactly
>> what, or how.
>
> Again, when customizing .gitmodules, you need to either give up on
>
>        (i) bisectability; it's not good enough to restore the canonical
>        .gitmodules contents on merge, since the bisect can run into one
>        of the commits on fred' branchs
>
>        (ii) publishing the exact same branch for testing and merging
>
> But I start to feel that the tradeoff of (ii) is really not so bad at
> alland this would be perhaps the most elegant solution. You can either
>
>        (a) make two parallel branches, one with your .gitmodules and
>        one with the upstream's
>
>        (b) probably better, stick a commit at the top of your branch
>        that will change .gitmodules to your locations; others can
>        check out fred, test things out, then merge fred^; you can even
>        generally go back in fred's commits if you just 'git submodule
>        update' right after checking fred out, since all the required
>        submodule commits will be probably already fetched.
>
> So I say just go for the (ii)-(b) combination. :-)
>

Hmm. Locally modifying my .gitmodules still feels bad because I don't
like either of those tradeoffs (but I don't have any sensible
suggestion yet).

As a bit of background (as to why I'd dislike (a) and (b)), we had a
team switch to git, and one of the really nice things is the ability
to share stuff around and branch freely - but the flipside of that is
that we tend to push to a central repo more rarely, so the advantages
of an continuous integration server become less. What we did is to
tell a centralised CI server the URLs of all the team's git
repositories, and it would periodically pull from them, speculatively
compile anything new, and run the big suite of tests - finishing up by
emailling them a heads-up that a particular state in their repo is
'bad'.

This was really popular as it was demonstrably better than anything we
could do with svn, and best of all, it's pretty much transparent - as
a user you don't have to do anything at all.

I could do it now by hacking about with files; it'd just be nice to
keep it transparent and make it a directly supported feature.

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] Teach git submodule update to use distributed repositories
  2008-07-18 15:09                             ` Nigel Magnay
@ 2008-07-18 15:49                               ` Petr Baudis
  2008-07-18 22:38                                 ` Mark Levedahl
  2008-07-21 10:59                                 ` Nigel Magnay
  0 siblings, 2 replies; 23+ messages in thread
From: Petr Baudis @ 2008-07-18 15:49 UTC (permalink / raw)
  To: Nigel Magnay; +Cc: Johannes Schindelin, Git Mailing List

On Fri, Jul 18, 2008 at 04:09:40PM +0100, Nigel Magnay wrote:
> Hmm. Locally modifying my .gitmodules still feels bad because I don't
> like either of those tradeoffs (but I don't have any sensible
> suggestion yet).
> 
> As a bit of background (as to why I'd dislike (a) and (b)), we had a
> team switch to git, and one of the really nice things is the ability
> to share stuff around and branch freely - but the flipside of that is
> that we tend to push to a central repo more rarely, so the advantages
> of an continuous integration server become less. What we did is to
> tell a centralised CI server the URLs of all the team's git
> repositories, and it would periodically pull from them, speculatively
> compile anything new, and run the big suite of tests - finishing up by
> emailling them a heads-up that a particular state in their repo is
> 'bad'.
> 
> This was really popular as it was demonstrably better than anything we
> could do with svn, and best of all, it's pretty much transparent - as
> a user you don't have to do anything at all.
> 
> I could do it now by hacking about with files; it'd just be nice to
> keep it transparent and make it a directly supported feature.

In that case you would need the "URL mappings", perhaps as a per-remote
attribute. That is, you could configure:

	"When I am doing git pull fred, do git submodule update but
	apply remote.fred.subrewrite sed script on each URL before
	fetching the submodule."

Still, that feels quite hackish to me, and I'm not convinced that your
workflow cannot be adjusted so that users merge only the next-to-last
commit of a branch instead of the last one.

-- 
				Petr "Pasky" Baudis
As in certain cults it is possible to kill a process if you know
its true name.  -- Ken Thompson and Dennis M. Ritchie

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] Teach git submodule update to use distributed   repositories
  2008-07-18 15:49                               ` Petr Baudis
@ 2008-07-18 22:38                                 ` Mark Levedahl
  2008-07-21 10:59                                 ` Nigel Magnay
  1 sibling, 0 replies; 23+ messages in thread
From: Mark Levedahl @ 2008-07-18 22:38 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Nigel Magnay, Johannes Schindelin, Git Mailing List

Petr Baudis wrote:
> On Fri, Jul 18, 2008 at 04:09:40PM +0100, Nigel Magnay wrote:
> 
> In that case you would need the "URL mappings", perhaps as a per-remote
> attribute. That is, you could configure:
> 
> 	"When I am doing git pull fred, do git submodule update but
> 	apply remote.fred.subrewrite sed script on each URL before
> 	fetching the submodule."
> 
> Still, that feels quite hackish to me, and I'm not convinced that your
> workflow cannot be adjusted so that users merge only the next-to-last
> commit of a branch instead of the last one.
> 

There really are two distinct forms of submodule URL's supported by 
git-submodule: absolute and relative. The first says "always go to repository x 
on server y", and is the correct form for a *very* loosely coupled submodule. 
However, it requires a lot of hacking to support fetching from an alternate 
location.

The relative form says "go to this location *relative* to the superproject's 
repository". Using this form greatly eases the use case. Basically, fred has his 
tree of trees on his system, arranged exactly as they are on the main server. If 
you do a git fetch "fred" into superproject, then submodule update, submodule 
should be able to find the related submdodules on "fred" and get the data 
relatively easily.

I actually submitted a patch series a while back that does this, but no-one on 
the list cared for the use case it supported so that series died.

Mark

^ permalink raw reply	[flat|nested] 23+ messages in thread

* Re: [PATCH] Teach git submodule update to use distributed repositories
  2008-07-18 15:49                               ` Petr Baudis
  2008-07-18 22:38                                 ` Mark Levedahl
@ 2008-07-21 10:59                                 ` Nigel Magnay
  1 sibling, 0 replies; 23+ messages in thread
From: Nigel Magnay @ 2008-07-21 10:59 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Johannes Schindelin, Git Mailing List

> In that case you would need the "URL mappings", perhaps as a per-remote
> attribute. That is, you could configure:
>
>        "When I am doing git pull fred, do git submodule update but
>        apply remote.fred.subrewrite sed script on each URL before
>        fetching the submodule."
>
> Still, that feels quite hackish to me, and I'm not convinced that your
> workflow cannot be adjusted so that users merge only the next-to-last
> commit of a branch instead of the last one.
>

Hm - I'm still disliking having 'special' commits to change
.gitmodules. I can hack scripts to make it work, but it would be nice
to have a UI that is generally useful.

Thinking out loud, could we have in .git/config something like

[submodule "moduleA"]
   url = git://origin.com/path/to/.git  # Current place of origin
   fred.url = git://fredcomputer/path/to/freds/moduleA.git # where
fred declares moduleA to come from
   local = git://myhost/working/copy/super/moduleA/.git # where other
people can get access to *my* moduleA repo

So if I look in the git repository of fred (as specified in my [remote
"fred"], I can see their "local" entry, and enter that as fred.url in
my config

And the ability to do (e.g)

$ git submodule init fred
$ git submodule update fred

?

^ permalink raw reply	[flat|nested] 23+ messages in thread

end of thread, other threads:[~2008-07-21 11:00 UTC | newest]

Thread overview: 23+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-07-17 12:08 [PATCH] Teach git submodule update to use distributed repositories Nigel Magnay
2008-07-17 12:13 ` Johannes Schindelin
     [not found]   ` <320075ff0807170520r200e546ejbad2ed103bd65f82@mail.gmail.com>
2008-07-17 12:21     ` Nigel Magnay
2008-07-17 12:58       ` Johannes Schindelin
2008-07-17 14:03         ` Nigel Magnay
2008-07-17 14:16           ` Johannes Schindelin
2008-07-17 15:07             ` Nigel Magnay
2008-07-17 18:22               ` Petr Baudis
2008-07-18  8:11                 ` Nigel Magnay
2008-07-18  8:45                   ` Jakub Narebski
2008-07-18  9:00                     ` Junio C Hamano
2008-07-18  9:07                       ` Jakub Narebski
2008-07-18  9:18                         ` Nigel Magnay
2008-07-18  9:16                   ` Petr Baudis
2008-07-18  9:36                     ` Nigel Magnay
2008-07-18 10:00                       ` Petr Baudis
2008-07-18 11:20                         ` Nigel Magnay
2008-07-18 14:43                           ` Petr Baudis
2008-07-18 15:09                             ` Nigel Magnay
2008-07-18 15:49                               ` Petr Baudis
2008-07-18 22:38                                 ` Mark Levedahl
2008-07-21 10:59                                 ` Nigel Magnay
2008-07-17 14:38 ` Petr Baudis

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).