git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* git submodules and commit
       [not found] <320075ff0807160331j30e8f832m4de3e3bbe9c26801@mail.gmail.com>
@ 2008-07-16 10:32 ` Nigel Magnay
  2008-07-16 10:47   ` Johannes Sixt
  2008-07-16 15:43   ` Avery Pennarun
  0 siblings, 2 replies; 14+ messages in thread
From: Nigel Magnay @ 2008-07-16 10:32 UTC (permalink / raw)
  To: Git Mailing List

I wonder if this is a fairly common pattern. We tend to have modules
as git repositories, and projects that tie together those git
repositories as submodules. In general, > 90% of the work is done in
one module, and the following stanza gets used a lot:

cd /proj/modA
git commit -s -m "Some change"
git push

cd ..
git add modA
git commit -s -m "Some change (modA)"
git push

But since this is much more cumbersome than (say) "svn ci", what often
happens is developers just commit into modA, then carry on. Or for
people just learning git, they somtimes screw up, and push the parent
proj but not the child modA

This is a shame, as it means any external people pulling updates
directly from proj will not get this change (e.g. CI tools
speculatively compiling against every developer tree).

For me, in some really high proportion of cases, I think I want 'git
commit' to mean 'commit to any child repositories, any sibling
repositories, and any parent repositories (updating the submodule sha1
as appropriate). In other words, 'pretend like the whole thing is one
big repo'.

I guess it probably gets sticky when there are merge conflicts. Is
anyone working on this kind of thing; I might be able to give some
time to help work on it?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: git submodules and commit
  2008-07-16 10:32 ` git submodules and commit Nigel Magnay
@ 2008-07-16 10:47   ` Johannes Sixt
  2008-07-16 11:02     ` Nigel Magnay
  2008-07-16 15:43   ` Avery Pennarun
  1 sibling, 1 reply; 14+ messages in thread
From: Johannes Sixt @ 2008-07-16 10:47 UTC (permalink / raw)
  To: Nigel Magnay; +Cc: Git Mailing List

Nigel Magnay schrieb:
> For me, in some really high proportion of cases, I think I want 'git
> commit' to mean 'commit to any child repositories, any sibling
> repositories, and any parent repositories (updating the submodule sha1
> as appropriate). In other words, 'pretend like the whole thing is one
> big repo'.

And I think that this is the problem: If this way of commiting your
changes is *required* in the *majority* of cases, then you are IMO outside
the intended use-case of submodules. You are better served by really
making this one big repo.

IMO, submodules are to be used if you can afford to advance parent project
and submodules at different paces; i.e. if the parent project can work
with newer versions of the submodules (and possibly in a degraded mode
even with outdated versions).

-- Hannes

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: git submodules and commit
  2008-07-16 10:47   ` Johannes Sixt
@ 2008-07-16 11:02     ` Nigel Magnay
  2008-07-16 11:35       ` Johannes Sixt
  0 siblings, 1 reply; 14+ messages in thread
From: Nigel Magnay @ 2008-07-16 11:02 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Git Mailing List

On Wed, Jul 16, 2008 at 11:47 AM, Johannes Sixt <j.sixt@viscovery.net> wrote:
> Nigel Magnay schrieb:
>> For me, in some really high proportion of cases, I think I want 'git
>> commit' to mean 'commit to any child repositories, any sibling
>> repositories, and any parent repositories (updating the submodule sha1
>> as appropriate). In other words, 'pretend like the whole thing is one
>> big repo'.
>
> And I think that this is the problem: If this way of commiting your
> changes is *required* in the *majority* of cases, then you are IMO outside
> the intended use-case of submodules. You are better served by really
> making this one big repo.
>

Hm - then my contention is that the scope of submodules needs to be
expanded (or something needs to be built on top).

One-big-repo doesn't fly - > 75% of the code volume (the 'other'
modules) are shared between multiple projects. In SVN these are just
svn:externals (which has it's own imperfections).

I think it's a common usecase. You have 'shared' modules and
'project-specific' modules[*]. The 'shared' modules you hope don't
change very much, but they are part of the overall project
configuration - it's really nice that you can branch so easily in git,
then get the module owner to merge those changes into the next release
at their leisure. The superproject then represents the correct
configuration of submodule trees to make a valid build.

The machinery has everything that's required, it's just the user
experience sucks :(

[*] actually there's more subtlety, there's 'shared', 'product' and
'project', so some 'specific' modules are potentially re-shared
elsewhere.
> IMO, submodules are to be used if you can afford to advance parent project
> and submodules at different paces; i.e. if the parent project can work
> with newer versions of the submodules (and possibly in a degraded mode
> even with outdated versions).
>
> -- Hannes
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: git submodules and commit
  2008-07-16 11:02     ` Nigel Magnay
@ 2008-07-16 11:35       ` Johannes Sixt
  2008-07-16 12:11         ` Petr Baudis
  2008-07-16 12:48         ` Nigel Magnay
  0 siblings, 2 replies; 14+ messages in thread
From: Johannes Sixt @ 2008-07-16 11:35 UTC (permalink / raw)
  To: Nigel Magnay; +Cc: Git Mailing List

Nigel Magnay schrieb:
> On Wed, Jul 16, 2008 at 11:47 AM, Johannes Sixt <j.sixt@viscovery.net> wrote:
>> Nigel Magnay schrieb:
>>> For me, in some really high proportion of cases, I think I want 'git
>>> commit' to mean 'commit to any child repositories, any sibling
>>> repositories, and any parent repositories (updating the submodule sha1
>>> as appropriate). In other words, 'pretend like the whole thing is one
>>> big repo'.
>> And I think that this is the problem: If this way of commiting your
>> changes is *required* in the *majority* of cases, then you are IMO outside
>> the intended use-case of submodules. You are better served by really
>> making this one big repo.
>>
> 
> Hm - then my contention is that the scope of submodules needs to be
> expanded (or something needs to be built on top).
> 
> One-big-repo doesn't fly - > 75% of the code volume (the 'other'
> modules) are shared between multiple projects. In SVN these are just
> svn:externals (which has it's own imperfections).
> 
> I think it's a common usecase. You have 'shared' modules and
> 'project-specific' modules[*]. The 'shared' modules you hope don't
> change very much, but they are part of the overall project
> configuration - it's really nice that you can branch so easily in git,
> then get the module owner to merge those changes into the next release
> at their leisure. The superproject then represents the correct
> configuration of submodule trees to make a valid build.

Ah, is this your actual scenario? Just to make sure we are talking about
the same thing:

- You own superproject P.
- $Maintainer owns submodule S.
- You use S in P.
- You make changes to S that you would like $Maintainer to include in the
next release.
x You use in P your changes to S while $Maintainer has not yet released a
new version of S with your changes.
- Finally your changes arrive via the new release of S.

That *is* the intended use-case for submodules. But you have to play the
game by the rules:

- $Maintainer defines the official states of S.

- You must never commit an unofficial state of S in P.

The critical step in above list I marked with x:

- During the period where only *you* have the new changes to S, you must
*not* commit your submodule state to P. Instead, you write P in such a way
that it can work with both the old version of S and the upcoming release
that will have your changes[*]. This way you make sure that your consumers
of P always have a working version regardless of which version of S they use.

- After you have received the new release of S from $Maintainer, you
commit the new state of S in P. And if you are nice to your consumers of
P, then you *do not* remove the workaround from P just yet, so that you
don't force them to upgrade S. You will remove it later only if it becomes
a maintainance burden.

[*] If it is not possible to make P work with old and new versions, then
you have to work closely with the $Maintainer so that you never need
commit an unofficial state of S into P.

-- Hannes

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: git submodules and commit
  2008-07-16 11:35       ` Johannes Sixt
@ 2008-07-16 12:11         ` Petr Baudis
  2008-07-16 12:48         ` Nigel Magnay
  1 sibling, 0 replies; 14+ messages in thread
From: Petr Baudis @ 2008-07-16 12:11 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Nigel Magnay, Git Mailing List

On Wed, Jul 16, 2008 at 01:35:24PM +0200, Johannes Sixt wrote:
> Ah, is this your actual scenario? Just to make sure we are talking about
> the same thing:
> 
> - You own superproject P.
> - $Maintainer owns submodule S.
> - You use S in P.
> - You make changes to S that you would like $Maintainer to include in the
> next release.
> x You use in P your changes to S while $Maintainer has not yet released a
> new version of S with your changes.
> - Finally your changes arrive via the new release of S.
> 
> That *is* the intended use-case for submodules. But you have to play the
> game by the rules:
> 
> - $Maintainer defines the official states of S.
> 
> - You must never commit an unofficial state of S in P.

I think the issue here is that $Maintainer = him (or Maintainers(P) =
Maintainers(S), in general); the workflow you described still works, but
is overly complicated and that is the original complaint.

				Petr "Pasky" Baudis

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: git submodules and commit
  2008-07-16 11:35       ` Johannes Sixt
  2008-07-16 12:11         ` Petr Baudis
@ 2008-07-16 12:48         ` Nigel Magnay
  2008-07-16 13:38           ` Johannes Sixt
  1 sibling, 1 reply; 14+ messages in thread
From: Nigel Magnay @ 2008-07-16 12:48 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Git Mailing List

> Ah, is this your actual scenario? Just to make sure we are talking about
> the same thing:
>
> - You own superproject P.
> - $Maintainer owns submodule S.
> - You use S in P.
> - You make changes to S that you would like $Maintainer to include in the
> next release.
> x You use in P your changes to S while $Maintainer has not yet released a
> new version of S with your changes.
> - Finally your changes arrive via the new release of S.
>
> That *is* the intended use-case for submodules. But you have to play the
> game by the rules:
>

Yes, that is the situation - with the proviso that it's not always
clear in company environments who $Maintainer actually is. For
example, if the only changes occurring in S come from me, then chances
are come release cycle, $Maintainer == me.

P and S aren't distant projects, they're closely coupled.

> - $Maintainer defines the official states of S.
>
Yes - there is one branch ('master') which the changes eventually
should be merged to, and releases will be performed on

> - You must never commit an unofficial state of S in P.
>

If by that you mean that the only person to move the branch 'master'
is $Maintainer, then I agree.
If by that you mean that you can't commit at all to the S tree (and
the S submodule pointer) then I don't agree, and I think that's a
serious limitation in productivity.

> The critical step in above list I marked with x:
>
> - During the period where only *you* have the new changes to S, you must
> *not* commit your submodule state to P. Instead, you write P in such a way
> that it can work with both the old version of S and the upcoming release
> that will have your changes[*]. This way you make sure that your consumers
> of P always have a working version regardless of which version of S they use.
>

Just to be clear - there's more than just 'me' working on P - there's
a whole team of people working on it. And there's Q R S and T teams
also working on projects that also have S.

Changes that happen to S are, often, new features or bug fixes. We
can't just stop because there isn't an 'official' version of S yet
(and the official version might end up simply being a FF anyway), so
saying 'don't commit your submodule state to P' is unrealistic.

And that should be the big advantage of git. If we suddenly find we
need some additional functionality in S, we just add it to our
P-branch-of-S. The $Maintainer (if he exists) can review these
upcoming changes in the tree, and merge them to master as appropriate
(or work with the projects to iron out cross-branch
incompatibilities). The best example is that S is a "product", and (by
management decree), the only product changes that happen will occur
because of *projects* (like P). And we can do this (and it's
infinitely better than svn, where 'ooh, branches too hard, everyone in
[P-T] just commit to trunk'. But the UI is an ache.


> - After you have received the new release of S from $Maintainer, you
> commit the new state of S in P. And if you are nice to your consumers of
> P, then you *do not* remove the workaround from P just yet, so that you
> don't force them to upgrade S. You will remove it later only if it becomes
> a maintainance burden.
>
Maintaining backwards compatibility isn't an issue at all for us.

> [*] If it is not possible to make P work with old and new versions, then
> you have to work closely with the $Maintainer so that you never need
> commit an unofficial state of S into P.
>
> -- Hannes
>
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: git submodules and commit
  2008-07-16 12:48         ` Nigel Magnay
@ 2008-07-16 13:38           ` Johannes Sixt
  2008-07-16 14:03             ` Nigel Magnay
  0 siblings, 1 reply; 14+ messages in thread
From: Johannes Sixt @ 2008-07-16 13:38 UTC (permalink / raw)
  To: Nigel Magnay; +Cc: Git Mailing List

Nigel Magnay schrieb:
> P and S aren't distant projects, they're closely coupled.

And I'm saying that submodules are designed for *loosely* coupled projects.

It's no wonder that this tool is awkward to use in your workflow.

-- Hannes

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: git submodules and commit
  2008-07-16 13:38           ` Johannes Sixt
@ 2008-07-16 14:03             ` Nigel Magnay
  2008-07-16 14:17               ` Petr Baudis
  0 siblings, 1 reply; 14+ messages in thread
From: Nigel Magnay @ 2008-07-16 14:03 UTC (permalink / raw)
  To: Johannes Sixt; +Cc: Git Mailing List

On Wed, Jul 16, 2008 at 2:38 PM, Johannes Sixt <j.sixt@viscovery.net> wrote:
> Nigel Magnay schrieb:
>> P and S aren't distant projects, they're closely coupled.
>
> And I'm saying that submodules are designed for *loosely* coupled projects.
>
> It's no wonder that this tool is awkward to use in your workflow.
>

Ok in a sense. I don't think it's particularly clear from the
documentation that this is a limitation of submodules though.

Given that
- The only way in git to separate out re-usable modules is by the use
of submodules
and
- It's a pretty common usecase for these submodules to be interrelated
and
- Looking over the list archives, it seems this is quite common complaint

"I really like the git submodule implementation, I just don't like how
hard it is to work with"

 "The current behaviour strongly encourages me to avoid submodules
when I would otherwise like to use them, just to keep the rest of my
team members (who are not git experts) from going insane."

 "For my use case, I passionately dislike the fact that a submodule is
not updated automatically.  There's never a time when I don't want to
update the submodule.  The submodule is a very important piece of our
project and the super-project depends on it being at the right
version."

and
- All the technical capability is there, it's just the porcelain
that's causing the friction.
then
 would this not seem to be an area that could be improved? Even if it
were an optional mode of working?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: git submodules and commit
  2008-07-16 14:03             ` Nigel Magnay
@ 2008-07-16 14:17               ` Petr Baudis
  2008-07-16 14:31                 ` Nigel Magnay
  0 siblings, 1 reply; 14+ messages in thread
From: Petr Baudis @ 2008-07-16 14:17 UTC (permalink / raw)
  To: Nigel Magnay; +Cc: Johannes Sixt, Git Mailing List

On Wed, Jul 16, 2008 at 03:03:41PM +0100, Nigel Magnay wrote:
> - All the technical capability is there, it's just the porcelain
> that's causing the friction.
> then
>  would this not seem to be an area that could be improved? Even if it
> were an optional mode of working?

So, were there already any patches posted to add such a functionality
that were rejected? If not, apparently noone cared _enough_, yet. ;-)
You may be the first!

I don't know if there are any _present_ "free developers" willing to
pick up this task now.  For many (most?) Git developers, submodules
simply aren't a priority.  For me, they actually currently are, but I
probably won't want to use them in your way either (even though I can
agree that your sentiments are valid), so I will personally invest my
time in doing other things than figuring out the precise semantics
these operations should have etc.

-- 
				Petr "Pasky" Baudis
GNU, n. An animal of South Africa, which in its domesticated state
resembles a horse, a buffalo and a stag. In its wild condition it is
something like a thunderbolt, an earthquake and a cyclone. -- A. Pierce

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: git submodules and commit
  2008-07-16 14:17               ` Petr Baudis
@ 2008-07-16 14:31                 ` Nigel Magnay
  0 siblings, 0 replies; 14+ messages in thread
From: Nigel Magnay @ 2008-07-16 14:31 UTC (permalink / raw)
  To: Petr Baudis; +Cc: Johannes Sixt, Git Mailing List

> On Wed, Jul 16, 2008 at 03:03:41PM +0100, Nigel Magnay wrote:
>> - All the technical capability is there, it's just the porcelain
>> that's causing the friction.
>> then
>>  would this not seem to be an area that could be improved? Even if it
>> were an optional mode of working?
>
> So, were there already any patches posted to add such a functionality
> that were rejected? If not, apparently noone cared _enough_, yet. ;-)
> You may be the first!
>
> I don't know if there are any _present_ "free developers" willing to
> pick up this task now.  For many (most?) Git developers, submodules
> simply aren't a priority.  For me, they actually currently are, but I
> probably won't want to use them in your way either (even though I can
> agree that your sentiments are valid), so I will personally invest my
> time in doing other things than figuring out the precise semantics
> these operations should have etc.
>

That's cool. I was guessing it might be the case (or alternatively
that someone might say 'yeah, but it's 25% of the way there'); my
original query was also one of an offer of help ;-) My guess though is
that the core-devs have much more connected neural pathways at
thinking about the problems around the edge cases to be able to give
warnings of  'there be dragons'!


Nigel

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: git submodules and commit
  2008-07-16 10:32 ` git submodules and commit Nigel Magnay
  2008-07-16 10:47   ` Johannes Sixt
@ 2008-07-16 15:43   ` Avery Pennarun
  2008-07-17  9:47     ` Nigel Magnay
  2008-07-18 16:11     ` Ping Yin
  1 sibling, 2 replies; 14+ messages in thread
From: Avery Pennarun @ 2008-07-16 15:43 UTC (permalink / raw)
  To: Nigel Magnay; +Cc: Git Mailing List

On 7/16/08, Nigel Magnay <nigel.magnay@gmail.com> wrote:
> I wonder if this is a fairly common pattern. We tend to have modules
>  as git repositories, and projects that tie together those git
>  repositories as submodules. [and submodules are necessary because they're
>  shared between multiple supermodules].

I have exactly the same problem as you, and have been working on
improving my own workflow so that someday I can offer patches that
might be generally applicable.

In the meantime, my solution is... some shell scripts checked in at
the top level of my project. :)

In one of my applications, I have a /wv submodule, which provides a
cross-platform build environment.  That environment respectively
contains a /wv/wvstreams submodule, which is a library that we use.

When I make a change to wvstreams that's needed for my application, I
need to check into wvstreams, then check that link into wv, then check
that link into the application.  Then, when I push, I have to make
sure to always push wvstreams first, then wv, then application, or
else other users can end up with "commit id xxxxxx not found" type
errors.

So basically, committing is always harmless, since I can do anything I
want in my own repo (and I want to be able to update wvstreams
*without* always updating wv, and so on).  The tricky part is pushing.
 Here's the script I wrote to make sure I don't screw up when pushing:


~/src/vx-lin $ cat push-git-modules
#!/bin/sh -x
set -e
test -e wv/wvstreams/Makefile
(cd wv/wvstreams && git push origin HEAD:master) &&
(cd wv && git push origin HEAD:master) &&
git push origin HEAD:master ||
echo "Failed!"


Now, this script is pretty flawed.  Notably, it always pushes to the
'master' branch, which is stupid.  However, it works in our particular
workflow, because wvstreams isn't being modified by too many
developers and it's okay if we all commit to master.  This is also
aided by the fact that people are trained to push only after they've
made all the unit tests pass, etc.  And further, individual apps don't
have to update their wvstreams to the latest anyway unless they really
need the latest changes, which is a wonderful feature of git
submodules.

Now, sometimes the above push script will fail.  In my experience,
this is only when someone else has pushed in something before you,
which means a fast-forward is not possible on at least one of the
repos.  When that happens, you have to pull first, using this script:

~/src/vx-lin $ cat newest-git-modules
#!/bin/sh -x
set -e
test -e wv/wvstreams/Makefile
git pull origin master &&
(cd wv && git pull origin master) &&
(cd wv/wvstreams && git pull origin master) ||
echo "Failed!"

This pulls in the latest version of application, wv, and wvstreams, in
that order, and stops in case of any merge conflicts so that you can
resolve them by hand.  It's safe to run the above script more than
once in case you're not sure if it's done or not.

After pulling the new modules, you may need to make new commits to
update to the latest submodule commits - if that's indeed what you
want.  And then you can run push-git-modules, and be reasonably
assured that it will work (unless someone made another push while you
were fixing conflicts).

Finally, I have another script that retrieves the *currently linked*
version of the git modules.  I wish git-checkout would do this
automatically, but it doesn't, for apparently-difficult-to-resolve
safety reasons.  Anyway, note that this script uses the existence of
submodule/Makefile as "proof" that the submodule was checked out
correctly.


~/src/vx-lin $ cat get-git-modules
#!/bin/sh -x
set -e
git submodule init
git submodule update
test -e wv/Makefile
(cd wv && git submodule init && git submodule update)
test -e wv/wvstreams/Makefile


>  I guess it probably gets sticky when there are merge conflicts. Is
>  anyone working on this kind of thing; I might be able to give some
>  time to help work on it?

So as you can see, my scripts are crappy.  However, they have already
drastically reduced the number of mistakes made by developers in my
group (especially commits lost due to 'git submodule update' at the
wrong time, and pushes of the supermodule before the submodule).

If you want to work with me on my new submodule workflow (and I'd
certainly appreciate it!) then I'd suggest one or more of the
following starting points:

- Take the recursive push, pull, and update operations described
above, make them general (ie. not referring to my submodules by name
:)), and add them as commands in the real git-submodule script.  The
trickiest part here will be figuring out which remote branch to
push/pull.

- Perhaps add a "recursive commit" operation that recursively
auto-commits submodule refs, for use after running the
newest-git-modules script.  The commit message could be auto-generated
using something like "git-whatchanged" on the submodule.

- See what can be done about making git-checkout automatically
git-submodule-update *if and only if* the currently checked-out commit
of the submodule exactly matches the one that was checked out last
time, *and* the desired commit is already available in the submodule
repo (which is not necessarily the case, if you haven't fetched it
yet).  That is, as with any file in git, if it hasn't changed from the
one in the repo, you know you won't lose any information if you just
auto-replace it with the new version.

- Fix git-submodule-update to not just switch submodule branches if
you've made checkins in that submodule.  Right now, commits to a
submodule by default don't go to any branch, so if you subsequently
run git-submodule-update, your commits are lost (except for the
reflog).  This is very un-git-like in general, and
git-submodule-update should be much more polite.

Note that git-submodule is only about 800 lines of shell.  It's
remarkably straightforward to make it do whatever you want.  The hard
part is figuring out what you want, and making sure you don't stomp on
*other* people's workflows while you're there.

Also note that even if you don't contribute any of the above, I'm
planning to someday make time to do it myself :)  But don't hold your
breath.  I've been busy.

Have fun,

Avery

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: git submodules and commit
  2008-07-16 15:43   ` Avery Pennarun
@ 2008-07-17  9:47     ` Nigel Magnay
  2008-07-17 15:12       ` Avery Pennarun
  2008-07-18 16:11     ` Ping Yin
  1 sibling, 1 reply; 14+ messages in thread
From: Nigel Magnay @ 2008-07-17  9:47 UTC (permalink / raw)
  To: Avery Pennarun; +Cc: Git Mailing List

On Wed, Jul 16, 2008 at 4:43 PM, Avery Pennarun <apenwarr@gmail.com> wrote:
> On 7/16/08, Nigel Magnay <nigel.magnay@gmail.com> wrote:
>> I wonder if this is a fairly common pattern. We tend to have modules
>>  as git repositories, and projects that tie together those git
>>  repositories as submodules. [and submodules are necessary because they're
>>  shared between multiple supermodules].
>
> I have exactly the same problem as you, and have been working on
> improving my own workflow so that someday I can offer patches that
> might be generally applicable.
>
> In the meantime, my solution is... some shell scripts checked in at
> the top level of my project. :)
>
> In one of my applications, I have a /wv submodule, which provides a
> cross-platform build environment.  That environment respectively
> contains a /wv/wvstreams submodule, which is a library that we use.
>
> When I make a change to wvstreams that's needed for my application, I
> need to check into wvstreams, then check that link into wv, then check
> that link into the application.  Then, when I push, I have to make
> sure to always push wvstreams first, then wv, then application, or
> else other users can end up with "commit id xxxxxx not found" type
> errors.
>
> So basically, committing is always harmless, since I can do anything I
> want in my own repo (and I want to be able to update wvstreams
> *without* always updating wv, and so on).  The tricky part is pushing.
>  Here's the script I wrote to make sure I don't screw up when pushing:
>
>
> ~/src/vx-lin $ cat push-git-modules
> #!/bin/sh -x
> set -e
> test -e wv/wvstreams/Makefile
> (cd wv/wvstreams && git push origin HEAD:master) &&
> (cd wv && git push origin HEAD:master) &&
> git push origin HEAD:master ||
> echo "Failed!"
>
>
> Now, this script is pretty flawed.  Notably, it always pushes to the
> 'master' branch, which is stupid.  However, it works in our particular
> workflow, because wvstreams isn't being modified by too many
> developers and it's okay if we all commit to master.  This is also
> aided by the fact that people are trained to push only after they've
> made all the unit tests pass, etc.  And further, individual apps don't
> have to update their wvstreams to the latest anyway unless they really
> need the latest changes, which is a wonderful feature of git
> submodules.
>

Yes - I use something rather similar on my desktop. The unfortunate
thing is that I know how submodules work, and am happy with the
scripts. My users are sometimes in the 'git gui' types - not as
technically literate, and likely on Windows.

> Now, sometimes the above push script will fail.  In my experience,
> this is only when someone else has pushed in something before you,
> which means a fast-forward is not possible on at least one of the
> repos.  When that happens, you have to pull first, using this script:
>
> ~/src/vx-lin $ cat newest-git-modules
> #!/bin/sh -x
> set -e
> test -e wv/wvstreams/Makefile
> git pull origin master &&
> (cd wv && git pull origin master) &&
> (cd wv/wvstreams && git pull origin master) ||
> echo "Failed!"
>
> This pulls in the latest version of application, wv, and wvstreams, in
> that order, and stops in case of any merge conflicts so that you can
> resolve them by hand.  It's safe to run the above script more than
> once in case you're not sure if it's done or not.
>
> After pulling the new modules, you may need to make new commits to
> update to the latest submodule commits - if that's indeed what you
> want.  And then you can run push-git-modules, and be reasonably
> assured that it will work (unless someone made another push while you
> were fixing conflicts).
>

Yeah - this happens a lot. If someone else commits to the
super-project before you, it's always a conflict. What's annoying is
there's no way around it (though resolution is easy - force to current
- but it this is a big bit of what confuses my users. They say 'but I
already resolved the merges in the submodule itself'. I'm not sure
there's an easy way around it though - and this is part of my worry
that there's hidden complexity with trying to make it 'look like 1 big
repo').

> Finally, I have another script that retrieves the *currently linked*
> version of the git modules.  I wish git-checkout would do this
> automatically, but it doesn't, for apparently-difficult-to-resolve
> safety reasons.  Anyway, note that this script uses the existence of
> submodule/Makefile as "proof" that the submodule was checked out
> correctly.
>
>
> ~/src/vx-lin $ cat get-git-modules
> #!/bin/sh -x
> set -e
> git submodule init
> git submodule update
> test -e wv/Makefile
> (cd wv && git submodule init && git submodule update)
> test -e wv/wvstreams/Makefile
>
>
>>  I guess it probably gets sticky when there are merge conflicts. Is
>>  anyone working on this kind of thing; I might be able to give some
>>  time to help work on it?
>
> So as you can see, my scripts are crappy.  However, they have already
> drastically reduced the number of mistakes made by developers in my
> group (especially commits lost due to 'git submodule update' at the
> wrong time, and pushes of the supermodule before the submodule).
>

Yeah. I have an additional usecase, which is around pulling from
another user. If they've made changes in their tree(s) that they want
to get reviewed, normally I could do something like

git fetch ssh://joebloggs.computer/blah +refs/heads/*:refs/remotes/joebloggs/*

But if they've made cross-module changes, I'm SOL, as fetching their
super-project will have references to commits that aren't in the repo
mentioned in .gitmodules (only in joebloggs's tree) - so doing git
submodule update doesn't help. I have to go into each submodule and
explicitly fetch. It feels wierdly centralised for this otherwise
distributed tool.

> If you want to work with me on my new submodule workflow (and I'd
> certainly appreciate it!) then I'd suggest one or more of the
> following starting points:
>
> - Take the recursive push, pull, and update operations described
> above, make them general (ie. not referring to my submodules by name
> :)), and add them as commands in the real git-submodule script.  The
> trickiest part here will be figuring out which remote branch to
> push/pull.
>

What's bugging me is I'm not sure that it's the right place. It seems
(to me) that having the only place that knows about submodules being
the 'git submodules' script isn't right. What users want is 'git fetch
<blah>' to do the lot - that, for the most, it ought to do the
submodule init, update and clever stuff automatically. That if 'git
fetch' is porcelain, then the porcelain needs to call the
git-submodule stuff.

But - perhaps it's best to approach it as scripts for now :)

> - Perhaps add a "recursive commit" operation that recursively
> auto-commits submodule refs, for use after running the
> newest-git-modules script.  The commit message could be auto-generated
> using something like "git-whatchanged" on the submodule.
>
Hm - I'd be happy with the same commt message in all modules. What I
want is to be able to do (from the top) 'git commit -a' or the same
with the GUI, and see all the files to be committed regardless of
whether they're in a submodule or not.

I'm guessing you probably need to build a tree of submodules, and
commit from the tips backwards towards the top level superproject.

This is what the users want - something that mirrors 'svn ci' at the
top level - "Please Check All My stuff in".

> - See what can be done about making git-checkout automatically
> git-submodule-update *if and only if* the currently checked-out commit
> of the submodule exactly matches the one that was checked out last
> time, *and* the desired commit is already available in the submodule
> repo (which is not necessarily the case, if you haven't fetched it
> yet).  That is, as with any file in git, if it hasn't changed from the
> one in the repo, you know you won't lose any information if you just
> auto-replace it with the new version.
>
> - Fix git-submodule-update to not just switch submodule branches if
> you've made checkins in that submodule.  Right now, commits to a
> submodule by default don't go to any branch, so if you subsequently
> run git-submodule-update, your commits are lost (except for the
> reflog).  This is very un-git-like in general, and
> git-submodule-update should be much more polite.
We always move back onto a branch immediately after submodule update,
which is another thing to forget!

>
> Note that git-submodule is only about 800 lines of shell.  It's
> remarkably straightforward to make it do whatever you want.  The hard
> part is figuring out what you want, and making sure you don't stomp on
> *other* people's workflows while you're there.
>
Totally.

> Also note that even if you don't contribute any of the above, I'm
> planning to someday make time to do it myself :)  But don't hold your
> breath.  I've been busy.
>
Ditto.

> Have fun,
>
> Avery
>

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: git submodules and commit
  2008-07-17  9:47     ` Nigel Magnay
@ 2008-07-17 15:12       ` Avery Pennarun
  0 siblings, 0 replies; 14+ messages in thread
From: Avery Pennarun @ 2008-07-17 15:12 UTC (permalink / raw)
  To: Nigel Magnay; +Cc: Git Mailing List

On 7/17/08, Nigel Magnay <nigel.magnay@gmail.com> wrote:
> Yeah - this happens a lot. If someone else commits to the
>  super-project before you, it's always a conflict. What's annoying is
>  there's no way around it (though resolution is easy - force to current
>  - but it this is a big bit of what confuses my users. They say 'but I
>  already resolved the merges in the submodule itself'. I'm not sure
>  there's an easy way around it though - and this is part of my worry
>  that there's hidden complexity with trying to make it 'look like 1 big
>  repo').

This might not be as hard as it sounds.  We probably just need to
teach the supermodule how to merge gitlinks safely.  So basically, if
I moved the gitlink from A to B, and he moved it from A to C, then it
needs to check whether a fast forward merge already exists for the
submodule to combine B and C.  This is easier than it sounds, because
if I *already* ran my newest-git-modules script in the inner module,
then I've already manually resolved the merge in question, so that B
*does* actually contain C.

Right now, such a thing results in a conflict.  It isn't really a
conflict though, it's a fast forward, and the supermodule's merge
should ideally just notice that and run with it.

Sadly I know very little about the merge code.  But I would be happy
to help you test a patch that implemented this :)

A slightly more advanced version of the same would automatically walk
into the submodule and ask it to merge B and C.  I suspect that is way
more complicated than it sounds at first glance, though (particularly
if the new B or C gitlink doesn't have A as a parent at all, which
couldn't happen in a unified git repo, but is perfectly allowable with
submodules).

With anything like this, there's always the question of what happens
if you haven't done a "fetch" in the submodule yet; I think reverting
to the current behaviour is fine in that case, because I can make
newest-git-modules to always fetch before trying anything anyway.

> Yeah. I have an additional usecase, which is around pulling from
>  another user. If they've made changes in their tree(s) that they want
>  to get reviewed, normally I could do something like
>
>  git fetch ssh://joebloggs.computer/blah +refs/heads/*:refs/remotes/joebloggs/*
>
>  But if they've made cross-module changes, I'm SOL, as fetching their
>  super-project will have references to commits that aren't in the repo
>  mentioned in .gitmodules (only in joebloggs's tree) - so doing git
>  submodule update doesn't help. I have to go into each submodule and
>  explicitly fetch. It feels wierdly centralised for this otherwise
>  distributed tool.

One slightly non-obvious option here is to actually use the *same*
repo for all your supermodules and submodules, then use "." as the
repo path in your .gitmodules.  The original clone is huge that way,
but it makes it obvious how to get any objects that you're missing.

Then you could construct your submodules using --reference the
supermodule.  Thus, doing a "fetch" of your user's supermodule, you'll
also get all the objects it references.

Note that I've only basically tried out this technique.  I think it's
the one for me, but I haven't experimented with it enough to know any
pitfalls.  When I've brought it up on the list, it's been shot down
because it wouldn't work for gigantic mega-repositories like KDE where
the whole point of submodules is to *not* download all the modules
every time.  It works for me, though, because my software doesn't even
*build* unless I have all the modules.

(And before anyone asks, yes, it still makes sense to use submodules
because some of the modules are shared with other projects.)

> What's bugging me is I'm not sure that it's the right place. It seems
>  (to me) that having the only place that knows about submodules being
>  the 'git submodules' script isn't right. What users want is 'git fetch
>  <blah>' to do the lot - that, for the most, it ought to do the
>  submodule init, update and clever stuff automatically. That if 'git
>  fetch' is porcelain, then the porcelain needs to call the
>  git-submodule stuff.

There is some architectural elegance to the fact that the gitlink
stuff is almost completely abstract (just a number, really) in the
core of git, and is only made "real" by running git-submodule, which
actually extracts files and makes .git dirs and fetches submodules and
whatnot.

However, it's architectural elegance, not UI elegance.  As a user, I
mostly don't want to have to care whether a particular directory is a
"submodule" or not, so the extra push and commit and fetch steps
become tedious.  From the point of view of UI, I agree with you.

Perhaps this is a plumbing vs. porcelain issue.  I don't think
git-submodule has made an attempt to separate the two, since it seems
to be porcelain, but there's no "submodule plumbing" underneath
(AFAICS) that things like git-fetch and git-commit and git-push can
plug into.

>  But - perhaps it's best to approach it as scripts for now :)

I suspect so :)

> Hm - I'd be happy with the same commt message in all modules. What I
>  want is to be able to do (from the top) 'git commit -a' or the same
>  with the GUI, and see all the files to be committed regardless of
>  whether they're in a submodule or not.

That actually wouldn't work very well for me.  I do need the commits
separated, because that's why I'm using submodules in the first place
instead of the "subtree" merge strategy.

Basically, I'm still planning on contributing patches to my class
library upstream, and the patches need to talk about how they affect
the *library*, not what I changed in my application.  So I *would*
want to write separate commit messages in all cases.  I can see how
other people might not, though.

>  This is what the users want - something that mirrors 'svn ci' at the
>  top level - "Please Check All My stuff in".

Note that submodules are more like svn:externals, which also require
you to commit each module separately.  One big difference there is
that you don't need to commit to the supermodule each time you commit
to the submodule, but that's only because svn:externals by default
links to a branch, not to a particular revision.  The
revision-specific linking is very worthwhile, I think, so requiring an
extra commit is mostly okay here.

Perhaps automating the extra commit would be nice in some cases, but
for me, for example, I tend to combine my "update to newest version of
submodule" commit with some changes to the supermodule, since the
reason I updated was to implement this new feature.

Have fun,

Avery

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: git submodules and commit
  2008-07-16 15:43   ` Avery Pennarun
  2008-07-17  9:47     ` Nigel Magnay
@ 2008-07-18 16:11     ` Ping Yin
  1 sibling, 0 replies; 14+ messages in thread
From: Ping Yin @ 2008-07-18 16:11 UTC (permalink / raw)
  To: Avery Pennarun; +Cc: Nigel Magnay, Git Mailing List

On Wed, Jul 16, 2008 at 11:43 PM, Avery Pennarun <apenwarr@gmail.com> wrote:
> On 7/16/08, Nigel Magnay <nigel.magnay@gmail.com> wrote:

> If you want to work with me on my new submodule workflow (and I'd
> certainly appreciate it!) then I'd suggest one or more of the
> following starting points:
>
> - Take the recursive push, pull, and update operations described
> above, make them general (ie. not referring to my submodules by name
> :)), and add them as commands in the real git-submodule script.  The
> trickiest part here will be figuring out which remote branch to
> push/pull.

See http://article.gmane.org/gmane.comp.version-control.git/69834
([PATCH] Added recurse command to git submodule)
Or search "submodule recursive" in gmane.

The recursive pull,diff,status for submodule is implemented by Imran M
Yousuf. And IIRC, with this patch, you can walk through the submodule
hierarchy to exectute any command.




-- 
Ping Yin

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2008-07-18 16:12 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <320075ff0807160331j30e8f832m4de3e3bbe9c26801@mail.gmail.com>
2008-07-16 10:32 ` git submodules and commit Nigel Magnay
2008-07-16 10:47   ` Johannes Sixt
2008-07-16 11:02     ` Nigel Magnay
2008-07-16 11:35       ` Johannes Sixt
2008-07-16 12:11         ` Petr Baudis
2008-07-16 12:48         ` Nigel Magnay
2008-07-16 13:38           ` Johannes Sixt
2008-07-16 14:03             ` Nigel Magnay
2008-07-16 14:17               ` Petr Baudis
2008-07-16 14:31                 ` Nigel Magnay
2008-07-16 15:43   ` Avery Pennarun
2008-07-17  9:47     ` Nigel Magnay
2008-07-17 15:12       ` Avery Pennarun
2008-07-18 16:11     ` Ping Yin

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).