git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Managing submodules on large multi-user projects
@ 2009-05-29 18:41 R. Tyler Ballance
  2009-05-29 19:53 ` Avery Pennarun
                   ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: R. Tyler Ballance @ 2009-05-29 18:41 UTC (permalink / raw)
  To: git

[-- Attachment #1: Type: text/plain, Size: 1650 bytes --]

As some of you may recall from my last swath of emails to the list
regarding memory usage and repository size, we have quite a large
repository. About a month ago, I added a submodule to the primary repo
in an effort to start to segment where possible, particularly around
third party modules.

I've noticed that keeping submodules updated is an absolute pain,
particularly with a large multiuser setup with *lots* of branches. 


What will tend to happen is that the submodule reference will be updated
in the master branch (we use a centralized model) and then committed 
(imagine the commit reference was incremented from A-B).

Other developers with other branches will then periodically merge master 
into their project/topic branches but will either neglect to run 
`git submodule update` or our bootstrap script (which also executes the
submodule update command). At this point they'll have outstanding
changes of their own, and the submodule will be marked as "modified" as
well. Usually what will then happen is they'll `git commit -a` without
thinking and the submodule's reference will be changed (typically from
B->A, undoing the previous change).


Are there any saner ways of managing this? I've been trying to get the 
`git submodule update` command to run with as many hooks as possible
(pre-commit, post-update) to make sure that developers aren't
inadvertantly breaking things, but nothing seems to ensure that
*everybody* is up to date and that *everybody* doesn't inadvertantly
commit changes to the submodule?


Feeling trapped in a box of PEBKAC.

Cheers
-- 
-R. Tyler Ballance
Slide, Inc.

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Managing submodules on large multi-user projects
  2009-05-29 18:41 Managing submodules on large multi-user projects R. Tyler Ballance
@ 2009-05-29 19:53 ` Avery Pennarun
  2009-05-29 20:09   ` R. Tyler Ballance
  2009-05-29 22:58 ` Felipe Contreras
  2009-05-31 13:39 ` Alex Riesen
  2 siblings, 1 reply; 7+ messages in thread
From: Avery Pennarun @ 2009-05-29 19:53 UTC (permalink / raw)
  To: R. Tyler Ballance; +Cc: git

On Fri, May 29, 2009 at 2:41 PM, R. Tyler Ballance <tyler@slide.com> wrote:
> As some of you may recall from my last swath of emails to the list
> regarding memory usage and repository size, we have quite a large
> repository. About a month ago, I added a submodule to the primary repo
> in an effort to start to segment where possible, particularly around
> third party modules.
>
> I've noticed that keeping submodules updated is an absolute pain,
> particularly with a large multiuser setup with *lots* of branches.

Just so I understand, is the reason you're splitting into submodules
*just* to avoid memory usage / repository size issues?  I can sort of
understand the memory usage issues - sort of - but how does it reduce
repository size if you need to need to check out all the submodule
repositories along with the main project anyway?

Just looking to clarify for myself.  (I'm continuing my work on
git-subtree, which is getting more and more positive feedback.  It
solves all the *other* problems that you listed vs. submodules, but it
certainly doesn't resolve any repository size issues.)

Have fun,

Avery

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Managing submodules on large multi-user projects
  2009-05-29 19:53 ` Avery Pennarun
@ 2009-05-29 20:09   ` R. Tyler Ballance
  2009-05-29 20:18     ` Avery Pennarun
  0 siblings, 1 reply; 7+ messages in thread
From: R. Tyler Ballance @ 2009-05-29 20:09 UTC (permalink / raw)
  To: Avery Pennarun; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 1904 bytes --]

On Fri, May 29, 2009 at 03:53:26PM -0400, Avery Pennarun wrote:
> On Fri, May 29, 2009 at 2:41 PM, R. Tyler Ballance <tyler@slide.com> wrote:
> > As some of you may recall from my last swath of emails to the list
> > regarding memory usage and repository size, we have quite a large
> > repository. About a month ago, I added a submodule to the primary repo
> > in an effort to start to segment where possible, particularly around
> > third party modules.
> >
> > I've noticed that keeping submodules updated is an absolute pain,
> > particularly with a large multiuser setup with *lots* of branches.
> 
> Just so I understand, is the reason you're splitting into submodules
> *just* to avoid memory usage / repository size issues?  I can sort of
> understand the memory usage issues - sort of - but how does it reduce
> repository size if you need to need to check out all the submodule
> repositories along with the main project anyway?

I've got an eye on submodules as a way of avoiding the need to require a
whole tree clone to just work on parts of it, but that's not really
relevant to my query, just explaining our environment and setting the stage ;)

We're using submodules right now similar to how we used svn externals in
the past (except better, clearly), to incorporate outside components
(like open source projects) that our stack depends on.

> Just looking to clarify for myself.  (I'm continuing my work on
> git-subtree, which is getting more and more positive feedback.  It
> solves all the *other* problems that you listed vs. submodules, but it
> certainly doesn't resolve any repository size issues.)

Good to know, we're still on Git 1.6.1, are there any benefits or
additional features in more recent releases of Git that help alleviate
the submodules issues I outlined at the top of the thread?


Cheers
-- 
-R. Tyler Ballance
Slide, Inc.

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Managing submodules on large multi-user projects
  2009-05-29 20:09   ` R. Tyler Ballance
@ 2009-05-29 20:18     ` Avery Pennarun
  0 siblings, 0 replies; 7+ messages in thread
From: Avery Pennarun @ 2009-05-29 20:18 UTC (permalink / raw)
  To: R. Tyler Ballance; +Cc: git

On Fri, May 29, 2009 at 4:09 PM, R. Tyler Ballance <tyler@slide.com> wrote:
> On Fri, May 29, 2009 at 03:53:26PM -0400, Avery Pennarun wrote:
>> Just so I understand, is the reason you're splitting into submodules
>> *just* to avoid memory usage / repository size issues?  I can sort of
>> understand the memory usage issues - sort of - but how does it reduce
>> repository size if you need to need to check out all the submodule
>> repositories along with the main project anyway?
>
> I've got an eye on submodules as a way of avoiding the need to require a
> whole tree clone to just work on parts of it, but that's not really
> relevant to my query, just explaining our environment and setting the stage ;)
>
> We're using submodules right now similar to how we used svn externals in
> the past (except better, clearly), to incorporate outside components
> (like open source projects) that our stack depends on.

That makes sense.

>> Just looking to clarify for myself.  (I'm continuing my work on
>> git-subtree, which is getting more and more positive feedback.  It
>> solves all the *other* problems that you listed vs. submodules, but it
>> certainly doesn't resolve any repository size issues.)
>
> Good to know, we're still on Git 1.6.1, are there any benefits or
> additional features in more recent releases of Git that help alleviate
> the submodules issues I outlined at the top of the thread?

git-subtree is my own little project that hasn't been accepted into
git proper yet.  It does work with git 1.6.1 (and also git 1.5.4, at
least) just by dropping the script into your PATH.  Google "git
subtree" for more.

AFAIK, the particular issues you outlined with submodules continue to
exist in latest git.  They are certainly fixable (they aren't
*fundamental* problems), but nobody has fixed them yet.  I looked at
the issues for a long time and failed miserably to find a good general
solution, but that's just me.

Have fun,

Avery

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Managing submodules on large multi-user projects
  2009-05-29 18:41 Managing submodules on large multi-user projects R. Tyler Ballance
  2009-05-29 19:53 ` Avery Pennarun
@ 2009-05-29 22:58 ` Felipe Contreras
  2009-06-23 22:58   ` R. Tyler Ballance
  2009-05-31 13:39 ` Alex Riesen
  2 siblings, 1 reply; 7+ messages in thread
From: Felipe Contreras @ 2009-05-29 22:58 UTC (permalink / raw)
  To: R. Tyler Ballance; +Cc: git

On Fri, May 29, 2009 at 9:41 PM, R. Tyler Ballance <tyler@slide.com> wrote:
> As some of you may recall from my last swath of emails to the list
> regarding memory usage and repository size, we have quite a large
> repository. About a month ago, I added a submodule to the primary repo
> in an effort to start to segment where possible, particularly around
> third party modules.
>
> I've noticed that keeping submodules updated is an absolute pain,
> particularly with a large multiuser setup with *lots* of branches.
>
>
> What will tend to happen is that the submodule reference will be updated
> in the master branch (we use a centralized model) and then committed
> (imagine the commit reference was incremented from A-B).
>
> Other developers with other branches will then periodically merge master
> into their project/topic branches but will either neglect to run
> `git submodule update` or our bootstrap script (which also executes the
> submodule update command). At this point they'll have outstanding
> changes of their own, and the submodule will be marked as "modified" as
> well. Usually what will then happen is they'll `git commit -a` without
> thinking and the submodule's reference will be changed (typically from
> B->A, undoing the previous change).
>
>
> Are there any saner ways of managing this? I've been trying to get the
> `git submodule update` command to run with as many hooks as possible
> (pre-commit, post-update) to make sure that developers aren't
> inadvertantly breaking things, but nothing seems to ensure that
> *everybody* is up to date and that *everybody* doesn't inadvertantly
> commit changes to the submodule?

Have you tried repo?
http://source.android.com/download/using-repo

-- 
Felipe Contreras

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Managing submodules on large multi-user projects
  2009-05-29 18:41 Managing submodules on large multi-user projects R. Tyler Ballance
  2009-05-29 19:53 ` Avery Pennarun
  2009-05-29 22:58 ` Felipe Contreras
@ 2009-05-31 13:39 ` Alex Riesen
  2 siblings, 0 replies; 7+ messages in thread
From: Alex Riesen @ 2009-05-31 13:39 UTC (permalink / raw)
  To: R. Tyler Ballance; +Cc: git

2009/5/29 R. Tyler Ballance <tyler@slide.com>:
> Other developers with other branches will then periodically merge master
> into their project/topic branches but will either neglect to run
> `git submodule update` or our bootstrap script (which also executes the
> submodule update command). At this point they'll have outstanding
> changes of their own, and the submodule will be marked as "modified" as
> well. Usually what will then happen is they'll `git commit -a` without
> thinking and the submodule's reference will be changed (typically from
> B->A, undoing the previous change).

This (the fact that "git commit -a" updates submodules in the index after merge)
is probably our bug (or at least an unfinished feature).

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: Managing submodules on large multi-user projects
  2009-05-29 22:58 ` Felipe Contreras
@ 2009-06-23 22:58   ` R. Tyler Ballance
  0 siblings, 0 replies; 7+ messages in thread
From: R. Tyler Ballance @ 2009-06-23 22:58 UTC (permalink / raw)
  To: Felipe Contreras; +Cc: git

[-- Attachment #1: Type: text/plain, Size: 2149 bytes --]

Hey Felipe, reply inline..

On Sat, 30 May 2009, Felipe Contreras wrote:

> On Fri, May 29, 2009 at 9:41 PM, R. Tyler Ballance <tyler@slide.com> wrote:
> > I've noticed that keeping submodules updated is an absolute pain,
> > particularly with a large multiuser setup with *lots* of branches.
> >
> >
> > What will tend to happen is that the submodule reference will be updated
> > in the master branch (we use a centralized model) and then committed
> > (imagine the commit reference was incremented from A-B).
> >
> > Other developers with other branches will then periodically merge master
> > into their project/topic branches but will either neglect to run
> > `git submodule update` or our bootstrap script (which also executes the
> > submodule update command). At this point they'll have outstanding
> > changes of their own, and the submodule will be marked as "modified" as
> > well. Usually what will then happen is they'll `git commit -a` without
> > thinking and the submodule's reference will be changed (typically from
> > B->A, undoing the previous change).
> >
> >
> > Are there any saner ways of managing this? I've been trying to get the
> > `git submodule update` command to run with as many hooks as possible
> > (pre-commit, post-update) to make sure that developers aren't
> > inadvertantly breaking things, but nothing seems to ensure that
> > *everybody* is up to date and that *everybody* doesn't inadvertantly
> > commit changes to the submodule?
> 
> Have you tried repo?
> http://source.android.com/download/using-repo

No I've not tried repo, and the likelihood of getting our now 100+ user
organization to switch over is highly unlikely.

Since I originally posted to this thread, I've had to entirely *remove*
the submodule from the super-project and just dump the code in (boo,
hiss) since it just caused too much damn trouble.

I'm going to give a newer version of Git a try and hope that everythin
is better now, since the need has arisen for a git submodule again and 
things will get gnarly if I have to do another source dump.

:(

-R. Tyler Ballance
Slide, Inc.

[-- Attachment #2: Type: application/pgp-signature, Size: 197 bytes --]

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2009-06-23 22:59 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-05-29 18:41 Managing submodules on large multi-user projects R. Tyler Ballance
2009-05-29 19:53 ` Avery Pennarun
2009-05-29 20:09   ` R. Tyler Ballance
2009-05-29 20:18     ` Avery Pennarun
2009-05-29 22:58 ` Felipe Contreras
2009-06-23 22:58   ` R. Tyler Ballance
2009-05-31 13:39 ` Alex Riesen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).