* [RFC GSoC 2009: git-submodule for multiple, active developers on active trees]
@ 2009-03-25 20:14 P Baker
2009-03-30 15:32 ` Shawn O. Pearce
0 siblings, 1 reply; 12+ messages in thread
From: P Baker @ 2009-03-25 20:14 UTC (permalink / raw)
To: git
Greetings, I've been working on this for a while, but figured I'd send
it out while I've still got some time left before I submit it!
Any comments/questions would be welcome, as I'd really love to spend a
summer working on git.
Abstract:
This project focuses on upgrading git-submodule to manage code
created in external projects in ways that allow users to safely branch
and merge that code without loss of data or routine merge conflicts.
This will incorporate some changes made on the ‘pu’ branch, but will
also include making substantial changes to git-submodule underlying
code.
Content:
git-submodule is currently a good tool designed to allow developers to
leverage the work of others by incorporating external code into a
project. However, its implementation is underdeveloped as most core
users/developers are not heavy users of the application. In contrast
to much of the rest of the project, these holes create usage problems
when git does not act according to developers’ expectations. This
project would devote a summer of work to filling in the gaps so that
git-submodules contains the features necessary to fully exploit its
potential to track, update and edit external codebases incorporated
into a super-project.
As opposed to “remotes,” which also incorporate external code into a
project, submodules maintain the distinct nature of code and separate
the projects’ history. This is perfect for the intended nature of
“embedding foreign repositories in dedicated subdirectories of the
source tree.” For example, managing plug-ins within a larger,
standalone project that depends on the plug-ins. However, the
shortcomings of git-submodule create headaches for developers
attempting to use git and might prevent its adoption among those
developers not willing to either create laborious workarounds or
explicitly create manual management techniques.
Adding the features to fully enable git-submodule would allow heavy
users of projects built on other actively developed projects to use
git to manage this interaction in intuitive and predictable ways.
Adding this feature set would give current users a desired tool, boost
git’s credibility by providing a common feature among revision
systems, make git’s adoption for new and existing projects easier and,
as a result, likely boost git’s usage.
This project will consist of several stages: an initial community
based design review and investigation of specific requirements;
specifying and documenting the planned changes; writing and debugging
the code and related tests; and finally merging it into a public
release. The tentative timeline is:
End of May – Conclusively finish the public discussion regarding where
git-submodules needs to go
Beginning of June – Produce final specifications (including method stubs)
Middle of July – Finish active code and test development
End of July – Merge code into production release, fix public submitted bugs
Middle of August – Prepare code for final release and finish
user-facing documentation
This timeline should allow adequate flexibility while establishing
deadlines that ensure that the project will be completed in a timely
and efficient manner.
A few specific changes that this project will likely include are:
*use .git instead of .gitmodules
*move objects of submodules into .git/ directory
*git submodule update --init should initialize nested levels of submodules
*protect changes in local submodules when doing “git submodule update”
These changes, compiled from feature requests on the git mailing list
and formulated in response to blog posts regarding git-submodule’s
issues, are representative of the full list of changes. Most
development will need to occur within git-submodules.sh, without
changing much plumbing, however, other files might be affected by more
substantial changes.
While git-submodule is stable and operational, it is not widely
updated and has not seen much change beyond bug-fixes. After reaching
its current feature complete status, the last time a feature of any
novelty was included in a public release was August 2008. However,
some work has already been started on the ‘pu’ branch, which will need
to be reviewed and probably incorporated into this project. At the
conclusion of the project, one metric by which to evaluate its success
will be its acceptance in online communities. The final goal is to
make git a top-tier version control system in its management of
external code repositories.
My main usage of git started when managing a summer project as an
intern that had many of the requirements that make git-submodule
problematic: built in Ruby on Rails and dependent on other plug-ins,
some of which were managed in SVN. Even though the Ruby on Rails
community has popularized within itself the use of sub-modules to
manage plug-ins and quite a bit has been written on the topic, a
significant portion end either in frustration or convoluted
work-arounds. The problems and extremely confusing nature of
git-submodule led me to give up on it altogether (an unfortunately
common occurrence), and manage the code and updates to it by hand.
I currently am finishing my sophomore year as an Electrical Engineer
at the University of Pennsylvania (and dual-majoring in the Wharton
School of Business). I started to develop professionally when I took a
year off between high school and college to work for a small firm in
Silicon Valley, California developing diagnostic imaging software for
quality assurance and research on their products. Last summer I
developed an Ruby on Rails-based engine to test user-designed
investment strategies for QED Benchmark, a boutique hedge fund.
Thanks,
Phill Baker
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC GSoC 2009: git-submodule for multiple, active developers on active trees]
2009-03-25 20:14 [RFC GSoC 2009: git-submodule for multiple, active developers on active trees] P Baker
@ 2009-03-30 15:32 ` Shawn O. Pearce
2009-03-31 15:30 ` P Baker
0 siblings, 1 reply; 12+ messages in thread
From: Shawn O. Pearce @ 2009-03-30 15:32 UTC (permalink / raw)
To: P Baker; +Cc: git
Hi! As someone who has coded around git submodule by creating "repo"
for Android, I'm certainly interested in git submodule improvements,
so this is a great idea for a GSoC project. I have some comments
below that may help improve your proposal before you submit it.
P Baker <me@retrodict.com> wrote:
> Abstract:
> This project focuses on upgrading git-submodule to manage code
> created in external projects in ways that allow users to safely branch
> and merge that code without loss of data or routine merge conflicts.
> This will incorporate some changes made on the ???pu??? branch, but will
> also include making substantial changes to git-submodule underlying
> code.
...
> As opposed to ???remotes,??? which also incorporate external code into a
> project,
I'm not sure what you mean by that. Typically a "remote" in Git is
thought to be a configuration that says where to download a fork of
this project from. By default you get one remote, called "origin",
which is where you initially cloned your fork from, but you can add
many more, such as other developers you frequently collaborate with.
This is quite different from the problem that submodule tries
to address, as its dealing with forks of of the *same* project.
But a submodule is trying to point to forks of *other* projects,
whose histories are (possibly) unrelated to this project's history.
> The tentative timeline is:
>
> End of May ??? Conclusively finish the public discussion regarding where
> git-submodules needs to go
> Beginning of June ??? Produce final specifications (including method stubs)
> Middle of July ??? Finish active code and test development
> End of July ??? Merge code into production release, fix public submitted bugs
> Middle of August ??? Prepare code for final release and finish
> user-facing documentation
IMHO, this is too vague. *What exactly* are the features you want
to add to git submodule? Break this down by features, not by phases
of coding.
Further, you spend roughly a month writing method stubs.
My experience with such development practices is that you will
get frustrated by not having the code working, get bored with it,
and walk away. Or at best, you'll be able to stub it all out,
but will need to redo most of the stubs because you find later on
while writing the implementation code that you need to pass data
through that you didn't initially anticipate.
Also, we very much prefer Git patches to update the documentation
at the same time that the code changes. Maybe its done in the same
patch, if the code+doc update are relatively small, or maybe its done
in two patches in the same series (code change, then doc update),
but the general guideline is that both code and documentation should
be updated at roughly the same time (e.g. same day for Junio when
he merges the series down to master). This way the documentation
doesn't stray too far from the code its describing.
> A few specific changes that this project will likely include are:
>
> *use .git instead of .gitmodules
> *move objects of submodules into .git/ directory
> *git submodule update --init should initialize nested levels of submodules
> *protect changes in local submodules when doing ???git submodule update???
As I said above, I'd like to see this described in the timeline
better, each of these could be done independently, so you could work
on one item try to get it completed, tested, documented, and merged
into Junio's tree, and then start the next item. At worst at the
end of the summer you'll have a fraction of these done, merged,
and available for users, which is better than trying to do it all
and failing to get none merged.
I'd like to know more about each of these items, and less about
the general reasoning of where you got these feature ideas from.
What exactly are you talking about changing, and why? I don't need
to see detailed code at this stage, but I'd like a better description
of the user-visible changes that each bullet point might cause,
and why you feel this change is better than what we have today.
--
Shawn.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC GSoC 2009: git-submodule for multiple, active developers on active trees]
2009-03-30 15:32 ` Shawn O. Pearce
@ 2009-03-31 15:30 ` P Baker
2009-03-31 15:57 ` Johannes Schindelin
0 siblings, 1 reply; 12+ messages in thread
From: P Baker @ 2009-03-31 15:30 UTC (permalink / raw)
To: Shawn O. Pearce; +Cc: git
Great, thanks Shawn. I've included my responses to your
questions/comments below, and would love to continue the dialog - any
help with the application would be much appreciated. I'll work my
responses into my application and send it out again.
On 3/30/09, Shawn O. Pearce <spearce@spearce.org> wrote:
> Hi! As someone who has coded around git submodule by creating "repo"
> for Android, I'm certainly interested in git submodule improvements,
> so this is a great idea for a GSoC project. I have some comments
> below that may help improve your proposal before you submit it.
>
I was going to send out another request for comments to the list, but
you beat me to it. The first part of my proposal is to get community
involvement, is this mailing list the best place to do that, or
something external, like a publicly editable site (e.g. a survey or
wiki)?
>
> > As opposed to ???remotes,??? which also incorporate external code into a
> > project,
>
>
> I'm not sure what you mean by that. Typically a "remote" in Git is
> thought to be a configuration that says where to download a fork of
> this project from. By default you get one remote, called "origin",
> which is where you initially cloned your fork from, but you can add
> many more, such as other developers you frequently collaborate with.
>
> This is quite different from the problem that submodule tries
> to address, as its dealing with forks of of the *same* project.
> But a submodule is trying to point to forks of *other* projects,
> whose histories are (possibly) unrelated to this project's history.
>
Right. My point is that due to either perceived or real problems with
git-submodules, some folks have abandoned it and instead used
workarounds (see
http://flavoriffic.blogspot.com/2008/05/managing-git-submodules-with-gitrake.html?showComment=1210125780000#c1605130977296198852
for one example of such a comment). That so much effort has been put
into solutions external to git, shows that there is some pent up
demand for a solution built into git.
>
> > The tentative timeline is:
> >
> > End of May ??? Conclusively finish the public discussion regarding where
> > git-submodules needs to go
> > Beginning of June ??? Produce final specifications (including method stubs)
> > Middle of July ??? Finish active code and test development
> > End of July ??? Merge code into production release, fix public submitted bugs
> > Middle of August ??? Prepare code for final release and finish
> > user-facing documentation
>
>
> IMHO, this is too vague. *What exactly* are the features you want
> to add to git submodule? Break this down by features, not by phases
> of coding.
>
The features I have been considering are:
*move objects of submodules into base .git/ directory
**This would, as I understand it: protect submodules from being
overwritten and changes lost when switching between branches of the
superproject that might or might not contain the submodules and
centralize their management into one location. The added benefits of
fully using git's ability to branch and merge submodules makes it
worth adding some complexity within the .git directory.
*use .git instead of .gitmodules
**I actually don't know why this was included with the project
description, I searched for an explanation of the desired name change
on the mailing list and in commit messages, but came up with nothing.
*git submodule update --init should initialize nested levels of submodules
**As an ease of use command, either an additional flag to recurse can
be added, or it can act by default. As a requested feature on the
mailing list, this is worth implementing.
*ability to update submodule pulled from svn repo
**One workaround is to clone it as local copy using git-svn and then
import that local clone as a submodule; clearly a clunky solution.
There are many requests for this feature (see
http://panthersoftware.com/articles/view/4/git-svn-dcommit-workaround-for-git-submodules
for a typical example), and it makes sense integrating git-submodule
with git-svn would expand submodule's usefulness.
*make submodules deal with updated references
**Instead of issuing merge conflicts on updated submodule references,
this will allow submodules on default detached HEAD so that changes
from the local repo can be committed without first pulling changes
from the shared repo. See
http://flavoriffic.blogspot.com/2008/05/managing-git-submodules-with-gitrake.html?showComment=1211380200000#c3897235118548537475
for an explanation of how this made 'submodules...unsuitable for
active development'. Clearly losing this kind of functionality impairs
the overall usability of git and should be fixed.
*protect changes in local submodules when doing “git submodule update”
**This is similar to the previous point, in that changes need to be
protected or merged or warnings issued when updating the submodule.
The potential to lose work with no warning is a big no-no.
*make git submodules easy to remove
** See http://pitupepito.homelinux.org/?p=24, for an example of why
this is a pain. Adding a submodule has ui, removing one should as
well.
> Further, you spend roughly a month writing method stubs.
Week max: end of May to beginning of June, but if I
>
> Also, we very much prefer Git patches to update the documentation
> at the same time that the code changes.
Fair enough. I was planning on starting documentation during the
stubbing phase, and then finishing it once having written the code.
>
>
> > A few specific changes that this project will likely include are:
> >
> > *use .git instead of .gitmodules
> > *move objects of submodules into .git/ directory
> > *git submodule update --init should initialize nested levels of submodules
>
> > *protect changes in local submodules when doing ???git submodule update???
>
> As I said above, I'd like to see this described in the timeline
> better, each of these could be done independently, so you could work
> on one item try to get it completed, tested, documented, and merged
> into Junio's tree, and then start the next item.
Ok, makes sense. I can redo the timeline like that, I'll order them by
my priority and look for community input on re-ranking them and adding
or subtracting features.
>
> I'd like to know more about each of these items, and less about
> the general reasoning of where you got these feature ideas from.
> What exactly are you talking about changing, and why? I don't need
> to see detailed code at this stage, but I'd like a better description
> of the user-visible changes that each bullet point might cause,
> and why you feel this change is better than what we have today.
>
See above.
Phill Baker
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC GSoC 2009: git-submodule for multiple, active developers on active trees]
2009-03-31 15:30 ` P Baker
@ 2009-03-31 15:57 ` Johannes Schindelin
2009-03-31 22:32 ` P Baker
0 siblings, 1 reply; 12+ messages in thread
From: Johannes Schindelin @ 2009-03-31 15:57 UTC (permalink / raw)
To: P Baker; +Cc: Shawn O. Pearce, git
[-- Attachment #1: Type: TEXT/PLAIN, Size: 3547 bytes --]
Hi,
I am a rather intense user of submodules, so I am quite interested.
Please take my comments as encouragement rather than discouragement.
On Tue, 31 Mar 2009, P Baker wrote:
> On 3/30/09, Shawn O. Pearce <spearce@spearce.org> wrote:
> > IMHO, this is too vague. *What exactly* are the features you want to
> > add to git submodule? Break this down by features, not by phases of
> > coding.
>
> The features I have been considering are:
> *move objects of submodules into base .git/ directory
> **This would, as I understand it: protect submodules from being
> overwritten and changes lost when switching between branches of the
> superproject that might or might not contain the submodules and
> centralize their management into one location. The added benefits of
> fully using git's ability to branch and merge submodules makes it
> worth adding some complexity within the .git directory.
The main problem with renaming/deleting is not the repository of the
submodule, but the working directoy.
> *use .git instead of .gitmodules
> **I actually don't know why this was included with the project
> description, I searched for an explanation of the desired name change
> on the mailing list and in commit messages, but came up with nothing.
AFAICT somebody thought that the information about the locations of the
submodules should be in .git/ rather than in the working directory. But
of course, that is wrong: you want it to be tracked.
> *git submodule update --init should initialize nested levels of submodules
> **As an ease of use command, either an additional flag to recurse can
> be added, or it can act by default. As a requested feature on the
> mailing list, this is worth implementing.
I thought there was a patch to support "git submodule recurse"? That
would be rather less limited than yet another option to submodule update.
> *ability to update submodule pulled from svn repo
> **One workaround is to clone it as local copy using git-svn and then
> import that local clone as a submodule; clearly a clunky solution.
> There are many requests for this feature (see
> http://panthersoftware.com/articles/view/4/git-svn-dcommit-workaround-for-git-submodules
> for a typical example), and it makes sense integrating git-submodule
> with git-svn would expand submodule's usefulness.
I do not think that this would be good. Both "git svn" and "git
submodule" are rather complex by now, and mixing them would only
complicate code.
> *make submodules deal with updated references
> **Instead of issuing merge conflicts on updated submodule references,
> this will allow submodules on default detached HEAD so that changes
> from the local repo can be committed without first pulling changes
> from the shared repo.
I'd rather call this "make git-submodule help with merging".
> *protect changes in local submodules when doing “git submodule update”
> **This is similar to the previous point, in that changes need to be
> protected or merged or warnings issued when updating the submodule.
> The potential to lose work with no warning is a big no-no.
One word: Reflogs.
> *make git submodules easy to remove
> ** See http://pitupepito.homelinux.org/?p=24, for an example of why
> this is a pain. Adding a submodule has ui, removing one should as
> well.
AFAIR there was already a patch to implement this, but the OP apparently
did not address all issues.
> > Further, you spend roughly a month writing method stubs.
>
> Week max: end of May to beginning of June, but if I
... yes?
Ciao,
Dscho
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC GSoC 2009: git-submodule for multiple, active developers on active trees]
2009-03-31 15:57 ` Johannes Schindelin
@ 2009-03-31 22:32 ` P Baker
2009-03-31 23:05 ` Johannes Schindelin
0 siblings, 1 reply; 12+ messages in thread
From: P Baker @ 2009-03-31 22:32 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Shawn O. Pearce, git
Thanks for the comments, my replies are included. It's good to see
some core folks are big users!
On Tue, Mar 31, 2009 at 11:57 AM, Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
> Hi,
>
> I am a rather intense user of submodules, so I am quite interested.
>
I'm curious, where/under what circumstances do you tend to use it?
> Please take my comments as encouragement rather than discouragement.
>
Always.
>> *move objects of submodules into base .git/ directory
>> **This would, as I understand it: protect submodules from being
>> overwritten and changes lost when switching between branches of the
>> superproject that might or might not contain the submodules and
>> centralize their management into one location. The added benefits of
>> fully using git's ability to branch and merge submodules makes it
>> worth adding some complexity within the .git directory.
>
> The main problem with renaming/deleting is not the repository of the
> submodule, but the working directoy.
>
My understanding is that since the submodule objects (history) is
stored in a .git directory in the subdirectory where the submodule is
located, removing that subdirectory during checkout of a branch that
does not include that submodule eliminates the .git directory as well.
Moving the objects from the submodule's .git directory to the base
.git directory would seem to alleviate this problem.
>> *use .git instead of .gitmodules
>> **I actually don't know why this was included with the project
>> description, I searched for an explanation of the desired name change
>> on the mailing list and in commit messages, but came up with nothing.
>
> AFAICT somebody thought that the information about the locations of the
> submodules should be in .git/ rather than in the working directory. But
> of course, that is wrong: you want it to be tracked.
>
So, in looking back through the archives of the mailing list there
seems to be some disagreement between using .gitmodules and
.git/config to track submodules.
>> *git submodule update --init should initialize nested levels of submodules
>> **As an ease of use command, either an additional flag to recurse can
>> be added, or it can act by default. As a requested feature on the
>> mailing list, this is worth implementing.
>
> I thought there was a patch to support "git submodule recurse"? That
> would be rather less limited than yet another option to submodule update.
>
There is a git submodule foreach command, but it doesn't look like the
patch for git submodule recurse
(http://marc.info/?l=git&m=120997867213008&w=2) has been incorporated
into a public release.
That is one route, on the other hand, the default action is also open
to question. When I update a submodule, I would probably expect that
anything it depends on is also updated. The default action probably
should be recursive.
>> *ability to update submodule pulled from svn repo
>> **One workaround is to clone it as local copy using git-svn and then
>> import that local clone as a submodule; clearly a clunky solution.
>> There are many requests for this feature (see
>> http://panthersoftware.com/articles/view/4/git-svn-dcommit-workaround-for-git-submodules
>> for a typical example), and it makes sense integrating git-submodule
>> with git-svn would expand submodule's usefulness.
>
> I do not think that this would be good. Both "git svn" and "git
> submodule" are rather complex by now, and mixing them would only
> complicate code.
>
Hm, point well taken, but it would seem to have enormous benefit for a
lot of people. I can move it down the priority list, but I'd like to
include it in the proposal - complexity alone isn't a good reason to
avoid something.
I also believe that the workaround described, if incorporated into
git-submodules.sh in an appropriate way might open up possibilities
for further improvement. The UI would change much, seems like it'd
just be detection of pointing to a SVN repo instead of a git repo and
then hooking into git svn calls instead of regular git calls. This
brings up the possibility that git submodules should abstract its
repository handling in much the same way that git does. I'm not
familiar with the code, but this seems more like calling other
plumbing hooks than anything else.
>> *make submodules deal with updated references
>> **Instead of issuing merge conflicts on updated submodule references,
>> this will allow submodules on default detached HEAD so that changes
>> from the local repo can be committed without first pulling changes
>> from the shared repo.
>
> I'd rather call this "make git-submodule help with merging".
>
Better name. Duly noted. Will change.
>> *protect changes in local submodules when doing “git submodule update”
>> **This is similar to the previous point, in that changes need to be
>> protected or merged or warnings issued when updating the submodule.
>> The potential to lose work with no warning is a big no-no.
>
> One word: Reflogs.
>
I haven't used reflogs, but it doesn't seem to fix the problem (maybe
you can explain?): simply knowing where/what the reference is, doesn't
mean that git-submodule looks at it, obeys the reference or issues
warnings when it should. The problem as stated
(http://flavoriffic.blogspot.com/2008/05/managing-git-submodules-with-gitrake.html?showComment=1211380200000#c3897235118548537475)
was that git submodule update would silently overwrite any local
changes with the remote version (i.e. git did not check to see if the
local reference was different than the remote reference when
updating).
>> *make git submodules easy to remove
>> ** See http://pitupepito.homelinux.org/?p=24, for an example of why
>> this is a pain. Adding a submodule has ui, removing one should as
>> well.
>
> AFAIR there was already a patch to implement this, but the OP apparently
> did not address all issues.
>
Yep, found it on the mailing list. Obviously, part of the project
would be to resolve those final issues.
Phillip Baker
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC GSoC 2009: git-submodule for multiple, active developers on active trees]
2009-03-31 22:32 ` P Baker
@ 2009-03-31 23:05 ` Johannes Schindelin
2009-03-31 23:49 ` P Baker
0 siblings, 1 reply; 12+ messages in thread
From: Johannes Schindelin @ 2009-03-31 23:05 UTC (permalink / raw)
To: P Baker; +Cc: Shawn O. Pearce, git
Hi,
On Tue, 31 Mar 2009, P Baker wrote:
> On Tue, Mar 31, 2009 at 11:57 AM, Johannes Schindelin
> <Johannes.Schindelin@gmx.de> wrote:
>
> >> *move objects of submodules into base .git/ directory
> >> **This would, as I understand it: protect submodules from being
> >> overwritten and changes lost when switching between branches of the
> >> superproject that might or might not contain the submodules and
> >> centralize their management into one location. The added benefits of
> >> fully using git's ability to branch and merge submodules makes it
> >> worth adding some complexity within the .git directory.
> >
> > The main problem with renaming/deleting is not the repository of the
> > submodule, but the working directoy.
> >
>
> My understanding is that since the submodule objects (history) is
> stored in a .git directory in the subdirectory where the submodule is
> located, removing that subdirectory during checkout of a branch that
> does not include that submodule eliminates the .git directory as well.
> Moving the objects from the submodule's .git directory to the base
> .git directory would seem to alleviate this problem.
My point was more about "you cannot just remove the subdirectory, or you
_will_ lose data".
> >> *use .git instead of .gitmodules
> >> **I actually don't know why this was included with the project
> >> description, I searched for an explanation of the desired name change
> >> on the mailing list and in commit messages, but came up with nothing.
> >
> > AFAICT somebody thought that the information about the locations of the
> > submodules should be in .git/ rather than in the working directory. But
> > of course, that is wrong: you want it to be tracked.
>
> So, in looking back through the archives of the mailing list there
> seems to be some disagreement between using .gitmodules and
> .git/config to track submodules.
No. .gitmodules has the default information, and "git submodule init"
brings that into .git/config, to be overridden by the user if she so
likes.
> >> *git submodule update --init should initialize nested levels of submodules
> >> **As an ease of use command, either an additional flag to recurse can
> >> be added, or it can act by default. As a requested feature on the
> >> mailing list, this is worth implementing.
> >
> > I thought there was a patch to support "git submodule recurse"? That
> > would be rather less limited than yet another option to submodule update.
>
> There is a git submodule foreach command, but it doesn't look like the
> patch for git submodule recurse
> (http://marc.info/?l=git&m=120997867213008&w=2) has been incorporated
> into a public release.
>
> That is one route, on the other hand, the default action is also open
> to question. When I update a submodule, I would probably expect that
> anything it depends on is also updated. The default action probably
> should be recursive.
No. Not at all. At least in my usage, submodules are mostly optional.
IOW I have ways in my projects to cope with the absence of a checkout.
> >> *ability to update submodule pulled from svn repo
> >> **One workaround is to clone it as local copy using git-svn and then
> >> import that local clone as a submodule; clearly a clunky solution.
> >> There are many requests for this feature (see
> >> http://panthersoftware.com/articles/view/4/git-svn-dcommit-workaround-for-git-submodules
> >> for a typical example), and it makes sense integrating git-submodule
> >> with git-svn would expand submodule's usefulness.
> >
> > I do not think that this would be good. Both "git svn" and "git
> > submodule" are rather complex by now, and mixing them would only
> > complicate code.
>
> Hm, point well taken, but it would seem to have enormous benefit for a
> lot of people. I can move it down the priority list, but I'd like to
> include it in the proposal - complexity alone isn't a good reason to
> avoid something.
Complexity is often a good sign of bad design.
In this case, I want to point out that there has been a better design
already:
http://thread.gmane.org/gmane.comp.version-control.git/114545
(Unfortunately, Daniel decided to post the follow-up patches in different
threads; that will make it hard for you to find them.)
Ciao,
Dscho
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC GSoC 2009: git-submodule for multiple, active developers on active trees]
2009-03-31 23:05 ` Johannes Schindelin
@ 2009-03-31 23:49 ` P Baker
2009-04-01 0:58 ` Johannes Schindelin
0 siblings, 1 reply; 12+ messages in thread
From: P Baker @ 2009-03-31 23:49 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Shawn O. Pearce, git
I'll paraphrase to see if I understand your points:
*Moving objects from submodule .git directories into the base .git/
directory would protect the submodules and is a good idea.
*Moving to a .git/ file from .gitmodules should be taken off of the
goal list (I went back and read this thread:
http://thread.gmane.org/gmane.comp.version-control.git/78605; seemed
to clear things up).
*git submodule recurse would be a good option (not as a default), if
the remaining issues are resolved.
*It would be a good idea for git submodule to work with foreign VCS,
through Daniel's patches.
I appreciate the guidance, it's helping me to see that some of this
work has already been done, it needs to be finished and pushed into a
public release. As an intense user of submodules, what does it do
poorly/not do for your needs?
Thanks,
Phillip Baker
On Tue, Mar 31, 2009 at 7:05 PM, Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
> Hi,
>
> On Tue, 31 Mar 2009, P Baker wrote:
>
>> On Tue, Mar 31, 2009 at 11:57 AM, Johannes Schindelin
>> <Johannes.Schindelin@gmx.de> wrote:
>>
>> >> *move objects of submodules into base .git/ directory
>> >> **This would, as I understand it: protect submodules from being
>> >> overwritten and changes lost when switching between branches of the
>> >> superproject that might or might not contain the submodules and
>> >> centralize their management into one location. The added benefits of
>> >> fully using git's ability to branch and merge submodules makes it
>> >> worth adding some complexity within the .git directory.
>> >
>> > The main problem with renaming/deleting is not the repository of the
>> > submodule, but the working directoy.
>> >
>>
>> My understanding is that since the submodule objects (history) is
>> stored in a .git directory in the subdirectory where the submodule is
>> located, removing that subdirectory during checkout of a branch that
>> does not include that submodule eliminates the .git directory as well.
>> Moving the objects from the submodule's .git directory to the base
>> .git directory would seem to alleviate this problem.
>
> My point was more about "you cannot just remove the subdirectory, or you
> _will_ lose data".
>
>> >> *use .git instead of .gitmodules
>> >> **I actually don't know why this was included with the project
>> >> description, I searched for an explanation of the desired name change
>> >> on the mailing list and in commit messages, but came up with nothing.
>> >
>> > AFAICT somebody thought that the information about the locations of the
>> > submodules should be in .git/ rather than in the working directory. But
>> > of course, that is wrong: you want it to be tracked.
>>
>> So, in looking back through the archives of the mailing list there
>> seems to be some disagreement between using .gitmodules and
>> .git/config to track submodules.
>
> No. .gitmodules has the default information, and "git submodule init"
> brings that into .git/config, to be overridden by the user if she so
> likes.
>
>> >> *git submodule update --init should initialize nested levels of submodules
>> >> **As an ease of use command, either an additional flag to recurse can
>> >> be added, or it can act by default. As a requested feature on the
>> >> mailing list, this is worth implementing.
>> >
>> > I thought there was a patch to support "git submodule recurse"? That
>> > would be rather less limited than yet another option to submodule update.
>>
>> There is a git submodule foreach command, but it doesn't look like the
>> patch for git submodule recurse
>> (http://marc.info/?l=git&m=120997867213008&w=2) has been incorporated
>> into a public release.
>>
>> That is one route, on the other hand, the default action is also open
>> to question. When I update a submodule, I would probably expect that
>> anything it depends on is also updated. The default action probably
>> should be recursive.
>
> No. Not at all. At least in my usage, submodules are mostly optional.
> IOW I have ways in my projects to cope with the absence of a checkout.
>
>> >> *ability to update submodule pulled from svn repo
>> >> **One workaround is to clone it as local copy using git-svn and then
>> >> import that local clone as a submodule; clearly a clunky solution.
>> >> There are many requests for this feature (see
>> >> http://panthersoftware.com/articles/view/4/git-svn-dcommit-workaround-for-git-submodules
>> >> for a typical example), and it makes sense integrating git-submodule
>> >> with git-svn would expand submodule's usefulness.
>> >
>> > I do not think that this would be good. Both "git svn" and "git
>> > submodule" are rather complex by now, and mixing them would only
>> > complicate code.
>>
>> Hm, point well taken, but it would seem to have enormous benefit for a
>> lot of people. I can move it down the priority list, but I'd like to
>> include it in the proposal - complexity alone isn't a good reason to
>> avoid something.
>
> Complexity is often a good sign of bad design.
>
> In this case, I want to point out that there has been a better design
> already:
>
> http://thread.gmane.org/gmane.comp.version-control.git/114545
>
> (Unfortunately, Daniel decided to post the follow-up patches in different
> threads; that will make it hard for you to find them.)
>
> Ciao,
> Dscho
>
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC GSoC 2009: git-submodule for multiple, active developers on active trees]
2009-03-31 23:49 ` P Baker
@ 2009-04-01 0:58 ` Johannes Schindelin
2009-04-01 2:47 ` P Baker
2009-04-01 6:26 ` Andreas Ericsson
0 siblings, 2 replies; 12+ messages in thread
From: Johannes Schindelin @ 2009-04-01 0:58 UTC (permalink / raw)
To: P Baker; +Cc: Shawn O. Pearce, git
Hi,
On Tue, 31 Mar 2009, P Baker wrote:
> I'll paraphrase to see if I understand your points:
>
> *Moving objects from submodule .git directories into the base .git/
> directory would protect the submodules and is a good idea.
No, I did not say that.
I said that moving submodules' working directory need to protected when
renaming/deleting submodules.
Even worse, I think that moving the .git/ directory into the
superproject's .git/ would be at least quite a bit awkward in the nested
case.
> *Moving to a .git/ file from .gitmodules should be taken off of the
> goal list (I went back and read this thread:
> http://thread.gmane.org/gmane.comp.version-control.git/78605; seemed
> to clear things up).
Can't follow links here, as I am reading this offline, so cannot comment.
> *git submodule recurse would be a good option (not as a default), if
> the remaining issues are resolved.
Definitely.
> *It would be a good idea for git submodule to work with foreign VCS,
> through Daniel's patches.
But that would not only apply to submodules, but rather all repositories,
to the point that "git submodule" does not need any change.
> I appreciate the guidance, it's helping me to see that some of this work
> has already been done, it needs to be finished and pushed into a public
> release. As an intense user of submodules, what does it do poorly/not do
> for your needs?
One gripe I have, but which should be rather easy to fix: "git checkout --
submodule/" does not update the index, last time I checked. (It correctly
does not touch the submodule's working directory.)
Another one: The most common mistake with submodules is to commit and push
the superproject, after having committed (but not pushed) in the
submodule. Not sure how that could be helped.
Further, often it would come in rather handy to be able to say something
like "git diff $REVISION_AS_COMMITTED_IN_THE_SUPERPROJECT" from within
the submodule...
git submodule summary should output to the pager by default.
Oh, and it would not hurt performance on Windows at all if git-submodule
would be finally made a builtin.
Ciao,
Dscho
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC GSoC 2009: git-submodule for multiple, active developers on active trees]
2009-04-01 0:58 ` Johannes Schindelin
@ 2009-04-01 2:47 ` P Baker
2009-04-01 16:10 ` Johannes Schindelin
2009-04-01 6:26 ` Andreas Ericsson
1 sibling, 1 reply; 12+ messages in thread
From: P Baker @ 2009-04-01 2:47 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: Shawn O. Pearce, git
> > *Moving objects from submodule .git directories into the base .git/
> > directory would protect the submodules and is a good idea.
>
>
> No, I did not say that.
>
> Even worse, I think that moving the .git/ directory into the
> superproject's .git/ would be at least quite a bit awkward in the nested
> case.
>
Tthe initial prompt for the proposal was: "Rewrite git-submodule,
placing the repository for each referenced submodules in the
superproject's $GIT_DIR/modules...This resolves issues related to
switching between versions of the superproject..." The prompt, and
past experience with git, helped me to form my proposal which it seems
would fix numerous problems with git submodule, with the implied cost
of some awkwardness/complexity. Am I misunderstanding the prompt? Or
do you think this could be accomplished more elegantly?
> I said that moving submodules' working directory need to protected when
> renaming/deleting submodules.
I'm sorry, I still don't understand. Where would this occur? What is
being protected? What is the submodules' working directory? I'm still
learning the intricacies of git, so I'd appreciate any pointers you
can give.
>
>
> > *It would be a good idea for git submodule to work with foreign VCS,
> > through Daniel's patches.
>
>
> But that would not only apply to submodules, but rather all repositories,
> to the point that "git submodule" does not need any change.
>
>
Fair enough. There's plenty of other work to be done!
> > I appreciate the guidance, it's helping me to see that some of this work
> > has already been done, it needs to be finished and pushed into a public
> > release. As an intense user of submodules, what does it do poorly/not do
> > for your needs?
>
>
> One gripe I have, but which should be rather easy to fix: "git checkout --
> submodule/" does not update the index, last time I checked. (It correctly
> does not touch the submodule's working directory.)
>
I'll add it to the list. In terms of general gripes: git submodule add
(or all of git submodule?) handles relative links poorly (see
http://kerneltrap.org/mailarchive/git/2007/12/10/485597). And the
'Gotchas' listed at
http://git.or.cz/gitwiki/GitSubmoduleTutorial#head-a3cba9cbd1e125c0667dfb3b9249100be7f815ad.
> Another one: The most common mistake with submodules is to commit and push
> the superproject, after having committed (but not pushed) in the
> submodule. Not sure how that could be helped.
>
Seems like this is on the git submodule wiki 'Gotcha' list, too.
There's a spectrum of options: failing, warning, generating an output
message, etc. I think it is worth working on. What is git's policy on
interrupting users when their actions _could_ be counterproductive to
their intentions? Would hooks on the submodule's commit written by the
user fix this? That's not a built-in solution.
> Further, often it would come in rather handy to be able to say something
> like "git diff $REVISION_AS_COMMITTED_IN_THE_SUPERPROJECT" from within
> the submodule...
>
That sounds complex, and would break expectations. This would only
work if git in the submodule working directory knows its a submodule.
Is there a way to reference it's super project?
> git submodule summary should output to the pager by default.
>
Added to the list.
> Oh, and it would not hurt performance on Windows at all if git-submodule
> would be finally made a builtin.
You mean rewriting git-submodule.sh in C? What other impacts might that have?
Thanks,
Phillip Baker
> Ciao,
> Dscho
>
>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC GSoC 2009: git-submodule for multiple, active developers on active trees]
2009-04-01 0:58 ` Johannes Schindelin
2009-04-01 2:47 ` P Baker
@ 2009-04-01 6:26 ` Andreas Ericsson
2009-04-01 16:13 ` Johannes Schindelin
1 sibling, 1 reply; 12+ messages in thread
From: Andreas Ericsson @ 2009-04-01 6:26 UTC (permalink / raw)
To: Johannes Schindelin; +Cc: P Baker, Shawn O. Pearce, git
Johannes Schindelin wrote:
> Hi,
>
> On Tue, 31 Mar 2009, P Baker wrote:
>
>> I'll paraphrase to see if I understand your points:
>>
>> *Moving objects from submodule .git directories into the base .git/
>> directory would protect the submodules and is a good idea.
>
> No, I did not say that.
>
> I said that moving submodules' working directory need to protected when
> renaming/deleting submodules.
>
> Even worse, I think that moving the .git/ directory into the
> superproject's .git/ would be at least quite a bit awkward in the nested
> case.
>
Not necessarily. The .git directory of a submodule need not be named .git
inside the superprojects .git directory. I could well imagine something
like this:
.git/modules/submod(.git)/modules/nested-submod(.git)
For deeply nested submodules (eurgh), one might run into path length limit
issues though. The point is that we will need some library-like function
to find the repository of the submodule. Once that's done, the same call
with a different $gitdir should be able to find the nested submodule.
I'm also thinking of libgit2 here, where each repository will be
represented as a struct that must be passed to the various $gitdir
searching functions. This is necessary to allow a single program to access
multiple repositories, and the .git/modules scheme makes supporting
submodules in the library quite trivial.
--
Andreas Ericsson andreas.ericsson@op5.se
OP5 AB www.op5.se
Tel: +46 8-230225 Fax: +46 8-230231
Considering the successes of the wars on alcohol, poverty, drugs and
terror, I think we should give some serious thought to declaring war
on peace.
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC GSoC 2009: git-submodule for multiple, active developers on active trees]
2009-04-01 2:47 ` P Baker
@ 2009-04-01 16:10 ` Johannes Schindelin
0 siblings, 0 replies; 12+ messages in thread
From: Johannes Schindelin @ 2009-04-01 16:10 UTC (permalink / raw)
To: P Baker; +Cc: Shawn O. Pearce, git
Hi,
On Tue, 31 Mar 2009, P Baker wrote:
> > > *Moving objects from submodule .git directories into the base .git/
> > > directory would protect the submodules and is a good idea.
> >
> > No, I did not say that.
> >
> > Even worse, I think that moving the .git/ directory into the
> > superproject's .git/ would be at least quite a bit awkward in the
> > nested case.
> >
>
> Tthe initial prompt for the proposal was: "Rewrite git-submodule,
> placing the repository for each referenced submodules in the
> superproject's $GIT_DIR/modules...This resolves issues related to
> switching between versions of the superproject..." The prompt, and
> past experience with git, helped me to form my proposal which it seems
> would fix numerous problems with git submodule, with the implied cost
> of some awkwardness/complexity. Am I misunderstanding the prompt? Or
> do you think this could be accomplished more elegantly?
Well, I think the focus here is wrong. The focus should be on the working
directory as hinted here:
> > I said that moving submodules' working directory need to protected
> > when renaming/deleting submodules.
>
> I'm sorry, I still don't understand. Where would this occur? What is
> being protected? What is the submodules' working directory? I'm still
> learning the intricacies of git, so I'd appreciate any pointers you can
> give.
If your superproject deletes a submodule, what should happen with the
working directory?
And what should happen if the submodule is _moved_? (Which is not as
easily detected as with renamed files or directories.)
> > Further, often it would come in rather handy to be able to say
> > something like "git diff $REVISION_AS_COMMITTED_IN_THE_SUPERPROJECT"
> > from within the submodule...
> >
>
> That sounds complex, and would break expectations. This would only
> work if git in the submodule working directory knows its a submodule.
... or can detect easily that it is a submodule.
> Is there a way to reference it's super project?
No.
> > Oh, and it would not hurt performance on Windows at all if
> > git-submodule would be finally made a builtin.
>
> You mean rewriting git-submodule.sh in C? What other impacts might that
> have?
Junio would hate it, I am sure.
Ciao,
Dscho
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [RFC GSoC 2009: git-submodule for multiple, active developers on active trees]
2009-04-01 6:26 ` Andreas Ericsson
@ 2009-04-01 16:13 ` Johannes Schindelin
0 siblings, 0 replies; 12+ messages in thread
From: Johannes Schindelin @ 2009-04-01 16:13 UTC (permalink / raw)
To: Andreas Ericsson; +Cc: P Baker, Shawn O. Pearce, git
Hi,
On Wed, 1 Apr 2009, Andreas Ericsson wrote:
> Johannes Schindelin wrote:
>
> > On Tue, 31 Mar 2009, P Baker wrote:
> >
> > > I'll paraphrase to see if I understand your points:
> > >
> > > *Moving objects from submodule .git directories into the base .git/
> > > directory would protect the submodules and is a good idea.
> >
> > No, I did not say that.
> >
> > I said that moving submodules' working directory need to protected when
> > renaming/deleting submodules.
> >
> > Even worse, I think that moving the .git/ directory into the superproject's
> > .git/ would be at least quite a bit awkward in the nested case.
> >
>
> Not necessarily. The .git directory of a submodule need not be named .git
> inside the superprojects .git directory. I could well imagine something
> like this:
>
> .git/modules/submod(.git)/modules/nested-submod(.git)
>
> For deeply nested submodules (eurgh), one might run into path length limit
> issues though. The point is that we will need some library-like function
> to find the repository of the submodule. Once that's done, the same call
> with a different $gitdir should be able to find the nested submodule.
It appears to me as a solution in need of a problem.
> I'm also thinking of libgit2 here, where each repository will be
> represented as a struct that must be passed to the various $gitdir
> searching functions. This is necessary to allow a single program to
> access multiple repositories, and the .git/modules scheme makes
> supporting submodules in the library quite trivial.
First: libgit2 is not an issue for this thread AFAIAC.
Second: accessing submodules' repositories is quite trivial at the moment.
So much so that git-submodule can still get away with being a shell
script.
Ciao,
Dscho
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2009-04-01 16:16 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-03-25 20:14 [RFC GSoC 2009: git-submodule for multiple, active developers on active trees] P Baker
2009-03-30 15:32 ` Shawn O. Pearce
2009-03-31 15:30 ` P Baker
2009-03-31 15:57 ` Johannes Schindelin
2009-03-31 22:32 ` P Baker
2009-03-31 23:05 ` Johannes Schindelin
2009-03-31 23:49 ` P Baker
2009-04-01 0:58 ` Johannes Schindelin
2009-04-01 2:47 ` P Baker
2009-04-01 16:10 ` Johannes Schindelin
2009-04-01 6:26 ` Andreas Ericsson
2009-04-01 16:13 ` Johannes Schindelin
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).