* organizing multiple repositories with dependencies
@ 2012-04-16 9:27 Namit Bhalla
2012-04-16 14:30 ` Jakub Narebski
2012-04-24 19:48 ` Eugene Sajine
0 siblings, 2 replies; 34+ messages in thread
From: Namit Bhalla @ 2012-04-16 9:27 UTC (permalink / raw)
To: git@vger.kernel.org
I am looking to track some projects using Git with each project as a
separate repository.
Even after reading the documentation, I am still wondering if there is a
way to organize things as described below.
Consider 2 projects, Project-a and Project-b, which are housed in
repositories Repo-a and Repo-b respectively.
Project-a develops reusable libraries which are needed by Project-b
(otherwise Project-b will not compile).
When a new stable version of Project-a libraries has to be delivered, they
are "checked into" a path in Repo-a.
Now, I would like to setup Repo-b so that when someone starts working on
Project-b, he should be able to retrieve the code from Repo-b as well as the libraries from Repo-a. Is there any way to achieve that in
Git?
Thanks for any pointers!
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: organizing multiple repositories with dependencies
2012-04-16 9:27 organizing multiple repositories with dependencies Namit Bhalla
@ 2012-04-16 14:30 ` Jakub Narebski
2012-04-16 20:08 ` dag
2012-04-24 19:48 ` Eugene Sajine
1 sibling, 1 reply; 34+ messages in thread
From: Jakub Narebski @ 2012-04-16 14:30 UTC (permalink / raw)
To: Namit Bhalla; +Cc: git@vger.kernel.org
Namit Bhalla <namitbhalla@yahoo.com> writes:
> I am looking to track some projects using Git with each project as a
> separate repository.
> Even after reading the documentation, I am still wondering if there is a
> way to organize things as described below.
>
> Consider 2 projects, Project-a and Project-b, which are housed in
> repositories Repo-a and Repo-b respectively.
> Project-a develops reusable libraries which are needed by Project-b
> (otherwise Project-b will not compile).
> When a new stable version of Project-a libraries has to be delivered, they
> are "checked into" a path in Repo-a.
> Now, I would like to setup Repo-b so that when someone starts working on
> Project-b, he should be able to retrieve the code from Repo-b as well as the libraries from Repo-a. Is there any way to achieve that in
> Git?
Put reusable library in its own repository, and use submodules to link
it up to project-a and project-b repositories.
HTH
--
Jakub Narebski
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: organizing multiple repositories with dependencies
2012-04-16 14:30 ` Jakub Narebski
@ 2012-04-16 20:08 ` dag
2012-04-17 17:29 ` Hilco Wijbenga
0 siblings, 1 reply; 34+ messages in thread
From: dag @ 2012-04-16 20:08 UTC (permalink / raw)
To: Jakub Narebski; +Cc: Namit Bhalla, git@vger.kernel.org
Jakub Narebski <jnareb@gmail.com> writes:
> Put reusable library in its own repository, and use submodules to link
> it up to project-a and project-b repositories.
git-subtree is another option. It was recently merged into contrib/.
Whether to use submodules or subtrees largely depends on the work style
of your group and how coupled the projects are to each other.
submodules requires a bit more day-to-day maintenance by the user (in my
experience) while with subtrees it's a bit more involved to push changes
from the combined repository back to the separate repositories.
-Dave
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: organizing multiple repositories with dependencies
2012-04-16 20:08 ` dag
@ 2012-04-17 17:29 ` Hilco Wijbenga
2012-04-17 17:51 ` dag
2012-04-17 18:37 ` Seth Robertson
0 siblings, 2 replies; 34+ messages in thread
From: Hilco Wijbenga @ 2012-04-17 17:29 UTC (permalink / raw)
To: Git Users; +Cc: Jakub Narebski, Namit Bhalla, Dave
On 16 April 2012 13:08, <dag@cray.com> wrote:
> Jakub Narebski <jnareb@gmail.com> writes:
>
>> Put reusable library in its own repository, and use submodules to link
>> it up to project-a and project-b repositories.
>
> git-subtree is another option. It was recently merged into contrib/.
> Whether to use submodules or subtrees largely depends on the work style
> of your group and how coupled the projects are to each other.
> submodules requires a bit more day-to-day maintenance by the user (in my
> experience) while with subtrees it's a bit more involved to push changes
> from the combined repository back to the separate repositories.
(My reply below is based on my experience with Git and submodules from
about a year ago. I would really like to see better support for
including separate repos in Git. It does not appear to be an easy nut
to crack, though.)
If you really have only one or two libraries then submodules will work
just fine but if you have quite a few (we had around 50 when we moved
away from submodules) you will find it pretty much unworkable. It's
fragile and hard to keep the submodules in synch with each other and
the umbrella project. Another problem is branches. Branches are per
submodule but you want them for all submodules. You might want to look
into git-slave if you want to go this route. I haven't used
git-subtree so I can't comment on that. (I do not get the impression
that it really is a big step forward, though. I would be *very* happy
to be proven wrong, though.)
In general, I do not think the blanket statement "one repo per
project" is good advice. If projects depend on each other they should
be in the same repo. At least with the current support in GIt for
including separate projects. Please note that I'm not disagreeing with
the notion "one repo per project" itself. It's just not supported well
enough to be feasible if you have a fairly large group of projects
that depend on each other.
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: organizing multiple repositories with dependencies
2012-04-17 17:29 ` Hilco Wijbenga
@ 2012-04-17 17:51 ` dag
2012-04-17 18:37 ` Seth Robertson
1 sibling, 0 replies; 34+ messages in thread
From: dag @ 2012-04-17 17:51 UTC (permalink / raw)
To: Hilco Wijbenga; +Cc: Git Users
Hilco Wijbenga <hilco.wijbenga@gmail.com> writes:
> Another problem is branches. Branches are per submodule but you want
> them for all submodules.
Branches have a similar problem in git-subtree. It is one of the things
I would like to improve in git-subtree going forward. I don't see any
fundamental reason that some git-slave-like operations can't be included
in git-subtree, though with a slightly different model.
-Dave
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: organizing multiple repositories with dependencies
2012-04-17 17:29 ` Hilco Wijbenga
2012-04-17 17:51 ` dag
@ 2012-04-17 18:37 ` Seth Robertson
2012-04-17 19:55 ` Hilco Wijbenga
1 sibling, 1 reply; 34+ messages in thread
From: Seth Robertson @ 2012-04-17 18:37 UTC (permalink / raw)
To: Hilco Wijbenga; +Cc: Git Users, Jakub Narebski, Namit Bhalla, Dave
In message <CAE1pOi1KnvRk4yxK8OQHi9h_ueNnh5Ar3tbKFBKTA69=Aje0TQ@mail.gmail.com>, Hilco Wijbenga writes:
On 16 April 2012 13:08, <dag@cray.com> wrote:
> Jakub Narebski <jnareb@gmail.com> writes:
>> Put reusable library in its own repository, and use submodules to link
>> it up to project-a and project-b repositories.
If you really have only one or two libraries then submodules will work
just fine but if you have quite a few (we had around 50 when we moved
away from submodules) you will find it pretty much unworkable. [...]
Branches are per submodule but you want them for all
submodules. You might want to look into git-slave if you want to
go this route.
In general, I do not think the blanket statement "one repo per
project" is good advice. If projects depend on each other they should
be in the same repo. At least with the current support in GIt for
including separate projects. Please note that I'm not disagreeing with
the notion "one repo per project" itself. It's just not supported well
enough to be feasible if you have a fairly large group of projects
that depend on each other.
As you mentioned, this is exactly the environment that gitslave was
designed for. It provides the flexibility to work on the subprojects
as if they were standalone independent git repositories (which of
course they are) or treat the entire superproject as one giant git
repository (with only a few cracks showing through). All gitslave
commands are just git commands (s/^git\s/gits /) so training to use it is
rather easy.
Unlike with git-submodules there is no strict binding between the
parent repo's commits and the sub-project's commits except at tag
boundaries. This gives you the flexibility of person A saying that A
is master and B is underneath it while person B says that B is master
and A is underneath it (or alternatively you can also say that A
include B plus whatever B includes). However, I would in general
recommend that the common library be factored out and be a child of A
and B. gitslave makes it trivial to work with federated git
repositories, if you can handle only having binding between
repositories at tag boundaries.
-Seth Robertson
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: organizing multiple repositories with dependencies
2012-04-17 18:37 ` Seth Robertson
@ 2012-04-17 19:55 ` Hilco Wijbenga
2012-04-17 20:51 ` dag
0 siblings, 1 reply; 34+ messages in thread
From: Hilco Wijbenga @ 2012-04-17 19:55 UTC (permalink / raw)
To: Git Users; +Cc: Jakub Narebski, Namit Bhalla, Dave, Seth Robertson
On 17 April 2012 11:37, Seth Robertson <in-gitvger@baka.org> wrote:
> In message <CAE1pOi1KnvRk4yxK8OQHi9h_ueNnh5Ar3tbKFBKTA69=Aje0TQ@mail.gmail.com>, Hilco Wijbenga writes:
>
> On 16 April 2012 13:08, <dag@cray.com> wrote:
> > Jakub Narebski <jnareb@gmail.com> writes:
> >> Put reusable library in its own repository, and use submodules to link
> >> it up to project-a and project-b repositories.
>
> If you really have only one or two libraries then submodules will work
> just fine but if you have quite a few (we had around 50 when we moved
> away from submodules) you will find it pretty much unworkable. [...]
> Branches are per submodule but you want them for all
> submodules. You might want to look into git-slave if you want to
> go this route.
>
> In general, I do not think the blanket statement "one repo per
> project" is good advice. If projects depend on each other they should
> be in the same repo. At least with the current support in GIt for
> including separate projects. Please note that I'm not disagreeing with
> the notion "one repo per project" itself. It's just not supported well
> enough to be feasible if you have a fairly large group of projects
> that depend on each other.
>
> As you mentioned, this is exactly the environment that gitslave was
> designed for. It provides the flexibility to work on the subprojects
> as if they were standalone independent git repositories (which of
> course they are) or treat the entire superproject as one giant git
> repository (with only a few cracks showing through). All gitslave
> commands are just git commands (s/^git\s/gits /) so training to use it is
> rather easy.
>
> Unlike with git-submodules there is no strict binding between the
> parent repo's commits and the sub-project's commits except at tag
> boundaries. This gives you the flexibility of person A saying that A
> is master and B is underneath it while person B says that B is master
> and A is underneath it (or alternatively you can also say that A
> include B plus whatever B includes). However, I would in general
> recommend that the common library be factored out and be a child of A
> and B. gitslave makes it trivial to work with federated git
> repositories, if you can handle only having binding between
> repositories at tag boundaries.
Well, since I seem to have pretty much everyone who is involved with
"subprojects" (be it submodules, subtree, or gitslave) in this thread,
I wanted to clarify what I meant with "not supported well in Git". To
me, subproject support is one of the most important pieces missing in
Git. Git submodules, subtree and gitslave all provide a part of what
I'm looking for: non-invasive subproject support. I would like to be
able to use normal Git commands on the umbrella project that "trickle
down" to the subprojects. If you work on a subproject (in its own
repo) then a subsequent pull in the umbrella project should bring this
new code into the umbrella project (assuming that would make sense
given the branches involved).
After rereading my earlier reply I felt that it might be interpreted
as being disparaging of submodules/subtree/gitslave. I just wanted to
make clear that that was not my intent. I'm hopeful that we can get
some sort of combination of submodules, subtree and gitslave to
provide subproject support in Git.
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: organizing multiple repositories with dependencies
2012-04-17 19:55 ` Hilco Wijbenga
@ 2012-04-17 20:51 ` dag
2012-04-17 21:43 ` Hilco Wijbenga
2012-04-18 12:19 ` Jens Lehmann
0 siblings, 2 replies; 34+ messages in thread
From: dag @ 2012-04-17 20:51 UTC (permalink / raw)
To: Hilco Wijbenga; +Cc: Git Users
Hilco Wijbenga <hilco.wijbenga@gmail.com> writes:
> If you work on a subproject (in its own repo) then a subsequent pull
> in the umbrella project should bring this new code into the umbrella
> project (assuming that would make sense given the branches involved).
I don't necessarily think this is always what should happen. I can't
comment on git-submodule since I haven't used it in its more recent
incarnation, but one thing I like about git-subtree is that it's
explicit. I have to do a "git subtree pull" on the umbrella project to
pull in the new changes from a subproject. That gives me some degree of
control over when to update sources. I suspect one can do the same by
using "git pull" in submodule directories.
If you want the behavior you describe, a post-receive hook on the
component repositories is easy to implement. I just did that a couple
of weeks ago for a subtree-aggregated repository. When a component
receives something it immediately does a "git subtree pull" on a
workarea on the server and then does a push from that workarea to the
subtree-aggregated repository.
Of course, this is entirely driven by git-subtree's model of actually
incorporating subproject history into one big umbrella repository.
There is no separation between the subprojects and umbrella projects.
It's one giant history. Therefore, push/pull to/from subprojects are
explicit operations. That's probably not the best model for every
situation but I find it very nice.
> After rereading my earlier reply I felt that it might be interpreted
> as being disparaging of submodules/subtree/gitslave.
I didn't interpret it that way at all. I agree with you that
subproject/superproject support could be much better. But I don't agree
that we'll be able to design one model that works for everyone. svn
externals are just one model to aggregate projects but it is not the
only one. It just happens that no one working on Subversion bothered to
implement anything else.
Perhaps a good way to go would be to provide the basic operations (I
think we have most of that) and some hooks in contrib/ or elsewere to
implement various models. Just like git imposes no particular workflow
model I don't think git should impose one particular aggregation model.
What we do need is better documentation of what the various models and
tools are. For example, I would find a subtree/submodule comparison
highly valuable. It would help people decide which model is best for
them.
-Dave
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: organizing multiple repositories with dependencies
2012-04-17 20:51 ` dag
@ 2012-04-17 21:43 ` Hilco Wijbenga
2012-04-17 22:25 ` PJ Weisberg
` (2 more replies)
2012-04-18 12:19 ` Jens Lehmann
1 sibling, 3 replies; 34+ messages in thread
From: Hilco Wijbenga @ 2012-04-17 21:43 UTC (permalink / raw)
To: dag; +Cc: Git Users
On 17 April 2012 13:51, <dag@cray.com> wrote:
> Hilco Wijbenga <hilco.wijbenga@gmail.com> writes:
>
>> If you work on a subproject (in its own repo) then a subsequent pull
>> in the umbrella project should bring this new code into the umbrella
>> project (assuming that would make sense given the branches involved).
>
> I don't necessarily think this is always what should happen. I can't
> comment on git-submodule since I haven't used it in its more recent
> incarnation, but one thing I like about git-subtree is that it's
> explicit. I have to do a "git subtree pull" on the umbrella project to
> pull in the new changes from a subproject. That gives me some degree of
> control over when to update sources. I suspect one can do the same by
> using "git pull" in submodule directories.
I'm assuming that if you have subproject S in umbrella project U and a
branch "topic" in U then that same branch should exist in S. Any
changes in S's topic should show up in U's topic (probably after some
sort of update command like git fetch/pull). This should be unusual,
though, you should be working in U, not S. If you want to work on
something in S that you don't want to see in U, then you should not be
working in S's topic.
> If you want the behavior you describe, a post-receive hook on the
> component repositories is easy to implement. I just did that a couple
> of weeks ago for a subtree-aggregated repository. When a component
> receives something it immediately does a "git subtree pull" on a
> workarea on the server and then does a push from that workarea to the
> subtree-aggregated repository.
[1] Would such a post-receive hook be something that the user has to
set up? Or would that be automatically set up after git clone?
The main problem with the current submodule support is that there is
so much manual work needed. It is too easy to forget a step. Moreover,
it's not easy to determine *that* you forgot a step or which step you
forgot.
> Of course, this is entirely driven by git-subtree's model of actually
> incorporating subproject history into one big umbrella repository.
> There is no separation between the subprojects and umbrella projects.
> It's one giant history. Therefore, push/pull to/from subprojects are
> explicit operations. That's probably not the best model for every
> situation but I find it very nice.
I do not have enough (okay, any) experience with subtree to comment on
that. The first part seems just what I want. I'm not sure about the
explicit pushing/pulling part. That sounds too much like asking for
the sort of problems that scared us away from submodules. Hopefully,
I'm dead wrong. :-)
>> After rereading my earlier reply I felt that it might be interpreted
>> as being disparaging of submodules/subtree/gitslave.
>
> I didn't interpret it that way at all. I agree with you that
> subproject/superproject support could be much better.
Good. I just wanted to be extra clear because you (and others) are
working on something that is very important to me. The last thing I
want to do is discourage you. :-)
> But I don't agree
> that we'll be able to design one model that works for everyone. svn
> externals are just one model to aggregate projects but it is not the
> only one. It just happens that no one working on Subversion bothered to
> implement anything else.
:-) I think I made it pretty clear that I was listing what *I* want.
What *I* am looking for is something that is as invisible and
automatic as possible.
(I find working with Git really quite enjoyable but it has a very
steep learning curve. E.g., I have (literally) spent hours explaining
rebase and merge to our new developers. Surprisingly, some come from
college/university without having ever used an SCM tool but even for
those that have learning Git is quite a challenge. And Git's API isn't
always particularly helpful. The "checkout" command is a perfect (bad)
example in that regard. Even those that haven't used SVN/CVS before do
not associate "checkout" with switching branches. And using git
checkout to go back to the HEAD version of a file you've changed?
Sure, it can be explained and learned but it doesn't make automatic
sense. What does switching branches have to do with undoing changes?
[Yes, it makes sense given Git's implementation but *not* from the
user's point of view.]
Given that, I really do *not* want to pile on more just to accommodate
subprojects.)
> Perhaps a good way to go would be to provide the basic operations (I
> think we have most of that) and some hooks in contrib/ or elsewere to
> implement various models. Just like git imposes no particular workflow
> model I don't think git should impose one particular aggregation model.
> What we do need is better documentation of what the various models and
> tools are. For example, I would find a subtree/submodule comparison
> highly valuable. It would help people decide which model is best for
> them.
That all sounds good. As long as the hooks are automatic (I'm hopeful
you said "no" and "yes" to [1] above). If so, then I can promise you
I'll be taking a look at subtree. :-)
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: organizing multiple repositories with dependencies
2012-04-17 21:43 ` Hilco Wijbenga
@ 2012-04-17 22:25 ` PJ Weisberg
2012-04-17 22:49 ` Hilco Wijbenga
2012-04-18 12:09 ` Jens Lehmann
2012-04-24 17:17 ` dag
2 siblings, 1 reply; 34+ messages in thread
From: PJ Weisberg @ 2012-04-17 22:25 UTC (permalink / raw)
To: Hilco Wijbenga; +Cc: dag, Git Users
On Tue, Apr 17, 2012 at 2:43 PM, Hilco Wijbenga
<hilco.wijbenga@gmail.com> wrote:
> I'm assuming that if you have subproject S in umbrella project U and a
> branch "topic" in U then that same branch should exist in S. Any
> changes in S's topic should show up in U's topic (probably after some
> sort of update command like git fetch/pull). This should be unusual,
> though, you should be working in U, not S. If you want to work on
> something in S that you don't want to see in U, then you should not be
> working in S's topic.
This paragraph makes me wonder why you want to use submodules at all.
Wouldn't a sparse checkout be a better fit for what you're trying to
accomplish?
-PJ
Gehm's Corollary to Clark's Law: Any technology distinguishable from
magic is insufficiently advanced.
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: organizing multiple repositories with dependencies
2012-04-17 22:25 ` PJ Weisberg
@ 2012-04-17 22:49 ` Hilco Wijbenga
2012-04-18 10:15 ` Namit Bhalla
0 siblings, 1 reply; 34+ messages in thread
From: Hilco Wijbenga @ 2012-04-17 22:49 UTC (permalink / raw)
To: PJ Weisberg; +Cc: dag, Git Users
On 17 April 2012 15:25, PJ Weisberg <pj@irregularexpressions.net> wrote:
> On Tue, Apr 17, 2012 at 2:43 PM, Hilco Wijbenga
> <hilco.wijbenga@gmail.com> wrote:
>
>> I'm assuming that if you have subproject S in umbrella project U and a
>> branch "topic" in U then that same branch should exist in S. Any
>> changes in S's topic should show up in U's topic (probably after some
>> sort of update command like git fetch/pull). This should be unusual,
>> though, you should be working in U, not S. If you want to work on
>> something in S that you don't want to see in U, then you should not be
>> working in S's topic.
>
> This paragraph makes me wonder why you want to use submodules at all.
> Wouldn't a sparse checkout be a better fit for what you're trying to
> accomplish?
No, I don't think so but I could be wrong. I want to be able to easily
build and release the individual projects separately (manually and on
the build server). I believe that with a sparse checkout I still get
the entire directory tree. This just doesn't work well. I can make it
work but then I lose other nice features (unrelated to Git).
Basically, I want things separate for release management but together
for development.
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: organizing multiple repositories with dependencies
2012-04-17 22:49 ` Hilco Wijbenga
@ 2012-04-18 10:15 ` Namit Bhalla
0 siblings, 0 replies; 34+ messages in thread
From: Namit Bhalla @ 2012-04-18 10:15 UTC (permalink / raw)
To: git
Hilco Wijbenga <hilco.wijbenga <at> gmail.com> writes:
>
> Basically, I want things separate for release management but together
> for development.
>
Thank you all for this very informative discussion.
I will give these different approaches a try in the next few days.
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: organizing multiple repositories with dependencies
2012-04-17 21:43 ` Hilco Wijbenga
2012-04-17 22:25 ` PJ Weisberg
@ 2012-04-18 12:09 ` Jens Lehmann
2012-04-24 17:17 ` dag
2 siblings, 0 replies; 34+ messages in thread
From: Jens Lehmann @ 2012-04-18 12:09 UTC (permalink / raw)
To: Hilco Wijbenga; +Cc: dag, Git Users
Am 17.04.2012 23:43, schrieb Hilco Wijbenga:
> The main problem with the current submodule support is that there is
> so much manual work needed. It is too easy to forget a step. Moreover,
> it's not easy to determine *that* you forgot a step or which step you
> forgot.
Looks like you are talking about the submodule support how it was a
few years ago. Since 1.7.0 you cannot forget to commit changes in the
submodule anymore, and since 1.7.5 all referenced submodule commits
are fetched when you fetch the superproject. The only thing missing
(with some work done towards that in last years GSoc) is supporting
the pushing of submodule changes and the transparent update of
submodule content when the superproject is updated, both of which are
currently being worked on.
What else was bothering you so much you dumped submodules?
>> Of course, this is entirely driven by git-subtree's model of actually
>> incorporating subproject history into one big umbrella repository.
>> There is no separation between the subprojects and umbrella projects.
>> It's one giant history. Therefore, push/pull to/from subprojects are
>> explicit operations. That's probably not the best model for every
>> situation but I find it very nice.
>
> I do not have enough (okay, any) experience with subtree to comment on
> that. The first part seems just what I want. I'm not sure about the
> explicit pushing/pulling part. That sounds too much like asking for
> the sort of problems that scared us away from submodules. Hopefully,
> I'm dead wrong. :-)
As I understand subtree the pushing and pulling of the subprojects
is needed pretty much at the same points in time it is needed when
using submodules (to share the subproject work between different
superprojects via their upstream). The difference is you import all
subproject changes into a single repo when using subtree, while they
stay separate when using submodules (and additionally you have to
record the updated subprojects in the superproject in an extra
commit there). Submodules enforce the distinction between submodules
and the superproject while subtree doesn't, which may or may not be
just what you want ;-)
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: organizing multiple repositories with dependencies
2012-04-17 20:51 ` dag
2012-04-17 21:43 ` Hilco Wijbenga
@ 2012-04-18 12:19 ` Jens Lehmann
2012-04-24 17:22 ` dag
1 sibling, 1 reply; 34+ messages in thread
From: Jens Lehmann @ 2012-04-18 12:19 UTC (permalink / raw)
To: dag; +Cc: Hilco Wijbenga, Git Users, Junio C Hamano, greened
Am 17.04.2012 22:51, schrieb dag@cray.com:
> Hilco Wijbenga <hilco.wijbenga@gmail.com> writes:
>
>> If you work on a subproject (in its own repo) then a subsequent pull
>> in the umbrella project should bring this new code into the umbrella
>> project (assuming that would make sense given the branches involved).
>
> I don't necessarily think this is always what should happen.
I agree, the reason that we have three different implementations of
subproject support is that there is no model that fits all work flows.
> I can't
> comment on git-submodule since I haven't used it in its more recent
> incarnation, but one thing I like about git-subtree is that it's
> explicit. I have to do a "git subtree pull" on the umbrella project to
> pull in the new changes from a subproject. That gives me some degree of
> control over when to update sources. I suspect one can do the same by
> using "git pull" in submodule directories.
It's explicit too when using submodules, you can update each submodule
to the commit you want, review and test that and then decide if you want
to commit that (or e.g. it's parent) in the superproject or just rewind
the submodule because the new changes don't work for you. For a lot of
use cases an automatic pull of changes you haven't even seen yet and
then automatically promote them to the superproject (which is how I
understand "git subtree pull", but I might be wrong) is undesirable, for
others it might very well work.
> Perhaps a good way to go would be to provide the basic operations (I
> think we have most of that) and some hooks in contrib/ or elsewere to
> implement various models. Just like git imposes no particular workflow
> model I don't think git should impose one particular aggregation model.
> What we do need is better documentation of what the various models and
> tools are. For example, I would find a subtree/submodule comparison
> highly valuable. It would help people decide which model is best for
> them.
I agree and am willing to provide information about submodule use cases,
advantages and problems, but I'm not a user of subtree so I can't really
comment on it. Now that subtree is in git core, what about putting such
a comparison under Documentation/subproject-support.txt?
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: organizing multiple repositories with dependencies
2012-04-17 21:43 ` Hilco Wijbenga
2012-04-17 22:25 ` PJ Weisberg
2012-04-18 12:09 ` Jens Lehmann
@ 2012-04-24 17:17 ` dag
2012-04-24 18:54 ` Hilco Wijbenga
2 siblings, 1 reply; 34+ messages in thread
From: dag @ 2012-04-24 17:17 UTC (permalink / raw)
To: Hilco Wijbenga; +Cc: Git Users
Hilco Wijbenga <hilco.wijbenga@gmail.com> writes:
> I'm assuming that if you have subproject S in umbrella project U and a
> branch "topic" in U then that same branch should exist in S.
No, I think that is actually very rare. If topic branches really should
be mirrored then U and S should be one repository. They are too closely
coupled to be separated. But see the but about git-subtree and topic
branches below.
For release tags, etc. I agree that this kind of mirrored tag/branch
behavior is the common case.
>> If you want the behavior you describe, a post-receive hook on the
>> component repositories is easy to implement.
>
> [1] Would such a post-receive hook be something that the user has to
> set up? Or would that be automatically set up after git clone?
The user/admin would have to set this up, at least for now.
> The main problem with the current submodule support is that there is
> so much manual work needed. It is too easy to forget a step. Moreover,
> it's not easy to determine *that* you forgot a step or which step you
> forgot.
I agree. We can certainly make things more user-friendly.
>> Of course, this is entirely driven by git-subtree's model of actually
>> incorporating subproject history into one big umbrella repository.
>> There is no separation between the subprojects and umbrella projects.
>> It's one giant history. Therefore, push/pull to/from subprojects are
>> explicit operations. That's probably not the best model for every
>> situation but I find it very nice.
>
> I do not have enough (okay, any) experience with subtree to comment on
> that. The first part seems just what I want. I'm not sure about the
> explicit pushing/pulling part. That sounds too much like asking for
> the sort of problems that scared us away from submodules. Hopefully,
> I'm dead wrong. :-)
With subtrees, a topic branch in the umbrella project WILL be reflected
in the subproject because it is really one big repository. It's a
little inconvenient to subtree push a new tag at the moment. You have
to do a subtree split to a new branch and then push the branch to the
original component repository. That's one thing I want to improve in
the short term. I have found a need for then when creating release
tags.
But still, it seems odd to me that you'd create a topic branch in U and
then want to push it to a separate S repository. Topic branches are by
nature ephemeral and I have never had a need to do something like that.
It just seems to go against the grain of what a topic branch is. As I
said above, release tags and such are in a different category and that
is the main target of the subtree push enhancements I want to make.
>> But I don't agree
>> that we'll be able to design one model that works for everyone. svn
>> externals are just one model to aggregate projects but it is not the
>> only one. It just happens that no one working on Subversion bothered to
>> implement anything else.
>
> :-) I think I made it pretty clear that I was listing what *I* want.
> What *I* am looking for is something that is as invisible and
> automatic as possible.
Absolutely.
> That all sounds good. As long as the hooks are automatic (I'm hopeful
> you said "no" and "yes" to [1] above). If so, then I can promise you
> I'll be taking a look at subtree. :-)
I think at the very least we can provide setup scripts in contrib. To
be honest I haven't thought deeply enough about this to determine if
there's a way to make it more convenient.
-Dave
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: organizing multiple repositories with dependencies
2012-04-18 12:19 ` Jens Lehmann
@ 2012-04-24 17:22 ` dag
2012-04-24 17:59 ` Seth Robertson
0 siblings, 1 reply; 34+ messages in thread
From: dag @ 2012-04-24 17:22 UTC (permalink / raw)
To: Jens Lehmann; +Cc: Hilco Wijbenga, Git Users, Junio C Hamano, greened
Jens Lehmann <Jens.Lehmann@web.de> writes:
> It's explicit too when using submodules, you can update each submodule
> to the commit you want, review and test that and then decide if you want
> to commit that (or e.g. it's parent) in the superproject or just rewind
> the submodule because the new changes don't work for you.
Yes, that is very useful.
> For a lot of use cases an automatic pull of changes you haven't even
> seen yet and then automatically promote them to the superproject
> (which is how I understand "git subtree pull", but I might be wrong)
> is undesirable, for others it might very well work.
Since subtrees are really just directories in a single-history
repository, a subtree pull does "prommote" the changes to the
superproject because there is no superproject/subproject. That's one of
the reasons subtree can be used to create subprojects out of existing
repositories.
Subtrees and submodules really are very different models. I see
advantages and dsadvantages to both depending on one's work flow.
> I agree and am willing to provide information about submodule use cases,
> advantages and problems, but I'm not a user of subtree so I can't really
> comment on it. Now that subtree is in git core, what about putting such
> a comparison under Documentation/subproject-support.txt?
That would be great. Do you want to start work on that? I can
contribute some text about git-subtree.
-Dave
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: organizing multiple repositories with dependencies
2012-04-24 17:22 ` dag
@ 2012-04-24 17:59 ` Seth Robertson
2012-04-24 20:26 ` Jens Lehmann
2012-04-24 23:25 ` dag
0 siblings, 2 replies; 34+ messages in thread
From: Seth Robertson @ 2012-04-24 17:59 UTC (permalink / raw)
To: dag; +Cc: Jens Lehmann, Hilco Wijbenga, Git Users, Junio C Hamano, greened
In message <nngbomh3uz0.fsf@transit.us.cray.com>, dag@cray.com writes:
> I agree and am willing to provide information about submodule use cases,
> advantages and problems, but I'm not a user of subtree so I can't really
> comment on it. Now that subtree is in git core, what about putting such
> a comparison under Documentation/subproject-support.txt?
That would be great. Do you want to start work on that? I can
contribute some text about git-subtree.
I have a document I created for gitslave which I have cleaned up a bit
and might be the start of such comparison.
----------------------------------------------------------------------
git-submodules is the legacy solution for putting repositories inside
of other repositories. With git submodules, the submodule is checked
out to a semi-fixed commit, typically on a detached HEAD. To make a
change to the subproject, you need to check the submodule repository
out onto the correct branch, make the desired change (possibly
involving pull), commit, and then go into the superproject and commit
the commit (or at least record the new location of the submodule). It
was designed for third party projects which you typically do not doing
active development on. Many/most git commands performed on the
superproject will not recurse down into the submodules. submodules
give you a tight mapping between subproject commits and superproject
commits (you always know which commit a subproject was in for any
given superproject commit). git-submodules is considered difficult to
use for less experienced git developers who need to modify the
subproject.
Another option is to stick everything in one giant repository,
typically by using git-subtree. This might make your repository large
and you have to manually run git-subtree commands to export your
changes back out to the individual non-aggregated repositories. All
git commands run as normal, though when examining pre-subtree history,
you can see the individual lines of development on different branches.
gitslave creates a federation of git repositoriesâa superproject
repository and a number of slave repositoriesâall of which may be
concurrently developed on and on which all desired git operations will
touch. In a typical use case, you would branch all repositories (to
the same branch name) at the same time, and checkout or tag or get of
the status of all repositories at the same time. For essentially any
git command, you simply replace "git" with "gits" to get the specified
git command to run on all repositories. Of course, some commands do
not necessarily make a great deal of sense to run over all git
repositories (eg. `git add filename`), but you are allowed to run any
normal git command on any of the repositories at any time. gitslave
provides a loose binding so that it is not necessarily completely
clear which revision one repositories was at when a commit was made in
a different repository, except after tag operations.
Another options include repo from Google, used with Android. Repo
seems to work much like gitslave from a high level perspective, but
there isn't a lot of documentation on using it for other projects.
Still another option is kitenet's mr which supports multiple
repository types (CVS, SVN, git, etc). It is absolutely the solution
for multi-SCM projects, but since it works on the lowest common
denominator you would lose much of the expressive power of git.
The final option is to just put git repositories inside of other
repositories. However, you must be sure to add the subproject into
the superproject's .gitignore to prevent `git add` from adding the
subproject as a broken submodule commit (broken because no .gitmodules
or git-config entry will exist for it).
----------------------------------------------------------------------
-Seth Robertson
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: organizing multiple repositories with dependencies
2012-04-24 17:17 ` dag
@ 2012-04-24 18:54 ` Hilco Wijbenga
2012-04-24 21:09 ` PJ Weisberg
2012-04-24 23:33 ` dag
0 siblings, 2 replies; 34+ messages in thread
From: Hilco Wijbenga @ 2012-04-24 18:54 UTC (permalink / raw)
To: dag; +Cc: Git Users
On 24 April 2012 10:17, <dag@cray.com> wrote:
> Hilco Wijbenga <hilco.wijbenga@gmail.com> writes:
>
>> I'm assuming that if you have subproject S in umbrella project U and a
>> branch "topic" in U then that same branch should exist in S.
>
> No, I think that is actually very rare. If topic branches really should
> be mirrored then U and S should be one repository. They are too closely
> coupled to be separated. But see the but about git-subtree and topic
> branches below.
Too closely coupled? I do not think breaking up a project into a set
of libraries makes everything tightly coupled. I would argue the
opposite. :-) Anyway, you answer my concern below.
>>> Of course, this is entirely driven by git-subtree's model of actually
>>> incorporating subproject history into one big umbrella repository.
>>> There is no separation between the subprojects and umbrella projects.
>>> It's one giant history. Therefore, push/pull to/from subprojects are
>>> explicit operations. That's probably not the best model for every
>>> situation but I find it very nice.
>>
>> I do not have enough (okay, any) experience with subtree to comment on
>> that. The first part seems just what I want. I'm not sure about the
>> explicit pushing/pulling part. That sounds too much like asking for
>> the sort of problems that scared us away from submodules. Hopefully,
>> I'm dead wrong. :-)
>
> With subtrees, a topic branch in the umbrella project WILL be reflected
> in the subproject because it is really one big repository. It's a
> little inconvenient to subtree push a new tag at the moment. You have
> to do a subtree split to a new branch and then push the branch to the
> original component repository. That's one thing I want to improve in
> the short term. I have found a need for then when creating release
> tags.
Okay, that would work fine. I have no problem with a bit of extra work
here. (In fact, this is probably where I would *want* a bit of extra
control.)
What would happen if you had a bunch of commits in the umbrella
project and then did a push? Would that error out? Are there
protections in place to prevent developers from making silly mistakes
like that?
> But still, it seems odd to me that you'd create a topic branch in U and
> then want to push it to a separate S repository. Topic branches are by
> nature ephemeral and I have never had a need to do something like that.
> It just seems to go against the grain of what a topic branch is. As I
> said above, release tags and such are in a different category and that
> is the main target of the subtree push enhancements I want to make.
I had not realized all changes would simply be part of the umbrella
project. Given that, a topic branch in each subproject's repo is
unnecessary.
Cheers,
Hilco
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: organizing multiple repositories with dependencies
2012-04-16 9:27 organizing multiple repositories with dependencies Namit Bhalla
2012-04-16 14:30 ` Jakub Narebski
@ 2012-04-24 19:48 ` Eugene Sajine
2012-04-24 22:11 ` Hilco Wijbenga
2012-04-24 23:36 ` dag
1 sibling, 2 replies; 34+ messages in thread
From: Eugene Sajine @ 2012-04-24 19:48 UTC (permalink / raw)
To: Namit Bhalla; +Cc: git@vger.kernel.org
On Mon, Apr 16, 2012 at 5:27 AM, Namit Bhalla <namitbhalla@yahoo.com> wrote:
> I am looking to track some projects using Git with each project as a
> separate repository.
> Even after reading the documentation, I am still wondering if there is a
> way to organize things as described below.
>
> Consider 2 projects, Project-a and Project-b, which are housed in
> repositories Repo-a and Repo-b respectively.
> Project-a develops reusable libraries which are needed by Project-b
> (otherwise Project-b will not compile).
> When a new stable version of Project-a libraries has to be delivered, they
> are "checked into" a path in Repo-a.
> Now, I would like to setup Repo-b so that when someone starts working on
> Project-b, he should be able to retrieve the code from Repo-b as well as the libraries from Repo-a. Is there any way to achieve that in
> Git?
>
> Thanks for any pointers!
> --
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
We are working in the environment where we have hundreds (700 + and
counting) or projects with many of them reused by others.
We are following strictly one project = one repo rule without any
subtrees or submodules.
What you are asking about is "integration" and IMHO has nothing to do
with git - i.e. should be VCS independent.
We are using integration on the artifact level and it works amazingly well.
But we also use pretty strict naming and location convention that
allows us to script around the whole setup very easily.
In order to track dependencies between projects we use Ivy.
The project can be compiled locally using local copies of the upstream
project artifacts built by developer on the same machine or if not
present, the current production artifacts are used.
We also have Jenkins CI server that helps with integration. It is very
simple and straight forward set up without any unnecessary
complications IMHO.
Feel free to contact me if you need more info about such set up.
Just my 2 cents.
Thanks,
Eugene
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: organizing multiple repositories with dependencies
2012-04-24 17:59 ` Seth Robertson
@ 2012-04-24 20:26 ` Jens Lehmann
2012-04-24 20:52 ` Seth Robertson
2012-04-24 23:21 ` dag
2012-04-24 23:25 ` dag
1 sibling, 2 replies; 34+ messages in thread
From: Jens Lehmann @ 2012-04-24 20:26 UTC (permalink / raw)
To: Seth Robertson; +Cc: dag, Hilco Wijbenga, Git Users, Junio C Hamano, greened
Am 24.04.2012 19:59, schrieb Seth Robertson:
>
> In message <nngbomh3uz0.fsf@transit.us.cray.com>, dag@cray.com writes:
>
> > I agree and am willing to provide information about submodule use cases,
> > advantages and problems, but I'm not a user of subtree so I can't really
> > comment on it. Now that subtree is in git core, what about putting such
> > a comparison under Documentation/subproject-support.txt?
>
> That would be great. Do you want to start work on that? I can
> contribute some text about git-subtree.
>
> I have a document I created for gitslave which I have cleaned up a bit
> and might be the start of such comparison.
Thanks for providing the input. Unfortunately I'll be pretty occupied for
the next three weeks, so I won't be able to put much work into that before
mid-May. But maybe we can get the ball rolling ...
In the end I'd like to see a document people can use to decide what
subproject support suits their needs best. Maybe it should start with
the basic concept behind each of them:
submodules: A submodule is a full fledged repository of which a certain
commit is recorded in a gitlink entry in each of the the superproject's
commits.
The emphasis lies on tightly coupling versions of both while keeping the
boundaries between superproject and submodules visible.
This leads to some extra cost when doing changes in a submodule but makes
it easy to evaluate and select new changes from upstream and push back
local changes to their respective upstream.
subtree: All subprojects become an integral part of the history of the
superproject.
The emphasis lies on incorporating the subtree and its history into the
superproject.
That adds some extra cost when it comes to pushing subtree changes back
to their upstream (starting with the need for careful commit planning for
local commits intended to be pushed out again) and less fine grained
control over importing changes from the subtrees upstream.
gitslave: This creates a federation of full fledged git repositories which
are operated on by the gits commands together (where a git command would
only operate on the superproject).
The emphasis lies on the simultaneous operation of gits commands on all
git repositories.
It does not provide any coupling of the commits in the superproject and the
slave repositories (but you can use tags to have that at some points in the
history).
What do you think? (Please point out anything I misrepresented in the last
two paragraphs, they are based solely on what I picked up on this list and
are not based on any actual experience ;-)
Then we could describe in a table what to do when to fetch new subproject
commits, how to "select" them in the superproject and how to push them
back to their respective upstream. Another interesting question could be
how a bug in a subproject that affects the superproject is handled in each
of the scenarios.
Does that sound like a start?
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: organizing multiple repositories with dependencies
2012-04-24 20:26 ` Jens Lehmann
@ 2012-04-24 20:52 ` Seth Robertson
2012-04-24 23:21 ` dag
1 sibling, 0 replies; 34+ messages in thread
From: Seth Robertson @ 2012-04-24 20:52 UTC (permalink / raw)
To: Jens Lehmann; +Cc: dag, Hilco Wijbenga, Git Users, Junio C Hamano, greened
In message <4F970C92.3030704@web.de>, Jens Lehmann writes:
gitslave: This creates a federation of full fledged git repositories which
are operated on by the gits commands together (where a git command would
only operate on the superproject).
The emphasis lies on the simultaneous operation of gits commands on all
git repositories.
It does not provide any coupling of the commits in the superproject and the
slave repositories (but you can use tags to have that at some points in the
history).
Well, gitslave is essentially a loop to run the listed git command in
each repository, so there are no atomic operations and you can get
partial success and partial failure, thus "simultaneous operation"
isn't a very good description.
Perhaps a better sentence would be, "The emphasis lies in the
simplicity and convenience of having gits commands run the same git
operation on all linked repositories, with output summarizing."
Just a FYI: partial success and partial failure in different
repositories isn't a major problem when using git. However, in the
interest of full disclosure, two users racing to push could in theory
cause a broken project given specific combinations of some users
modifying different repositories than others with mutual dependencies
between them. But in all of the years of using gitslave no-one has
ever had/reported such a problem. If you can assuming any sort of
sane QA and deployment practice, this should be able to cause an
operational problem.
-Seth Robertson
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: organizing multiple repositories with dependencies
2012-04-24 18:54 ` Hilco Wijbenga
@ 2012-04-24 21:09 ` PJ Weisberg
2012-04-24 22:04 ` Hilco Wijbenga
2012-04-24 23:33 ` dag
1 sibling, 1 reply; 34+ messages in thread
From: PJ Weisberg @ 2012-04-24 21:09 UTC (permalink / raw)
To: Hilco Wijbenga; +Cc: dag, Git Users
On Tue, Apr 24, 2012 at 11:54 AM, Hilco Wijbenga
<hilco.wijbenga@gmail.com> wrote:
> On 24 April 2012 10:17, <dag@cray.com> wrote:
>> Hilco Wijbenga <hilco.wijbenga@gmail.com> writes:
>>
>>> I'm assuming that if you have subproject S in umbrella project U and a
>>> branch "topic" in U then that same branch should exist in S.
>>
>> No, I think that is actually very rare. If topic branches really should
>> be mirrored then U and S should be one repository. They are too closely
>> coupled to be separated. But see the but about git-subtree and topic
>> branches below.
>
> Too closely coupled? I do not think breaking up a project into a set
> of libraries makes everything tightly coupled. I would argue the
> opposite. :-) Anyway, you answer my concern below.
Indeed. But when you make a branch in your main project, wouldn't you
usually still want to use the master branch of the libraries? Or if
there's an experimental branch in a library and you want to use that
branched version, wouldn't you still use the master version of all the
other libraries? What if you have two projects that both use a
library, but are otherwise unrelated? If you create a branch called
'hotfix' in one project, do you automatically find your library
version switching to an unrelated 'hotfix' from another project?
-PJ
Gehm's Corollary to Clark's Law: Any technology distinguishable from
magic is insufficiently advanced.
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: organizing multiple repositories with dependencies
2012-04-24 21:09 ` PJ Weisberg
@ 2012-04-24 22:04 ` Hilco Wijbenga
0 siblings, 0 replies; 34+ messages in thread
From: Hilco Wijbenga @ 2012-04-24 22:04 UTC (permalink / raw)
To: PJ Weisberg; +Cc: dag, Git Users
On 24 April 2012 14:09, PJ Weisberg <pj@irregularexpressions.net> wrote:
> On Tue, Apr 24, 2012 at 11:54 AM, Hilco Wijbenga
> <hilco.wijbenga@gmail.com> wrote:
>> On 24 April 2012 10:17, <dag@cray.com> wrote:
>>> Hilco Wijbenga <hilco.wijbenga@gmail.com> writes:
>>>
>>>> I'm assuming that if you have subproject S in umbrella project U and a
>>>> branch "topic" in U then that same branch should exist in S.
>>>
>>> No, I think that is actually very rare. If topic branches really should
>>> be mirrored then U and S should be one repository. They are too closely
>>> coupled to be separated. But see the but about git-subtree and topic
>>> branches below.
>>
>> Too closely coupled? I do not think breaking up a project into a set
>> of libraries makes everything tightly coupled. I would argue the
>> opposite. :-) Anyway, you answer my concern below.
>
> Indeed. But when you make a branch in your main project, wouldn't you
> usually still want to use the master branch of the libraries?
For those that haven't changed? Sure. I just (incorrectly) assumed
that if I had a topic branch in the umbrella project I would need
topic branches in all repos as well, otherwise I would end up with
changes in both various masters and my topic branch. Too much exposure
to submodules, I suppose. :-) But subtree works around that quite
nicely so it really doesn't matter. I simply made one too many
assumptions. :-)
Similar answers to everything below.
> Or if
> there's an experimental branch in a library and you want to use that
> branched version, wouldn't you still use the master version of all the
> other libraries? What if you have two projects that both use a
> library, but are otherwise unrelated? If you create a branch called
> 'hotfix' in one project, do you automatically find your library
> version switching to an unrelated 'hotfix' from another project?
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: organizing multiple repositories with dependencies
2012-04-24 19:48 ` Eugene Sajine
@ 2012-04-24 22:11 ` Hilco Wijbenga
2012-04-24 23:38 ` dag
2012-04-24 23:36 ` dag
1 sibling, 1 reply; 34+ messages in thread
From: Hilco Wijbenga @ 2012-04-24 22:11 UTC (permalink / raw)
To: Eugene Sajine; +Cc: Namit Bhalla, git@vger.kernel.org
On 24 April 2012 12:48, Eugene Sajine <euguess@gmail.com> wrote:
> We are working in the environment where we have hundreds (700 + and
> counting) or projects with many of them reused by others.
> We are following strictly one project = one repo rule without any
> subtrees or submodules.
> What you are asking about is "integration" and IMHO has nothing to do
> with git - i.e. should be VCS independent.
> We are using integration on the artifact level and it works amazingly well.
> But we also use pretty strict naming and location convention that
> allows us to script around the whole setup very easily.
>
> In order to track dependencies between projects we use Ivy.
> The project can be compiled locally using local copies of the upstream
> project artifacts built by developer on the same machine or if not
> present, the current production artifacts are used.
> We also have Jenkins CI server that helps with integration. It is very
> simple and straight forward set up without any unnecessary
> complications IMHO.
> Feel free to contact me if you need more info about such set up.
So how do you handle implementing features/changes that involve more
than one project? Surely, you do not manually create topic branches in
each of the involved repos?
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: organizing multiple repositories with dependencies
2012-04-24 20:26 ` Jens Lehmann
2012-04-24 20:52 ` Seth Robertson
@ 2012-04-24 23:21 ` dag
2012-04-28 17:31 ` username localhost
1 sibling, 1 reply; 34+ messages in thread
From: dag @ 2012-04-24 23:21 UTC (permalink / raw)
To: Jens Lehmann
Cc: Seth Robertson, Hilco Wijbenga, Git Users, Junio C Hamano,
greened
Jens Lehmann <Jens.Lehmann@web.de> writes:
[Thanks for working this! I have a few comments inlined below to
hopefully help make this even better.]
> In the end I'd like to see a document people can use to decide what
> subproject support suits their needs best. Maybe it should start with
> the basic concept behind each of them:
Exactly.
> submodules: A submodule is a full fledged repository of which a certain
> commit is recorded in a gitlink entry in each of the the superproject's
> commits.
That's far too technical. I don't even know what that means. :) I
think we want to go for the average user who just wants to make an
informed decision among the various models available.
> The emphasis lies on tightly coupling versions of both while keeping the
> boundaries between superproject and submodules visible.
The above is good but could use some expanding. What exactly does
"tightly coupling" mean? It's kind of a generic phrase.
> This leads to some extra cost when doing changes in a submodule but makes
> it easy to evaluate and select new changes from upstream and push back
> local changes to their respective upstream.
This, I think is a key differentiator for submodules and should be
emphasized.
> subtree: All subprojects become an integral part of the history of the
> superproject.
> The emphasis lies on incorporating the subtree and its history into the
> superproject.
> That adds some extra cost when it comes to pushing subtree changes back
> to their upstream (starting with the need for careful commit planning for
> local commits intended to be pushed out again) and less fine grained
> control over importing changes from the subtrees upstream.
That's a good start. I'll add some text to this later as I think there
are some advantages to the approach that should be called out.
> gitslave: This creates a federation of full fledged git repositories which
> are operated on by the gits commands together (where a git command would
> only operate on the superproject).
> The emphasis lies on the simultaneous operation of gits commands on all
> git repositories.
> It does not provide any coupling of the commits in the superproject and the
> slave repositories (but you can use tags to have that at some points in the
> history).
Should gitslave be covered in this document if gitslave is not in the
upstream git repository? I'm not knocking gitslave, in fact I think
it's cool technology and probably _should_ be contributed upstream. I'm
just asking the question about whether stuff in Documentation/ is or
should be limited to things in the upstream repository.
That said, the above is good but as a user I would want more
clarification on how submodules and gitslave differ. The same is true
for subtrees but I'm assuming I'll handle that. :)
> What do you think? (Please point out anything I misrepresented in the last
> two paragraphs, they are based solely on what I picked up on this list and
> are not based on any actual experience ;-)
It looks very good as a starting point. Thanks!
> Then we could describe in a table what to do when to fetch new subproject
> commits, how to "select" them in the superproject and how to push them
> back to their respective upstream. Another interesting question could be
> how a bug in a subproject that affects the superproject is handled in each
> of the scenarios.
Yes, I was imagining exactly this sort of table.
How about creating a topic branch for this and publishing it so several
of us can collaborate? I think that would make things a bit easier
moving forward.
-Dave
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: organizing multiple repositories with dependencies
2012-04-24 17:59 ` Seth Robertson
2012-04-24 20:26 ` Jens Lehmann
@ 2012-04-24 23:25 ` dag
2012-04-25 12:48 ` Seth Robertson
1 sibling, 1 reply; 34+ messages in thread
From: dag @ 2012-04-24 23:25 UTC (permalink / raw)
To: Seth Robertson
Cc: Jens Lehmann, Hilco Wijbenga, Git Users, Junio C Hamano, greened
Seth Robertson <in-gitvger@baka.org> writes:
> I have a document I created for gitslave which I have cleaned up a bit
> and might be the start of such comparison.
I'll look through this is more detail and I think some of the text
can be combined with other contributions. I just asked Jens if he wants
to create a topic branch on which we may collaborate.
My inclination is to limit the documentation to what's in the upstream
repository, which would eliminate repo, mr and unfortunately, for the
time being, git slave, but that's just one opinion and it's a newbie
opinion at that. :) Please don't take it as knocking git-slave because
I think it's really cool. I would like to see it go upstream!
-Dave
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: organizing multiple repositories with dependencies
2012-04-24 18:54 ` Hilco Wijbenga
2012-04-24 21:09 ` PJ Weisberg
@ 2012-04-24 23:33 ` dag
2012-04-30 19:25 ` Phil Hord
1 sibling, 1 reply; 34+ messages in thread
From: dag @ 2012-04-24 23:33 UTC (permalink / raw)
To: Hilco Wijbenga; +Cc: Git Users
Hilco Wijbenga <hilco.wijbenga@gmail.com> writes:
>>> I'm assuming that if you have subproject S in umbrella project U and a
>>> branch "topic" in U then that same branch should exist in S.
>>
>> No, I think that is actually very rare. If topic branches really should
>> be mirrored then U and S should be one repository. They are too closely
>> coupled to be separated. But see the but about git-subtree and topic
>> branches below.
>
> Too closely coupled? I do not think breaking up a project into a set
> of libraries makes everything tightly coupled. I would argue the
> opposite. :-) Anyway, you answer my concern below.
If you need the same topic branch for each component they would indeed
seem to be very tightly coupled, even if the code is "physically"
separated. I can't think of a situation where I would need to implement
the same or similar features in multiple components where those
components are not tightly coupled in some way.
> What would happen if you had a bunch of commits in the umbrella
> project and then did a push? Would that error out? Are there
> protections in place to prevent developers from making silly mistakes
> like that?
It would push to the remote/origin of the umbrella project, maintaining
the same "whole project" history. It's an explicit operation to split
the commits on any subproject out and push them to the subproject's
origin.
So let's say you want to branch each subproject for release. You could
do something like this (off the top of my head so don't copy/paste
verbatim):
branch U release_X # Create the branch in the umbrella project
work, work, work
git subtree split S1 S1_release_X # Split commits to S1 made on
# release_X branch
git subtree split S2 S2_release_X
git subtree split S3 S3_release_X
git checkout S1_release_X # Send commits to S1 to origin,
git push origin_S1 release_X # creating branch release_X
git checkout S2_release_X
git push origin_S2 release_X
git checkout S3_release_X
git push origin_S3 release_X
It's the split/checkout/push sequence that I'd like to optimize and make
simpler.
-Dave
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: organizing multiple repositories with dependencies
2012-04-24 19:48 ` Eugene Sajine
2012-04-24 22:11 ` Hilco Wijbenga
@ 2012-04-24 23:36 ` dag
1 sibling, 0 replies; 34+ messages in thread
From: dag @ 2012-04-24 23:36 UTC (permalink / raw)
To: Eugene Sajine; +Cc: Namit Bhalla, git@vger.kernel.org
Eugene Sajine <euguess@gmail.com> writes:
> What you are asking about is "integration" and IMHO has nothing to do
> with git - i.e. should be VCS independent.
Indeed, integration is another good strategy. I mainly use submodules
as a convenience because it is a frequent (though not regular)
occurrence that we want to change multiple libraries at the same time.
Plus it provides a convenient way for developers to check out "the
project."
Again, it depends on the development practice and individual situations.
:)
-Dave
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: organizing multiple repositories with dependencies
2012-04-24 22:11 ` Hilco Wijbenga
@ 2012-04-24 23:38 ` dag
0 siblings, 0 replies; 34+ messages in thread
From: dag @ 2012-04-24 23:38 UTC (permalink / raw)
To: Hilco Wijbenga; +Cc: Eugene Sajine, Namit Bhalla, git@vger.kernel.org
Hilco Wijbenga <hilco.wijbenga@gmail.com> writes:
> On 24 April 2012 12:48, Eugene Sajine <euguess@gmail.com> wrote:
> So how do you handle implementing features/changes that involve more
> than one project? Surely, you do not manually create topic branches in
> each of the involved repos?
That is indeed what integrators usually do for release branches. For
features, developers create topic branches for whatever component
they're working on and these never hit a shared repository anyway so it
doesn't really matter.
Again, it's just another model which works well in a lot of cases, but
not all.
-Dave
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: organizing multiple repositories with dependencies
2012-04-24 23:25 ` dag
@ 2012-04-25 12:48 ` Seth Robertson
2012-04-27 14:23 ` dag
0 siblings, 1 reply; 34+ messages in thread
From: Seth Robertson @ 2012-04-25 12:48 UTC (permalink / raw)
To: dag; +Cc: Jens Lehmann, Hilco Wijbenga, Git Users, Junio C Hamano, greened
In message <nngipgo1zn3.fsf@transit.us.cray.com>, dag@cray.com writes:
My inclination is to limit the documentation to what's in the upstream
repository, which would eliminate repo, mr and unfortunately, for the
time being, git slave, but that's just one opinion and it's a newbie
opinion at that. :) Please don't take it as knocking git-slave because
I think it's really cool. I would like to see it go upstream!
So would I, and I'm also happy to do work to cause that to happen, but
the question is would it be accepted?
-Seth Robertson
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: organizing multiple repositories with dependencies
2012-04-25 12:48 ` Seth Robertson
@ 2012-04-27 14:23 ` dag
0 siblings, 0 replies; 34+ messages in thread
From: dag @ 2012-04-27 14:23 UTC (permalink / raw)
To: Seth Robertson
Cc: Jens Lehmann, Hilco Wijbenga, Git Users, Junio C Hamano, greened
Seth Robertson <in-gitvger@baka.org> writes:
> So would I, and I'm also happy to do work to cause that to happen, but
> the question is would it be accepted?
Start by asking the question. Start a thread on the list and cc Junio.
That's how I started with git-subtree. Of course Avery did some prep
work to soften up the maintainers. :)
-Dave
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: organizing multiple repositories with dependencies
2012-04-24 23:21 ` dag
@ 2012-04-28 17:31 ` username localhost
0 siblings, 0 replies; 34+ messages in thread
From: username localhost @ 2012-04-28 17:31 UTC (permalink / raw)
To: git
dag writes:
>
> Should gitslave be covered in this document if gitslave is not in the
> upstream git repository? I'm not knocking gitslave, in fact I think
> it's cool technology and probably _should_ be contributed upstream. I'm
> just asking the question about whether stuff in Documentation/ is or
> should be limited to things in the upstream repository.
I think that it would be better to be inclusive while initially writing such a
document. The work of helping to distinguish gitslave from the others should
also help to flesh out the differences between submodules and git-subtree.
Once the document is in good enough shape that it can be merged, it would be
easy enough to strip gitslave out if it were decided that it should not be
included.
>
> How about creating a topic branch for this and publishing it so several
> of us can collaborate? I think that would make things a bit easier
> moving forward.
I second that. As a git user, I would love to see a document that described
these systems, explaining how they differ, showing how various "common" actions
are done in each, and describing the workflows each is best in, and what
workflows each does poorly with.
If somebody does not create a branch and post a starting point, I'm afraid this
idea will simply die out.
--
username@localhost
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: organizing multiple repositories with dependencies
2012-04-24 23:33 ` dag
@ 2012-04-30 19:25 ` Phil Hord
2012-04-30 19:43 ` dag
0 siblings, 1 reply; 34+ messages in thread
From: Phil Hord @ 2012-04-30 19:25 UTC (permalink / raw)
To: dag; +Cc: Hilco Wijbenga, Git Users
On Tue, Apr 24, 2012 at 7:33 PM, <dag@cray.com> wrote:
> Hilco Wijbenga <hilco.wijbenga@gmail.com> writes:
>>> No, I think that is actually very rare. If topic branches really should
>>> be mirrored then U and S should be one repository. They are too closely
>>> coupled to be separated. But see the but about git-subtree and topic
>>> branches below.
>>
>> Too closely coupled? I do not think breaking up a project into a set
>> of libraries makes everything tightly coupled. I would argue the
>> opposite. :-) Anyway, you answer my concern below.
>
> If you need the same topic branch for each component they would indeed
> seem to be very tightly coupled, even if the code is "physically"
> separated. I can't think of a situation where I would need to implement
> the same or similar features in multiple components where those
> components are not tightly coupled in some way.
I tend to agree. However, I have a use case that I suffer on a daily basis.
We have code that runs on multiple platforms (embedded SoCs). I have
a superproject that has a common library and some vendor-specific code
for each supported platform broken out into submodules.
super-all
+-- CommonAPI
+-- VendorA
+-- VendorB
+-- VendorC
The code in the Vendor submodules contains the proprietary
implementations for specific vendor's systems of the CommonAPI
library. When the CommonAPI gets a new feature, it often gets
implemented in all the vendor submodules as well.
We could easily do this without submodules, of course. But this setup
allows us to define alternative super-projects that we can then share
with subcontractors and original vendors without exposing proprietary
third-party code.
super-B
+-- CommonAPI
+-- VendorB
super-C
+-- CommonAPI
+-- VendorC
We could still handle this with git-subtree. But we don't.
Phil
^ permalink raw reply [flat|nested] 34+ messages in thread
* Re: organizing multiple repositories with dependencies
2012-04-30 19:25 ` Phil Hord
@ 2012-04-30 19:43 ` dag
0 siblings, 0 replies; 34+ messages in thread
From: dag @ 2012-04-30 19:43 UTC (permalink / raw)
To: Phil Hord; +Cc: Hilco Wijbenga, Git Users
Phil Hord <phil.hord@gmail.com> writes:
>> I can't think of a situation where I would need to implement the same
>> or similar features in multiple components where those components are
>> not tightly coupled in some way.
>
> I tend to agree. However, I have a use case that I suffer on a daily basis.
>
> We have code that runs on multiple platforms (embedded SoCs). I have
> a superproject that has a common library and some vendor-specific code
> for each supported platform broken out into submodules.
>
> super-all
> +-- CommonAPI
> +-- VendorA
> +-- VendorB
> +-- VendorC
>
> The code in the Vendor submodules contains the proprietary
> implementations for specific vendor's systems of the CommonAPI
> library. When the CommonAPI gets a new feature, it often gets
> implemented in all the vendor submodules as well.
Ah yes, that's a good example.
> We could easily do this without submodules, of course. But this setup
> allows us to define alternative super-projects that we can then share
> with subcontractors and original vendors without exposing proprietary
> third-party code.
>
> super-B
> +-- CommonAPI
> +-- VendorB
>
> super-C
> +-- CommonAPI
> +-- VendorC
>
> We could still handle this with git-subtree. But we don't.
Yes, I agree that this is a very important use case. This is the case
where subprojects exist because of vendor barriers, not necessarily due
to software engineering concerns.
-Dave
^ permalink raw reply [flat|nested] 34+ messages in thread
end of thread, other threads:[~2012-04-30 19:46 UTC | newest]
Thread overview: 34+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-04-16 9:27 organizing multiple repositories with dependencies Namit Bhalla
2012-04-16 14:30 ` Jakub Narebski
2012-04-16 20:08 ` dag
2012-04-17 17:29 ` Hilco Wijbenga
2012-04-17 17:51 ` dag
2012-04-17 18:37 ` Seth Robertson
2012-04-17 19:55 ` Hilco Wijbenga
2012-04-17 20:51 ` dag
2012-04-17 21:43 ` Hilco Wijbenga
2012-04-17 22:25 ` PJ Weisberg
2012-04-17 22:49 ` Hilco Wijbenga
2012-04-18 10:15 ` Namit Bhalla
2012-04-18 12:09 ` Jens Lehmann
2012-04-24 17:17 ` dag
2012-04-24 18:54 ` Hilco Wijbenga
2012-04-24 21:09 ` PJ Weisberg
2012-04-24 22:04 ` Hilco Wijbenga
2012-04-24 23:33 ` dag
2012-04-30 19:25 ` Phil Hord
2012-04-30 19:43 ` dag
2012-04-18 12:19 ` Jens Lehmann
2012-04-24 17:22 ` dag
2012-04-24 17:59 ` Seth Robertson
2012-04-24 20:26 ` Jens Lehmann
2012-04-24 20:52 ` Seth Robertson
2012-04-24 23:21 ` dag
2012-04-28 17:31 ` username localhost
2012-04-24 23:25 ` dag
2012-04-25 12:48 ` Seth Robertson
2012-04-27 14:23 ` dag
2012-04-24 19:48 ` Eugene Sajine
2012-04-24 22:11 ` Hilco Wijbenga
2012-04-24 23:38 ` dag
2012-04-24 23:36 ` dag
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).