Re: RFC: Subprojects - Linus Torvalds

git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Linus Torvalds <torvalds@osdl.org>
To: Junio C Hamano <junkio@cox.net>
Cc: Johannes Schindelin <Johannes.Schindelin@gmx.de>,
	git@vger.kernel.org, Simon Richter <Simon.Richter@hogyros.de>
Subject: Re: RFC: Subprojects
Date: Sat, 14 Jan 2006 11:16:34 -0800 (PST)	[thread overview]
Message-ID: <Pine.LNX.4.64.0601141055210.13339@g5.osdl.org> (raw)
In-Reply-To: <7vacdzkww3.fsf@assigned-by-dhcp.cox.net>

On Sat, 14 Jan 2006, Junio C Hamano wrote:

>  + The contained project is kept totally independent and does
>    not have to know it is contained.
> 
>  + The tree for the contained project can be rooted anywhere in
>    the containing project's tree.

Right.

>  - The contained project cannot be rooted at the same level or
>    higher than the containing project; the containing project
>    can only delegate a whole subdirectory to the contained
>    project.

Yes.

However, I think this is actually a _huge_ advantage.

The thing is, if you do the contained projects as "union projects" as you 
suggest, I will bet that it will really really suck, because it ends up 
losing the two positives above.

In particular, any real independent project will have it's own "Makefile" 
or "configure-in", and often its own "src" subdirectory or other 
pseudo-standard names.

And the "contained project as a link" approach has zero problems with that 
at all, exactly because it keeps the projects clearly separate - just 
linked (one way).

> What does "git-diff-index/git-diff-tree/git-diff-files" would do
> with them?

I would actually argue that git itself wouldn't do a whole lot with them. 
There are real advantages to seeing only the diffs wrt _one_ of the 
projects, and I'd argue that

	git-diff-*

would actually act like they now act for directories that they don't 
recurse into, ie you'd see something like

	:160000 160000 5eb57670... 3f1a42aa... M	sub-project

and it would be up to higher-level porcelain to recurse.

Why? Partly because that's actually likely enough for a lot of users: you 
_can_ use just the raw git programs by just doing

	cd sub-project
	git diff
	..
	git commit

and so technically you aren't really missing a lot. The capabilities are 
there, you just have to do some more by hand (but in many ways that is 
_good_: it makes it obvious that you're really committing a _different_ 
subproject).

The other reason? A lot of the git infrastructure really does only work on 
the "one project" level. The programs work with _one_ index, not two. 
Reading two trees is perfectly possible, but unless you keep them in 
separate stages, you can't separate them afterwards. IOW, trying to be 
recursive really does end up being a big change, for very little gain (and 
for a lot of potential bugs and instability).

In contrast, doing it at a higher level means that you have a simple and 
reliable lower level that you can trust. Layering is good.

> Fetching/cloning at the core level is easy.  "git-fetch-pack"
> would just need to do one level, but Porcelains need to address
> how to actually arrange the subprojects cloning to happen, which
> is harder.
> 
> "git clone" would say: "Ah, now I see these gitlinks; we need to
> clone them.

Actually, I would say no - that's actually not a "clone" operation so much 
as a "checkout" operation. There are strong arguments that you should 
_not_ clone sub-projects when you clone the top-level project: there's no 
reason to. Anybody else who clones it will have all the information you 
have, so cloning th esub-project is just extra work.

So only if you actually check it out (which is often in practice the 
second stage of the cloning, of course) do you want to fetch the 
subproject too. But even then you might want to ask the user (he may have 
a local repository for that sub-project somewhere else, so going to the 
"canonical name" might be the wrong thing to do - and he might not even 
care, because he might want to work _just_ on the top-level project).

> Now I'll think aloud about a completely different design.
> 
> We could simply overlay the projects.  I think this is what
> Johannes suggested earlier.
> 
> You keep one branch for each "subproject", and make commits into
> each branch (i.e. if you modified files for the upstream kernel,
> the change is committed to the branch for linux-2.6 subproject),
> but when checking things out, you do an equivalent of octopus
> merge across subprojects.

I think this one has serious disadvantages:

 - it's much less obvious when there are common names and especially 
   common subdirectories.
 - in _practice_, almost all sub-projects are kept in sub-directories. Are 
   you doing to change the sub-project git tree? How are you going to 
   merge back to the original sub-project?
 - iow, I think this only works for sub-projects that are totally 
   controlled by the top-level project - in which case they might as well 
   just be totally merged into the top level (the way we did with the 
   "tools" project, and largely with "gitk").

in the "gitk" case, we could actually continue to keep gitk a separate 
project, but that was really fortunate: it's purely because gitk ends up 
being a single file, with no Makefile at all to build it independently 
etc. The moment we integrated the "tools" sub-project into git, we lost 
the ability to do that, exactly because they now needed to share Makefiles 
etc, making all further development very inter-twined.

Put another way: the moment you have linkages going both ways between the 
subproject and the top-level project, it's no longer two separate 
projects. At that point, it in practice becomes one, since the sub-project 
can no longer do independent development without merging becoming a big 
issue.

The advantage of having a "git link" is exactly the fact that the 
dependency goes only one way. The subproject remains truly independent.

			Linus

next prev parent reply	other threads:[~2006-01-14 19:17 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-01-11 15:58 RFC: Subprojects Simon Richter
2006-01-11 16:44 ` Johannes Schindelin
2006-01-11 16:52   ` Simon Richter
2006-01-11 17:42     ` Linus Torvalds
2006-01-11 19:43       ` Simon Richter
2006-01-11 20:06         ` Linus Torvalds
2006-01-14  8:59       ` Junio C Hamano
2006-01-14 19:16         ` Linus Torvalds [this message]
2006-01-14 19:32           ` A Large Angry SCM
2006-01-14 20:02             ` Linus Torvalds
2006-01-14 20:30               ` A Large Angry SCM
2006-01-14 20:38                 ` Junio C Hamano
2006-01-15  0:28                   ` Martin Langhoff
2006-01-15  0:49                     ` Junio C Hamano
2006-01-15  1:55                       ` Tom Prince
2006-01-16  5:06                     ` Daniel Barkalow
2006-01-16 19:08                       ` A Large Angry SCM
2006-01-16 20:20                         ` Daniel Barkalow
2006-01-16 22:25                           ` A Large Angry SCM
2006-01-16  7:48               ` Alex Riesen
2006-01-14 20:16           ` Junio C Hamano
2006-01-15  1:01             ` Junio C Hamano
2006-01-16 10:44             ` Josef Weidendorfer
2006-01-16 20:49               ` Junio C Hamano
2006-01-17  5:46                 ` Daniel Barkalow
2006-01-17  6:18                   ` Junio C Hamano
2006-01-17 14:09                     ` Petr Baudis
2006-01-17 16:45                       ` Daniel Barkalow
2006-01-17 17:33                         ` Craig Schlenter
2006-01-17 17:38                         ` Linus Torvalds
2006-01-17 17:41                     ` Daniel Barkalow
2006-01-18  1:41                       ` Junio C Hamano
2006-01-18  3:49                         ` Junio C Hamano
2006-01-18 11:47                           ` Alexander Litvinov
2006-01-18 13:29                             ` Andreas Ericsson
2006-01-18 17:06                             ` Junio C Hamano
2006-01-18 18:21                         ` Daniel Barkalow
2006-01-18 18:49                           ` Junio C Hamano
2006-01-18 19:29                             ` Daniel Barkalow
2006-01-23  1:22                           ` Petr Baudis
2006-01-23  0:50                 ` Petr Baudis
2006-01-16  7:28         ` Alexander Litvinov
2006-01-16 10:16           ` Andreas Ericsson
2006-02-20 13:16         ` Uwe Zeisberger
2006-02-21  7:57           ` Junio C Hamano
2006-01-12  3:19 ` Alexander Litvinov
2006-01-12  4:46   ` Martin Langhoff
2006-01-12  5:25     ` Alexander Litvinov
2006-01-12  5:39       ` Martin Langhoff
2006-01-12  8:36         ` Alexander Litvinov
2006-01-12  8:58           ` Alex Riesen
2006-01-12  7:20       ` Anand Kumria
2006-01-12 13:38     ` Daniel Barkalow
2006-01-15 15:07 ` [RFC][PATCH] Cogito support for simple subprojects Petr Baudis
2006-01-15 17:38   ` Linus Torvalds
2006-01-15 19:15   ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Pine.LNX.4.64.0601141055210.13339@g5.osdl.org \
    --to=torvalds@osdl.org \
    --cc=Johannes.Schindelin@gmx.de \
    --cc=Simon.Richter@hogyros.de \
    --cc=git@vger.kernel.org \
    --cc=junkio@cox.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).