Re: RFC: Subprojects - Junio C Hamano

git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Junio C Hamano <junkio@cox.net>
To: Daniel Barkalow <barkalow@iabervon.org>
Cc: Josef Weidendorfer <Josef.Weidendorfer@gmx.de>, git@vger.kernel.org
Subject: Re: RFC: Subprojects
Date: Tue, 17 Jan 2006 17:41:48 -0800	[thread overview]
Message-ID: <7vpsmq2tyb.fsf@assigned-by-dhcp.cox.net> (raw)
In-Reply-To: Pine.LNX.4.64.0601171150050.25300@iabervon.org

Daniel Barkalow <barkalow@iabervon.org> writes:

> Incidentally, I don't think we'd want "gitlink" objects with the "gitlink" 
> approach; we'd want trees to contain commit objects for subprojects. The 
> "gitlink" thing that corresponds to ".git/HEAD" isn't an object, it's a 
> tree entry, which, like ".git/HEAD" (or, more appropriately, 
> ".git/refs/heads/something") maps a name to the hash of a commit object.

> Hmm... maybe libification should go ahead of subprojects. If access to the 
> index weren't so often open-coded, it would just be a matter of having 
> these entries in the data structure, but not actually returned by any 
> current call, and it would be just like they were in some other structure. 

And libification has been waiting for the core to settle ;-) We
have to start somewhere.

> Actually, it should be easy to have them in the index file but not in the 
> main index data structure, by skipping over them in the for loop near the 
> end of read_cache()....

Yeah, I guess I was vaguely thinking along those lines while I
was driving to work this morning.  I appreciate your spelling it
out to make things clearer.

> Side issue here: this implies that the kernel objects are in the 
> superproject's repository, or at least accessible from it. So prune has to 
> not remove them. So, if you've committed changes to a subproject but not 
> yet committed the fact that you want to use the changed subproject into 
> the superproject, fsck-objects has to find them somewhere.

Yes.  I was planning to have "$GIT_DIR/bind" that says:

	master kernel=linux-2.6/ gcc=gcc-4.0/

meaning:

	The project kept track by "master" branch binds the
	project kept track by "kernel" branch as its subproject
	at its linux-2.6/ subdirectory.

or something like that, so when you make a commit, you update
those other branches as needed.  You already raised that issue
at the end of your message, and I will explain how I think that
can/should be done as a response to that part later.

>> Reading such a commit is easy:
>> 
>> 	$ git-read-tree $tree ;# ;-)
>> 
>> But that is cheating.  
>
> This is for backwards compatibility, I assume?

This is done more for not having to touch *anything* that does
"index vs working file", "tree vs index" and "tree vs working
file via index".  It also is the easiest way to keep the "a
commit object name can be used in place of the tree object name
of the tree it contains" invariant.  Also I suspect this
organization might help recursive subprojects, but if it is the
case, that is just a byproduct, not a design goal.

>> When you have such an index, writing out various trees are:
>> 
>> 	$ git-write-tree ;# $tree
>> 	$ git-write-tree --prefix=linux-2.6/ ;# $linuxsub^{tree}
>> 	$ git-write-tree --prefix=gcc-4.0/ ;# $gccsub^{tree}
>> 	$ git-write-tree \
>>           --bound=linux-2.6/ --bound=gcc-4.0/ ;# $primarysub^{tree}
>
> The hard thing here is getting the commits for the trees. The bind lines 
> need commits, which means either identifying that we already have the 
> correct commit object, because we didn't change anything in the 
> subproject, or generating a new commit object with some message and the 
> right parent. And we want to use commit objects, not tree objects, in the 
> bind lines, so that, once we track a problem to the change of which commit 
> is bound, we can treat the subproject as a project and debug it with 
> bisect, rather than just having one tree that works and one that doesn't.

Your wording "get the commit" is a bit misleading.  Even when
the tree for a subproject happens to match a commit in the
subproject in a distant past, we would not want to use it unless
the user explicitly asked for it.  IOW, we do not actively go
and look for a commit.

Our subproject tree either matches the subproject branch head,
in which case we just reuse it, or we make a new commit on top
of that ourselves.

Let's say my project breaks with the latest kernel, and I
suspect that it would work with v2.6.13 sources.  To test that
theory, I could:

        $ git branch -f kernel v2.6.13 ;# rewind

	$ git ls-files linux-2.6/ |
          xargs git update-index --force-remove
        $ git read-tree --prefix=linux-2.6/ -u kernel

to construct such a tree.  Maybe the latter two-command sequence
"ls-files & read-tree --prefix" sequence deserves to become a
command, "git update-subproject kernel" [*1*].

The result may work as-is, or I may need to do some further
futzing in linux-2.6/ directory before the result works.  Once
the result starts working, I'd want to make a commit:

 - I compare the result of write-tree for linux-2.6/ portion and
   the tree object name contained in the head commit of the
   "kernel" branch.  If they match, then the current "kernel"
   branch head commit is what I'll place on the "bind" line in
   my commit; I do not have to make a new commit in the "kernel"
   subproject in this case.

 - If the tree object does not match the "kernel" head, that
   means I have tweaked the kernel part further, on top of
   v2.6.13.  So I make a commit for the kernel subproject (whose
   parent is obviously v2.6.13), update the kernel branch head
   with that commit, and then record that tip-of-the-tree commit
   for the subproject on the "bind" line in my commit for the
   toplevel.

Or let's say my project builds with the latest kernel (IOW, I
did not do the branch -f kernel in the above), and I made some
custom tweaks in the kernel area.  The above precedure would
result in a new commit on top of the latest kernel, update the
"kernel" branch head, and make a commit for the toplevel that
records the updated "kernel" branch head on its "bind" line.

Note that the above procedure did not use the commit object name
recorded on the "bind" line at all in either case.  From the
mechanism point of view, it is the right thing to do.  From the
usability point of view, however, we may want to take notice
that "bind" line commit and the bound branch head do not match,
and remind/warn the user about it.  If the reason why they are
different is because the user rewound the bound branch to use a
known working version, or made fixes in the subproject and
pulled the result into the bound branch (in which case there is
no funny rewinding involved), then this warning is
extraneous. But in the normal case of keep reusing the same
vintage of subprojects (and maybe making necessary adjustments
to subprojects while working on the main project), the commit
object on the "bind" line of the HEAD commit and bound branch
head should match.

[Footnote]

*1* One could also do a forward development on the kernel branch
in a separate working tree and fetch from there.  For example,
if our example "superproject" is in embed/ directory, and there
is a linux/ directory next to it to house a kernel repository,
we could:

        $ cd ../linux/
        $ edit && compile && test 
        $ git commit -m 'Fix for upstream, not just for embed'

to make an upstream fix, and then:

        $ cd ../embed/
        $ git fetch ../linux/ master:kernel

to update the "kernel" subproject branch head.  In such a case:

	$ git update-subproject kernel

would bring the subproject working tree and index up to date
with respect to the updated kernel branch.

next prev parent reply	other threads:[~2006-01-18  1:42 UTC|newest]

Thread overview: 56+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2006-01-11 15:58 RFC: Subprojects Simon Richter
2006-01-11 16:44 ` Johannes Schindelin
2006-01-11 16:52   ` Simon Richter
2006-01-11 17:42     ` Linus Torvalds
2006-01-11 19:43       ` Simon Richter
2006-01-11 20:06         ` Linus Torvalds
2006-01-14  8:59       ` Junio C Hamano
2006-01-14 19:16         ` Linus Torvalds
2006-01-14 19:32           ` A Large Angry SCM
2006-01-14 20:02             ` Linus Torvalds
2006-01-14 20:30               ` A Large Angry SCM
2006-01-14 20:38                 ` Junio C Hamano
2006-01-15  0:28                   ` Martin Langhoff
2006-01-15  0:49                     ` Junio C Hamano
2006-01-15  1:55                       ` Tom Prince
2006-01-16  5:06                     ` Daniel Barkalow
2006-01-16 19:08                       ` A Large Angry SCM
2006-01-16 20:20                         ` Daniel Barkalow
2006-01-16 22:25                           ` A Large Angry SCM
2006-01-16  7:48               ` Alex Riesen
2006-01-14 20:16           ` Junio C Hamano
2006-01-15  1:01             ` Junio C Hamano
2006-01-16 10:44             ` Josef Weidendorfer
2006-01-16 20:49               ` Junio C Hamano
2006-01-17  5:46                 ` Daniel Barkalow
2006-01-17  6:18                   ` Junio C Hamano
2006-01-17 14:09                     ` Petr Baudis
2006-01-17 16:45                       ` Daniel Barkalow
2006-01-17 17:33                         ` Craig Schlenter
2006-01-17 17:38                         ` Linus Torvalds
2006-01-17 17:41                     ` Daniel Barkalow
2006-01-18  1:41                       ` Junio C Hamano [this message]
2006-01-18  3:49                         ` Junio C Hamano
2006-01-18 11:47                           ` Alexander Litvinov
2006-01-18 13:29                             ` Andreas Ericsson
2006-01-18 17:06                             ` Junio C Hamano
2006-01-18 18:21                         ` Daniel Barkalow
2006-01-18 18:49                           ` Junio C Hamano
2006-01-18 19:29                             ` Daniel Barkalow
2006-01-23  1:22                           ` Petr Baudis
2006-01-23  0:50                 ` Petr Baudis
2006-01-16  7:28         ` Alexander Litvinov
2006-01-16 10:16           ` Andreas Ericsson
2006-02-20 13:16         ` Uwe Zeisberger
2006-02-21  7:57           ` Junio C Hamano
2006-01-12  3:19 ` Alexander Litvinov
2006-01-12  4:46   ` Martin Langhoff
2006-01-12  5:25     ` Alexander Litvinov
2006-01-12  5:39       ` Martin Langhoff
2006-01-12  8:36         ` Alexander Litvinov
2006-01-12  8:58           ` Alex Riesen
2006-01-12  7:20       ` Anand Kumria
2006-01-12 13:38     ` Daniel Barkalow
2006-01-15 15:07 ` [RFC][PATCH] Cogito support for simple subprojects Petr Baudis
2006-01-15 17:38   ` Linus Torvalds
2006-01-15 19:15   ` Junio C Hamano

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=7vpsmq2tyb.fsf@assigned-by-dhcp.cox.net \
    --to=junkio@cox.net \
    --cc=Josef.Weidendorfer@gmx.de \
    --cc=barkalow@iabervon.org \
    --cc=git@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).