git.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [RFC] Support projects including other projects
@ 2005-05-12  4:23 Daniel Barkalow
  2005-05-12  4:52 ` Junio C Hamano
  0 siblings, 1 reply; 14+ messages in thread
From: Daniel Barkalow @ 2005-05-12  4:23 UTC (permalink / raw)
  To: git; +Cc: Junio C Hamano, Petr Baudis, Linus Torvalds

I've come up with a way to handle projects like cogito which are based on
other projects. I think that it actually solves the real problem with such
projects, and it is actually very simple.

The problem that such projects run into, especially while both the core
and the non-core projects are in a state of substantial flux and when the 
non-core developer(s) contribute needed changes to the core, is that the
two projects not only have to be tracked, they have to be kept in 
sync. That is, a particular version of cogito requires a particular
version of git. There is a bit of convenience to having the tools
magically do the right thing when you check out the child project, but the
thing that really requires tool support is that you need to be able to
find the version of git-pb which matches the version of cogito you're
trying to build (and you might be searching the history for where a bug
was introduced, so you may not be able to use the latest of either).

The solution is to add a header to commits: "include {hash}", which simply
says that the given hash, which is from the core project, is the commit
needed to build this commit of the non-core project. This comes from an
argument to commit-tree ("-I", perhaps), and the parsing code needs to
identify the reference so that fsck-cache stays happy.

Git doesn't do anything more; wrapping layers would be able to take care
of the rest. When the wrapping layer determines that you are checking out
a commit with an include header, it also checks out the included commit,
using a different index file. The core treats everything as if you had a
bunch of non-tracked files in the directory (those being the things in the
other project). When you commit, it first commits any includes (if
needed), identifies the resulting core head, and passes that to the
include for the final result.

It seems to me like this should work perfectly. The one weakness is that
it's quite annoying to do by hand, since you have to simultaneously track
two index files and remember to pass the argument to commit-tree each
time. (Also, it means that you'd ideally pull git-pb from the cogito
repository with a client that ignores things not reachable from your head,
although Petr could still just copy and prune to match the current
situation).

I've written up the git changes needed, if people are interested in the
patch.

	-Daniel
*This .sig left intentionally blank*


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] Support projects including other projects
  2005-05-12  4:23 [RFC] Support projects including other projects Daniel Barkalow
@ 2005-05-12  4:52 ` Junio C Hamano
  2005-05-12  5:19   ` Daniel Barkalow
  0 siblings, 1 reply; 14+ messages in thread
From: Junio C Hamano @ 2005-05-12  4:52 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: git, Petr Baudis, Linus Torvalds

I think that the core of your idea of recording "required
version" of the depended project (core GIT) in the depending
project (Cogito) is a very sound one.  GNU Arch folks do
something similar in their "package-framework" stuff.  

I however do not think that belongs to the core GIT nor even to
Cogito for that matter.  To me, it feels like this is a pure
build infrastructure issue.

I think you could arrange something like that with today's core
GIT tools, like this:

 - Tweak Cogito Makefile so that pure Cogito and core GIT are
   housed in separate subdirectories;

 - Add "required-git-pb" file to Cogito source as a tracked
   source file, and record the required version of git-pb there;

 - Arrange Cogito Makefile to make sure the subtree that has the
   core GIT side meets "required-git-pb" constraints.  The
   constraints could be "at least contains this one", "exactly
   this one".  The policy would be differnt from a depending
   project to another.  What happens if the requirements are not
   met is also up to the policy of that depending project.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] Support projects including other projects
  2005-05-12  4:52 ` Junio C Hamano
@ 2005-05-12  5:19   ` Daniel Barkalow
  2005-05-12  5:37     ` Junio C Hamano
  2005-05-12  5:37     ` James Purser
  0 siblings, 2 replies; 14+ messages in thread
From: Daniel Barkalow @ 2005-05-12  5:19 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Petr Baudis, Linus Torvalds

On Wed, 11 May 2005, Junio C Hamano wrote:

> I think that the core of your idea of recording "required
> version" of the depended project (core GIT) in the depending
> project (Cogito) is a very sound one.  GNU Arch folks do
> something similar in their "package-framework" stuff.  
> 
> I however do not think that belongs to the core GIT nor even to
> Cogito for that matter.  To me, it feels like this is a pure
> build infrastructure issue.

If you think about it as git and cogito being entirely separate projects,
where users would be expected to have the right version of git most of the
time (or ever), this is true. But I think that cogito is as closely tied
to git as the kernel is to kbuild or kconfig; the difference is that git
is not solely available with cogito, like kbuild is solely available with
the kernel.

> I think you could arrange something like that with today's core
> GIT tools, like this:
> 
>  - Tweak Cogito Makefile so that pure Cogito and core GIT are
>    housed in separate subdirectories;
> 
>  - Add "required-git-pb" file to Cogito source as a tracked
>    source file, and record the required version of git-pb there;
> 
>  - Arrange Cogito Makefile to make sure the subtree that has the
>    core GIT side meets "required-git-pb" constraints.  The
>    constraints could be "at least contains this one", "exactly
>    this one".  The policy would be differnt from a depending
>    project to another.  What happens if the requirements are not
>    met is also up to the policy of that depending project.

When a particular cogito commit is made, it is impossible to tell whether
the next git-pb will work with it; the current set of patches could be
rejected in mainline git, and different support for the same functionality
added which requires something different from cogito.

This also means that Petr can't really test changes to git before
commiting them (and a new cogito with the constraint changed), because the
cogito build system would then require him to use a version he's not
testing.

Also, either the user has to keep track of two projects without any system
support in the same directory structure and figure out how to follow the
instructions from the build system in getting the right version checked
out in the right place, or the build system is tied to a particular
wrapper layer.

I think your idea is theoretically possible, but that it is just too
impractical for anyone to ever actually use it. It's something that people
could do with CVS (and it would actually work better, due to CVS's
limitations making the issues simpler), but people don't.

	-Daniel
*This .sig left intentionally blank*


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] Support projects including other projects
  2005-05-12  5:19   ` Daniel Barkalow
@ 2005-05-12  5:37     ` Junio C Hamano
  2005-05-12  6:04       ` Daniel Barkalow
  2005-05-12  6:14       ` Junio C Hamano
  2005-05-12  5:37     ` James Purser
  1 sibling, 2 replies; 14+ messages in thread
From: Junio C Hamano @ 2005-05-12  5:37 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: git, Petr Baudis, Linus Torvalds

>>>>> "DB" == Daniel Barkalow <barkalow@iabervon.org> writes:

DB> When a particular cogito commit is made, it is impossible to tell whether
DB> the next git-pb will work with it; the current set of patches could be
DB> rejected in mainline git, and different support for the same functionality
DB> added which requires something different from cogito...

 ... Many problems with the approach of saying "this Cogito
     requires this git-pb" omitted here; I agree that alone
     may not solve problems ...

DB> I think your idea is theoretically possible, but that it is just too
DB> impractical...

I do not think it is my idea.  Maybe I misunderstood what you
meant, but here is what you wrote in the message I responded to.

    ... There is a bit of convenience to having the tools magically
    do the right thing when you check out the child project, but the
    thing that really requires tool support is that you need to be
    able to find the version of git-pb which matches the version of
    cogito you're trying to build (and you might be searching the
    history for where a bug was introduced, so you may not be able
    to use the latest of either).

That part is fine.  I already agreed that recording such version
dependency would be a good thing.  I disagreed with the
"solution", however, of having that recorded at the core level:

    The solution is to add a header to commits: "include {hash}",
    which simply says that the given hash, which is from the core
    project, is the commit needed to build this commit of the
    non-core project. This comes from an argument to commit-tree
    ("-I", perhaps), and the parsing code needs to identify the
    reference so that fsck-cache stays happy.

I do not think the issues you are raising are solved by having
that "include {hash}" thing in the commit like you propose here,
instead of keeping it outside of the commit like I suggested.

What I meant to say was just I do not think having this "version
dependency" in the core or outside of the core would make any
difference.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] Support projects including other projects
  2005-05-12  5:19   ` Daniel Barkalow
  2005-05-12  5:37     ` Junio C Hamano
@ 2005-05-12  5:37     ` James Purser
  2005-05-12  5:46       ` Daniel Barkalow
  1 sibling, 1 reply; 14+ messages in thread
From: James Purser @ 2005-05-12  5:37 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: Junio C Hamano, git, Petr Baudis, Linus Torvalds

On Thu, 2005-05-12 at 15:19, Daniel Barkalow wrote:
> If you think about it as git and cogito being entirely separate projects,
> where users would be expected to have the right version of git most of the
> time (or ever), this is true. But I think that cogito is as closely tied
> to git as the kernel is to kbuild or kconfig; the difference is that git
> is not solely available with cogito, like kbuild is solely available with
> the kernel.
I tend to disagree with you on this point. Cogito and Git share
arelationship more akin to xorg and gnome and this is something I think
Linus intended so that it would be very easy to build a layer on top of
the git toolset. Cogito is great and it fills a need but give it time
and other implementations and tool sets will come along that may
supersede it.
-- 
James Purser
http://ksit.dynalias.com


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] Support projects including other projects
  2005-05-12  5:37     ` James Purser
@ 2005-05-12  5:46       ` Daniel Barkalow
  2005-05-12  6:33         ` James Purser
  0 siblings, 1 reply; 14+ messages in thread
From: Daniel Barkalow @ 2005-05-12  5:46 UTC (permalink / raw)
  To: James Purser; +Cc: Junio C Hamano, git, Petr Baudis, Linus Torvalds

On Thu, 12 May 2005, James Purser wrote:

> On Thu, 2005-05-12 at 15:19, Daniel Barkalow wrote:
> > If you think about it as git and cogito being entirely separate projects,
> > where users would be expected to have the right version of git most of the
> > time (or ever), this is true. But I think that cogito is as closely tied
> > to git as the kernel is to kbuild or kconfig; the difference is that git
> > is not solely available with cogito, like kbuild is solely available with
> > the kernel.
> I tend to disagree with you on this point. Cogito and Git share
> arelationship more akin to xorg and gnome and this is something I think
> Linus intended so that it would be very easy to build a layer on top of
> the git toolset. Cogito is great and it fills a need but give it time
> and other implementations and tool sets will come along that may
> supersede it.

The point of this feature is to support other implementations and tool
sets. If there weren't other things using the git core, there would be no
reason to leave the current situation where cogito simply includes the
complete contents of git-pb. The relationship between cogito and git is,
however, not at all like that between Gnome and x.org; gnome could not be
started until X was essentially completely stable for several years (after
which X could be reimplemented and extended, so long as it retained the
same API). Cogito, on the other hand, is being developed concurrently with
git, and substantially informs git development. The current cogito doesn't
work completely correctly with any mainline git, whereas the current Gnome
works with every x.org release as well as any XFree86 or most other X
servers since the mid 90's.

Also, any particular user is probably only going to use one git-based
system, but will almost certainly use many different X clients.

	-Daniel
*This .sig left intentionally blank*


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] Support projects including other projects
  2005-05-12  5:37     ` Junio C Hamano
@ 2005-05-12  6:04       ` Daniel Barkalow
  2005-05-12  6:28         ` Junio C Hamano
  2005-05-12  6:14       ` Junio C Hamano
  1 sibling, 1 reply; 14+ messages in thread
From: Daniel Barkalow @ 2005-05-12  6:04 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Petr Baudis, Linus Torvalds

On Wed, 11 May 2005, Junio C Hamano wrote:

> I do not think the issues you are raising are solved by having
> that "include {hash}" thing in the commit like you propose here,
> instead of keeping it outside of the commit like I suggested.
> 
> What I meant to say was just I do not think having this "version
> dependency" in the core or outside of the core would make any
> difference.

I was primarily responding to your idea of it being outside the scope of
cogito as well as outside the core.

My reasons for having it in the core are as follows:

 - All of the porcelain layers have to, at least, agree as to how this is
   represented in order for repositories to be portable; since the
   representation is common, it might as well be core.

 - There are currently no special files which are tracked for cogito (et 
   al) to put the information in.

 - Ideally, the dependancy would only be per-commit, not per-tree; if Petr
   releases a new cogito which only merges a new mainline with the git-pb,
   the cogito tree object should be the same (since the cogito content
   didn't change). This means that it can't be anywhere other than the
   commit.

 - If the solution to the issue of finding the necessary git-pb is to
   store it with cogito, then the programs that pull from this repository
   need to know that they need to pull the git-pb portion, and fsck-cache
   needs to know that the cogito references the git-pb.

	-Daniel
*This .sig left intentionally blank*


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] Support projects including other projects
  2005-05-12  5:37     ` Junio C Hamano
  2005-05-12  6:04       ` Daniel Barkalow
@ 2005-05-12  6:14       ` Junio C Hamano
  1 sibling, 0 replies; 14+ messages in thread
From: Junio C Hamano @ 2005-05-12  6:14 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: git, Petr Baudis, Linus Torvalds

>>>>> "JCH" == Junio C Hamano <junkio@cox.net> writes:

Daniel, I am sorry but I realize I completely misunderstood what
you meant by "projects including other projects".  What you are
trying to solve is the problem of feeding core GIT changes and
pure Cogito changes separately to the upstream _within_ _the_
_current_ _source_ _tree_ _structure_ of Cogito, isn't it?
That's where your juggling index files and other complexity
comes from, and I did not realize that was what you were talking
about.  I should have realized it when you mentioned kbuild.

Well, personally I do not think such project _overlays_ are
worth supporting because it happens rarely, and to a certain
extent it is simply an undisciplined way to organize the source
tree.  Kbuild case may be justified, but I vaguely recall
something very similar build infrastructure was used by busybox
folks---it could be using just their own copy of kbuild for that
matter.

But as you said in a separate message, I agree that core GIT
layer is meant to be independent from what Porcelain you put on
it.  The relationship between Cogito and core GIT is not similar
to kbuild and the kernel.  It is more like a random X11
application and Xlib.  Having them in the same source tree,
intermixed, is less than optimal.

I would not be surprised when future, if not the next, Cogito
release has source tree organized more like JIT sources,
shipping git-pb and cogito in separate directories, managed by
separate GIT_DIR.  That would make Pasky's life a lot simpler.

And once the separation happens, the issue becomes just a simple
package version matching every distribution does (e.g. Debian's
binary package and library dependencies, or source Build-Depends
dependencies), which is something already has been solved.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] Support projects including other projects
  2005-05-12  6:04       ` Daniel Barkalow
@ 2005-05-12  6:28         ` Junio C Hamano
  2005-05-12 16:51           ` Daniel Barkalow
  0 siblings, 1 reply; 14+ messages in thread
From: Junio C Hamano @ 2005-05-12  6:28 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: git, Petr Baudis, Linus Torvalds

>>>>> "DB" == Daniel Barkalow <barkalow@iabervon.org> writes:

DB> My reasons for having it in the core are as follows:

DB>  - All of the porcelain layers have to, at least, agree as
DB>  to how this is represented in order for repositories to be
DB>  portable; since the representation is common, it might as
DB>  well be core.

That is weak.  .git/refs/heads/master is not core, but something
Porcelain need to agree on [*1*].

DB>  - There are currently no special files which are tracked for cogito (et 
DB>    al) to put the information in.

I am somewhat sympathetic to this, but then there are probably
lot other things that are more relevant than this "required
version" thing.  One thing that immediately comes to mind is the
dontdiff list.  Also, if you consider Cogito and GIT independent
projects as you said, you would probably need to have "require
{project-name} {commit-id}", not "include {commit-id}".  Things
start smelling much more like the traditional package version
matching issue which is outside of SCM (let alone core GIT).

DB>  - Ideally, the dependancy would only be per-commit, not
DB>  per-tree; if Petr releases a new cogito which only merges a
DB>  new mainline with the git-pb, the cogito tree object should
DB>  be the same (since the cogito content didn't change). This
DB>  means that it can't be anywhere other than the commit.

As I already said, I consider the current "overlayed" directory
structure broken and not worth considering the toolset support
[*2*].

DB>  - If the solution to the issue of finding the necessary
DB>  git-pb is to store it with cogito, then the programs that
DB>  pull from this repository need to know that they need to
DB>  pull the git-pb portion, and fsck-cache needs to know that
DB>  the cogito references the git-pb.

I do not think this is necessary for the same reason as I
dismissed the third point above.

[Footnotes]

*1* I consider git-pull-script one example of Porcelain, JIT
knows about it as well.

*2* "Broken" is probably a too strong word here.  I know Petr
did it that way because it was the simplest way to start, and I
started the same way when I started JIT, until I realized
separating the core and treating the core as something I can
borrow from the neighbouring directory is much easier to manage.
I think Petr knows this, and I further think that is why he
started git-pb.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] Support projects including other projects
  2005-05-12  5:46       ` Daniel Barkalow
@ 2005-05-12  6:33         ` James Purser
  0 siblings, 0 replies; 14+ messages in thread
From: James Purser @ 2005-05-12  6:33 UTC (permalink / raw)
  To: git

I've really got remember that reply to all option.
On Thu, 2005-05-12 at 15:46, Daniel Barkalow wrote:
> On Thu, 12 May 2005, James Purser wrote:
> 
> > On Thu, 2005-05-12 at 15:19, Daniel Barkalow wrote:
> > > If you think about it as git and cogito being entirely separate
projects,
> > > where users would be expected to have the right version of git
most of the
> > > time (or ever), this is true. But I think that cogito is as
closely tied
> > > to git as the kernel is to kbuild or kconfig; the difference is
that git
> > > is not solely available with cogito, like kbuild is solely
available with
> > > the kernel.
> > I tend to disagree with you on this point. Cogito and Git share
> > arelationship more akin to xorg and gnome and this is something I
think
> > Linus intended so that it would be very easy to build a layer on top
of
> > the git toolset. Cogito is great and it fills a need but give it
time
> > and other implementations and tool sets will come along that may
> > supersede it.
> 
> The point of this feature is to support other implementations and tool
> sets. If there weren't other things using the git core, there would be
no
> reason to leave the current situation where cogito simply includes the
> complete contents of git-pb. The relationship between cogito and git
is,
> however, not at all like that between Gnome and x.org; gnome could not
be
> started until X was essentially completely stable for several years
(after
> which X could be reimplemented and extended, so long as it retained
the
> same API). Cogito, on the other hand, is being developed concurrently
with
> git, and substantially informs git development. The current cogito
doesn't
> work completely correctly with any mainline git, whereas the current
Gnome
> works with every x.org release as well as any XFree86 or most other X
> servers since the mid 90's.
> 
> Also, any particular user is probably only going to use one git-based
> system, but will almost certainly use many different X clients.
> 
>       -Daniel
> *This .sig left intentionally blank*
> 
> -
> To unsubscribe from this list: send the line "unsubscribe git" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
Okay the gnome/xorg is a bad example the point I was trying to get
across was that cogito and git are not as intertwined as you say, if
development of cogito stopped tomorrow then git would keep going and
another second layer app would take its place.

Yes cogito helps with git development as it provides a great way to test
different situations in a different environment than you would normally
get by running the bare git tools your self.

The way I have been reading things (and I may be wrong about this, it
wouldn't be the first time :)) is that git is THE base line providing
the necessary tools and structure for anyone who wishes to build an
application on top. Cogito is an example of that second layer app, built
on top of the toolset and still able to talk to non cogito managed
trees. Sort of like CVS and its various client implentations (Command
Line, GCVS etc).

Again I may have gotten things arse about, if I have then I blame lack
of sleep :)
-- 
James Purser
http://ksit.dynalias.com


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] Support projects including other projects
  2005-05-12  6:28         ` Junio C Hamano
@ 2005-05-12 16:51           ` Daniel Barkalow
  2005-05-12 17:24             ` David Lang
  2005-05-12 18:47             ` Junio C Hamano
  0 siblings, 2 replies; 14+ messages in thread
From: Daniel Barkalow @ 2005-05-12 16:51 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Petr Baudis, Linus Torvalds

On Wed, 11 May 2005, Junio C Hamano wrote:

> >>>>> "DB" == Daniel Barkalow <barkalow@iabervon.org> writes:
> 
> DB> My reasons for having it in the core are as follows:
> 
> DB>  - All of the porcelain layers have to, at least, agree as
> DB>  to how this is represented in order for repositories to be
> DB>  portable; since the representation is common, it might as
> DB>  well be core.
> 
> That is weak.  .git/refs/heads/master is not core, but something
> Porcelain need to agree on [*1*].

I think it is a defect of the current core that it fails to completely
specify a portable repository format. Obviously, it is not necessary to
have things in the core for this reason, but it's also not necessary to
have anything at all in the core. We could eliminate commits entirely in
favor of putting the information in special files in trees, and it would
still be as complete as it is, although it would also be unmaintainable.

> DB>  - There are currently no special files which are tracked for cogito (et 
> DB>    al) to put the information in.
> 
> I am somewhat sympathetic to this, but then there are probably
> lot other things that are more relevant than this "required
> version" thing.  One thing that immediately comes to mind is the
> dontdiff list.

The dontdiff list isn't expected to change with every commit, however.

> Also, if you consider Cogito and GIT independent projects as you said,
> you would probably need to have "require {project-name} {commit-id}",
> not "include {commit-id}".

I *don't* consider Cogito and GIT to be independant projects. GIT is
independant of Cogito, but Cogito includes GIT as part of it.

If you don't like the structure of Cogito, I have a set of projects at
work, where I have a bunch of microcontroller programs and a library of
common code. Traditionally, there are two possible arrangements: either
they are all separate projects, in which case the user has to figure out
what versions match, or they are the same project, in which case everybody
has to get everything. What I would like is to have the library consider
itself a separate project, but each program consider itself, in some
sense, the same project as the library (but not as other programs).

> Things start smelling much more like the traditional package version 
> matching issue which is outside of SCM (let alone core GIT).

Once the core portion matures to the point where it gets used without
program-specific patches, it can be done outside of SCM. But it doesn't
make sense to have an SCM require that the projects are really mature in
order to work well, since active development is supposed to be what an SCM
is for.

> DB>  - Ideally, the dependancy would only be per-commit, not
> DB>  per-tree; if Petr releases a new cogito which only merges a
> DB>  new mainline with the git-pb, the cogito tree object should
> DB>  be the same (since the cogito content didn't change). This
> DB>  means that it can't be anywhere other than the commit.
> 
> As I already said, I consider the current "overlayed" directory
> structure broken and not worth considering the toolset support

You missed my point here entirely. I think that the cogito tree including
any non-source files in it (if there are such) should be the same. So the
dependancy can't be tracked in the tree.

> DB>  - If the solution to the issue of finding the necessary
> DB>  git-pb is to store it with cogito, then the programs that
> DB>  pull from this repository need to know that they need to
> DB>  pull the git-pb portion, and fsck-cache needs to know that
> DB>  the cogito references the git-pb.
> 
> I do not think this is necessary for the same reason as I
> dismissed the third point above.

Do you have some solution to the problem of having the porcelain
layer (or the end user) find the version of git that a version of cogito
needs, in some way such that if I'm working on the project and make a
change to cogito and a matching change to git, Petr can get them.

	-Daniel
*This .sig left intentionally blank*


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] Support projects including other projects
  2005-05-12 16:51           ` Daniel Barkalow
@ 2005-05-12 17:24             ` David Lang
  2005-05-12 18:47             ` Junio C Hamano
  1 sibling, 0 replies; 14+ messages in thread
From: David Lang @ 2005-05-12 17:24 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: Junio C Hamano, git, Petr Baudis, Linus Torvalds

I was thinking about this recently while reading an article on bittorrent 
and how it works and it occured to me that perhapse the network access 
model of git should be reexamined.

git produces a large pool of objects, there are two ways that people want 
to access these objects.

1. pull the current version of a project (either a straight 'ckeckout' 
type pull or a 'merge' to a local project)

2. pull the objects nessasary for past versions of a project (either all 
the way back to the beginning of time or back to some point, that point 
being a number of possibilities (date, version, things you don't have, 
etc)

in either case the important thing that's key are the indexes related to a 
particular project, the objects themselves could all be in one huge pool 
for all projects that ever existed (this doesn't make sense if you use 
rsync to copy repositories as Linux origionally did, but if you have a 
more git-aware transport it can make sense)

I believe that there are going to be quite a number of cases where the 
same object is used for multiple projects (either becouse the project is a 
fork of another project or becouse some functions (or include files) are 
so trivial that they are basicly boilerplate and get reused or recreated) 
if you think about a major mirror server distributing a dozen linux 
distros via git you will realize that in many cases the source files, 
scripts, and (in many cases) even the binaries are really going to be 
identical objects for all the distros so a ftp/http server that used a git 
filesystem could result in a pretty significant saveings in disk space.

In addition, when you are doing a pull you can accept data from 
non-authoritative sources since each object (and it's index info) includes 
enough info to validate the object hasn't been tampered with (at least 
until such time as the hashes are sufficiantly broken, but that's another 
debate, and we had that one :-). so a bittorrent-like peer sharing system 
to fetch objects identified by the index files would open the potential 
for saving significant bandwith on the master servers while not 
comprimising the trees at all.

Going back (somewhat) to the subject at hand, with something like this you 
should be able to combine as many projects as you want in one repository, 
and the only issue would be the work nessasary to go through that 
repository and all the index files that point at it when you want to prune 
old data out of the object pool to save disk space.

thoughts? unfortunnatly I don't have the time to even consider codeing 
something like this up, but hopefully it will spark interest for someone 
who does.

David Lang

-- 
There are two ways of constructing a software design. One way is to make it so simple that there are obviously no deficiencies. And the other way is to make it so complicated that there are no obvious deficiencies.
  -- C.A.R. Hoare

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] Support projects including other projects
  2005-05-12 16:51           ` Daniel Barkalow
  2005-05-12 17:24             ` David Lang
@ 2005-05-12 18:47             ` Junio C Hamano
  2005-05-12 19:12               ` Daniel Barkalow
  1 sibling, 1 reply; 14+ messages in thread
From: Junio C Hamano @ 2005-05-12 18:47 UTC (permalink / raw)
  To: Daniel Barkalow; +Cc: git, Petr Baudis, Linus Torvalds

>>>>> "DB" == Daniel Barkalow <barkalow@iabervon.org> writes:

DB> Do you have some solution to the problem of having the
DB> porcelain layer (or the end user) find the version of git
DB> that a version of cogito needs, in some way such that if I'm
DB> working on the project and make a change to cogito and a
DB> matching change to git, Petr can get them.

I have to think about this a bit but let me understand the
problem first.  Let's say it is a couple of weeks ago when there
were not cg-status.  You write cg-status, by adding -t flag to
ls-files.c  You commit the addition of -t flag to git-pb
repository and note the commit id.  You then commit addition of
cg-status to cogito repository and when you do so you want the
party that pulls the latter commit to know it needs the former
commit in the git-pb tree.  Is it what you are solving here?



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [RFC] Support projects including other projects
  2005-05-12 18:47             ` Junio C Hamano
@ 2005-05-12 19:12               ` Daniel Barkalow
  0 siblings, 0 replies; 14+ messages in thread
From: Daniel Barkalow @ 2005-05-12 19:12 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Petr Baudis, Linus Torvalds

On Thu, 12 May 2005, Junio C Hamano wrote:

> I have to think about this a bit but let me understand the
> problem first.  Let's say it is a couple of weeks ago when there
> were not cg-status.  You write cg-status, by adding -t flag to
> ls-files.c  You commit the addition of -t flag to git-pb
> repository and note the commit id.  You then commit addition of
> cg-status to cogito repository and when you do so you want the
> party that pulls the latter commit to know it needs the former
> commit in the git-pb tree.  Is it what you are solving here?

Right; and I'm not Petr, so the place that has the -t flag in ls-files
isn't his git-pb repository, and I'm not going to remember to tell him
about two places to pull from or two heads to pull.

Probably my biggest concern here is that it has to not make anything more
difficult for Cogito hackers (or people working on similarly arranged
projects) to have the other project demarcated as separate, or they'd tend
to be lazy and the upstream core will suffer. I believe that this is why
people in practice tend not to bother making projects clean and modular
with current tools. Having it streamlined and automatic would mean that
people in the position that Petr was in when he started would do it by
default.

	-Daniel
*This .sig left intentionally blank*


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2005-05-12 19:04 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-05-12  4:23 [RFC] Support projects including other projects Daniel Barkalow
2005-05-12  4:52 ` Junio C Hamano
2005-05-12  5:19   ` Daniel Barkalow
2005-05-12  5:37     ` Junio C Hamano
2005-05-12  6:04       ` Daniel Barkalow
2005-05-12  6:28         ` Junio C Hamano
2005-05-12 16:51           ` Daniel Barkalow
2005-05-12 17:24             ` David Lang
2005-05-12 18:47             ` Junio C Hamano
2005-05-12 19:12               ` Daniel Barkalow
2005-05-12  6:14       ` Junio C Hamano
2005-05-12  5:37     ` James Purser
2005-05-12  5:46       ` Daniel Barkalow
2005-05-12  6:33         ` James Purser

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).