* Migrating svn to git with heavy use of externals
@ 2008-03-31 20:59 D. Stuart Freeman
2008-04-08 18:07 ` D. Stuart Freeman
0 siblings, 1 reply; 48+ messages in thread
From: D. Stuart Freeman @ 2008-03-31 20:59 UTC (permalink / raw)
To: git
[-- Attachment #1: Type: text/plain, Size: 1045 bytes --]
I'm a developer on the Sakai project. I think Sakai could benefit
greatly from use of git because we have a huge need to track local
changes while contributing back to a central codebase. I've started
looking at git-svn and have managed to get a copy of our repository into
git, and looked at the stuff to do with submodules as a replacement for
externals. The problem is we rely very heavily on externals, for
instance when we make a tag for release we tag all the modules at the
same time and use an externals file to build the release from those
tags. I realize that's probably not a best practice, but it's what we
do. Our latest release is here:
https://source.sakaiproject.org/svn/sakai/tags/sakai_2-5-0/ if you want
to get an idea of the scope of the problem. How would you convert this
to a git repository? I'm currently looking at
http://blog.alieniloquent.com/2008/03/08/git-svn-with-svnexternals/ but
that doesn't look like it would leave all the old release tags intact.
--
D. Stuart Freeman
Georgia Institute of Technology
[-- Attachment #2: stuart_freeman.vcf --]
[-- Type: text/x-vcard, Size: 162 bytes --]
begin:vcard
fn:D. Stuart Freeman
n:Freeman;Douglas
email;internet:stuart.freeman@et.gatech.edu
tel;work:(404)385-1473
x-mozilla-html:FALSE
version:2.1
end:vcard
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Migrating svn to git with heavy use of externals
2008-03-31 20:59 Migrating svn to git with heavy use of externals D. Stuart Freeman
@ 2008-04-08 18:07 ` D. Stuart Freeman
2008-04-08 20:06 ` Avery Pennarun
0 siblings, 1 reply; 48+ messages in thread
From: D. Stuart Freeman @ 2008-04-08 18:07 UTC (permalink / raw)
To: stuart.freeman; +Cc: git
[-- Attachment #1: Type: text/plain, Size: 1382 bytes --]
Maybe I should clarify.
I've imported an svn managed project into a git repository
with 71 submodules, what I don't understand though is if I
have a branch called 2-5-x and another called 2-4-x in each of
the submodules and the superproject, is there a way to
associate those?
D. Stuart Freeman wrote:
> I'm a developer on the Sakai project. I think Sakai could benefit
> greatly from use of git because we have a huge need to track local
> changes while contributing back to a central codebase. I've started
> looking at git-svn and have managed to get a copy of our repository into
> git, and looked at the stuff to do with submodules as a replacement for
> externals. The problem is we rely very heavily on externals, for
> instance when we make a tag for release we tag all the modules at the
> same time and use an externals file to build the release from those
> tags. I realize that's probably not a best practice, but it's what we
> do. Our latest release is here:
> https://source.sakaiproject.org/svn/sakai/tags/sakai_2-5-0/ if you want
> to get an idea of the scope of the problem. How would you convert this
> to a git repository? I'm currently looking at
> http://blog.alieniloquent.com/2008/03/08/git-svn-with-svnexternals/ but
> that doesn't look like it would leave all the old release tags intact.
>
--
D. Stuart Freeman
Georgia Institute of Technology
[-- Attachment #2: stuart_freeman.vcf --]
[-- Type: text/x-vcard, Size: 162 bytes --]
begin:vcard
fn:D. Stuart Freeman
n:Freeman;Douglas
email;internet:stuart.freeman@et.gatech.edu
tel;work:(404)385-1473
x-mozilla-html:FALSE
version:2.1
end:vcard
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Migrating svn to git with heavy use of externals
2008-04-08 18:07 ` D. Stuart Freeman
@ 2008-04-08 20:06 ` Avery Pennarun
2008-04-08 20:49 ` D. Stuart Freeman
0 siblings, 1 reply; 48+ messages in thread
From: Avery Pennarun @ 2008-04-08 20:06 UTC (permalink / raw)
To: stuart.freeman; +Cc: git
On Tue, Apr 8, 2008 at 2:07 PM, D. Stuart Freeman
<stuart.freeman@et.gatech.edu> wrote:
> Maybe I should clarify.
> I've imported an svn managed project into a git repository
> with 71 submodules, what I don't understand though is if I
> have a branch called 2-5-x and another called 2-4-x in each of
> the submodules and the superproject, is there a way to
> associate those?
I don't think git-svn currently knows how to import svn:externals
properly. Basically you'd have to do it yourself, perhaps with the
help of something like git-filter-branch and a shell script.
The equivalent of svn:externals in git is called git-submodule, and
it's actually much more powerful than svn:externals, because you can
link to a *specific revision* and not just a branch. In other words,
I can set up my application to point at r2956 of a library, so even if
the library changes in the future, my application always gets exactly
that version. (To have the app use the later version, you have to
'git pull' in the submodule, then make a commit in the application
module.)
See "man git-submodule" and "man git-filter-branch" for more information.
If I'm wrong and git-svn already supports svn:externals, I'm sure
someone will correct me :)
Have fun,
Avery
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Migrating svn to git with heavy use of externals
2008-04-08 20:06 ` Avery Pennarun
@ 2008-04-08 20:49 ` D. Stuart Freeman
2008-04-08 21:01 ` Avery Pennarun
0 siblings, 1 reply; 48+ messages in thread
From: D. Stuart Freeman @ 2008-04-08 20:49 UTC (permalink / raw)
To: Avery Pennarun; +Cc: git
[-- Attachment #1: Type: text/plain, Size: 2069 bytes --]
Avery Pennarun wrote:
> On Tue, Apr 8, 2008 at 2:07 PM, D. Stuart Freeman
> <stuart.freeman@et.gatech.edu> wrote:
>> Maybe I should clarify.
>> I've imported an svn managed project into a git repository
>> with 71 submodules, what I don't understand though is if I
>> have a branch called 2-5-x and another called 2-4-x in each of
>> the submodules and the superproject, is there a way to
>> associate those?
>
> I don't think git-svn currently knows how to import svn:externals
> properly. Basically you'd have to do it yourself, perhaps with the
> help of something like git-filter-branch and a shell script.
>
> The equivalent of svn:externals in git is called git-submodule, and
> it's actually much more powerful than svn:externals, because you can
> link to a *specific revision* and not just a branch. In other words,
> I can set up my application to point at r2956 of a library, so even if
> the library changes in the future, my application always gets exactly
> that version. (To have the app use the later version, you have to
> 'git pull' in the submodule, then make a commit in the application
> module.)
>
> See "man git-submodule" and "man git-filter-branch" for more information.
>
> If I'm wrong and git-svn already supports svn:externals, I'm sure
> someone will correct me :)
>
> Have fun,
>
> Avery
It's possible to have svn:externals point at a specific revision, but
that's not the point. I'm convinced that submodules are the answer, I'm
just not sure how to make them work. Assume "sakai" is the superproject
and "access" is a submodule, I've done:
cd sakai
git checkout work
git submodule add ../access access
And that's cool, but then I do:
cd ../access
git checkout -b 2-5-x sakai_2-5-x # sakai_2-5-x is an svn import
cd ../sakai
git checkout -b 2-5-x sakai_2-5-x
git submodule add -b 2-5-x ../access access
Which gives me an error about access already existing. I'm pretty sure
I'm just not thinking about this the way git does, I blame svn for
damaging my brain.
--
D. Stuart Freeman
Georgia Institute of Technology
[-- Attachment #2: stuart_freeman.vcf --]
[-- Type: text/x-vcard, Size: 162 bytes --]
begin:vcard
fn:D. Stuart Freeman
n:Freeman;Douglas
email;internet:stuart.freeman@et.gatech.edu
tel;work:(404)385-1473
x-mozilla-html:FALSE
version:2.1
end:vcard
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Migrating svn to git with heavy use of externals
2008-04-08 20:49 ` D. Stuart Freeman
@ 2008-04-08 21:01 ` Avery Pennarun
2008-04-08 22:47 ` D. Stuart Freeman
2008-04-09 3:03 ` Roman Shaposhnik
0 siblings, 2 replies; 48+ messages in thread
From: Avery Pennarun @ 2008-04-08 21:01 UTC (permalink / raw)
To: stuart.freeman; +Cc: git
On Tue, Apr 8, 2008 at 4:49 PM, D. Stuart Freeman
<stuart.freeman@et.gatech.edu> wrote:
> It's possible to have svn:externals point at a specific revision, but
> that's not the point.
Right, I forgot that they added that.
> cd ../access
> git checkout -b 2-5-x sakai_2-5-x # sakai_2-5-x is an svn import
> cd ../sakai
> git checkout -b 2-5-x sakai_2-5-x
> git submodule add -b 2-5-x ../access access
>
> Which gives me an error about access already existing.
You should only ever need to add a given submodule once. As far as I
can tell, this is a bit of confusion in the way git-submodule works,
but I don't have any suggestions for what to do about it yet so I
don't complain :)
The way to understand git-submodule's operation is in terms of what it
actually does. Roughly speaking, git-submodule-add puts things into
.gitmodules and .git/config; git-submodule-init copies that stuff from
.gitmodules to .git/config (so if you're the guy who did the add, you
can skip this step). Then git-commit actually checks the submodule
reference into the parent tree, and someone who pulls your parent tree
needs to run git-submodule-update in order to actually retrieve the
new submodule pointer.
In other words, git-submodule is very powerful, but also very
complicated, and at least one of the things I said up above is
probably wrong :) By comparison, svn:externals is at least easy to
understand.
Anyway, in this case, what you need to know is that .git/config
already contains your submodule information. Sadly, .gitmodules is
probably sitting somewhere on your original branch, so it probably
doesn't exist. You could remove the entry from .git/config by hand
and use git-submodule-add again (thus putting it in both places), or
copy the .gitmodules file from the original branch, or git-cherry-pick
the commit where you added it.
You should *also* cd into the access subdir and checkout the right
revision there; at that time, the next commit to the sakai repository
will make sure the submodule reference is to the right place.
Phew, I hope that made things more clear instead of less clear. :)
Have fun,
Avery
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Migrating svn to git with heavy use of externals
2008-04-08 21:01 ` Avery Pennarun
@ 2008-04-08 22:47 ` D. Stuart Freeman
2008-04-09 3:03 ` Roman Shaposhnik
1 sibling, 0 replies; 48+ messages in thread
From: D. Stuart Freeman @ 2008-04-08 22:47 UTC (permalink / raw)
To: Avery Pennarun; +Cc: git
Avery Pennarun wrote:
> Anyway, in this case, what you need to know is that .git/config
> already contains your submodule information. Sadly, .gitmodules is
> probably sitting somewhere on your original branch, so it probably
> doesn't exist. You could remove the entry from .git/config by hand
> and use git-submodule-add again (thus putting it in both places), or
> copy the .gitmodules file from the original branch, or git-cherry-pick
> the commit where you added it.
>
> You should *also* cd into the access subdir and checkout the right
> revision there; at that time, the next commit to the sakai repository
> will make sure the submodule reference is to the right place.
>
> Phew, I hope that made things more clear instead of less clear. :)
>
> Have fun,
>
> Avery
OK, this makes a lot more sense now. Thanks.
--
Stuart
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Migrating svn to git with heavy use of externals
2008-04-08 21:01 ` Avery Pennarun
2008-04-08 22:47 ` D. Stuart Freeman
@ 2008-04-09 3:03 ` Roman Shaposhnik
2008-04-09 3:33 ` Avery Pennarun
1 sibling, 1 reply; 48+ messages in thread
From: Roman Shaposhnik @ 2008-04-09 3:03 UTC (permalink / raw)
To: Avery Pennarun; +Cc: stuart.freeman, git
On Apr 8, 2008, at 2:01 PM, Avery Pennarun wrote:
> The way to understand git-submodule's operation is in terms of what it
> actually does. Roughly speaking, git-submodule-add puts things into
> .gitmodules and .git/config;
I could be mistaken, but I don't think "git submodule add" does anything
to the .git/config. In fact, how settings migrate between .gitmodules
and .git/config has been a long standing source of slight confusion
for me.
Please correct me if I'm wrong, but it seems that the only reason
for the file .gitmodules to be there at all is because it can be
revved through commits, just as any file under Git's control.
.git/config doesn't have such a property. Other than that, it is not
really needed, right?
> In other words, git-submodule is very powerful, but also very
> complicated,
Speaking of complications, it took me awhile to realize that 90%
of the Submodule magic seems to be based on the secret ability of
tree objects to hold references not only to blobs and trees but
also to *commits*:
$ git init
$ mkdir foo ; cd foo
$ git init
$ touch file
$ git add file
$ git commit -mInit
Created initial commit 5a61c46: Init
0 files changed, 0 insertions(+), 0 deletions(-)
create mode 100644 file
Now observe:
$ cd ..
$ git add foo
$ git ls-files --stage
160000 5a61c4698ca56004c2532ce02e6736cfd2e977d1 0 foo
The '5a61c46' is simply a reference to the commit we've just created.
Of course, it is insufficient to know what commit "foo" refers
to unless we also know what Git repo this commit resides in. And that's
where the mapping between a path ("foo") and a url pointing to
the submodule comes into play.
This is a cool property and it lets you build functionality like
"git submodules" in a variety of different ways. I just wish it
was documented somewhere.
> Anyway, in this case, what you need to know is that .git/config
> already contains your submodule information. Sadly, .gitmodules is
> probably sitting somewhere on your original branch, so it probably
> doesn't exist. You could remove the entry from .git/config by hand
> and use git-submodule-add again (thus putting it in both places), or
> copy the .gitmodules file from the original branch, or git-cherry-pick
> the commit where you added it.
That's exactly what makes me doubtful about .gitmodules being the
best place for storing the url, but then again, I don't have any
better ideas. :-( Yet ;-)
Thanks,
Roman.
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Migrating svn to git with heavy use of externals
2008-04-09 3:03 ` Roman Shaposhnik
@ 2008-04-09 3:33 ` Avery Pennarun
2008-04-09 4:39 ` Roman Shaposhnik
0 siblings, 1 reply; 48+ messages in thread
From: Avery Pennarun @ 2008-04-09 3:33 UTC (permalink / raw)
To: Roman Shaposhnik; +Cc: stuart.freeman, git
On Tue, Apr 8, 2008 at 11:03 PM, Roman Shaposhnik <rvs@sun.com> wrote:
> On Apr 8, 2008, at 2:01 PM, Avery Pennarun wrote:
> > The way to understand git-submodule's operation is in terms of what it
> > actually does. Roughly speaking, git-submodule-add puts things into
> > .gitmodules and .git/config;
>
> I could be mistaken, but I don't think "git submodule add" does anything
> to the .git/config. In fact, how settings migrate between .gitmodules
> and .git/config has been a long standing source of slight confusion
> for me.
>
> Please correct me if I'm wrong, but it seems that the only reason
> for the file .gitmodules to be there at all is because it can be
> revved through commits, just as any file under Git's control.
> .git/config doesn't have such a property. Other than that, it is not
> really needed, right?
You have the last paragraph right, but I think the first paragraph wrong :)
.gitmodules doesn't do anything unless git-submodule reads it, which
it does in git-submodule-init and git-submodule-add. (You know
git-submodule-add is screwing with .git/config because you don't need
to call git-submodule-init when you use it.) git-submodule-update,
AFAICT, just reads .git/config.
> Speaking of complications, it took me awhile to realize that 90%
> of the Submodule magic seems to be based on the secret ability of
> tree objects to hold references not only to blobs and trees but
> also to *commits*:
Indeed, this is the majority of the coolness right there. The rest of
the screwiness with .gitmodules and so on is really just to support
fetching the objects for the submodules from repos than the primary
supermodule one.
Also, git-checkout seems to explicitly *not* checkout refs to commits
by itself; you have to call git-submodule-update for that. This is
probably because git-checkout wouldn't know what to do if the
submodule were dirty (ie. the sub-checkout couldn't complete because
files had been changed but not checked in). This is useful in a
not-destroying-my-data way, but the behaviour isn't too obvious or
coherent.
> That's exactly what makes me doubtful about .gitmodules being the
> best place for storing the url, but then again, I don't have any
> better ideas. :-( Yet ;-)
There's definitely no better place; .git/config isn't versioned, and
URLs don't belong in the tree objects themselves, which are otherwise
location-neutral and transport-neutral.
In my own use case, I think having all the objects from the
supermodule *and* submodules all be in the same repo is what I want.
This kind of obviates the need for .gitmodules entirely, if
git-checkout and friends will do the right thing. I think I'll submit
some patches eventually once I have this figured out properly.
Have fun,
Avery
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Migrating svn to git with heavy use of externals
2008-04-09 3:33 ` Avery Pennarun
@ 2008-04-09 4:39 ` Roman Shaposhnik
2008-04-09 6:34 ` Avery Pennarun
0 siblings, 1 reply; 48+ messages in thread
From: Roman Shaposhnik @ 2008-04-09 4:39 UTC (permalink / raw)
To: Avery Pennarun; +Cc: stuart.freeman, git
On Apr 8, 2008, at 8:33 PM, Avery Pennarun wrote:
> On Tue, Apr 8, 2008 at 11:03 PM, Roman Shaposhnik <rvs@sun.com> wrote:
>> On Apr 8, 2008, at 2:01 PM, Avery Pennarun wrote:
>>> The way to understand git-submodule's operation is in terms of
>>> what it
>>> actually does. Roughly speaking, git-submodule-add puts things into
>>> .gitmodules and .git/config;
>>
>> I could be mistaken, but I don't think "git submodule add" does
>> anything
>> to the .git/config. In fact, how settings migrate between .gitmodules
>> and .git/config has been a long standing source of slight confusion
>> for me.
>>
>> Please correct me if I'm wrong, but it seems that the only reason
>> for the file .gitmodules to be there at all is because it can be
>> revved through commits, just as any file under Git's control.
>> .git/config doesn't have such a property. Other than that, it is not
>> really needed, right?
>
> You have the last paragraph right, but I think the first paragraph
> wrong :)
Well, may be we are talking about slightly different things, or
there's a version mismatch,
but here's what I get with 1.5.4.5:
$ git init
$ git submodule add /tmp/GIT/1
Initialized empty Git repository in /tmp/GIT/3/1/.git/
$ cat .git/config
[core]
repositoryformatversion = 0
filemode = true
bare = false
logallrefupdates = true
Now, .gitmodules is there alright, so if I do:
$ git submodule init
I get the migration of settings to .git/config:
$ cat .git/config
[core]
repositoryformatversion = 0
filemode = true
bare = false
logallrefupdates = true
[submodule "1"]
url = /tmp/GIT/1/.git
>> Speaking of complications, it took me awhile to realize that 90%
>> of the Submodule magic seems to be based on the secret ability of
>> tree objects to hold references not only to blobs and trees but
>> also to *commits*:
>
> Indeed, this is the majority of the coolness right there. The rest of
> the screwiness with .gitmodules and so on is really just to support
> fetching the objects for the submodules from repos than the primary
> supermodule one.
Yeap. It all reminds me a bit of symbolic links in file systems. With
the
key difference being that symbolic links can only point to an object,
where we can actually reference a particular *state* of that object.
I like this ability very much. It comes especially handy when a single
component participates in multiple superprojects (being referenced
as a submodule) and every single one of them can reference a state
of the component that they like.
That said, I still can't quite figure out how to do a very basic thing:
how can I change the SHA1 that a tree objects refers to without
checking out a corresponding submodule first? IOW, suppose I've
just cloned ~/Superproject into /tmp/super-clone. All of the submodules
are still empty (nothing has been cheked out into them yet) and all
I want to do is to bump the version of one submodule
/tmp/super-clone/foo
and then push the changes back to the ~/Superproject, so that everybody
who pulls from it get the newer foo. It seems that the procedure
outlined
in Git's manual seems to be pretty heavyweight for such a simple thing.
>> That's exactly what makes me doubtful about .gitmodules being the
>> best place for storing the url, but then again, I don't have any
>> better ideas. :-( Yet ;-)
>
> There's definitely no better place; .git/config isn't versioned, and
> URLs don't belong in the tree objects themselves, which are otherwise
> location-neutral and transport-neutral.
Agreed. But I guess I'd be less confused if "git submodule" didn't muck
with .git/config at all. Or are there any other consumers of the
information
that it puts there (except itself)?
> In my own use case, I think having all the objects from the
> supermodule *and* submodules all be in the same repo is what I want.
> This kind of obviates the need for .gitmodules entirely, if
> git-checkout and friends will do the right thing. I think I'll submit
> some patches eventually once I have this figured out properly.
Hm. But what about those who might want to pull from you? .git/config
doesn't propagate, which means that they'll be kind of stuck, don't
you think?
Thanks,
Roman.
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Migrating svn to git with heavy use of externals
2008-04-09 4:39 ` Roman Shaposhnik
@ 2008-04-09 6:34 ` Avery Pennarun
2008-04-09 6:43 ` Junio C Hamano
2008-04-09 19:57 ` Roman Shaposhnik
0 siblings, 2 replies; 48+ messages in thread
From: Avery Pennarun @ 2008-04-09 6:34 UTC (permalink / raw)
To: Roman Shaposhnik; +Cc: stuart.freeman, git
On Wed, Apr 9, 2008 at 12:39 AM, Roman Shaposhnik <rvs@sun.com> wrote:
> Agreed. But I guess I'd be less confused if "git submodule" didn't muck
> with .git/config at all. Or are there any other consumers of the
> information
> that it puts there (except itself)?
That I don't know. If there aren't any others, then I agree, I'm not
sure what the whole .git/config messing is about.
> > In my own use case, I think having all the objects from the
> > supermodule *and* submodules all be in the same repo is what I want.
> > This kind of obviates the need for .gitmodules entirely, if
> > git-checkout and friends will do the right thing. I think I'll submit
> > some patches eventually once I have this figured out properly.
>
> Hm. But what about those who might want to pull from you? .git/config
> doesn't propagate, which means that they'll be kind of stuck, don't
> you think?
Not exactly. The idea is that if the supermodule and submodules are
all lumped into a single repo (and your refs are set up correctly),
then cloning the supermodule will also clone all the submodules. So
everyone will have all the necessary refs anyway; as long as
git-checkout checks them out, .gitmodules shouldn't have to exist at
all, becaues there's nothing "special" for git-submodule to do.
Have fun,
Avery
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Migrating svn to git with heavy use of externals
2008-04-09 6:34 ` Avery Pennarun
@ 2008-04-09 6:43 ` Junio C Hamano
2008-04-10 3:43 ` Intricacies of submodules [was: Migrating svn to git with heavy use of externals] Roman Shaposhnik
2008-04-09 19:57 ` Roman Shaposhnik
1 sibling, 1 reply; 48+ messages in thread
From: Junio C Hamano @ 2008-04-09 6:43 UTC (permalink / raw)
To: Avery Pennarun; +Cc: Roman Shaposhnik, stuart.freeman, git
"Avery Pennarun" <apenwarr@gmail.com> writes:
> On Wed, Apr 9, 2008 at 12:39 AM, Roman Shaposhnik <rvs@sun.com> wrote:
>> Agreed. But I guess I'd be less confused if "git submodule" didn't muck
>> with .git/config at all. Or are there any other consumers of the
>> information
>> that it puts there (except itself)?
>
> That I don't know. If there aren't any others, then I agree, I'm not
> sure what the whole .git/config messing is about.
Its actually the other way around.
In-tree .gitmodules is used to give hints to prime what is placed in
.git/config, which after initialized should serve as the authoritative
information on managed submodules as far as your repository is concerned.
"git submodule init" may be a handy way to do this "priming", but you do
not necessarily have to use it but instead manually adjust .git/config
yourself; this is so that you can configure remote url that is different
from what .gitmodules suggests to suite your local needs.
Although putting everything in a single repository could work, that does
not have to be the only way to work with submodules. In fact, the basic
submodule design is trying very hard not to force you to grab objects that
are needed for all submodules when you are cloning the superproject, as
not cloning nor checking out any submodule is the default.
^ permalink raw reply [flat|nested] 48+ messages in thread
* Intricacies of submodules [was: Migrating svn to git with heavy use of externals]
2008-04-09 6:34 ` Avery Pennarun
2008-04-09 6:43 ` Junio C Hamano
@ 2008-04-09 19:57 ` Roman Shaposhnik
2008-04-09 20:27 ` Avery Pennarun
1 sibling, 1 reply; 48+ messages in thread
From: Roman Shaposhnik @ 2008-04-09 19:57 UTC (permalink / raw)
To: Avery Pennarun; +Cc: stuart.freeman, git
On Wed, 2008-04-09 at 02:34 -0400, Avery Pennarun wrote:
> On Wed, Apr 9, 2008 at 12:39 AM, Roman Shaposhnik <rvs@sun.com> wrote:
> > > In my own use case, I think having all the objects from the
> > > supermodule *and* submodules all be in the same repo is what I want.
> > > This kind of obviates the need for .gitmodules entirely, if
> > > git-checkout and friends will do the right thing. I think I'll submit
> > > some patches eventually once I have this figured out properly.
> >
> > Hm. But what about those who might want to pull from you? .git/config
> > doesn't propagate, which means that they'll be kind of stuck, don't
> > you think?
>
> Not exactly. The idea is that if the supermodule and submodules are
> all lumped into a single repo (and your refs are set up correctly),
> then cloning the supermodule will also clone all the submodules.
Interesting! How do you make it happen? Do you use git hooks or
something? On my end, I can't really reproduce that behavior of clone
but I would very much like to:
$ alias mkrepo="git init; touch file; git add file; git commit -mInit"
$ mkdir super ; cd super
$ mkrepo
$ mkdir submodule ; cd submodule
$ mkrepo
$ cd ..
$ git submodule add submodule
Adding existing repo at 'submodule' to the index
$ git commit -mSubmodule
Created commit 5921c87: Submodule
2 files changed, 4 insertions(+), 0 deletions(-)
create mode 100644 .gitmodules
create mode 160000 submodule
Now, when I clone super I don't actually have submodule cloned:
$ git clone super super-clone
$ cd super-clone
$ git submodule status
-7482d0433ed681aa243629f13cd97ca5be242393 submodule
In fact, it seems that I can't even do "submodule update", which
seems like a bug to me, by the way:
$ git submodule init
Submodule 'submodule' (submodule) registered for path 'submodule'
$ git submodule update
Initialized empty Git repository in /tmp/TEST/super-clone/submodule/.git/
fatal: no matching remote head
fetch-pack from 'submodule' failed.
Clone of 'submodule' into submodule path 'submodule' failed
Any ideas on what's going on here? Or what am I doing wrong?
> So everyone will have all the necessary refs anyway; as long as
> git-checkout checks them out, .gitmodules shouldn't have to exist at
> all, becaues there's nothing "special" for git-submodule to do.
I would very much like to have that, yes. Please do provide additional
details on how's your setup is different from mine.
Thanks,
Roman.
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Intricacies of submodules [was: Migrating svn to git with heavy use of externals]
2008-04-09 19:57 ` Roman Shaposhnik
@ 2008-04-09 20:27 ` Avery Pennarun
0 siblings, 0 replies; 48+ messages in thread
From: Avery Pennarun @ 2008-04-09 20:27 UTC (permalink / raw)
To: Roman Shaposhnik; +Cc: stuart.freeman, git
On Wed, Apr 9, 2008 at 3:57 PM, Roman Shaposhnik <rvs@sun.com> wrote:
> On Wed, 2008-04-09 at 02:34 -0400, Avery Pennarun wrote:
> > So everyone will have all the necessary refs anyway; as long as
> > git-checkout checks them out, .gitmodules shouldn't have to exist at
> > all, becaues there's nothing "special" for git-submodule to do.
>
> I would very much like to have that, yes. Please do provide additional
> details on how's your setup is different from mine.
Sorry, I wasn't clear. I meant that there's no fundamental reason
that this shouldn't be possible; as far as I know, there's no way to
make git do this in an obvious way (yet).
It's encouraging that other people seem to want the same behaviour as
me, which means I might get to working on it sooner :) Not that this
should discourage you from trying, of course: perhaps I'm just missing
something too.
Note: I think a big part of the secret is using "." as the location of
the submodule's repository in .gitmodules. "git submodule add" seems
to expand . to a full path, but you can change it by hand if you edit
the file.
Have fun,
Avery
^ permalink raw reply [flat|nested] 48+ messages in thread
* Intricacies of submodules [was: Migrating svn to git with heavy use of externals]
2008-04-09 6:43 ` Junio C Hamano
@ 2008-04-10 3:43 ` Roman Shaposhnik
2008-04-10 5:53 ` Intricacies of submodules Junio C Hamano
2008-04-10 16:07 ` Intricacies of submodules [was: Migrating svn to git with heavy use of externals] Ping Yin
0 siblings, 2 replies; 48+ messages in thread
From: Roman Shaposhnik @ 2008-04-10 3:43 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Avery Pennarun, stuart.freeman, git
Hi Junio!
On Apr 8, 2008, at 11:43 PM, Junio C Hamano wrote:
> "Avery Pennarun" <apenwarr@gmail.com> writes:
>
>> On Wed, Apr 9, 2008 at 12:39 AM, Roman Shaposhnik <rvs@sun.com>
>> wrote:
>>> Agreed. But I guess I'd be less confused if "git submodule" didn't
>>> muck
>>> with .git/config at all. Or are there any other consumers of the
>>> information
>>> that it puts there (except itself)?
>>
>> That I don't know. If there aren't any others, then I agree, I'm not
>> sure what the whole .git/config messing is about.
>
> Its actually the other way around.
Got it. But if you don't mind, I still would like to ask you a few
questions
to clarify some things.
> In-tree .gitmodules is used to give hints to prime what is placed in
> .git/config, which after initialized should serve as the authoritative
> information on managed submodules as far as your repository is
> concerned.
> "git submodule init" may be a handy way to do this "priming", but
> you do
> not necessarily have to use it but instead manually adjust .git/config
> yourself; this is so that you can configure remote url that is
> different
> from what .gitmodules suggests to suite your local needs.
Ok. Now I understand that .git/config is supposed to be the
authoritative
source of information on submodules. Yet we also have .gitmodules
to take care of. This leads to information duplication and makes me
believe that .git/config should be as much as sync with .gitmodules as
possible. Yet, even with the latest version of Git we don't have
"git submodule add" updating .git/config. So here comes the first
question:
* Do you consider this behavior to be a bug or do you a have a
reasonable
explanation for it?
Continuing in the same line of though as far as information
duplication goes,
here's my second question:
* Whenever .gitmodules and .git/config disagree on the URL for a
particular
submodule do you expect .git/config to always take precedence?
And finally, since from your explanation it appears that the only
reason for
.gitmodules existence is to "prime" the .git/config it seems that what
we're
trying to achieve is a way for Git settings that are usually part
of .git/config
to be resident within the repository itself. That would give these
setting
a benefit of percolating through clone/fetch/push operations, yet be
overridden by individual .git/config settings. And so I have my final
question:
* Has an idea of having a regular file (subject to having
history, etc.)
called something like .gitconfig at the top level of Git's
repository ever
been considered (implemented?). That way you a repository
maintainer
would be able to force a particular set of settings on all of
its clones
yet clones will be able to override then in .git/config if
needed.
> Although putting everything in a single repository could work, that
> does
> not have to be the only way to work with submodules. In fact, the
> basic
> submodule design is trying very hard not to force you to grab
> objects that
> are needed for all submodules when you are cloning the superproject,
> as
> not cloning nor checking out any submodule is the default.
Indeed. This is a very beneficial setup for large projects. In fact,
what I'm working
on right now is a prototype of a build infrastructure that would be
smart enough
to import "cached" binaries of the build of a particular submodule if
the submodule
itself hasn't been checked out yet. SHA1 lets me do the versioning
properly and
once developers do checkout sources of any submodule the build system
will stop importing "cached" binaries and start build it for real. All
without developers
actually doing anything special.
Thanks,
Roman.
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Intricacies of submodules
2008-04-10 3:43 ` Intricacies of submodules [was: Migrating svn to git with heavy use of externals] Roman Shaposhnik
@ 2008-04-10 5:53 ` Junio C Hamano
2008-04-10 20:32 ` Roman Shaposhnik
2008-04-12 4:02 ` Ping Yin
2008-04-10 16:07 ` Intricacies of submodules [was: Migrating svn to git with heavy use of externals] Ping Yin
1 sibling, 2 replies; 48+ messages in thread
From: Junio C Hamano @ 2008-04-10 5:53 UTC (permalink / raw)
To: Roman Shaposhnik; +Cc: Avery Pennarun, stuart.freeman, git
Roman Shaposhnik <rvs@sun.com> writes:
> ... Yet, even with the latest version of Git we don't have
> "git submodule add" updating .git/config. So here comes the first
> question:
> * Do you consider this behavior to be a bug or do you a have a
> reasonable
> explanation for it?
I would say that the part is simply underdeveloped. I do not think many
people from the core circle of the git community are heavy users of
submodules.
The "submodule add" command was done primarily by and for people who
wanted to initially register a commit as a submodule from a subdirectory
repository, back when there was not much actual propagation support (that
is, "what should happen when somebody cloned such a toplevel project with
submodule?") designed yet. The command simply records the global hint to
the .gitmodules file, so that people who get such a commit that records a
submodule in its tree can also learn where to turn to when they do want to
get to the submodule by looking at in-tree .gitmodules file.
However, the way others will obtain a copy of the submodule repository
will be quite different from the way you access it (you already have it,
so you do not need to clone it from elsewhere to initialize it). It may
not make much sense to record the URL that you tell others to use in your
own .git/config in the repository of the originator of such a superproject
vs submodule combination. So in that sense, I am not sure if not mucking
with .git/config is even a bad thing.
The side that registers data in .git/config using what is in .gitmodules
as a hint, which is what "git submodule init" is about, is not very much
developed either (yet). It does not have enough user interaction to allow
the user to tailor the URL for his own needs, for example. There is no
duplication per-se; .gitmodules may give you git:// URL but you might need
to rewrite it to corresponding http:// URL because of your networking
situation. And that is the reason why .git/config should be the
authoritative copy. There may be bugs in the implementation -- I dunno --
but at least that is the intent. IOW, at runtime, .gitmodules should not
be consulted for purposes other than to update .git/config entries.
The original discussion that led to the current implementation dates back
in May-June timeframe of 2007. I would not be surprised if not all of the
good ideas were incorporated in the current implementation. For example,
one thing that we may want to do is to record what contents we've seen in
the .gitmodules file in order to prime each entry in .git/config, so that
we can give users a chance to adjust what is in .git/config when we notice
the entry in .gitmodules has changed.
For example, consider that .gitmodules said the submodule should be taken
from repository URL git://A.xz/project.git when you cloned. You may have
used the given URL as-is to prime your .git/config, or you may have chosen
to use http://A.xz/project.git/ for networking reasons.
After working with the project for a while (i.e. you pull and perhaps push
back or send patches upstream), .gitmodules file changes and it now says
the repository resides at host B.xz because the project relocated. You
would want the next "git submodule update" to notice that your .git/config
records a URL you derived from git://A.xz/project.git/, and that you have
not seen this new URL git://B.xz/project.git/, and give you a chance to
make adjustments if needed.
After that happens, if you seeked to an old version (perhaps you wanted to
work on an old bug), .gitmodules file that is checked out of that old
version may say the "upstream" is at A.xz, but the entry in .git/config
may already be based on B.xz. But because you have already seen this old
URL in .gitmodules, you may not want to get asked about adjusting the
entry in .git/config merely because you checked out an old version. What
this means is that it is not enough to just record "What the current URL
you chose to use is" in .git/config (which is obvious), and it is also not
enough to record "what URL .gitmodules had when you made that choice", but
you would also need to record "What URLs you have _seen_ when making that
choice".
The above is one thing I remember seeing in the original discussion but I
do not think implemented in the current code. I strongly suspect there
are other good design bits left unimplemented in the discussion. There
definitely are other things people who are interested in submodules may
want to improve.
> * Has an idea of having a regular file (subject to having
> history, etc.)
> called something like .gitconfig at the top level of Git's
> repository ever
> been considered (implemented?). That way you a repository
> maintainer
> would be able to force a particular set of settings on all of
> its clones
> yet clones will be able to override then in .git/config if
> needed.
Considered, yes, implemented, no. Not because nobody bothered to, but
because it is unclear if it is a good thing to do in general to begin
with. What's recorded in .git/config is pretty much personal (e.g. "who
you are known as to this project?", "what's the SMTP host, user and
password when sending out patches from here?", "do you want to use color
in diff?"), dependent on local needs (e.g. "what protocol a particular
remote repository should be reached via"), or what the repository (as
opposed to "project") is about (e.g. "is this a bare, shared distribution
point, or is this a developer repository with a work tree?").
Project policies do not belong to .git/config and should be propagated
in-tree. For example, "indenting with more than 8 spaces is a whitespace
error for *.c files" is described in .gitattributes and given to all
cloners.
There may be some behaviour that is currently controlled by what is
recorded in .git/config but should be enforced project-wide. If there are
such things, we may want to have a mechanism that reads from in-tree data,
just like the attributes code does.
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Intricacies of submodules [was: Migrating svn to git with heavy use of externals]
2008-04-10 3:43 ` Intricacies of submodules [was: Migrating svn to git with heavy use of externals] Roman Shaposhnik
2008-04-10 5:53 ` Intricacies of submodules Junio C Hamano
@ 2008-04-10 16:07 ` Ping Yin
2008-04-10 19:27 ` Roman Shaposhnik
1 sibling, 1 reply; 48+ messages in thread
From: Ping Yin @ 2008-04-10 16:07 UTC (permalink / raw)
To: Roman Shaposhnik; +Cc: Junio C Hamano, Avery Pennarun, stuart.freeman, git
On Thu, Apr 10, 2008 at 11:43 AM, Roman Shaposhnik <rvs@sun.com> wrote:
> Hi Junio!
>
> * Has an idea of having a regular file (subject to having history,
> etc.)
> called something like .gitconfig at the top level of Git's repository
> ever
> been considered (implemented?). That way you a repository maintainer
> would be able to force a particular set of settings on all of its
> clones
> yet clones will be able to override then in .git/config if needed.
>
I like this idea, it's another common/special requirement just like
.gitignore vs. $GIT_DIR/info/exclude.
--
Ping Yin
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Intricacies of submodules [was: Migrating svn to git with heavy use of externals]
2008-04-10 16:07 ` Intricacies of submodules [was: Migrating svn to git with heavy use of externals] Ping Yin
@ 2008-04-10 19:27 ` Roman Shaposhnik
0 siblings, 0 replies; 48+ messages in thread
From: Roman Shaposhnik @ 2008-04-10 19:27 UTC (permalink / raw)
To: Ping Yin; +Cc: Junio C Hamano, Avery Pennarun, stuart.freeman, git
On Fri, 2008-04-11 at 00:07 +0800, Ping Yin wrote:
> On Thu, Apr 10, 2008 at 11:43 AM, Roman Shaposhnik <rvs@sun.com> wrote:
> > Hi Junio!
> >
>
> > * Has an idea of having a regular file (subject to having history,
> > etc.)
> > called something like .gitconfig at the top level of Git's repository
> > ever
> > been considered (implemented?). That way you a repository maintainer
> > would be able to force a particular set of settings on all of its
> > clones
> > yet clones will be able to override then in .git/config if needed.
> >
>
> I like this idea, it's another common/special requirement just like
> .gitignore vs. $GIT_DIR/info/exclude.
Well, I guess if enough of us like it there's a chance it can be
implemented, right? ;-)
To some extent it seems that you've solved this particular issue for
submodules with your PATCH/RFC 3/7. Now, in a general case, if
git-config(1) can be patched to take into account one extra place
for retrieving options from (.gitconfig) it seems that
retiring .gitmodules completely would be just one benefit of many.
Other benefits would include propagating setting like most of the
core.* and quite a few other things I see listed in git-config(1)
man page.
It seems that the only downside here would be a need for a bit
of special handling when a setting needs to be recorded. Otherwise
it looks like a pretty clean and general idea.
Thanks,
Roman.
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Intricacies of submodules
2008-04-10 5:53 ` Intricacies of submodules Junio C Hamano
@ 2008-04-10 20:32 ` Roman Shaposhnik
2008-04-11 5:20 ` Junio C Hamano
2008-04-12 4:02 ` Ping Yin
1 sibling, 1 reply; 48+ messages in thread
From: Roman Shaposhnik @ 2008-04-10 20:32 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Avery Pennarun, stuart.freeman, git
On Wed, 2008-04-09 at 22:53 -0700, Junio C Hamano wrote:
> Roman Shaposhnik <rvs@sun.com> writes:
>
> > ... Yet, even with the latest version of Git we don't have
> > "git submodule add" updating .git/config. So here comes the first
> > question:
> > * Do you consider this behavior to be a bug or do you a have a
> > reasonable
> > explanation for it?
>
> I would say that the part is simply underdeveloped.
Fair enough. Which is good news for me, since I can not really imagine
how the repositories I need to establish can survive without solid
Submodule support. I'm very interested in getting this functionality
right with git-submodule. And I can be either your guinea pig or
a frenetic hamster. After all, you don't mind complete newcomers
to the development process sending you code, do you? ;-)
> I do not think many people from the core circle of the git community are
> heavy users of submodules.
I wonder why. Most of the software projects that I have to deal with
seem to be a pretty hefty collections of loosely coupled things. In that
other thread I had on git-repack there was a typical example of what
happens if you put something like that into a single repo (700
subdirectories at the top level -- that's what :-(). Does it all mean
that the core Git community mostly works on projects like Linux kernel
and not things like OpenOffice or Mozilla, etc?
> However, the way others will obtain a copy of the submodule repository
> will be quite different from the way you access it (you already have it,
> so you do not need to clone it from elsewhere to initialize it). It may
> not make much sense to record the URL that you tell others to use in your
> own .git/config in the repository of the originator of such a superproject
> vs submodule combination. So in that sense, I am not sure if not mucking
> with .git/config is even a bad thing.
It is all about consistency as far as I see it. One huge advantage of
Git is that it is a DSCM. It makes things totally symmetric. The
paragraph that I quoted above hints at a possibility of treating the
initial repo somewhat differently from its copies. That would break
a nice symmetry. And it would do that unnecessarily.
> After working with the project for a while (i.e. you pull and perhaps push
> back or send patches upstream), .gitmodules file changes and it now says
> the repository resides at host B.xz because the project relocated. You
> would want the next "git submodule update" to notice that your .git/config
> records a URL you derived from git://A.xz/project.git/, and that you have
> not seen this new URL git://B.xz/project.git/, and give you a chance to
> make adjustments if needed.
I guess something like that could be implemented via Git hooks, right?
> After that happens, if you seeked to an old version (perhaps you wanted to
> work on an old bug), .gitmodules file that is checked out of that old
> version may say the "upstream" is at A.xz, but the entry in .git/config
> may already be based on B.xz. But because you have already seen this old
> URL in .gitmodules, you may not want to get asked about adjusting the
> entry in .git/config merely because you checked out an old version. What
> this means is that it is not enough to just record "What the current URL
> you chose to use is" in .git/config (which is obvious), and it is also not
> enough to record "what URL .gitmodules had when you made that choice", but
> you would also need to record "What URLs you have _seen_ when making that
> choice".
Good point.
> > * Has an idea of having a regular file (subject to having
> > history, etc.)
> > called something like .gitconfig at the top level of Git's
> > repository ever
> > been considered (implemented?). That way you a repository
> > maintainer
> > would be able to force a particular set of settings on all of
> > its clones
> > yet clones will be able to override then in .git/config if
> > needed.
>
> Considered, yes, implemented, no. Not because nobody bothered to, but
> because it is unclear if it is a good thing to do in general to begin
> with. What's recorded in .git/config is pretty much personal (e.g. "who
> you are known as to this project?", "what's the SMTP host, user and
> password when sending out patches from here?", "do you want to use color
> in diff?"), dependent on local needs (e.g. "what protocol a particular
> remote repository should be reached via"), or what the repository (as
> opposed to "project") is about (e.g. "is this a bare, shared distribution
> point, or is this a developer repository with a work tree?").
Some of it is personal, yes. But sometimes those personal preferences
need to be enforced on a project level (of course, giving everybody
a way to override the setting if they really want to). For a big
software organization with a mix of senior and junior engineers I need
a way to set up *my* workspace in such a way that everybody who
clones/pulls from it get not only the source code, but also "Git best
practices". That would simplify things a great deal for me, because
I can always say: "just pull my latest .gitconfig, make sure you
don't have any extra stuff in your .git/confing and everything
in Git will work for you". That would also simplify things for junior
guys as well -- they can be sure that whatever needs to be done
with Git's setup I can do for them and all they have to do is pull.
Perhaps, this model is different from what the majority of Git
developers uses (especially working on OpenSource projects) but I'm
pretty sure it is quite widespread within corporate firewalls.
> Project policies do not belong to .git/config and should be propagated
> in-tree. For example, "indenting with more than 8 spaces is a whitespace
> error for *.c files" is described in .gitattributes and given to all
> cloners.
Agreed. But what I have in mind is more of: in this project everybody
shall use super-duper-GUI-merge by default. I can't really propagate
merge.tool in any way, can I? But I really need to. Since the health
of my project really depends on junior engineers not being confused
when doing the merges.
What is also not clear to me is the difference between what is
considered to be an attribute (part of .gitattributes) and
what it considered to be a setting (part of .git/config). It seems
that the line gets quite blurry (at least for me it does) and thus
I'd totally appreciate any help understanding this difference
better.
> There may be some behaviour that is currently controlled by what is
> recorded in .git/config but should be enforced project-wide. If there are
> such things, we may want to have a mechanism that reads from in-tree data,
> just like the attributes code does.
That's exactly what I have in mind.
Thanks,
Roman.
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Intricacies of submodules
2008-04-10 20:32 ` Roman Shaposhnik
@ 2008-04-11 5:20 ` Junio C Hamano
2008-04-11 16:04 ` Ping Yin
2008-04-14 19:56 ` Roman Shaposhnik
0 siblings, 2 replies; 48+ messages in thread
From: Junio C Hamano @ 2008-04-11 5:20 UTC (permalink / raw)
To: Roman Shaposhnik; +Cc: Avery Pennarun, stuart.freeman, git
Roman Shaposhnik <rvs@sun.com> writes:
> ... I'm very interested in getting this functionality
> right with git-submodule. And I can be either your guinea pig or
> a frenetic hamster. After all, you don't mind complete newcomers
> to the development process sending you code, do you? ;-)
Everybody starts out as a total stranger. Linus has never worked with me
when I started, and many people who are the core members of git community
have never worked with me before either.
>> However, the way others will obtain a copy of the submodule repository
>> will be quite different from the way you access it (you already have it,
>> so you do not need to clone it from elsewhere to initialize it). It may
>> not make much sense to record the URL that you tell others to use in your
>> own .git/config in the repository of the originator of such a superproject
>> vs submodule combination. So in that sense, I am not sure if not mucking
>> with .git/config is even a bad thing.
>
> It is all about consistency as far as I see it. One huge advantage of
> Git is that it is a DSCM. It makes things totally symmetric. The
> paragraph that I quoted above hints at a possibility of treating the
> initial repo somewhat differently from its copies. That would break
> a nice symmetry. And it would do that unnecessarily.
I do not think being distributed is about such symmetry.
Being distributed is more about each repository being able to serve its
own purpose, being able to get configured suitably and individually,
without disturbing others, and allowing a workflow around it that
_potentially_ treats everybody as equals.
Not having the kind of symmetry you talk about is not anything new about
submodules, nor is it necessarily a bad thing. You create a history here,
you push it into there. Somebody else clones your history from there and
starts hacking.
The way that somebody's clone interacts with the intermediary and the way
your original repository interacts with the intermediary _are_ different,
and they ought to stay different if that intermediary is _your_ owned
publishing repository. You can push into it, but that somebody else
should not be able to. There should be no symmetry about that repository.
That somebody else may have his own publishing repository where he pushes
the result of his work into and you fetch from. Taken together, each of
you and that somebody else having his own repository to allow others to
fetch from, makes you two the equals in the global picture.
You would only need the symmetry of your kind if there is a single
intermediary that is _the central location_, a shared repository where
everybody meets. Only in that case, you _may_, after priming the process
by initially creating the superproject - submodule combination in the
originating repository and pushing it to the shared repository, want to
clone it back to a new work tree you will use as your usual working place
(and nuke the originating one, which is not needed anymore as the process
has been primed now). At that point, your usual working place and
everybody else's working place would look symmetrical, as everybody
including you cloned from a single shared location.
I am not saying that is necessarily a bad thing to wish for. I am only
saying that the kind of symmetry you talk about does not have much to do
with being distributed. If anything, that symmetry is more closely tied
to using a centralized work flow, not distributed.
>> After working with the project for a while (i.e. you pull and perhaps push
>> back or send patches upstream), .gitmodules file changes and it now says
>> the repository resides at host B.xz because the project relocated. You
>> would want the next "git submodule update" to notice that your .git/config
>> records a URL you derived from git://A.xz/project.git/, and that you have
>> not seen this new URL git://B.xz/project.git/, and give you a chance to
>> make adjustments if needed.
>
> I guess something like that could be implemented via Git hooks, right?
I do not see a reason to bring in hooks here. To answer "Yes" to your
"right?" question, "git submodule update" ought to call out to a hook in
such a situation, which it doesn't right now. So the answer for the
current implementation would be "no". To make it "Yes", the command needs
to be modified to call out a hook, but should it be implemented as a hook,
when it is already so clearly specified what needs to happen?
>> Considered, yes, implemented, no. Not because nobody bothered to, but
>> because it is unclear if it is a good thing to do in general to begin
>> with. What's recorded in .git/config is pretty much personal (e.g. "who
>> you are known as to this project?", "what's the SMTP host, user and
>> password when sending out patches from here?", "do you want to use color
>> in diff?"), dependent on local needs (e.g. "what protocol a particular
>> remote repository should be reached via"), or what the repository (as
>> opposed to "project") is about (e.g. "is this a bare, shared distribution
>> point, or is this a developer repository with a work tree?").
>
> Some of it is personal, yes. But sometimes those personal preferences
> need to be enforced on a project level (of course, giving everybody
> a way to override the setting if they really want to). For a big
> software organization with a mix of senior and junior engineers I need
> a way to set up *my* workspace in such a way that everybody who
> clones/pulls from it get not only the source code, but also "Git best
> practices". That would simplify things a great deal for me, because
> I can always say: "just pull my latest .gitconfig, make sure you
> don't have any extra stuff in your .git/confing and everything
> in Git will work for you".
I think the way you stated the above speaks for itself. The issue you are
solving is mostly human (social), and solution is majorly instruction with
slight help from mechanism. The instruction "Use this latest thing, do
not have anything in .git/config" can be substituted with "Use this latest
update-git-config.sh which mucks with your .git/config to conform to our
project standard", without losing simplicity and with much enhanced
robustness, as you can now enforce that the users do not have anything
that would interfere with and countermand your policy you would want to
implement.
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Intricacies of submodules
2008-04-11 5:20 ` Junio C Hamano
@ 2008-04-11 16:04 ` Ping Yin
2008-04-11 22:32 ` Junio C Hamano
2008-04-14 19:56 ` Roman Shaposhnik
1 sibling, 1 reply; 48+ messages in thread
From: Ping Yin @ 2008-04-11 16:04 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Roman Shaposhnik, Avery Pennarun, stuart.freeman, git
On Fri, Apr 11, 2008 at 1:20 PM, Junio C Hamano <gitster@pobox.com> wrote:
> > Some of it is personal, yes. But sometimes those personal preferences
> > need to be enforced on a project level (of course, giving everybody
> > a way to override the setting if they really want to). For a big
> > software organization with a mix of senior and junior engineers I need
> > a way to set up *my* workspace in such a way that everybody who
> > clones/pulls from it get not only the source code, but also "Git best
> > practices". That would simplify things a great deal for me, because
> > I can always say: "just pull my latest .gitconfig, make sure you
> > don't have any extra stuff in your .git/confing and everything
> > in Git will work for you".
>
> I think the way you stated the above speaks for itself. The issue you are
> solving is mostly human (social), and solution is majorly instruction with
> slight help from mechanism. The instruction "Use this latest thing, do
> not have anything in .git/config" can be substituted with "Use this latest
> update-git-config.sh which mucks with your .git/config to conform to our
> project standard", without losing simplicity and with much enhanced
> robustness, as you can now enforce that the users do not have anything
> that would interfere with and countermand your policy you would want to
> implement.
>
But, how to handle the case that there are more than one policies
for different projects?
--
Ping Yin
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Intricacies of submodules
2008-04-11 16:04 ` Ping Yin
@ 2008-04-11 22:32 ` Junio C Hamano
2008-04-12 3:13 ` Roman Shaposhnik
2008-04-12 3:20 ` Ping Yin
0 siblings, 2 replies; 48+ messages in thread
From: Junio C Hamano @ 2008-04-11 22:32 UTC (permalink / raw)
To: Ping Yin; +Cc: Roman Shaposhnik, Avery Pennarun, stuart.freeman, git
"Ping Yin" <pkufranky@gmail.com> writes:
> On Fri, Apr 11, 2008 at 1:20 PM, Junio C Hamano <gitster@pobox.com> wrote:
>> > Some of it is personal, yes. But sometimes those personal preferences
>> > need to be enforced on a project level (of course, giving everybody
>> > a way to override the setting if they really want to). For a big
>> > software organization with a mix of senior and junior engineers I need
>> > a way to set up *my* workspace in such a way that everybody who
>> > clones/pulls from it get not only the source code, but also "Git best
>> > practices". That would simplify things a great deal for me, because
>> > I can always say: "just pull my latest .gitconfig, make sure you
>> > don't have any extra stuff in your .git/confing and everything
>> > in Git will work for you".
>>
>> I think the way you stated the above speaks for itself. The issue you are
>> solving is mostly human (social), and solution is majorly instruction with
>> slight help from mechanism. The instruction "Use this latest thing, do
>> not have anything in .git/config" can be substituted with "Use this latest
>> update-git-config.sh which mucks with your .git/config to conform to our
>> project standard", without losing simplicity and with much enhanced
>> robustness, as you can now enforce that the users do not have anything
>> that would interfere with and countermand your policy you would want to
>> implement.
>>
> But, how to handle the case that there are more than one policies
> for different projects?
"How to"? You would handle the case just like either of us suggested
above.
Are you talking about a single project with more than one policies A, B,
C, ... that conflict with each other? Or are you talking about more than
one projects, each of which has a single project-wide policy?
I do not think the former makes sense and won't be helped with in-tree
file that overrides .git/config Roman discussed either.
The latter would be helped equally well whether that in-tree polic file is
called .gitconfig or update-git-config.sh.
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Intricacies of submodules
2008-04-11 22:32 ` Junio C Hamano
@ 2008-04-12 3:13 ` Roman Shaposhnik
2008-04-12 5:11 ` Junio C Hamano
2008-04-12 3:20 ` Ping Yin
1 sibling, 1 reply; 48+ messages in thread
From: Roman Shaposhnik @ 2008-04-12 3:13 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Ping Yin, Avery Pennarun, stuart.freeman, git
On Fri, 2008-04-11 at 15:32 -0700, Junio C Hamano wrote:
> > But, how to handle the case that there are more than one policies
> > for different projects?
>
> "How to"? You would handle the case just like either of us suggested
> above.
>
> Are you talking about a single project with more than one policies A, B,
> C, ... that conflict with each other? Or are you talking about more than
> one projects, each of which has a single project-wide policy?
>
> I do not think the former makes sense and won't be helped with in-tree
> file that overrides .git/config Roman discussed either.
>
> The latter would be helped equally well whether that in-tree polic file is
> called .gitconfig or update-git-config.sh.
I believe Fedor addressed the social aspects of this issue quite well,
so I'm just going to focus on a technical aspect here: there is a
difference between .gitconfig and update-git-config.sh approaches
that I would like you to acknowledge. With update-git-config.sh you
are allowing for a repository to be in a state that is inconsistent
with the policies that need to be enforced, without novice users even
realizing that. Contrast this with .gitconfig where policies get
enforced right from the minute your clone operation finishes and there's
much less opportunity for the user to shoot himself in the foot. In
fact "shooting in the foot" (senselessly overriding default policies
via .git/config) becomes an *explicit* action on user's part. He is
the one to blame.
Thanks,
Roman.
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Intricacies of submodules
2008-04-11 22:32 ` Junio C Hamano
2008-04-12 3:13 ` Roman Shaposhnik
@ 2008-04-12 3:20 ` Ping Yin
1 sibling, 0 replies; 48+ messages in thread
From: Ping Yin @ 2008-04-12 3:20 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Roman Shaposhnik, Avery Pennarun, stuart.freeman, git
On Sat, Apr 12, 2008 at 6:32 AM, Junio C Hamano <gitster@pobox.com> wrote:
>
> "Ping Yin" <pkufranky@gmail.com> writes:
>
> > On Fri, Apr 11, 2008 at 1:20 PM, Junio C Hamano <gitster@pobox.com> wrote:
> >> > Some of it is personal, yes. But sometimes those personal preferences
> >> > need to be enforced on a project level (of course, giving everybody
> >> > a way to override the setting if they really want to). For a big
> >> > software organization with a mix of senior and junior engineers I need
> >> > a way to set up *my* workspace in such a way that everybody who
> >> > clones/pulls from it get not only the source code, but also "Git best
> >> > practices". That would simplify things a great deal for me, because
> >> > I can always say: "just pull my latest .gitconfig, make sure you
> >> > don't have any extra stuff in your .git/confing and everything
> >> > in Git will work for you".
> >>
> >> I think the way you stated the above speaks for itself. The issue you are
> >> solving is mostly human (social), and solution is majorly instruction with
> >> slight help from mechanism. The instruction "Use this latest thing, do
> >> not have anything in .git/config" can be substituted with "Use this latest
> >> update-git-config.sh which mucks with your .git/config to conform to our
> >> project standard", without losing simplicity and with much enhanced
> >> robustness, as you can now enforce that the users do not have anything
> >> that would interfere with and countermand your policy you would want to
> >> implement.
> >>
> > But, how to handle the case that there are more than one policies
> > for different projects?
>
> "How to"? You would handle the case just like either of us suggested
> above.
>
> Are you talking about a single project with more than one policies A, B,
> C, ... that conflict with each other? Or are you talking about more than
> one projects, each of which has a single project-wide policy?
>
> I do not think the former makes sense and won't be helped with in-tree
> file that overrides .git/config Roman discussed either.
>
> The latter would be helped equally well whether that in-tree polic file is
> called .gitconfig or update-git-config.sh.
I meant more than one projects, each of which has a different
project-wide policy. I originally thought update-git-config.sh can't
help, but i'm wrong since it can update $GIT_DIR/config instead of
$HOME/.gitconfig.
However, i think .gitconfig is better since it's more consistent with
other analogies.
--
Ping Yin
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Intricacies of submodules
2008-04-10 5:53 ` Intricacies of submodules Junio C Hamano
2008-04-10 20:32 ` Roman Shaposhnik
@ 2008-04-12 4:02 ` Ping Yin
2008-04-12 5:25 ` Junio C Hamano
1 sibling, 1 reply; 48+ messages in thread
From: Ping Yin @ 2008-04-12 4:02 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Roman Shaposhnik, Avery Pennarun, stuart.freeman, git
On Thu, Apr 10, 2008 at 1:53 PM, Junio C Hamano <gitster@pobox.com> wrote:
> The original discussion that led to the current implementation dates back
> in May-June timeframe of 2007. I would not be surprised if not all of the
> good ideas were incorporated in the current implementation. For example,
> one thing that we may want to do is to record what contents we've seen in
> the .gitmodules file in order to prime each entry in .git/config, so that
> we can give users a chance to adjust what is in .git/config when we notice
> the entry in .gitmodules has changed.
>
> For example, consider that .gitmodules said the submodule should be taken
> from repository URL git://A.xz/project.git when you cloned. You may have
> used the given URL as-is to prime your .git/config, or you may have chosen
> to use http://A.xz/project.git/ for networking reasons.
>
> After working with the project for a while (i.e. you pull and perhaps push
> back or send patches upstream), .gitmodules file changes and it now says
> the repository resides at host B.xz because the project relocated. You
> would want the next "git submodule update" to notice that your .git/config
> records a URL you derived from git://A.xz/project.git/, and that you have
> not seen this new URL git://B.xz/project.git/, and give you a chance to
> make adjustments if needed.
I think this should be done if "git submodule update" fails. The
reason it fails may be different, such as newest commits not pushed
out and the subproject relocated etc. So it can only given some hints
with "maybe".
However, how to detect the url has changed in .gitmodules? Compare the
latest two version of .gitmodules?
And if only the protocol or domain changes of the submodule between
$GIT_CONFIG/config and .gitmodules, i think the
"url.<usethis>.insteadof = <otherurl>" form introduced in v1.5.5 is
more helpful.
>
> After that happens, if you seeked to an old version (perhaps you wanted to
> work on an old bug), .gitmodules file that is checked out of that old
> version may say the "upstream" is at A.xz, but the entry in .git/config
> may already be based on B.xz. But because you have already seen this old
> URL in .gitmodules, you may not want to get asked about adjusting the
> entry in .git/config merely because you checked out an old version. What
> this means is that it is not enough to just record "What the current URL
> you chose to use is" in .git/config (which is obvious), and it is also not
> enough to record "what URL .gitmodules had when you made that choice", but
> you would also need to record "What URLs you have _seen_ when making that
> choice".
>
When bug happens, i only care the commit in the index of submodule and
wheter i can check out the old submodule commit. However, does it
really matter that what the url of the submodule is?
--
Ping Yin
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Intricacies of submodules
2008-04-12 3:13 ` Roman Shaposhnik
@ 2008-04-12 5:11 ` Junio C Hamano
2008-04-14 19:52 ` Roman Shaposhnik
0 siblings, 1 reply; 48+ messages in thread
From: Junio C Hamano @ 2008-04-12 5:11 UTC (permalink / raw)
To: Roman Shaposhnik; +Cc: Ping Yin, Avery Pennarun, stuart.freeman, git
Roman Shaposhnik <rvs@sun.com> writes:
> ... Contrast this with .gitconfig where policies get
> enforced right from the minute your clone operation finishes and there's
> much less opportunity for the user to shoot himself in the foot.
Why? Even if you expect .git/config in a new repository would be vanilla
(which you can't really, as crazy sysadmin can have /etc/gitconfig or
template to override what you do), $HOME/.gitconfig would be in effect the
minute you clone.
As you cannot reasonably expect that your project is the _only_ project
your cloners would use, you cannot dictate what $HOME/.gitconfig has.
A policy issue needs to be addressed at the human level anyway, so I do
not really see major difference either way. You need to trust your users
to follow the guideline at some point, and all you can do is to make it
easy for them to do so, and (optionally) verify that they are actually
following the guideline. We need to suggest an easy-to-use and robust
mechanism to allow you to do so as the BCP.
Convenience and robustness need to be considered at the same time. In
that area, I would say a custom "sane environment setup script" would be
the more flexible, as it rolls the customization and verification into one
step.
Trust goes mutual and your users need to be able to trust you, too. If
the config mechanism blindly starts reading from in-tree .gitconfig, you
can do nasty things with aliases for example. So the "sane environment
setup script" would also be a good idea in that sense, too --- the users,
perhaps only the most suspicious and untrusting kind, have a way to verify
it does not mean any harm before running it.
Don't get me wrong. I am not saying that everybody should start rolling
their own "sane environment setup script" and ship their project with it.
I am only suggesting it as a possible way to do your "policy enforcement"
without having to introduce in-tree .gitconfig, which I unfortunately see
no fundamental upsides but definite downsides (security included).
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Intricacies of submodules
2008-04-12 4:02 ` Ping Yin
@ 2008-04-12 5:25 ` Junio C Hamano
2008-04-12 6:26 ` Ping Yin
0 siblings, 1 reply; 48+ messages in thread
From: Junio C Hamano @ 2008-04-12 5:25 UTC (permalink / raw)
To: Ping Yin
Cc: Junio C Hamano, Roman Shaposhnik, Avery Pennarun, stuart.freeman,
git
"Ping Yin" <pkufranky@gmail.com> writes:
>> After working with the project for a while (i.e. you pull and perhaps push
>> back or send patches upstream), .gitmodules file changes and it now says
>> the repository resides at host B.xz because the project relocated. You
>> would want the next "git submodule update" to notice that your .git/config
>> records a URL you derived from git://A.xz/project.git/, and that you have
>> not seen this new URL git://B.xz/project.git/, and give you a chance to
>> make adjustments if needed.
>
> I think this should be done if "git submodule update" fails. The
> reason it fails may be different, such as newest commits not pushed
> out and the subproject relocated etc. So it can only given some hints
> with "maybe".
>
> However, how to detect the url has changed in .gitmodules? Compare the
> latest two version of .gitmodules?
That's why I suggested (and Roman seems to have got it, so I do not think
what I wrote was too confusing to be understood) you should record the set
of _all_ URLs you have _seen_ in .git/config. If the URL in .gitmodules
checked out is included in that set, you do not do anything. Otherwise
you ask.
I think "git submodule update" is a good place to do that check, but I'd
prefer it be done _before_ it actually goes to the network to start
accessing potentially stale URL. The old URL may not be defunct but the
project decided not to advertise it to be used for some non-technical
reason (e.g. the site owner asked them not to point at it and instead use
some other mirrors).
> When bug happens, i only care the commit in the index of submodule and
> wheter i can check out the old submodule commit. However, does it
> really matter that what the url of the submodule is?
No. The discussion was what should _not_ happen when you run "git
submodule update" from that state. Usually in a steadily advancing
history, you _want_ "git submodule update" to notice that the suggested
remote URL has changed in .gitmodules and give the user a chance to adjust
the URL _before_ it hits the network, but you obviously do not want it to
happen only because you happened to be at a seeked back commit when you
initiated "git submodule update". In other words, you are agreeing with
me without really reading what I wrote ;-) It does not matter, and
recording the URLs you have _seen_ (not "the last one you saw", or "the
one you initialized .git/config with") is a way to make sure that the
fixed "git submodule update" agrees with us on that point.
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Intricacies of submodules
2008-04-12 5:25 ` Junio C Hamano
@ 2008-04-12 6:26 ` Ping Yin
0 siblings, 0 replies; 48+ messages in thread
From: Ping Yin @ 2008-04-12 6:26 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Roman Shaposhnik, Avery Pennarun, stuart.freeman, git
On Sat, Apr 12, 2008 at 1:25 PM, Junio C Hamano <gitster@pobox.com> wrote:
> "Ping Yin" <pkufranky@gmail.com> writes:
>
>
> >> After working with the project for a while (i.e. you pull and perhaps push
> >> back or send patches upstream), .gitmodules file changes and it now says
> >> the repository resides at host B.xz because the project relocated. You
> >> would want the next "git submodule update" to notice that your .git/config
> >> records a URL you derived from git://A.xz/project.git/, and that you have
> >> not seen this new URL git://B.xz/project.git/, and give you a chance to
> >> make adjustments if needed.
> >
> > I think this should be done if "git submodule update" fails. The
> > reason it fails may be different, such as newest commits not pushed
> > out and the subproject relocated etc. So it can only given some hints
> > with "maybe".
> >
> > However, how to detect the url has changed in .gitmodules? Compare the
> > latest two version of .gitmodules?
>
> That's why I suggested (and Roman seems to have got it, so I do not think
> what I wrote was too confusing to be understood) you should record the set
> of _all_ URLs you have _seen_ in .git/config. If the URL in .gitmodules
> checked out is included in that set, you do not do anything. Otherwise
> you ask.
I don't think it deserves such a change (say recoding history urls to
$GIT_DIR/config) to just ask just the user whether to change url in
$GIT_CONFIG/config when the url in .gitmodules changes to a new one.
Actually, i think this is an ugly solution :-)
>
> I think "git submodule update" is a good place to do that check, but I'd
> prefer it be done _before_ it actually goes to the network to start
> accessing potentially stale URL. The old URL may not be defunct but the
> project decided not to advertise it to be used for some non-technical
> reason (e.g. the site owner asked them not to point at it and instead use
> some other mirrors).
If only the protocol (such as http://->git://) is different between
urls in $GIT_DIR/config and .gitmodules, i think use
"url.base.insteadOf = newbase" is simpler.
If the urls are totally different, when url in .gitmodules changes,
there is little chance that the url in $GIT_DIR/config will also
change.
--
Ping Yin
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Intricacies of submodules
2008-04-12 5:11 ` Junio C Hamano
@ 2008-04-14 19:52 ` Roman Shaposhnik
2008-04-15 1:13 ` Junio C Hamano
0 siblings, 1 reply; 48+ messages in thread
From: Roman Shaposhnik @ 2008-04-14 19:52 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Ping Yin, Avery Pennarun, stuart.freeman, git
On Fri, 2008-04-11 at 22:11 -0700, Junio C Hamano wrote:
> Roman Shaposhnik <rvs@sun.com> writes:
>
> > ... Contrast this with .gitconfig where policies get
> > enforced right from the minute your clone operation finishes and there's
> > much less opportunity for the user to shoot himself in the foot.
>
> Why? Even if you expect .git/config in a new repository would be vanilla
> (which you can't really, as crazy sysadmin can have /etc/gitconfig or
> template to override what you do), $HOME/.gitconfig would be in effect the
> minute you clone.
I think I understand where you are going with this. Although, truth be
told, to me ~/.gitconfig is much less of a concern. Why? Well, because
by definition if the user is smart enough to edit ~/.gitconfig I'm
not concerned about him. As I pointed out my main concern is about
junior developers for whom the only way to screw things up would
be to have a global /etc/gitconfig, which is still quite rare.
> As you cannot reasonably expect that your project is the _only_ project
> your cloners would use, you cannot dictate what $HOME/.gitconfig has.
See, that's exactly why I would love to have in-tree .gitconfig ;-)
~/.gitconfig is not flexible enough to have settings for multiple
projects and .git/config needs to be managed by scritps. In-tree
.gitconfig just works.
> A policy issue needs to be addressed at the human level anyway, so I do
> not really see major difference either way. You need to trust your users
> to follow the guideline at some point, and all you can do is to make it
> easy for them to do so, and (optionally) verify that they are actually
> following the guideline. We need to suggest an easy-to-use and robust
> mechanism to allow you to do so as the BCP.
And that's where it becomes a matter of preference. I can now see your
point very clearly and I tend to slightly disagree with it. But! This
is definitely not a technical issue anymore (in-tree .gitconfig and
in-tree shell script for managing .git/config are technically
equivalent). So, I think I don't have any more arguments to add to the
discussion. I do have one question left (see bellow) and one comment
to make: my experience has been that it is much easier to trust
volunteer and open source developers compared to corporate ones.
I do get it 100% that Git is "for the kernel folks; by the kernel folks"
and I actually think that it is a healthy environment for an SCM to
grow in. But!
All I'm saying is that if the needs of the corporate folks can be taken
into account without doing Git's architecture any harm I think they
should be.
> Don't get me wrong. I am not saying that everybody should start rolling
> their own "sane environment setup script" and ship their project with it.
> I am only suggesting it as a possible way to do your "policy enforcement"
> without having to introduce in-tree .gitconfig, which I unfortunately see
> no fundamental upsides but definite downsides (security included).
And here comes my question: could you, please, elaborate on *technical*
drawbacks of in-tree .gitconfig (such as security that you've
mentioned).
Thanks,
Roman.
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Intricacies of submodules
2008-04-11 5:20 ` Junio C Hamano
2008-04-11 16:04 ` Ping Yin
@ 2008-04-14 19:56 ` Roman Shaposhnik
1 sibling, 0 replies; 48+ messages in thread
From: Roman Shaposhnik @ 2008-04-14 19:56 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Avery Pennarun, stuart.freeman, git, rvs
On Thu, 2008-04-10 at 22:20 -0700, Junio C Hamano wrote:
> Roman Shaposhnik <rvs@sun.com> writes:
>
> > ... I'm very interested in getting this functionality
> > right with git-submodule. And I can be either your guinea pig or
> > a frenetic hamster. After all, you don't mind complete newcomers
> > to the development process sending you code, do you? ;-)
>
> Everybody starts out as a total stranger. Linus has never worked with me
> when I started, and many people who are the core members of git community
> have never worked with me before either.
Cool! I do have a couple of questions on the development etiquette,
but I think I'll ask them off-line unless somebody can point me
to an FAQ on how Git's development is setup. The section
"Community and Development" doesn't seem to answer much
of my questions.
Thanks,
Roman.
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Intricacies of submodules
2008-04-14 19:52 ` Roman Shaposhnik
@ 2008-04-15 1:13 ` Junio C Hamano
2008-04-15 2:13 ` Ping Yin
2008-04-16 3:49 ` Roman V. Shaposhnik
0 siblings, 2 replies; 48+ messages in thread
From: Junio C Hamano @ 2008-04-15 1:13 UTC (permalink / raw)
To: Roman Shaposhnik; +Cc: Ping Yin, Avery Pennarun, stuart.freeman, git
Roman Shaposhnik <rvs@sun.com> writes:
>> Don't get me wrong. I am not saying that everybody should start rolling
>> their own "sane environment setup script" and ship their project with it.
>> I am only suggesting it as a possible way to do your "policy enforcement"
>> without having to introduce in-tree .gitconfig, which I unfortunately see
>> no fundamental upsides but definite downsides (security included).
>
> And here comes my question: could you, please, elaborate on *technical*
> drawbacks of in-tree .gitconfig (such as security that you've
> mentioned).
Just to name a few, as I do not see a point in spending time elaborating
in detail when there is an alternative without such security downsides.
One of your examples was about a forced use of custom merge tool.
Consider in-tree .gitconfig that is always read for everybody that
describes such a tool. A malicious script named there is a security risk
for people who clone such a project. A smudge filter is even worse, as it
kicks in the minute you try to check out the project.
These executable (not just merge tool or attribute filters) are designed
to be named by .git/config exactly because .git/config is designed to be
personal (i.e. "that _particular repository only_") and you can afford to
be environment and platform specific there. If you start describing them
in in-tree .gitconfig, they must be cross platform and (worse yet)
you have to make sure they are installed everywhere.
There are states recorded by git-submodule whether the particular
repository has seen and is interested in which submodule (i.e. "submodule
init" has been run).
I'm too lazy to make a laundary list of what you can have in .git/config
with the current system (see Documentation/config.txt), but that part of
the system is built around the design that the configuration is specific
to the repository (and sharing what the user records in ~/.gitconfig
across repositories is in line with it).
Unless you are willing to sift through all of them, mark which ones can be
overriden by in-tree .gitconfig and which ones cannot, and implement an
easy to use (by both the developers and the users) mechanism to enforce
the distinction, just changing the git_config() function to read from one
new place (i.e. in-tree .gitconfig) would not be a sufficient solution for
what you seem to want to do.
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Intricacies of submodules
2008-04-15 1:13 ` Junio C Hamano
@ 2008-04-15 2:13 ` Ping Yin
2008-04-16 3:49 ` Roman V. Shaposhnik
1 sibling, 0 replies; 48+ messages in thread
From: Ping Yin @ 2008-04-15 2:13 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Roman Shaposhnik, Avery Pennarun, stuart.freeman, git
On Tue, Apr 15, 2008 at 9:13 AM, Junio C Hamano <gitster@pobox.com> wrote:
> Roman Shaposhnik <rvs@sun.com> writes:
>
>
> I'm too lazy to make a laundary list of what you can have in .git/config
> with the current system (see Documentation/config.txt), but that part of
> the system is built around the design that the configuration is specific
> to the repository (and sharing what the user records in ~/.gitconfig
> across repositories is in line with it).
>
I can give some which can be enforced in a project level
merge.summary
status.submodulesummary
core.whitespace
--
Ping Yin
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Intricacies of submodules
2008-04-15 1:13 ` Junio C Hamano
2008-04-15 2:13 ` Ping Yin
@ 2008-04-16 3:49 ` Roman V. Shaposhnik
2008-04-17 18:09 ` Jeremy Maitin-Shepard
1 sibling, 1 reply; 48+ messages in thread
From: Roman V. Shaposhnik @ 2008-04-16 3:49 UTC (permalink / raw)
To: Junio C Hamano; +Cc: Ping Yin, Avery Pennarun, stuart.freeman, git
On Mon, 2008-04-14 at 18:13 -0700, Junio C Hamano wrote:
> Roman Shaposhnik <rvs@sun.com> writes:
>
> >> Don't get me wrong. I am not saying that everybody should start rolling
> >> their own "sane environment setup script" and ship their project with it.
> >> I am only suggesting it as a possible way to do your "policy enforcement"
> >> without having to introduce in-tree .gitconfig, which I unfortunately see
> >> no fundamental upsides but definite downsides (security included).
> >
> > And here comes my question: could you, please, elaborate on *technical*
> > drawbacks of in-tree .gitconfig (such as security that you've
> > mentioned).
>
> Just to name a few, as I do not see a point in spending time elaborating
> in detail when there is an alternative without such security downsides.
>
> One of your examples was about a forced use of custom merge tool.
> Consider in-tree .gitconfig that is always read for everybody that
> describes such a tool. A malicious script named there is a security risk
> for people who clone such a project. A smudge filter is even worse, as it
> kicks in the minute you try to check out the project.
I'm sorry, but I don't buy this argument. If you have a malicious user
gaining access to the repository all bets are off. To single out
in-tree .gitconfig as the only place which could be hacked seems to
be a bit shortsighted and unfair. Any "executable" portion of your
project that rarely gets eyeballed (such as Makefile infrastrucutre)
could be used. In fact, under your scenario in-tree .gitconfig is
likely to be the least of your worries.
And here's one more thing: in-tree .gitconfig and in-tree
update-my-git-settings.sh are absolutely identical as far
as their security ramifications are concerned. If you really paranoid
you have to eyeball either of them.
> These executable (not just merge tool or attribute filters) are designed
> to be named by .git/config exactly because .git/config is designed to be
> personal (i.e. "that _particular repository only_") and you can afford to
> be environment and platform specific there. If you start describing them
> in in-tree .gitconfig, they must be cross platform and (worse yet)
> you have to make sure they are installed everywhere.
I don't buy this argument either. First of all, there's a $PATH. On top
of that even automounters learned how to deal with heterogeneous
hosts efficiently ($HOST, $CPU, etc.) so I really don't think Git should
have any problems. But the most obvious counterargument to your
statement would be that quite a few developers (myself included) don't
have a luxury of developing on a single architecture. Thus in-tree
.gitconfig doesn't change anything -- *my* single Git repository has to
provide settings that work on: [sparc|intel]-[Solaris|Linux]. I do
have .git/config that accomplished that. I see no reason for in-tree
.gitconfig to not be able to.
> I'm too lazy to make a laundary list of what you can have in .git/config
> with the current system (see Documentation/config.txt), but that part of
> the system is built around the design that the configuration is specific
> to the repository (and sharing what the user records in ~/.gitconfig
> across repositories is in line with it).
>
> Unless you are willing to sift through all of them, mark which ones can be
> overriden by in-tree .gitconfig and which ones cannot, and implement an
> easy to use (by both the developers and the users) mechanism to enforce
> the distinction, just changing the git_config() function to read from one
> new place (i.e. in-tree .gitconfig) would not be a sufficient solution for
> what you seem to want to do.
Why? I'm really confused here. Unless I'm given a clear example of at
least one setting that somehow becomes dangerous when stored inside
in-tree .gitconfig, I really do consider such an enforcement to be
as meaningful as enforcing that Git MUST manage source code and nothing
else. You seemed to mention the trust issue. Well, why don't you trust
the user to place whatever he wants in in-tree .gitconfig? And yes,
we are talking about trustworthy users here and repositories that
haven't been compromised.
Thanks,
Roman.
P.S. Junio, I really don't want to waste your time especially since
I get a feeling that our discussion has clearly moved into a domain
of taste and preferences. But I had to refute your security and
heterogeneity arguments simply because they don't seem to have any
substance to them.
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Intricacies of submodules
2008-04-16 3:49 ` Roman V. Shaposhnik
@ 2008-04-17 18:09 ` Jeremy Maitin-Shepard
2008-04-17 19:06 ` Linus Torvalds
2008-04-17 19:50 ` Roman V. Shaposhnik
0 siblings, 2 replies; 48+ messages in thread
From: Jeremy Maitin-Shepard @ 2008-04-17 18:09 UTC (permalink / raw)
To: Roman V. Shaposhnik
Cc: Junio C Hamano, Ping Yin, Avery Pennarun, stuart.freeman, git
"Roman V. Shaposhnik" <rvs@sun.com> writes:
[snip]
> I'm sorry, but I don't buy this argument. If you have a malicious user
> gaining access to the repository all bets are off. To single out
> in-tree .gitconfig as the only place which could be hacked seems to
> be a bit shortsighted and unfair. Any "executable" portion of your
> project that rarely gets eyeballed (such as Makefile infrastrucutre)
> could be used. In fact, under your scenario in-tree .gitconfig is
> likely to be the least of your worries.
> And here's one more thing: in-tree .gitconfig and in-tree
> update-my-git-settings.sh are absolutely identical as far
> as their security ramifications are concerned. If you really paranoid
> you have to eyeball either of them.
There is a huge difference: if you allow in-tree .gitconfig by default,
then git clone <some-repository> becomes an unsafe operation. I can't
even inspect some arbitrary repository to _see_ if I like the code and
think it is safe very easily, since I'd normally do that by cloning the
repository.
Obviously actually executing untrusted code is unsafe regardless of
whether you type "git clone" or "make" to do it, but not everyone
intends to type "make" after checking out an unknown repository, and the
user is explicitly invoking make with the knowledge that it is running
whatever code is in the repository. Similarly, if the user explicitly
calls some shell script in order to set things up, he is conscious that
he is performing a potentially unsafe operation.
As a silly analogy, it is currently perfectly safe to clone a repository
that has a text document containing instructions about committing
suicide, because there is the assumption that the instructions are not
automatically executed simply because they are on the user's hard drive.
[snip]
> Why? I'm really confused here. Unless I'm given a clear example of at
> least one setting that somehow becomes dangerous when stored inside
> in-tree .gitconfig, I really do consider such an enforcement to be
> as meaningful as enforcing that Git MUST manage source code and nothing
> else. You seemed to mention the trust issue. Well, why don't you trust
> the user to place whatever he wants in in-tree .gitconfig? And yes,
> we are talking about trustworthy users here and repositories that
> haven't been compromised.
Obviously any configuration option that specifies a shell command to run
is unsafe to specify in an in-tree .gitconfig. As Junio noted,
smudge/clean commands are especially unsafe because they will be
executed even if the user only uses the clone command.
You actually seem to be the one assuming that a Git repository must
store source code (in particular source code that is then blindly
executed by anyone that clones the repository), as that is the only case
in which an in-tree .gitconfig can introduce no additional security
risk, since your security is then already completely dependent on
trusting the contents of the repository.
--
Jeremy Maitin-Shepard
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Intricacies of submodules
2008-04-17 18:09 ` Jeremy Maitin-Shepard
@ 2008-04-17 19:06 ` Linus Torvalds
2008-04-17 20:04 ` Junio C Hamano
2008-04-17 19:50 ` Roman V. Shaposhnik
1 sibling, 1 reply; 48+ messages in thread
From: Linus Torvalds @ 2008-04-17 19:06 UTC (permalink / raw)
To: Jeremy Maitin-Shepard
Cc: Roman V. Shaposhnik, Junio C Hamano, Ping Yin, Avery Pennarun,
stuart.freeman, git
On Thu, 17 Apr 2008, Jeremy Maitin-Shepard wrote:
>
> There is a huge difference: if you allow in-tree .gitconfig by default,
> then git clone <some-repository> becomes an unsafe operation.
I have to agree.
The git config file is rather powerful, with things like aliases etc that
can be used to run external programs (and with the external diff
functionality that includes it for very basic and default operations), and
ways of subtly (and not so subtly) rewriting repository information etc
etc.
So if we do end up doing a "tracked config file", I'd personally very much
prefer it be limited in some way. For example, we obviously track the
.gitignore and .gitattributes files, but they are much more limited in
their effects. Maybe we could have a "limited config file" that allows for
*some* config options to be set?
Linus
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Intricacies of submodules
2008-04-17 18:09 ` Jeremy Maitin-Shepard
2008-04-17 19:06 ` Linus Torvalds
@ 2008-04-17 19:50 ` Roman V. Shaposhnik
2008-04-17 20:06 ` Martin Langhoff
` (3 more replies)
1 sibling, 4 replies; 48+ messages in thread
From: Roman V. Shaposhnik @ 2008-04-17 19:50 UTC (permalink / raw)
To: Jeremy Maitin-Shepard
Cc: Junio C Hamano, Ping Yin, Avery Pennarun, stuart.freeman, git
On Thu, 2008-04-17 at 14:09 -0400, Jeremy Maitin-Shepard wrote:
> > And here's one more thing: in-tree .gitconfig and in-tree
> > update-my-git-settings.sh are absolutely identical as far
> > as their security ramifications are concerned. If you really paranoid
> > you have to eyeball either of them.
>
> There is a huge difference: if you allow in-tree .gitconfig by default,
> then git clone <some-repository> becomes an unsafe operation. I can't
> even inspect some arbitrary repository to _see_ if I like the code and
> think it is safe very easily, since I'd normally do that by cloning the
> repository.
Are you saying that a *remote* in-tree .gitconfig would be capable of
affecting *local* system before the end of the clone operation? I find
it very hard to believe. And if it is so, I'd love to be educated on the
subject matter. What I (and to some extent Ping Yin) have been proposing
is a completely different semantics -- the in-tree .gitconfig would only
be able to affect your *local* operations. Doing clone of the *remote*
repository is a safe operation under such assumptions. Once you cloned
it, you might need to eyeball the content of .gitconfig if you're really
paranoid.
> As a silly analogy, it is currently perfectly safe to clone a repository
> that has a text document containing instructions about committing
> suicide, because there is the assumption that the instructions are not
> automatically executed simply because they are on the user's hard drive.
Same holds true for the semantics being proposed. The intsructions are
*not* executed until you actually try to do something with your
repository. There's a window of opportunity in which inspecting the
content of .gitconfig is absolutely possible.
> > Why? I'm really confused here. Unless I'm given a clear example of at
> > least one setting that somehow becomes dangerous when stored inside
> > in-tree .gitconfig, I really do consider such an enforcement to be
> > as meaningful as enforcing that Git MUST manage source code and nothing
> > else. You seemed to mention the trust issue. Well, why don't you trust
> > the user to place whatever he wants in in-tree .gitconfig? And yes,
> > we are talking about trustworthy users here and repositories that
> > haven't been compromised.
>
> Obviously any configuration option that specifies a shell command to run
> is unsafe to specify in an in-tree .gitconfig. As Junio noted,
> smudge/clean commands are especially unsafe because they will be
> executed even if the user only uses the clone command.
I'm sorry but I guess that went over my head. Is this the example of
something that can affect local repository (and host!) during the
clone operation? I tried to find documentation on the subject but
googling for "git smudge" returns very few useful hits and the bits
of documentation in gitattributes(5) don't really explain much.
> You actually seem to be the one assuming that a Git repository must
> store source code (in particular source code that is then blindly
> executed by anyone that clones the repository), as that is the only case
> in which an in-tree .gitconfig can introduce no additional security
> risk, since your security is then already completely dependent on
> trusting the contents of the repository.
There are two things at play: first of all, I usually *do* trust the
content of the repository. Call it matter of personal preference,
but *for me* if you start with distrust -- there's very little you
can do with that repository to begin with. To me it is a bit of
red herring. On the other hand I understand where you're coming from
and I definitely appreciate the need for a clone operation to be
safe. So far, the only example of an unsafe setting that I have been
given is smudge/clean filters. May be this is enough to shoot the
very idea of an in-tree .gitconfig down, but I still don't really
understand the *complete* semantics of these things. Can somebody
explain, please?
I hope this is not too much to ask.
Thanks,
Roman.
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Intricacies of submodules
2008-04-17 19:06 ` Linus Torvalds
@ 2008-04-17 20:04 ` Junio C Hamano
[not found] ` <32541b130804181128j57d76edcsbbd5fb8d4c782ae7@mail.gmail.com>
0 siblings, 1 reply; 48+ messages in thread
From: Junio C Hamano @ 2008-04-17 20:04 UTC (permalink / raw)
To: Linus Torvalds
Cc: Jeremy Maitin-Shepard, Roman V. Shaposhnik, Ping Yin,
Avery Pennarun, stuart.freeman, git
Linus Torvalds <torvalds@linux-foundation.org> writes:
> So if we do end up doing a "tracked config file", I'd personally very much
> prefer it be limited in some way. For example, we obviously track the
> .gitignore and .gitattributes files, but they are much more limited in
> their effects. Maybe we could have a "limited config file" that allows for
> *some* config options to be set?
Yes, that's all what I have been trying to say ;-)
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Intricacies of submodules
2008-04-17 19:50 ` Roman V. Shaposhnik
@ 2008-04-17 20:06 ` Martin Langhoff
2008-04-17 20:44 ` Junio C Hamano
2008-04-17 22:29 ` Dmitry Potapov
` (2 subsequent siblings)
3 siblings, 1 reply; 48+ messages in thread
From: Martin Langhoff @ 2008-04-17 20:06 UTC (permalink / raw)
To: Roman V. Shaposhnik
Cc: Jeremy Maitin-Shepard, Junio C Hamano, Ping Yin, Avery Pennarun,
stuart.freeman, git
On Thu, Apr 17, 2008 at 4:50 PM, Roman V. Shaposhnik <rvs@sun.com> wrote:
> There are two things at play: first of all, I usually *do* trust the
> content of the repository. Call it matter of personal preference,
I think most people here split the trust into "before or after
compilation". I must trust that I can clone/checkout code safely so I
can review it.
Running the code contained in the repo we are discussing a completely
different matter. Even before compilation, Makefiles and configure
scripts may shoot you in the foot or in the face. But you had at least
a chance to review it.
cheers,
n
--
martin.langhoff@gmail.com
martin@laptop.org -- School Server Architect
- ask interesting questions
- don't get distracted with shiny stuff - working code first
- http://wiki.laptop.org/go/User:Martinlanghoff
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Intricacies of submodules
2008-04-17 20:06 ` Martin Langhoff
@ 2008-04-17 20:44 ` Junio C Hamano
2008-04-17 21:00 ` Sverre Rabbelier
0 siblings, 1 reply; 48+ messages in thread
From: Junio C Hamano @ 2008-04-17 20:44 UTC (permalink / raw)
To: Martin Langhoff
Cc: Roman V. Shaposhnik, Jeremy Maitin-Shepard, Junio C Hamano,
Ping Yin, Avery Pennarun, stuart.freeman, git
"Martin Langhoff" <martin.langhoff@gmail.com> writes:
> On Thu, Apr 17, 2008 at 4:50 PM, Roman V. Shaposhnik <rvs@sun.com> wrote:
>> There are two things at play: first of all, I usually *do* trust the
>> content of the repository. Call it matter of personal preference,
>
> I think most people here split the trust into "before or after
> compilation". I must trust that I can clone/checkout code safely so I
> can review it.
I think that summarizes the arguments so far pretty well.
Having said that, the current "clone" implementation may happen to ignore
in-tree anything, e.g. ident filter defined in .gitattributes may not be
applied due to chicken-and-egg issue of not having .gitattributes
initially in the work tree when you check everything out to an empty work
tree for the first time.
But I consider that is not by design, but is a limitation of the current
implementation that can be improved.
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Intricacies of submodules
2008-04-17 20:44 ` Junio C Hamano
@ 2008-04-17 21:00 ` Sverre Rabbelier
2008-04-17 21:25 ` Martin Langhoff
0 siblings, 1 reply; 48+ messages in thread
From: Sverre Rabbelier @ 2008-04-17 21:00 UTC (permalink / raw)
To: Junio C Hamano
Cc: Martin Langhoff, Roman V. Shaposhnik, Jeremy Maitin-Shepard,
Ping Yin, Avery Pennarun, stuart.freeman, git
On Thu, Apr 17, 2008 at 10:44 PM, Junio C Hamano <gitster@pobox.com> wrote:
> Having said that, the current "clone" implementation may happen to ignore
> in-tree anything,
<snip>
> But I consider that is not by design, but is a limitation of the current
> implementation that can be improved.
I think it -should- be by design that it ignores everything unless we
are certain that it is safe to do so. So as long as an in-tree doesn't
provide any hooks to execute things (which of course includes changing
the environment) it should be fine, but if it is, it should be ignored
till after clone has finished.
Because of that an in-tree '.gitconfig' would have no security risks
as long as it is not 'used' until after the clone. This would be easy
to make sure of by not syncing it with the real '.gitconfig' until
after cloning. (That is assuming there will be some sort of syncing to
the real 'gitconfig' from the in-tree '.gitconfig', if a fall-back
type of mechanism is chosen that might be more difficult)
Cheers,
Sverre Rabbelier
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Intricacies of submodules
2008-04-17 21:00 ` Sverre Rabbelier
@ 2008-04-17 21:25 ` Martin Langhoff
2008-04-17 21:27 ` Sverre Rabbelier
0 siblings, 1 reply; 48+ messages in thread
From: Martin Langhoff @ 2008-04-17 21:25 UTC (permalink / raw)
To: sverre
Cc: Junio C Hamano, Roman V. Shaposhnik, Jeremy Maitin-Shepard,
Ping Yin, Avery Pennarun, stuart.freeman, git
On Thu, Apr 17, 2008 at 6:00 PM, Sverre Rabbelier <alturin@gmail.com> wrote:
> provide any hooks to execute things (which of course includes changing
> the environment) it should be fine, but if it is, it should be ignored
> till after clone has finished.
It should not be allowed at all. After the clone is the review, and
that has to be safe too.
> Because of that an in-tree '.gitconfig' would have no security risks
> as long as it is not 'used' until after the clone.
This is not true. A pre-commit hook or pre-checkout hook could be destructive.
cheers,
m
--
martin.langhoff@gmail.com
martin@laptop.org -- School Server Architect
- ask interesting questions
- don't get distracted with shiny stuff - working code first
- http://wiki.laptop.org/go/User:Martinlanghoff
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Intricacies of submodules
2008-04-17 21:25 ` Martin Langhoff
@ 2008-04-17 21:27 ` Sverre Rabbelier
2008-04-17 21:31 ` Martin Langhoff
0 siblings, 1 reply; 48+ messages in thread
From: Sverre Rabbelier @ 2008-04-17 21:27 UTC (permalink / raw)
To: Martin Langhoff
Cc: Junio C Hamano, Roman V. Shaposhnik, Jeremy Maitin-Shepard,
Ping Yin, Avery Pennarun, stuart.freeman, git
On Thu, Apr 17, 2008 at 11:25 PM, Martin Langhoff
<martin.langhoff@gmail.com> wrote:
> On Thu, Apr 17, 2008 at 6:00 PM, Sverre Rabbelier <alturin@gmail.com> wrote:
> > provide any hooks to execute things (which of course includes changing
> > the environment) it should be fine, but if it is, it should be ignored
> > till after clone has finished.
>
> It should not be allowed at all. After the clone is the review, and
> that has to be safe too.
I reckon review is done without using git, I don't see how it would
pose a security risk.
> > Because of that an in-tree '.gitconfig' would have no security risks
> > as long as it is not 'used' until after the clone.
>
> This is not true. A pre-commit hook or pre-checkout hook could be destructive.
But, those won't be executed till after the review, so everything
would be good still, wouldn't it?
Cheers,
Sverre Rabbelier
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Intricacies of submodules
2008-04-17 21:27 ` Sverre Rabbelier
@ 2008-04-17 21:31 ` Martin Langhoff
2008-04-18 1:41 ` Ping Yin
0 siblings, 1 reply; 48+ messages in thread
From: Martin Langhoff @ 2008-04-17 21:31 UTC (permalink / raw)
To: sverre
Cc: Junio C Hamano, Roman V. Shaposhnik, Jeremy Maitin-Shepard,
Ping Yin, Avery Pennarun, stuart.freeman, git
On Thu, Apr 17, 2008 at 6:27 PM, Sverre Rabbelier <alturin@gmail.com> wrote:
> > > Because of that an in-tree '.gitconfig' would have no security risks
> > > as long as it is not 'used' until after the clone.
> >
> > This is not true. A pre-commit hook or pre-checkout hook could be destructive.
>
> But, those won't be executed till after the review, so everything
> would be good still, wouldn't it?
No. A local review can be quite "active", involving changing branches,
moving patches around, and fixing sh*t up. The hooks available offer
plenty of danger if the repo can set them and make them active:
$ ls .git/hooks/
applypatch-msg post-commit post-update pre-commit update
commit-msg post-receive pre-applypatch pre-rebase
cheers,
m
--
martin.langhoff@gmail.com
martin@laptop.org -- School Server Architect
- ask interesting questions
- don't get distracted with shiny stuff - working code first
- http://wiki.laptop.org/go/User:Martinlanghoff
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Intricacies of submodules
2008-04-17 19:50 ` Roman V. Shaposhnik
2008-04-17 20:06 ` Martin Langhoff
@ 2008-04-17 22:29 ` Dmitry Potapov
2008-04-17 22:32 ` Linus Torvalds
2008-04-18 14:02 ` Jakub Narebski
3 siblings, 0 replies; 48+ messages in thread
From: Dmitry Potapov @ 2008-04-17 22:29 UTC (permalink / raw)
To: Roman V. Shaposhnik
Cc: Jeremy Maitin-Shepard, Junio C Hamano, Ping Yin, Avery Pennarun,
stuart.freeman, git
On Thu, Apr 17, 2008 at 12:50:08PM -0700, Roman V. Shaposhnik wrote:
> Doing clone of the *remote*
> repository is a safe operation under such assumptions. Once you cloned
> it, you might need to eyeball the content of .gitconfig if you're really
> paranoid.
No, I don't think it is right. It is absolutely unacceptable to expect
all users to be aware of some hidden file and to eyeball it just to be
sure that the next 'git log' (or some other normal git operation) will
not remove all their files from the disk.
Perhaps, I have not followed this discussion carefully, so I am not sure
what .gitconfig is intended to solve. But if you think that _blindly_
adding some options to other people configurations is a good idea, I
have to disagree with you. Some options may be useful in some cases or
for some platforms, but not for others. So, having a single .gitconfig
is going to be a bad fit for some users. Thus a more flexible and more
secure solution is needed, and it already exists.
You can put git-configure at the top of your repository and tell people
to run it after cloning. In this way, anyone can inspect this script and
if they trust they will run it. This script can check on what system git
is running on, and maybe ask questions, etc, so it can be really helpful
for wide category of users.
Dmitry
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Intricacies of submodules
2008-04-17 19:50 ` Roman V. Shaposhnik
2008-04-17 20:06 ` Martin Langhoff
2008-04-17 22:29 ` Dmitry Potapov
@ 2008-04-17 22:32 ` Linus Torvalds
2008-04-18 1:48 ` Ping Yin
2008-04-18 14:02 ` Jakub Narebski
3 siblings, 1 reply; 48+ messages in thread
From: Linus Torvalds @ 2008-04-17 22:32 UTC (permalink / raw)
To: Roman V. Shaposhnik
Cc: Jeremy Maitin-Shepard, Junio C Hamano, Ping Yin, Avery Pennarun,
stuart.freeman, git
On Thu, 17 Apr 2008, Roman V. Shaposhnik wrote:
>
> Are you saying that a *remote* in-tree .gitconfig would be capable of
> affecting *local* system before the end of the clone operation?
No. But what do you do after a "git clone".
Do you, for example, do something like "git log -p" to actually see the
commits?
And what happens if that runs an external diff viewer script that just
happens to do a "rm -rf $HOME"?
See? The .git/config file allows you to set those kinds of things. They
should *not* be things you download!
Linus
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Intricacies of submodules
2008-04-17 21:31 ` Martin Langhoff
@ 2008-04-18 1:41 ` Ping Yin
0 siblings, 0 replies; 48+ messages in thread
From: Ping Yin @ 2008-04-18 1:41 UTC (permalink / raw)
To: Martin Langhoff
Cc: sverre, Junio C Hamano, Roman V. Shaposhnik,
Jeremy Maitin-Shepard, Avery Pennarun, stuart.freeman, git
On Fri, Apr 18, 2008 at 5:31 AM, Martin Langhoff
<martin.langhoff@gmail.com> wrote:
> On Thu, Apr 17, 2008 at 6:27 PM, Sverre Rabbelier <alturin@gmail.com> wrote:
> > > > Because of that an in-tree '.gitconfig' would have no security risks
> > > > as long as it is not 'used' until after the clone.
> > >
> > > This is not true. A pre-commit hook or pre-checkout hook could be destructive.
> >
> > But, those won't be executed till after the review, so everything
> > would be good still, wouldn't it?
>
> No. A local review can be quite "active", involving changing branches,
> moving patches around, and fixing sh*t up. The hooks available offer
> plenty of danger if the repo can set them and make them active:
>
> $ ls .git/hooks/
> applypatch-msg post-commit post-update pre-commit update
> commit-msg post-receive pre-applypatch pre-rebase
>
AFAIK, hooks are not cloned automatically. So where do the destructive
hooks come from?
--
Ping Yin
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Intricacies of submodules
2008-04-17 22:32 ` Linus Torvalds
@ 2008-04-18 1:48 ` Ping Yin
0 siblings, 0 replies; 48+ messages in thread
From: Ping Yin @ 2008-04-18 1:48 UTC (permalink / raw)
To: Linus Torvalds
Cc: Roman V. Shaposhnik, Jeremy Maitin-Shepard, Junio C Hamano,
Avery Pennarun, stuart.freeman, git
On Fri, Apr 18, 2008 at 6:32 AM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
>
> On Thu, 17 Apr 2008, Roman V. Shaposhnik wrote:
> >
> > Are you saying that a *remote* in-tree .gitconfig would be capable of
> > affecting *local* system before the end of the clone operation?
>
> No. But what do you do after a "git clone".
>
> Do you, for example, do something like "git log -p" to actually see the
> commits?
>
> And what happens if that runs an external diff viewer script that just
> happens to do a "rm -rf $HOME"?
>
Good point. This is the best example (maybe the only one till now) i
have seen that demostrates the bad thing of in-tree .gitconfig. So i
vote for the limited in-tree .gitconfig point of Linus.
--
Ping Yin
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Intricacies of submodules
2008-04-17 19:50 ` Roman V. Shaposhnik
` (2 preceding siblings ...)
2008-04-17 22:32 ` Linus Torvalds
@ 2008-04-18 14:02 ` Jakub Narebski
3 siblings, 0 replies; 48+ messages in thread
From: Jakub Narebski @ 2008-04-18 14:02 UTC (permalink / raw)
To: git
Roman V. Shaposhnik wrote:
> On Thu, 2008-04-17 at 14:09 -0400, Jeremy Maitin-Shepard wrote:
>>> And here's one more thing: in-tree .gitconfig and in-tree
>>> update-my-git-settings.sh are absolutely identical as far
>>> as their security ramifications are concerned. If you really paranoid
>>> you have to eyeball either of them.
>>
>> There is a huge difference: if you allow in-tree .gitconfig by default,
>> then git clone <some-repository> becomes an unsafe operation. I can't
>> even inspect some arbitrary repository to _see_ if I like the code and
>> think it is safe very easily, since I'd normally do that by cloning the
>> repository.
[...]
>> Obviously any configuration option that specifies a shell command to run
>> is unsafe to specify in an in-tree .gitconfig. As Junio noted,
>> smudge/clean commands are especially unsafe because they will be
>> executed even if the user only uses the clone command.
>
> Are you saying that a *remote* in-tree .gitconfig would be capable of
> affecting *local* system before the end of the clone operation?
At the end of clone operation you usually do a checkout. clean/smudge
commands could wipe out your disk at the end of clone. And one usually
does checkout to view contents of repository (alternative is to use
plumbing git-cat-file, which does not use .gitattributes).
--
Jakub Narebski
Warsaw, Poland
ShadeHawk on #git
^ permalink raw reply [flat|nested] 48+ messages in thread
* Re: Intricacies of submodules
[not found] ` <32541b130804181128j57d76edcsbbd5fb8d4c782ae7@mail.gmail.com>
@ 2008-04-18 18:30 ` Avery Pennarun
0 siblings, 0 replies; 48+ messages in thread
From: Avery Pennarun @ 2008-04-18 18:30 UTC (permalink / raw)
To: Junio C Hamano, git
On 4/17/08, Junio C Hamano <gitster@pobox.com> wrote:
> Linus Torvalds <torvalds@linux-foundation.org> writes:
> > So if we do end up doing a "tracked config file", I'd personally very much
> > prefer it be limited in some way. For example, we obviously track the
> > .gitignore and .gitattributes files, but they are much more limited in
> > their effects. Maybe we could have a "limited config file" that allows for
> > *some* config options to be set?
>
> Yes, that's all what I have been trying to say ;-)
How about this: we know that *most* options are harmless, at least
from a security point of view. AFAIK it's really just the ones where
you specify shell commands that are unsafe.
Why not have a list of "safe" config options in git, and when reading
.gitconfig, error out if any of the options in that file are unsafe.
(Alternatively: silently ignore the unsafe ones, or warn and then
ignore the unsafe ones.) A more advanced variation of the same would
be to have .git/config options that list specific exceptions to the
safe list, so if .gitconfig causes an error, you can *explicitly* git
config set to let .gitconfig override them.
Another possibility would be to have an "unsafe" list instead of a
"safe" list, but that sounds rather error-prone to me.
Have fun,
Avery
^ permalink raw reply [flat|nested] 48+ messages in thread
end of thread, other threads:[~2008-04-18 18:30 UTC | newest]
Thread overview: 48+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-03-31 20:59 Migrating svn to git with heavy use of externals D. Stuart Freeman
2008-04-08 18:07 ` D. Stuart Freeman
2008-04-08 20:06 ` Avery Pennarun
2008-04-08 20:49 ` D. Stuart Freeman
2008-04-08 21:01 ` Avery Pennarun
2008-04-08 22:47 ` D. Stuart Freeman
2008-04-09 3:03 ` Roman Shaposhnik
2008-04-09 3:33 ` Avery Pennarun
2008-04-09 4:39 ` Roman Shaposhnik
2008-04-09 6:34 ` Avery Pennarun
2008-04-09 6:43 ` Junio C Hamano
2008-04-10 3:43 ` Intricacies of submodules [was: Migrating svn to git with heavy use of externals] Roman Shaposhnik
2008-04-10 5:53 ` Intricacies of submodules Junio C Hamano
2008-04-10 20:32 ` Roman Shaposhnik
2008-04-11 5:20 ` Junio C Hamano
2008-04-11 16:04 ` Ping Yin
2008-04-11 22:32 ` Junio C Hamano
2008-04-12 3:13 ` Roman Shaposhnik
2008-04-12 5:11 ` Junio C Hamano
2008-04-14 19:52 ` Roman Shaposhnik
2008-04-15 1:13 ` Junio C Hamano
2008-04-15 2:13 ` Ping Yin
2008-04-16 3:49 ` Roman V. Shaposhnik
2008-04-17 18:09 ` Jeremy Maitin-Shepard
2008-04-17 19:06 ` Linus Torvalds
2008-04-17 20:04 ` Junio C Hamano
[not found] ` <32541b130804181128j57d76edcsbbd5fb8d4c782ae7@mail.gmail.com>
2008-04-18 18:30 ` Avery Pennarun
2008-04-17 19:50 ` Roman V. Shaposhnik
2008-04-17 20:06 ` Martin Langhoff
2008-04-17 20:44 ` Junio C Hamano
2008-04-17 21:00 ` Sverre Rabbelier
2008-04-17 21:25 ` Martin Langhoff
2008-04-17 21:27 ` Sverre Rabbelier
2008-04-17 21:31 ` Martin Langhoff
2008-04-18 1:41 ` Ping Yin
2008-04-17 22:29 ` Dmitry Potapov
2008-04-17 22:32 ` Linus Torvalds
2008-04-18 1:48 ` Ping Yin
2008-04-18 14:02 ` Jakub Narebski
2008-04-12 3:20 ` Ping Yin
2008-04-14 19:56 ` Roman Shaposhnik
2008-04-12 4:02 ` Ping Yin
2008-04-12 5:25 ` Junio C Hamano
2008-04-12 6:26 ` Ping Yin
2008-04-10 16:07 ` Intricacies of submodules [was: Migrating svn to git with heavy use of externals] Ping Yin
2008-04-10 19:27 ` Roman Shaposhnik
2008-04-09 19:57 ` Roman Shaposhnik
2008-04-09 20:27 ` Avery Pennarun
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).