Git development
 help / color / mirror / Atom feed
* Re: Figured out how to get Mozilla into git
From: Junio C Hamano @ 2006-06-10  3:55 UTC (permalink / raw)
  To: Martin Langhoff; +Cc: git
In-Reply-To: <46a038f90606092041neadcc54n2acb6272d1f71de7@mail.gmail.com>

"Martin Langhoff" <martin.langhoff@gmail.com> writes:

> Yes, most people have -z3, and I agree with you, on paper it sounds
> like the cost is 1/4 of a git clone.
>
> However.
>
> The CVS protocol is very chatty because the client _acts_ extremely
> stupid. It says, ok, I got here an empty directory, and the server
> walks the client through every little step. And all that chatter is
> uncompressed cleartext under pserver.
>
> So the per-file and per-directory overhead are significant. I can do a
> cvs checkout via pserver:localhost but I don't know off-the-cuff how
> to measure the traffic. Hints?

If you have an otherwise unused interface, you can look at
ifconfig output and see RX/TX bytes?  But that sounds very
crude.

Running it through a proxy perhaps?

^ permalink raw reply

* Re: [PATCH] shared repository settings enhancement.
From: Junio C Hamano @ 2006-06-10  3:49 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git
In-Reply-To: <Pine.LNX.4.64.0606091743410.5498@g5.osdl.org>

Linus Torvalds <torvalds@osdl.org> writes:

> On Fri, 9 Jun 2006, Junio C Hamano wrote:
>>
>> This lets you say:
>> 
>> 	[core]
>> 		sharedrepository = 075
>
> I really think it's better to express this as some more traditional 
> number.
>
> I had to think about what 075 meant, while saying
>
> 	[core]
> 		sharedrepository = 0644
>
> just makes sense more or less automatically (and yes, for directories, the 
> read bit should obviously be expanded as an execute bit).

Or probably use the umask notation, 007 for traditional shared
repositories and 002 for gitweb exported ones.  With your
notation, people would start wondering what the distinction
between 0755, 0644, and even 0254 is (there isn't any).

Having said that, I do not think the distinction is that
important; I would rather make the core.sharedrepository = true
to mean an equivalent of "chmod go+rX" (it does "chmod g+rX"
currently).

^ permalink raw reply

* Re: Figured out how to get Mozilla into git
From: Martin Langhoff @ 2006-06-10  3:41 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jon Smirl, git
In-Reply-To: <Pine.LNX.4.64.0606091853180.5498@g5.osdl.org>

On 6/10/06, Linus Torvalds <torvalds@osdl.org> wrote:
> On Fri, 9 Jun 2006, Jon Smirl wrote:
> > >
> > > Btw, does anybody know roughly how much data a initial "cvs co" takes on
> > > the mozilla repo? Git will obviously get the whole history, and that will
> > > inevitably be bigger than getting a single check-out, but it's not
> > > necessarily orders of magnitude bigger.
> >
> > 339MB for initial checkout
>
> And I think people run :pserver: with compression by default, so we're
> likely talking about half that in actual download overhead, no?

Yes, most people have -z3, and I agree with you, on paper it sounds
like the cost is 1/4 of a git clone.

However.

The CVS protocol is very chatty because the client _acts_ extremely
stupid. It says, ok, I got here an empty directory, and the server
walks the client through every little step. And all that chatter is
uncompressed cleartext under pserver.

So the per-file and per-directory overhead are significant. I can do a
cvs checkout via pserver:localhost but I don't know off-the-cuff how
to measure the traffic. Hints?

cheers,


martin

^ permalink raw reply

* Re: Figured out how to get Mozilla into git
From: Linus Torvalds @ 2006-06-10  3:08 UTC (permalink / raw)
  To: Carl Worth; +Cc: Jon Smirl, Martin Langhoff, git
In-Reply-To: <87y7w5lowc.wl%cworth@cworth.org>



On Fri, 9 Jun 2006, Carl Worth wrote:

> On Fri, 9 Jun 2006 22:21:17 -0400, "Jon Smirl" wrote:
> > 
> > Could you clone the repo and delete changesets earlier than 2004? Then
> > I would clone the small repo and work with it. Later I decide I want
> > full history, can I pull from a full repository at that point and get
> > updated? That would need a flag to trigger it since I don't want full
> > history to come over if I am just getting updates from someone else's
> > tree that has a full history.
> 
> This is clearly a desirable feature, and has been requested by several
> people (including myself) looking to switch some large-ish histories
> from an existing system to git.

The thing is, to some degree it's really fundamentally hard.

It's easy for a linear history. What you do for a linear history is to 
just get the top commit, and the tree associated with it, and then you 
cauterize the parent by just grafting it to go away. Boom. You're done.

The problems are that if the preceding history _wasn't_ linear (or, in 
fact, _subsequent_ development refers to it by having branched off at an 
earlier point), and you try to pull your updates, the other end (that 
knows about all the history) will assume you have all the history that you 
don't have, and will send you a pack assuming that.

Which won't even necessarily have all the tree/blob objects (it assumed 
you already had them), but more annoyingly, the history won't be 
cauterized, and you'll have dangling commits. Which you can cauterize by 
hand, of course, but you literally _will_ have to get the objects and 
cauterize the thing by hand.

You're right that it's not "fundamentally impossible" to do: the git 
format certainly _allows_ it. But the git protocol handshake really does 
end up optimizing away all the unnecessary work by knowing that the other 
side will have all the shared history, so lacking the shared history will 
mean that you're a bit screwed.

Using the http protocol actually works. It doesn't do any handshake: it 
will just fetch objects from the other end as it needs them. The downside, 
of course, is that it also doesn't understand packs, so if the source is 
packed (and it pretty much _will_ be, for any big source), you're going to 
end up getting it all _anyway_.

		Linus

^ permalink raw reply

* Re: Figured out how to get Mozilla into git
From: Linus Torvalds @ 2006-06-10  3:01 UTC (permalink / raw)
  To: Jon Smirl; +Cc: Martin Langhoff, git
In-Reply-To: <9e4733910606091921o1d07826w8292dc22b1872345@mail.gmail.com>



On Fri, 9 Jun 2006, Jon Smirl wrote:
>
> No more cvs diff taking four minutes to finish. I have to do that
> every time I want to generate a 10 line patch. Diffs can run locally.
> No more cvs update to replace files I deleted because I messed up
> edits in them. And I can have local branches, yeah!

More importantly, when the CVS server is down (can you say 
"sourceforge"?), who cares?

> What are we going to do about the BEOS developers on Mozilla? There
> are a couple more obscure OSes.

Well, the git cvsserver exporter apparently works well enough...

			Linus

^ permalink raw reply

* Re: Figured out how to get Mozilla into git
From: Carl Worth @ 2006-06-10  2:34 UTC (permalink / raw)
  To: Jon Smirl; +Cc: Linus Torvalds, Martin Langhoff, git
In-Reply-To: <9e4733910606091921o1d07826w8292dc22b1872345@mail.gmail.com>

[-- Attachment #1: Type: text/plain, Size: 1121 bytes --]

On Fri, 9 Jun 2006 22:21:17 -0400, "Jon Smirl" wrote:
> 
> Could you clone the repo and delete changesets earlier than 2004? Then
> I would clone the small repo and work with it. Later I decide I want
> full history, can I pull from a full repository at that point and get
> updated? That would need a flag to trigger it since I don't want full
> history to come over if I am just getting updates from someone else's
> tree that has a full history.

This is clearly a desirable feature, and has been requested by several
people (including myself) looking to switch some large-ish histories
from an existing system to git.

If you'd like to look through git archives for some discussion of the
issues that would be involved here, look for "shallow clone".

There's a related proposal termed "lazy clone" for one that would pull
down missing objects as needed over the network.

My impression is that both things will eventually be implemented.
There's certainly nothing fundamental in git that will prevent them,
(though there will be some interesting things to resolve as a real
patch for this stuff is explored).

-Carl

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply

* Re: Figured out how to get Mozilla into git
From: Jon Smirl @ 2006-06-10  2:30 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Martin Langhoff, git
In-Reply-To: <Pine.LNX.4.64.0606091853180.5498@g5.osdl.org>

On 6/9/06, Linus Torvalds <torvalds@osdl.org> wrote:
>
>
> On Fri, 9 Jun 2006, Jon Smirl wrote:
> > >
> > > Btw, does anybody know roughly how much data a initial "cvs co" takes on
> > > the mozilla repo? Git will obviously get the whole history, and that will
> > > inevitably be bigger than getting a single check-out, but it's not
> > > necessarily orders of magnitude bigger.
> >
> > 339MB for initial checkout

I ran the checkout through bzip and it is 36.4MB, 46.4MB with zip.
So the ratio may be 15 to 1 for the cvs co vs git

> And I think people run :pserver: with compression by default, so we're
> likely talking about half that in actual download overhead, no?
>
> So a git clone would be about (wild handwaving, don't look at all the
> assumptions) four times as expensive - assuming we only look at a poor DSL
> line as the expense - as an initial CVS co, but you'd get the _whole_
> history. Which may or may not make up for it. For some people it will, for
> others it won't.
>
> Of course, to make up for some of the initial costs, I suspect that some
> people who are used to "cvs update" taking 15 minutes to update two files,
> it would be a serious relief to see the git kind of "300 objects in five
> seconds" kinds of pulls.
>
> Although I guess that's one of the CVS things that SVN improved on. At
> least I'd hope so ;/
>
>                         Linus
>


-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply

* Re: Figured out how to get Mozilla into git
From: Jon Smirl @ 2006-06-10  2:21 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Martin Langhoff, git
In-Reply-To: <Pine.LNX.4.64.0606091853180.5498@g5.osdl.org>

On 6/9/06, Linus Torvalds <torvalds@osdl.org> wrote:
>
>
> On Fri, 9 Jun 2006, Jon Smirl wrote:
> > >
> > > Btw, does anybody know roughly how much data a initial "cvs co" takes on
> > > the mozilla repo? Git will obviously get the whole history, and that will
> > > inevitably be bigger than getting a single check-out, but it's not
> > > necessarily orders of magnitude bigger.
> >
> > 339MB for initial checkout
>
> And I think people run :pserver: with compression by default, so we're
> likely talking about half that in actual download overhead, no?
>
> So a git clone would be about (wild handwaving, don't look at all the
> assumptions) four times as expensive - assuming we only look at a poor DSL
> line as the expense - as an initial CVS co, but you'd get the _whole_
> history. Which may or may not make up for it. For some people it will, for
> others it won't.

Could you clone the repo and delete changesets earlier than 2004? Then
I would clone the small repo and work with it. Later I decide I want
full history, can I pull from a full repository at that point and get
updated? That would need a flag to trigger it since I don't want full
history to come over if I am just getting updates from someone else's
tree that has a full history.

>
> Of course, to make up for some of the initial costs, I suspect that some
> people who are used to "cvs update" taking 15 minutes to update two files,
> it would be a serious relief to see the git kind of "300 objects in five
> seconds" kinds of pulls.

No more cvs diff taking four minutes to finish. I have to do that
every time I want to generate a 10 line patch. Diffs can run locally.
No more cvs update to replace files I deleted because I messed up
edits in them. And I can have local branches, yeah!

What are we going to do about the BEOS developers on Mozilla? There
are a couple more obscure OSes.

> Although I guess that's one of the CVS things that SVN improved on. At
> least I'd hope so ;/
>
>                         Linus
>


-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply

* [PATCH] Fix formatting of Documentation/git-clone.txt
From: Horst H. von Brand @ 2006-06-10  2:15 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Signed-off-by: Horst H. von Brand <vonbrand@inf.utfsm.cl>
---
 Documentation/git-clone.txt |    4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/git-clone.txt b/Documentation/git-clone.txt
index 7572e4b..a90521e 100644
--- a/Documentation/git-clone.txt
+++ b/Documentation/git-clone.txt
@@ -95,8 +95,8 @@ OPTIONS
 	defined default, typically `/usr/share/git-core/templates`.
 
 --use-separate-remote::
-	Save remotes heads under `$GIT_DIR/remotes/origin/' instead
-	of `$GIT_DIR/refs/heads/'.  Only the master branch is saved
+	Save remotes heads under `$GIT_DIR/remotes/origin/` instead
+	of `$GIT_DIR/refs/heads/`.  Only the master branch is saved
 	in the latter.
 
 <repository>::
-- 
1.4.0.rc2.g7612

^ permalink raw reply related

* Re: Figured out how to get Mozilla into git
From: Linus Torvalds @ 2006-06-10  1:59 UTC (permalink / raw)
  To: Jon Smirl; +Cc: Martin Langhoff, git
In-Reply-To: <9e4733910606091848r5fb4d565taabfc5198140daf2@mail.gmail.com>



On Fri, 9 Jun 2006, Jon Smirl wrote:
> > 
> > Btw, does anybody know roughly how much data a initial "cvs co" takes on
> > the mozilla repo? Git will obviously get the whole history, and that will
> > inevitably be bigger than getting a single check-out, but it's not
> > necessarily orders of magnitude bigger.
> 
> 339MB for initial checkout

And I think people run :pserver: with compression by default, so we're 
likely talking about half that in actual download overhead, no?

So a git clone would be about (wild handwaving, don't look at all the 
assumptions) four times as expensive - assuming we only look at a poor DSL 
line as the expense - as an initial CVS co, but you'd get the _whole_ 
history. Which may or may not make up for it. For some people it will, for 
others it won't.

Of course, to make up for some of the initial costs, I suspect that some 
people who are used to "cvs update" taking 15 minutes to update two files, 
it would be a serious relief to see the git kind of "300 objects in five 
seconds" kinds of pulls.

Although I guess that's one of the CVS things that SVN improved on. At 
least I'd hope so ;/

			Linus

^ permalink raw reply

* Re: Figured out how to get Mozilla into git
From: Jon Smirl @ 2006-06-10  1:48 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Martin Langhoff, git
In-Reply-To: <Pine.LNX.4.64.0606091837040.5498@g5.osdl.org>

On 6/9/06, Linus Torvalds <torvalds@osdl.org> wrote:
>
>
> On Fri, 9 Jun 2006, Linus Torvalds wrote:
> >
> > That's like 20% of the original, with all the obvious distribution
> > advantages.
>
> Btw, does anybody know roughly how much data a initial "cvs co" takes on
> the mozilla repo? Git will obviously get the whole history, and that will
> inevitably be bigger than getting a single check-out, but it's not
> necessarily orders of magnitude bigger.

339MB for initial checkout

> It could be that getting a whole git archive is not _that_ much more
> expnsive than getting a single version, considering how well history
> compresses (eg the kernel git arhive isn't orders of magnitude bigger than
> a single compressed tar-ball of the sources).
>
> At that point, it's probably a pretty usable alternative.
>
> (Although, to be fair, we almost certainly have to improve "git-rev-list
> --objects --all" performance on that thing, since that's going to
> otherwise make it totally impossible to do initial clones using the native
> git protocol, and make git look bad).
>
>                         Linus
>


-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply

* Re: Figured out how to get Mozilla into git
From: Linus Torvalds @ 2006-06-10  1:43 UTC (permalink / raw)
  To: Martin Langhoff; +Cc: Jon Smirl, git
In-Reply-To: <Pine.LNX.4.64.0606091825080.5498@g5.osdl.org>



On Fri, 9 Jun 2006, Linus Torvalds wrote:
> 
> That's like 20% of the original, with all the obvious distribution 
> advantages.

Btw, does anybody know roughly how much data a initial "cvs co" takes on 
the mozilla repo? Git will obviously get the whole history, and that will 
inevitably be bigger than getting a single check-out, but it's not 
necessarily orders of magnitude bigger.

It could be that getting a whole git archive is not _that_ much more 
expnsive than getting a single version, considering how well history 
compresses (eg the kernel git arhive isn't orders of magnitude bigger than 
a single compressed tar-ball of the sources).

At that point, it's probably a pretty usable alternative.

(Although, to be fair, we almost certainly have to improve "git-rev-list 
--objects --all" performance on that thing, since that's going to 
otherwise make it totally impossible to do initial clones using the native 
git protocol, and make git look bad).

			Linus

^ permalink raw reply

* Re: [PATCH] shared repository settings enhancement.
From: Jakub Narebski @ 2006-06-10  1:38 UTC (permalink / raw)
  To: git
In-Reply-To: <Pine.LNX.4.64.0606091743410.5498@g5.osdl.org>

Linus Torvalds wrote:

> 
> 
> On Fri, 9 Jun 2006, Junio C Hamano wrote:
>>
>> This lets you say:
>> 
>>      [core]
>>              sharedrepository = 075
> 
> I really think it's better to express this as some more traditional 
> number.
> 
> I had to think about what 075 meant, while saying
> 
>       [core]
>               sharedrepository = 0644

Yet another solution would be to actually set umask.

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply

* Re: Figured out how to get Mozilla into git
From: Linus Torvalds @ 2006-06-10  1:33 UTC (permalink / raw)
  To: Martin Langhoff; +Cc: Jon Smirl, git
In-Reply-To: <46a038f90606091814n1922bf25l94d913238b260296@mail.gmail.com>



On Sat, 10 Jun 2006, Martin Langhoff wrote:
> 
> Now I don't know how much memory or time this took, but it clearly
> completed ok. And, it's now a single pack, weighting a grand total of
> 617MB

Ok, that's more than reasonable. That should be fairly easily mapped on a 
32-bit architecture without any huge problems, even with some VM 
fragmentation going on. It might be borderline (and you definitely want a 
3:1 VM user:kernel split), but considering that the original CVS archive 
was apparently 3GB, having a single 617M pack-file is still pretty damn 
good.  That's like 20% of the original, with all the obvious distribution 
advantages.

Clearly this whole thing _does_ show that we could improve the process of 
importing things from CVS a whole lot, and I assume your 617MB pack 
doesn't have the nice name/email translations so it needs to be fixed up, 
but it sounds like on the whole the core git design came through with 
shining colors, even if we may want to polish things up a bit ;)

I'm downloading the thing right now.

			Linus

^ permalink raw reply

* Re: Figured out how to get Mozilla into git
From: Martin Langhoff @ 2006-06-10  1:23 UTC (permalink / raw)
  To: Jon Smirl; +Cc: Linus Torvalds, git
In-Reply-To: <9e4733910606091317p26d66579mdf93db293f93fb50@mail.gmail.com>

On 6/10/06, Jon Smirl <jonsmirl@gmail.com> wrote:
> The git tree that Martin got from cvsps is much smaller that the git
> tree I got from going to svn then to git.  I don't why the trees are
> 700KB different, it may be different amounts of packing, or one of the
> conversion tools is losing something.

Don't read too much into that. Packing/repacking points make a _huge_
difference, and even if one of our trees is a bit corrupt, the
packsizes should be about the same.

(With the patches I sent you we _are_ choosing to ignore a few
branches that don't seem to make sense in cvsps output. These will
show up in the error output -- what I saw were very old, possibly
corrupt branches there, stuff I wouldn't shed a tear over, but it is
worth reviewing).

> I haven't come up with anything that is likely to result in Mozilla
> switching over to git. Right now it takes three days to convert the
> tree. The tree will have to be run in parallel for a while to convince
> everyone to switch. I don't have a solution to keeping it in sync in
> near real time (commits would still go to CVS). Most Mozilla
> developers are interested but the infrastructure needs some help.

Don't worry about the initial import time. Once you've done it, you
can run the incremental import (which will take a few minutes) even
hourly to keep 'in sync'.

> Martin has also brought up the problem with needing a partial clone so
> that everyone doesn't have to bring down the entire repository. A
> trunk checkout is 340MB and Martin's git tree is 2GB (mine 2.7GB).  A
> kernel tree is only 680M.

Now that I have managed to repack the repo, it is indeed back in the
600M range. Actually, I just re-repacked, it took under a minute, and
it shrank down to 607MB.

Yay.

I'm sure that if you git-repack -a -d on a machine with plenty of
memory once or twice, we'll have matching packs.

cheers,



martin

^ permalink raw reply

* Re: Figured out how to get Mozilla into git
From: Martin Langhoff @ 2006-06-10  1:14 UTC (permalink / raw)
  To: Jon Smirl; +Cc: git
In-Reply-To: <46a038f90606082006t5c6a5623q4b9cf7b036dad1e5@mail.gmail.com>

On 6/9/06, Martin Langhoff <martin.langhoff@gmail.com> wrote:
> mozilla.git$ du -sh .git/
> 2.0G    .git/

Ok -- pushed the repository out to our mirror box. Try:

   git-clone http://mirrors.catalyst.net.nz/pub/mozilla.git/

Now, good news. No, _very_ good news. As I was rsync'ing this out, and
looking at the repo, suddently something was odd. Apparently after a
git-repack -a -d OOMd on me, and I had posted this message, I re-ran
it.

[As it happens I have been running several imports of gentoo and moz
lately on thebox. It is entirely possible that cvsps or a stray
git-cvsimport was sitting on a whole lot of ram at the time]

Now I don't know how much memory or time this took, but it clearly
completed ok. And, it's now a single pack, weighting a grand total of
617MB

So my comments about OOM'ing were wrong apparently. Hey, if the whole
history is actually only 617MB, then initial checkouts are back to
something reasonable, I'd say.

cheers,



martin

^ permalink raw reply

* Re: [PATCH] shared repository settings enhancement.
From: Linus Torvalds @ 2006-06-10  0:46 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git, Post, Mark K
In-Reply-To: <7vver9lu8g.fsf_-_@assigned-by-dhcp.cox.net>



On Fri, 9 Jun 2006, Junio C Hamano wrote:
>
> This lets you say:
> 
> 	[core]
> 		sharedrepository = 075

I really think it's better to express this as some more traditional 
number.

I had to think about what 075 meant, while saying

	[core]
		sharedrepository = 0644

just makes sense more or less automatically (and yes, for directories, the 
read bit should obviously be expanded as an execute bit).

The difference is just that the latter is how you _usually_ express 
permissions, so people are used to it. 

And "being used to it" is what "ease of use" really all boils down to.

		Linus

^ permalink raw reply

* Re: Figured out how to get Mozilla into git
From: Jon Smirl @ 2006-06-10  0:45 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Martin Langhoff, git
In-Reply-To: <9e4733910606091716q67d4c5f9ra807b712d871e562@mail.gmail.com>

They must be running some kind of process accounting at my host. As
soon as I hit 500MB RAM I get killed immediately. It is not from a
signal, I'm catching all of those. Maybe some kind of process
accounting.

I get this on the console:
[1]+  Killed
CVSROOT=~/jonsmirl.dreamhosters.com/mozilla/ cvsps -x --norc -A
mozilla >mozilla.cvsps 2>mozilla.cvspserr

and nothing on stdout or stderr.

kernel string:
 2.4.29-grsec+w+fhs6b+gr0501+nfs+a32+++p4+sata+c4+gr2b-v6.189

-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply

* [PATCH] shared repository settings enhancement.
From: Junio C Hamano @ 2006-06-10  0:39 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git, Post, Mark K
In-Reply-To: <7virnam435.fsf@assigned-by-dhcp.cox.net>

This lets you say:

	[core]
		sharedrepository = 075

to allow permission bits on files under $GIT_DIR for OTHER users
(not just GROUP users) to be copied from the permission bits of
the owner of the file.  This is useful for publishing a shared
repository over gitweb, when gitweb does not run as a member of
the project group and some members have umask too strict for
others to read what is created by default.  The historical
boolean sharedrepository maps to 070 (i.e. if the owner can read
or write or execute it, group members can, too).

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

 * This patch is meant for discussion, not application, as you
   can see there is one big "NEEDSWORK" in builtin-init-db.c.

   Regardless of this enhancement to deal with S_I[RWX]OTH, I
   spotted a couple of places that lack permission adjustment in
   the existing code, which might be worth fixing first.

 builtin-init-db.c |   21 ++++++++++++++++++---
 config.c          |    2 +-
 lockfile.c        |   15 ++++++++++-----
 refs.c            |    5 +++++
 setup.c           |   25 +++++++++++++++++++++++--
 sha1_file.c       |    9 ++++++---
 6 files changed, 63 insertions(+), 14 deletions(-)

diff --git a/builtin-init-db.c b/builtin-init-db.c
index 88b39a4..4826c08 100644
--- a/builtin-init-db.c
+++ b/builtin-init-db.c
@@ -198,6 +198,11 @@ static void create_default_files(const c
 
 	git_config(git_default_config);
 
+	/* NEEDSWORK: we would have created the above under user's
+	 * umask -- under shared-repository settings, we would need
+	 * to fix them up by scanning under $GIT_DIR here.
+	 */
+
 	/*
 	 * Create the default symlink from ".git/HEAD" to the "master"
 	 * branch, if it does not exist yet.
@@ -248,7 +253,14 @@ int cmd_init_db(int argc, const char **a
 		if (!strncmp(arg, "--template=", 11))
 			template_dir = arg+11;
 		else if (!strcmp(arg, "--shared"))
-			shared_repository = 1;
+			shared_repository = 070; /* might want 075 */
+		else if (!strncmp(arg, "--shared=", 9)) {
+			char *end;
+			long val = strtol(arg+9, &end, 8);
+			if (*end || 077 < val)
+				die("bad option for --shared=");
+			shared_repository = val;
+		}
 		else
 			die(init_db_usage);
 	}
@@ -286,8 +298,11 @@ int cmd_init_db(int argc, const char **a
 	strcpy(path+len, "/info");
 	safe_create_dir(path, 1);
 
-	if (shared_repository)
-		git_config_set("core.sharedrepository", "true");
+	if (shared_repository) {
+		char buf[6];
+		sprintf(buf, "%o", shared_repository);
+		git_config_set("core.sharedrepository", buf);
+	}
 
 	return 0;
 }
diff --git a/config.c b/config.c
index 2ae6153..c474970 100644
--- a/config.c
+++ b/config.c
@@ -536,7 +536,7 @@ int git_config_set_multivar(const char* 
 	 * contents of .git/config will be written into it.
 	 */
 	fd = open(lock_file, O_WRONLY | O_CREAT | O_EXCL, 0666);
-	if (fd < 0) {
+	if (fd < 0 || adjust_shared_perm(lock_file)) {
 		fprintf(stderr, "could not lock config file\n");
 		free(store.key);
 		ret = -1;
diff --git a/lockfile.c b/lockfile.c
index 9bc6083..2346e0e 100644
--- a/lockfile.c
+++ b/lockfile.c
@@ -27,11 +27,16 @@ int hold_lock_file_for_update(struct loc
 	int fd;
 	sprintf(lk->filename, "%s.lock", path);
 	fd = open(lk->filename, O_RDWR | O_CREAT | O_EXCL, 0666);
-	if (fd >=0 && !lk->next) {
-		lk->next = lock_file_list;
-		lock_file_list = lk;
-		signal(SIGINT, remove_lock_file_on_signal);
-		atexit(remove_lock_file);
+	if (0 <= fd) {
+		if (!lk->next) {
+			lk->next = lock_file_list;
+			lock_file_list = lk;
+			signal(SIGINT, remove_lock_file_on_signal);
+			atexit(remove_lock_file);
+		}
+		if (adjust_shared_perm(lk->filename))
+			return error("cannot fix permission bits on %s",
+				     lk->filename);
 	}
 	return fd;
 }
diff --git a/refs.c b/refs.c
index f91b771..713ca46 100644
--- a/refs.c
+++ b/refs.c
@@ -104,6 +104,11 @@ #endif
 		error("Unable to create %s", git_HEAD);
 		return -3;
 	}
+	if (adjust_shared_perm(git_HEAD)) {
+		unlink(lockpath);
+		error("Unable to fix permissions on %s", lockpath);
+		return -4;
+	}
 	return 0;
 }
 
diff --git a/setup.c b/setup.c
index fe7f884..213c596 100644
--- a/setup.c
+++ b/setup.c
@@ -223,8 +223,29 @@ int check_repository_format_version(cons
 {
        if (strcmp(var, "core.repositoryformatversion") == 0)
                repository_format_version = git_config_int(var, value);
-	else if (strcmp(var, "core.sharedrepository") == 0)
-		shared_repository = git_config_bool(var, value);
+       else if (strcmp(var, "core.sharedrepository") == 0) {
+	       /* This is unfortunate, but historically this
+		* variable was bool, and it now takes the umask
+		* to say if we want to keep the same access bits for
+		* the user to group members and others.
+		*/
+	       if (!value)
+		       shared_repository = 070; /* "true" - perhaps 075? */
+	       else if (!*value)
+		       ; /* bool "false" */
+	       else if (!strcasecmp(value, "true"))
+		       shared_repository = 070; /* "true" - perhaps 075? */
+	       else if (!strcasecmp(value, "false"))
+		       ; /* bool "false" */
+	       else if (strchr("01234567", *value)) {
+		       char *end;
+		       long val = strtol(value, &end, 8);
+		       if (*end || 077 < val)
+			       die("bad config value '%s' for '%s'",
+				   value, var);
+		       shared_repository = val;
+	       }
+       }
        return 0;
 }
 
diff --git a/sha1_file.c b/sha1_file.c
index aea0f40..fa19835 100644
--- a/sha1_file.c
+++ b/sha1_file.c
@@ -61,11 +61,14 @@ int adjust_shared_perm(const char *path)
 		return -1;
 	mode = st.st_mode;
 	if (mode & S_IRUSR)
-		mode |= S_IRGRP;
+		mode |= ( ((shared_repository & 040) ? S_IRGRP : 0) |
+			  ((shared_repository & 004) ? S_IROTH : 0) );
 	if (mode & S_IWUSR)
-		mode |= S_IWGRP;
+		mode |= ( ((shared_repository & 020) ? S_IWGRP : 0) |
+			  ((shared_repository & 002) ? S_IWOTH : 0) );
 	if (mode & S_IXUSR)
-		mode |= S_IXGRP;
+		mode |= ( ((shared_repository & 010) ? S_IXGRP : 0) |
+			  ((shared_repository & 001) ? S_IXOTH : 0) );
 	if (S_ISDIR(mode))
 		mode |= S_ISGID;
 	if (chmod(path, mode) < 0)
-- 
1.4.0.rc2.g55be

^ permalink raw reply related

* [PATCH/RFC] Retire SIMPLE_*** stuff.
From: Junio C Hamano @ 2006-06-10  0:35 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git
In-Reply-To: <7virnam435.fsf@assigned-by-dhcp.cox.net>

It used to be a good idea to keep simple stuff without linking with git
library, but more and more programs are using the config mechanism, and
yet more things are becoming internal.

Signed-off-by: Junio C Hamano <junkio@cox.net>
---

 Makefile |   24 +++++-------------------
 1 files changed, 5 insertions(+), 19 deletions(-)

diff --git a/Makefile b/Makefile
index 5226fa1..d65fc71 100644
--- a/Makefile
+++ b/Makefile
@@ -142,12 +142,7 @@ SCRIPTS = $(patsubst %.sh,%,$(SCRIPT_SH)
 	  $(patsubst %.py,%,$(SCRIPT_PYTHON)) \
 	  git-cherry-pick git-status
 
-# The ones that do not have to link with lcrypto, lz nor xdiff.
-SIMPLE_PROGRAMS = \
-	git-get-tar-commit-id$X git-mailsplit$X \
-	git-stripspace$X git-daemon$X
-
-# ... and all the rest that could be moved out of bindir to gitexecdir
+# Programs could be moved out of bindir to gitexecdir
 PROGRAMS = \
 	git-checkout-index$X git-clone-pack$X \
 	git-convert-objects$X git-fetch-pack$X git-fsck-objects$X \
@@ -162,7 +157,9 @@ PROGRAMS = \
 	git-upload-pack$X git-verify-pack$X git-write-tree$X \
 	git-update-ref$X git-symbolic-ref$X \
 	git-name-rev$X git-pack-redundant$X git-repo-config$X git-var$X \
-	git-describe$X git-merge-tree$X git-blame$X git-imap-send$X
+	git-describe$X git-merge-tree$X git-blame$X git-imap-send$X \
+	git-get-tar-commit-id$X git-mailsplit$X \
+	git-stripspace$X git-daemon$X
 
 BUILT_INS = git-log$X git-whatchanged$X git-show$X \
 	git-count-objects$X git-diff$X git-push$X \
@@ -175,7 +172,7 @@ BUILT_INS = git-log$X git-whatchanged$X 
 	git-diff-index$X git-diff-stages$X git-diff-tree$X git-cat-file$X
 
 # what 'all' will build and 'install' will install, in gitexecdir
-ALL_PROGRAMS = $(PROGRAMS) $(SIMPLE_PROGRAMS) $(SCRIPTS)
+ALL_PROGRAMS = $(PROGRAMS) $(SCRIPTS)
 
 # Backward compatibility -- to be removed after 1.0
 PROGRAMS += git-ssh-pull$X git-ssh-push$X
@@ -386,11 +383,9 @@ else
 endif
 ifdef NEEDS_SOCKET
 	LIBS += -lsocket
-	SIMPLE_LIB += -lsocket
 endif
 ifdef NEEDS_NSL
 	LIBS += -lnsl
-	SIMPLE_LIB += -lnsl
 endif
 ifdef NO_D_TYPE_IN_DIRENT
 	ALL_CFLAGS += -DNO_D_TYPE_IN_DIRENT
@@ -428,6 +428,8 @@ endif
 
 ifdef NO_ICONV
 	ALL_CFLAGS += -DNO_ICONV
+else
+	LIBS += $(LIB_4_ICONV)
 endif
 
 ifdef PPC_SHA1

@@ -559,15 +554,6 @@ endif
 git-%$X: %.o $(GITLIBS)
 	$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) $(LIBS)
 
-$(SIMPLE_PROGRAMS) : $(LIB_FILE)
-$(SIMPLE_PROGRAMS) : git-%$X : %.o
-	$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) \
-		$(LIB_FILE) $(SIMPLE_LIB)
-
-git-mailinfo$X: mailinfo.o $(LIB_FILE)
-	$(CC) $(ALL_CFLAGS) -o $@ $(ALL_LDFLAGS) $(filter %.o,$^) \
-		$(LIB_FILE) $(SIMPLE_LIB) $(LIB_4_ICONV)
-
 git-local-fetch$X: fetch.o
 git-ssh-fetch$X: rsh.o fetch.o
 git-ssh-upload$X: rsh.o
-- 
1.4.0.rc2.g55be

^ permalink raw reply related

* Re: Figured out how to get Mozilla into git
From: Jon Smirl @ 2006-06-10  0:16 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Martin Langhoff, git
In-Reply-To: <Pine.LNX.4.64.0606091710560.5498@g5.osdl.org>

I'll apply and give it a test.

They look like this for most of them.

WARNING: Invalid PatchSet 151492, Tag JSS_4_0_RTM:
    security/coreconf/HP-UX.mk:1.8=after,
security/jss/org/mozilla/jss/crypto/KeyPairAlgorithm.java:1.5=before.
Treated as 'before'
WARNING: Invalid PatchSet 151492, Tag JSS_4_0_RTM:
    security/coreconf/HP-UX.mk:1.8=after,
security/jss/org/mozilla/jss/crypto/KeyPairGenerator.java:1.5=before.
Treated as 'before'
WARNING: Invalid PatchSet 151492, Tag JSS_4_0_RTM:
    security/coreconf/HP-UX.mk:1.8=after,
security/jss/org/mozilla/jss/crypto/KeyPairGeneratorSpi.java:1.3=before.
Treated as 'before'
WARNING: Invalid PatchSet 151492, Tag JSS_4_0_RTM:
    security/coreconf/HP-UX.mk:1.8=after,
security/jss/org/mozilla/jss/crypto/KeyWrapAlgorithm.java:1.8=before.
Treated as 'before'
WARNING: Invalid PatchSet 151492, Tag JSS_4_0_RTM:
    security/coreconf/HP-UX.mk:1.8=after,
security/jss/org/mozilla/jss/crypto/KeyWrapper.java:1.8=before.
Treated as 'before'
WARNING: Invalid PatchSet 151492, Tag JSS_4_0_RTM:
    security/coreconf/HP-UX.mk:1.8=after,
security/jss/org/mozilla/jss/crypto/Makefile:1.2=before. Treated as
'before'
WARNING: Invalid PatchSet 151492, Tag JSS_4_0_RTM:
    security/coreconf/HP-UX.mk:1.8=after,
security/jss/org/mozilla/jss/crypto/NoSuchItemOnTokenException.java:1.3=before.
Treated as 'before'



-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply

* Re: Figured out how to get Mozilla into git
From: Linus Torvalds @ 2006-06-10  0:11 UTC (permalink / raw)
  To: Jon Smirl; +Cc: Martin Langhoff, git
In-Reply-To: <9e4733910606091700s49018cd5p3b66f8ef51b22d2e@mail.gmail.com>



On Fri, 9 Jun 2006, Jon Smirl wrote:
> 
> Are we sure cvsps is ok? It is generating 500MB of warnings when I run it.

Do they go away with these patches?

		Linus
---
commit 3d1ebcef6b4f9f6c9064efd64da4dd30d93c3c96
Author: Linus Torvalds <torvalds@g5.osdl.org>
Date:   Wed Mar 22 17:20:20 2006 -0800

    Fix branch ancestor calculation
    
    Not having any ancestor at all means that any valid ancestor (even of
    "depth 0") is fine.
    
    Signed-off-by: Linus Torvalds <torvalds@osdl.org>

diff --git a/cvsps.c b/cvsps.c
index c22147e..2695a0f 100644
--- a/cvsps.c
+++ b/cvsps.c
@@ -2599,7 +2599,7 @@ static void determine_branch_ancestor(Pa
 	 * note: rev is the pre-commit revision, not the post-commit
 	 */
 	if (!head_ps->ancestor_branch)
-	    d1 = 0;
+	    d1 = -1;
 	else if (strcmp(ps->branch, rev->branch) == 0)
 	    continue;
 	else if (strcmp(head_ps->ancestor_branch, "HEAD") == 0)

commit 82fcf7e31bbeae3b01a8656549e9b8fd89d598eb
Author: Linus Torvalds <torvalds@g5.osdl.org>
Date:   Wed Mar 22 11:23:37 2006 -0800

    Improve handling of file collisions in the same patchset
    
    Take the file revision into account.

diff --git a/cvsps.c b/cvsps.c
index 1e64e3c..c22147e 100644
--- a/cvsps.c
+++ b/cvsps.c
@@ -2384,8 +2384,31 @@ void patch_set_add_member(PatchSet * ps,
     for (next = ps->members.next; next != &ps->members; next = next->next) 
     {
 	PatchSetMember * m = list_entry(next, PatchSetMember, link);
-	if (m->file == psm->file && ps->collision_link.next == NULL) 
-		list_add(&ps->collision_link, &collisions);
+	if (m->file == psm->file) {
+		int order = compare_rev_strings(psm->post_rev->rev, m->post_rev->rev);
+
+		/*
+		 * Same revision too? Add it to the collision list
+		 * if it isn't already.
+		 */
+		if (!order) {
+			if (ps->collision_link.next == NULL)
+				list_add(&ps->collision_link, &collisions);
+			return;
+		}
+
+		/*
+		 * If this is an older revision than the one we already have
+		 * in this patchset, just ignore it
+		 */
+		if (order < 0)
+			return;
+
+		/*
+		 * This is a newer one, remove the old one
+		 */
+		list_del(&m->link);
+	}
     }
 
     psm->ps = ps;

commit 534120d9a47062eecd7b53fd7ac0b70d97feb4fd
Author: Linus Torvalds <torvalds@g5.osdl.org>
Date:   Wed Mar 22 11:20:59 2006 -0800

    Increase log-length limit to 64kB
    
    Yeah, it should be dynamic. I'm lazy.

diff --git a/cvsps_types.h b/cvsps_types.h
index b41e2a9..dba145d 100644
--- a/cvsps_types.h
+++ b/cvsps_types.h
@@ -8,7 +8,7 @@ #define CVSPS_TYPES_H
 
 #include <time.h>
 
-#define LOG_STR_MAX 32768
+#define LOG_STR_MAX 65536
 #define AUTH_STR_MAX 64
 #define REV_STR_MAX 64
 #define MIN(a, b) ((a) < (b) ? (a) : (b))

^ permalink raw reply related

* Re: Figured out how to get Mozilla into git
From: Jon Smirl @ 2006-06-10  0:00 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Martin Langhoff, git
In-Reply-To: <Pine.LNX.4.64.0606091640200.5498@g5.osdl.org>

On 6/9/06, Linus Torvalds <torvalds@osdl.org> wrote:
> On Sat, 10 Jun 2006, Martin Langhoff wrote:
> > In any case, and for the record, my cvsps is 2.1 pristine. It handles
> > the mozilla repo alright, as long as I give it a lot of RAM. I _think_
> > it slurped 3GB with the mozilla cvs.
>
> Oh, wow. Every single repo I've seen ends up having tons of complaints
> from pristine cvsps, but maybe that's because I only end up looking at the
> ones with problems ;)

Are we sure cvsps is ok? It is generating 500MB of warnings when I run it.

I have cvsps running at dreamhost currently. I had to modify cvs,
cvps, git, etc to not repsond to signals to keep them from killing
everything.

I can clone 2GB git tree there. Let me know when it is up.

-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply

* Re: Figured out how to get Mozilla into git
From: Linus Torvalds @ 2006-06-09 23:43 UTC (permalink / raw)
  To: Martin Langhoff; +Cc: Jon Smirl, git
In-Reply-To: <46a038f90606091637o6a0194d5yb413237253a372fc@mail.gmail.com>



On Sat, 10 Jun 2006, Martin Langhoff wrote:
> 
> Exactly. The dog at this time is cvsps -- I also remember vague
> promises from a list regular of publishing a git repo with cvsps2.1 +
> some patches from the list.

Ahh. cvsps doesn't do anything incrementally, does it?

Although it _does_ build up a cache of sorts, I think. That's not the 
parts I actually ever ended up looking at.

But yeah, a cvsps that blows up to a gig of VM and takes half an hour to 
parse things just for an incremental update would be a problem.

> In any case, and for the record, my cvsps is 2.1 pristine. It handles
> the mozilla repo alright, as long as I give it a lot of RAM. I _think_
> it slurped 3GB with the mozilla cvs.

Oh, wow. Every single repo I've seen ends up having tons of complaints 
from pristine cvsps, but maybe that's because I only end up looking at the 
ones with problems ;)

> I'm coming down to the office now to pick up my laptop, and I'll rsync
> it out to our git machine (also NZ kernel mirror, bandwidth should be
> good). That's one of the things I've discovered with these large
> trees: for the initial publish action, I just use rsync or scp.
> Perhaps I'm doing it wrong, but git-push doesn't optimise the
> 'initialise repo', and it take ages (and it this case, it'd probably
> OOM).
> 
> > So it will take me quite some time to download 2GB+, regardless of how fat
> > a pipe the other end has ;)
> 
> Right-o. Linus, Jon, can you guys then ping me when you have cloned it
> safely so I can take it down again?

Tell me where/when it is, and I'll start slurping. Will let you know when 
I'm done.

		Linus

^ permalink raw reply

* Re: Figured out how to get Mozilla into git
From: Martin Langhoff @ 2006-06-09 23:37 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jon Smirl, git
In-Reply-To: <Pine.LNX.4.64.0606091450180.5498@g5.osdl.org>

Apologies, I dropped out of the conversation -- Friday night drinks
(NZ timezone) took over ;-)

Now, back on track...

On 6/10/06, Linus Torvalds <torvalds@osdl.org> wrote:
> In the meantime, the fact that git-cvsimport can be done incrementally
> means that once we have the silly pack-file-mapping details worked out, it
> should be perfectly fine to run the 3-day import just once, and then work
> on it incrementally afterwards without any real problems.

Exactly. The dog at this time is cvsps -- I also remember vague
promises from a list regular of publishing a git repo with cvsps2.1 +
some patches from the list.

In any case, and for the record, my cvsps is 2.1 pristine. It handles
the mozilla repo alright, as long as I give it a lot of RAM. I _think_
it slurped 3GB with the mozilla cvs.

I want to review that cvs2svn importer, probably to steal the test
cases and perhaps some logic to revamp/replace cvsps. The thing is --
we can't just drop/replace cvsimport because it does incrementals, so
continuity and consistency are key. All the CVS imports have to take
some hard decisions when the data is bad -- however it is we fudge it,
we kind of want to fudge it consistently ;-)

> So people like you who want to work on it off-line using a distributed
> system _can_ do so, realistically. Maybe not practically _today_

Other than "don't run repack -a", it's feasible. In fact, that's how I
use git 99% of the time -- to do DSCM stuff on projects that are using
CVS, like Moodle.

>   The poor python/perl guys may write things more
>   quickly, but when they hit a language wall, they hit it.

Flamebait anyone? ;-) It is a different kind of fun -- let's say that
on top of knowing the performance tricks (or, to be more hip: "design
patterns") for the hardware and OS, you also end up learning the
performance tricks of the interpreter/vm/whatever.

> > It would be better to rsync Martins copy, he has a lot more bandwidth.

I'm coming down to the office now to pick up my laptop, and I'll rsync
it out to our git machine (also NZ kernel mirror, bandwidth should be
good). That's one of the things I've discovered with these large
trees: for the initial publish action, I just use rsync or scp.
Perhaps I'm doing it wrong, but git-push doesn't optimise the
'initialise repo', and it take ages (and it this case, it'd probably
OOM).

> So it will take me quite some time to download 2GB+, regardless of how fat
> a pipe the other end has ;)

Right-o. Linus, Jon, can you guys then ping me when you have cloned it
safely so I can take it down again?

cheers,


martin

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox