Git development
 help / color / mirror / Atom feed
* Re: Figured out how to get Mozilla into git
From: Jakub Narebski @ 2006-06-09  7:17 UTC (permalink / raw)
  To: git
In-Reply-To: <9e4733910606082028k37f6d915m26009e0d5011808b@mail.gmail.com>

Jon Smirl wrote:


>> git-repack -a -d but it OOMs on my 2GB+2GBswap machine :(
> 
> We are all having problems getting this to run on 32 bit machines with
> the 3-4GB process size limitations.

Is that expected (for 10GB repository if I remember correctly), or is there
some way to avoid this OOM?

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply

* [BUG] gitk with few commits
From: Uwe Zeisberger @ 2006-06-09  7:23 UTC (permalink / raw)
  To: git

Hello,

I started a new repository, for now there are only two commits.  In
gitk the area to display the commits is far from being filled.  I can
scroll up all the same.

Then marking commits by clicking on them fails, I have to click where
they are located, when the canvas is scrolled down completely.  Then the
canvas is redrawn to show the commits at the top again.

I tried shortly to fix this, but my Tcl/Tk knowledge ...

git version 1.4.0.rc2.ga95e
tcl/tk version 8.4.12-1

(Tk is patched with the patch from D. Richard Hipp Paul Mackerras
provided to me.)

Best regards
Uwe

-- 
Uwe Zeisberger

http://www.google.com/search?q=30+hours+and+4+days+in+seconds

^ permalink raw reply

* Re: Figured out how to get Mozilla into git
From: Linus Torvalds @ 2006-06-09 15:01 UTC (permalink / raw)
  To: Jakub Narebski; +Cc: git
In-Reply-To: <e6b798$td3$1@sea.gmane.org>



On Fri, 9 Jun 2006, Jakub Narebski wrote:
> Jon Smirl wrote:
> 
> >> git-repack -a -d but it OOMs on my 2GB+2GBswap machine :(
> > 
> > We are all having problems getting this to run on 32 bit machines with
> > the 3-4GB process size limitations.
> 
> Is that expected (for 10GB repository if I remember correctly), or is there
> some way to avoid this OOM?

Well, to some degree, the VM limitations are inevitable with huge packs.

The original idea for packs was to avoid making one huge pack, partly 
because it was expected to be really really slow to generate (so 
incremental repacking was a much better strategy), but partly simply 
because trying to map one huge pack is really hard to do.

For various reasons, we ended up mostly using a single pack most of the 
time: it's the most efficient model when the project is reasonably sized, 
and it turns out that with the delta re-use, repacking even moderately 
large projects like the kernel doesn't actually take all that long.

But the fact that we ended up mostly using a single pack for the kernel, 
for example, doesn't mean that the fundamental reasons that git supports 
multiple packs would somehow have gone away. At some point, the project 
gets large enough that one single pack simply isn't reasonable.

So a single 2GB pack is already very much pushing it. It's really really 
hard to map in a 2GB file on a 32-bit platform: your VM is usually 
fragmented enough that it simply isn't practical. In fact, I think the 
limit for _practical_ usage of single packs is probably somewhere in the 
half-gig region, unless you just have 64-bit machines.

And yes, I realize that the "single pack" thing actually ends up having 
become a fact for cloning, for example. Originally, cloning would unpack 
on the receiving end, and leave the repacking to happen there, but that 
obviously sucked. So now when we clone, we always get a single pack. That 
can absolutely be a problem.

I don't know what the right solution is. Single packs _are_ very useful, 
especially after a clone. So it's possible that we should just make the 
pack-reading code be able to map partial packs. But the point is that 
there are certainly ways we can fix this - it's not _really_ fundamental.

It's going to complicate it a bit (damn, how I hate 32-bit VM 
limitations), but the good news is that the whole git model of "everything 
is an individual object" means that it's a very _local_ decision: it will 
probably be painful to re-do some of the pack reading code and have a LRU 
of pack _fragments_ instead of a LRU of packs, but it's only going to 
affect a small part of git, and everything else will never even see it.

So large packs are not really a fundamental problem, but right now we have 
some practical issues with them.

(It's not _just_ packs: running out of memory is also because of 
git-rev-list --objects being pretty memory hungry. I've improved the 
memory usage several times by over 50%, but people keep trying larger 
projects. It used to be that I considered the kernel a large history, now 
we're talking about things that have ten times the number of objects).

Martin - do you have some place to make that big mozilla repo available? 
It would be a good test-case.. 

			Linus

^ permalink raw reply

* Re: Figured out how to get Mozilla into git
From: Nicolas Pitre @ 2006-06-09 16:11 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jakub Narebski, git
In-Reply-To: <Pine.LNX.4.64.0606090745390.5498@g5.osdl.org>

On Fri, 9 Jun 2006, Linus Torvalds wrote:

> 
> 
> On Fri, 9 Jun 2006, Jakub Narebski wrote:
> > Jon Smirl wrote:
> > 
> > >> git-repack -a -d but it OOMs on my 2GB+2GBswap machine :(
> > > 
> > > We are all having problems getting this to run on 32 bit machines with
> > > the 3-4GB process size limitations.
> > 
> > Is that expected (for 10GB repository if I remember correctly), or is there
> > some way to avoid this OOM?

What was that 10GB related to, exactly?  The original CVS repo, or the 
unpacked GIT repo?

> So a single 2GB pack is already very much pushing it. It's really really 
> hard to map in a 2GB file on a 32-bit platform: your VM is usually 
> fragmented enough that it simply isn't practical. In fact, I think the 
> limit for _practical_ usage of single packs is probably somewhere in the 
> half-gig region, unless you just have 64-bit machines.

Sure, but have we already reached that size?

The historic Linux repo currently repacks itself into a ~175MB pack for 
63428 commits.

The current Linux repo is ~103MB with a much shorter history (27153 
commits).

Given the above we can estimate the size of the kernel repository after 
x commits as follows:

	slope = (175 - 103) / (63428 - 27153) = approx 2KB per commit

	initial size = 175 - .001985 * 63428 = 49MB

So the initial kernel commit is about 49MB in size which is coherent 
with the corresponding compressed tarball.  Subsequent commits are 2KB 
in size on average.  Given that it will take about 233250 commits before 
the kernel reaches the half gigabyte pack file, and given the current 
commit rate (approx 23700 commits per year), that means we still have 
nearly 9 years to go.  And at that point 64-bit machines are likely to 
be the norm.

So given those numbers I don't think this is really an issue.  The Linux 
kernel is a rather huge and pretty active project to base comparisons 
against.  The Mozilla repository might be difficult to import and 
repack, but once repacked it should still be pretty usable now even on a 
32-bit machine even with a single pack.

Otherwise that should be quite easy to add a batch size argument to 
git-repack so git-rev-list and git-pack-objects are called multiple 
times with sequential commit 
ranges to create a repo with multiple packs.


Nicolas

^ permalink raw reply

* Re: Figured out how to get Mozilla into git
From: Linus Torvalds @ 2006-06-09 16:30 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Jakub Narebski, git
In-Reply-To: <Pine.LNX.4.64.0606091127540.19403@localhost.localdomain>



On Fri, 9 Jun 2006, Nicolas Pitre wrote:
> 
> > So a single 2GB pack is already very much pushing it. It's really really 
> > hard to map in a 2GB file on a 32-bit platform: your VM is usually 
> > fragmented enough that it simply isn't practical. In fact, I think the 
> > limit for _practical_ usage of single packs is probably somewhere in the 
> > half-gig region, unless you just have 64-bit machines.
> 
> Sure, but have we already reached that size?

Not for the Linux repos.

But apparently the mozilla repo ends up being 2GB in git. From Martin:

  >> oh, I went back to a cvsimport that I started a couple days ago.
  >> Completed with no problems...
  >> 
  >> Last commit:
  >> commit 5ecb56b9c4566618fad602a8da656477e4c6447a
  >> Author: wtchang%redhat.com <wtchang%redhat.com>
  >> Date:   Fri Jun 2 17:20:37 2006 +0000
  >> 
  >>    Import NSPR 4.6.2 and NSS 3.11.1
  >> 
  >> mozilla.git$ du -sh .git/
  >> 2.0G    .git/

now that was done with _incremental_ repacking (ie his .git directory
won't be just one large pack), but I bet that if you were to clone it
(without using the "-l" flag or rsync/http), you'd end up with serious
trouble because of the single-pack limit.

So we're starting to see archives where single packs are problematic for
a 32-bit architecture. 

			Linus

^ permalink raw reply

* Re: Figured out how to get Mozilla into git
From: Jakub Narebski @ 2006-06-09 17:10 UTC (permalink / raw)
  To: git
In-Reply-To: <Pine.LNX.4.64.0606091127540.19403@localhost.localdomain>

Nicolas Pitre wrote:

> What was that 10GB related to, exactly?  The original CVS repo, or the 
> unpacked GIT repo?

Erm, Subversion repository, result of cvs2svn conversion:

Jon Smirl> I wonder how long it will take to start gitk on a 10GB 
Jon Smirl> repository.

(in first post in this thread).

> Otherwise that should be quite easy to add a batch size argument to 
> git-repack so git-rev-list and git-pack-objects are called multiple 
> times with sequential commit ranges to create a repo with multiple
> packs. 

Good idea. In addition to best size pack limted by 32bit and/or RAM size +
swap size limit, there are (rare) limits of maximum filesize on filesystem,
e.g. FAT28^W FAT32.

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply

* Re: Figured out how to get Mozilla into git
From: Nicolas Pitre @ 2006-06-09 17:38 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Jakub Narebski, git
In-Reply-To: <Pine.LNX.4.64.0606090926550.5498@g5.osdl.org>

On Fri, 9 Jun 2006, Linus Torvalds wrote:

> 
> 
> On Fri, 9 Jun 2006, Nicolas Pitre wrote:
> > 
> > > So a single 2GB pack is already very much pushing it. It's really really 
> > > hard to map in a 2GB file on a 32-bit platform: your VM is usually 
> > > fragmented enough that it simply isn't practical. In fact, I think the 
> > > limit for _practical_ usage of single packs is probably somewhere in the 
> > > half-gig region, unless you just have 64-bit machines.
> > 
> > Sure, but have we already reached that size?
> 
> Not for the Linux repos.
> 
> But apparently the mozilla repo ends up being 2GB in git. From Martin:
> 
>   >> oh, I went back to a cvsimport that I started a couple days ago.
>   >> Completed with no problems...
>   >> 
>   >> Last commit:
>   >> commit 5ecb56b9c4566618fad602a8da656477e4c6447a
>   >> Author: wtchang%redhat.com <wtchang%redhat.com>
>   >> Date:   Fri Jun 2 17:20:37 2006 +0000
>   >> 
>   >>    Import NSPR 4.6.2 and NSS 3.11.1
>   >> 
>   >> mozilla.git$ du -sh .git/
>   >> 2.0G    .git/

He also sais:

| git-repack -a -d but it OOMs on my 2GB+2GBswap machine :(

> now that was done with _incremental_ repacking (ie his .git directory
> won't be just one large pack),

So given the nature of packs, incrementally packing an imported 
repository _might_ cause worse problems since each pack must be self 
referenced by definition.  That means you may end up with multiple 
revisions of the same file distributed amongst as many packs hence none 
of those revisions are ever deltified, and to repack that you currently 
have to mmap all those packs at once.

> but I bet that if you were to clone it
> (without using the "-l" flag or rsync/http), you'd end up with serious
> trouble because of the single-pack limit.

Maybe that single pack would instead be under the 512MB limit?  I'd be 
curious to know.

> So we're starting to see archives where single packs are problematic for
> a 32-bit architecture. 

Depending on the operation, the single pack might actually be better, 
especially for a full clone where everything gets mapped.  Multiple 
packs will always take more space, which is fine if you don't need 
access to all objects at once since individual packs are small, but the 
whole of them (when repacking or cloning) isn't.


Nicolas

^ permalink raw reply

* Re: Figured out how to get Mozilla into git
From: Linus Torvalds @ 2006-06-09 17:49 UTC (permalink / raw)
  To: Nicolas Pitre; +Cc: Jakub Narebski, git
In-Reply-To: <Pine.LNX.4.64.0606091326550.2703@localhost.localdomain>



On Fri, 9 Jun 2006, Nicolas Pitre wrote:
>
> Maybe that single pack would instead be under the 512MB limit?  I'd be 
> curious to know.

Possible, but not likely, and with "git repack -a -d" running out of 
memory, we clearly already have a problem in checking that.

That is most likely git-rev-list, though. Which is why I'd like to just 
rsync the repo, and run git-rev-list on it, and see what else I can shave 
off ;)

> > So we're starting to see archives where single packs are problematic for
> > a 32-bit architecture. 
> 
> Depending on the operation, the single pack might actually be better, 

Absolutely. Which is why I said we probably need to do a LRU on pack 
fragments rather than full packs when we do the pack memory mapping.

		Linus

^ permalink raw reply

* Re: Figured out how to get Mozilla into git
From: Jon Smirl @ 2006-06-09 18:13 UTC (permalink / raw)
  To: Martin Langhoff; +Cc: git
In-Reply-To: <46a038f90606082006t5c6a5623q4b9cf7b036dad1e5@mail.gmail.com>

On 6/8/06, Martin Langhoff <martin.langhoff@gmail.com> wrote:
> mozilla.git$ du -sh .git/
> 2.0G    .git/

That looks too small. My svn git import is 2.7GB and the source CVS is
3.0GB. The svn import wasn't finished when I stopped it.

My cvsps process is still running from last night. The error file is
341MB. How big is it when the conversion is finished? My machine is
swapping to death.

I'm still attracted to the cvs2svn tool. It handled everything right
the first time and it only needs 100MB to run. It is also a lot
faster. cvsps and parsecvs both need gigabytes of RAM to run. I'll
look at cvs2svn some more but I still need to figure out more about
low level git and learn Python.

-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply

* Failed git commands and StGIT
From: Petr Baudis @ 2006-06-09 18:36 UTC (permalink / raw)
  To: catalin.marinas; +Cc: git

  Hi,

  a user at #git just came with a problem with stg refresh - it turned
out that he did not have his environment set up properly, but what is
troublesome that stg refresh just said that "git-commit-tree failed" and
did not show the actual error message - looking at the code, you
probably want to keep fd 3 on the parent process' stderr, that is use
open2, not open3.

  The user has used StGIT 0.9.

  Thanks,

-- 
				Petr "Pasky" Baudis
Stuff: http://pasky.or.cz/
A person is just about as big as the things that make them angry.

^ permalink raw reply

* Git-daemon messing up permissions for gitweb
From: Post, Mark K @ 2006-06-09 18:41 UTC (permalink / raw)
  To: git

I'm trying to set up a git repository for mainframe Linux developers to
use at git390.osdl.marist.edu.  Everything _seemed_ to go well, until
Martin Schwidefsky started actually pushing changes back to the
repository.  When he does that, the projects disappear from the web page
that gitweb.cgi is generating.

As far as I can tell, the problem is happening because these files are
being written out with file permissions of 640, and since Apache is
running as user wwwrun, it can't read them:
-rw-r-----  1 sky git  5490 Jun  9 03:35 ./linux-2.6.git/info/refs
-rw-r-----  1 sky git    54 Jun  9 03:35
./linux-2.6.git/objects/info/packs
-rw-r-----  1 sky git    41 Jun  9 03:35
./linux-2.6.git/refs/heads/master
-rw-r-----  1 sky git    41 Jun  9 03:35
./linux-2.6.git/refs/heads/origin
-rw-r-----  1 sky git  5490 Jun  9 04:00
./s390-experimental.git/info/refs
-rw-r-----  1 sky git     1 Jun  9 04:00
./s390-experimental.git/objects/info/packs
-rw-r-----  1 sky git    41 Jun  9 04:00
./s390-experimental.git/refs/heads/master
-rw-r-----  1 sky git  5490 Jun  9 11:31 ./s390-features.git/info/refs
-rw-r-----  1 sky git     1 Jun  9 11:31
./s390-features.git/objects/info/packs
-rw-r-----  1 sky git    41 Jun  9 11:31
./s390-features.git/refs/heads/master
-rw-r-----  1 sky git  5490 Jun  9 03:41 ./s390-fixes.git/info/refs
-rw-r-----  1 sky git     1 Jun  9 03:41
./s390-fixes.git/objects/info/packs
-rw-r-----  1 sky git    41 Jun  9 03:41
./s390-fixes.git/refs/heads/master

I know I could brute-force this by adding wwwrun to the git group, but I
first wanted to find out if that is the solution, or if something is
wrong with the way I've set things up.  I tried searching the mailing
list archives, but nothing that appeared to be relevant turned up.

Thanks in advance for any help,

Mark Post

^ permalink raw reply

* Re: Git-daemon messing up permissions for gitweb
From: Jakub Narebski @ 2006-06-09 18:58 UTC (permalink / raw)
  To: git
In-Reply-To: <5A14AF34CFF8AD44A44891F7C9FF41050795782F@usahm236.amer.corp.eds.com>

Post, Mark K wrote:

> I'm trying to set up a git repository for mainframe Linux developers to
> use at git390.osdl.marist.edu.  Everything _seemed_ to go well, until
> Martin Schwidefsky started actually pushing changes back to the
> repository.  When he does that, the projects disappear from the web page
> that gitweb.cgi is generating.
> 
> As far as I can tell, the problem is happening because these files are
> being written out with file permissions of 640, and since Apache is
> running as user wwwrun, it can't read them:
> -rw-r-----  1 sky git  5490 Jun  9 03:35 ./linux-2.6.git/info/refs

One of the possible soultion would be to add sky to Apache group,
make directory sticky-group, and make files readable by group (perhaps by
marking repository as shared, or something).

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply

* Re: Figured out how to get Mozilla into git
From: Linus Torvalds @ 2006-06-09 19:00 UTC (permalink / raw)
  To: Jon Smirl; +Cc: Martin Langhoff, git
In-Reply-To: <9e4733910606091113vdc6ab06l2d3582cb82b8fd09@mail.gmail.com>



On Fri, 9 Jun 2006, Jon Smirl wrote:
> 
> That looks too small. My svn git import is 2.7GB and the source CVS is
> 3.0GB. The svn import wasn't finished when I stopped it.

Git is much better at packing than either CVS or SVN. Get used to it ;)

> My cvsps process is still running from last night. The error file is
> 341MB. How big is it when the conversion is finished? My machine is
> swapping to death.

Do you have all the cvsps patches? There's a few important ones floating 
around, and David Mansfield never did a 2.2 release..

I'm pretty sure Martin doesn't run plain 2.1.

		Linus

^ permalink raw reply

* Re: Git-daemon messing up permissions for gitweb
From: Linus Torvalds @ 2006-06-09 19:05 UTC (permalink / raw)
  To: Post, Mark K; +Cc: git
In-Reply-To: <5A14AF34CFF8AD44A44891F7C9FF41050795782F@usahm236.amer.corp.eds.com>



On Fri, 9 Jun 2006, Post, Mark K wrote:
>
> As far as I can tell, the problem is happening because these files are
> being written out with file permissions of 640, and since Apache is
> running as user wwwrun, it can't read them:

You can either make sure that people have something like

	umask 0022

in their bashrc (or that it's the default umask), so that they do things 
world-readably by default.

Or add
	[core]
		SharedRepository = true

to the repository config file.

		Linus

^ permalink raw reply

* RE: Git-daemon messing up permissions for gitweb
From: Post, Mark K @ 2006-06-09 19:11 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git

I just triple-checked, and both those things are already in place.  The
default umask is 0022, and all the config files have:
[core]
        repositoryformatversion = 0
        filemode = true
        sharedrepository = true

I see that the case of the value is different from what you typed:
SharedRepository 
Is that significant (as almost everything is)?


Mark Post

-----Original Message-----
From: Linus Torvalds [mailto:torvalds@osdl.org] 
Sent: Friday, June 09, 2006 3:06 PM
To: Post, Mark K
Cc: git@vger.kernel.org
Subject: Re: Git-daemon messing up permissions for gitweb



On Fri, 9 Jun 2006, Post, Mark K wrote:
>
> As far as I can tell, the problem is happening because these files are
> being written out with file permissions of 640, and since Apache is
> running as user wwwrun, it can't read them:

You can either make sure that people have something like

	umask 0022

in their bashrc (or that it's the default umask), so that they do things

world-readably by default.

Or add
	[core]
		SharedRepository = true

to the repository config file.

		Linus

^ permalink raw reply

* RE: Git-daemon messing up permissions for gitweb
From: Jakub Narebski @ 2006-06-09 19:27 UTC (permalink / raw)
  To: git
In-Reply-To: <5A14AF34CFF8AD44A44891F7C9FF410507957855@usahm236.amer.corp.eds.com>

Post, Mark K wrote:

> [core]
>         repositoryformatversion = 0
>         filemode = true
>         sharedrepository = true
> 
> I see that the case of the value is different from what you typed:
>         SharedRepository = true
> Is that significant (as almost everything is)?

No, the keys are case-insensitive.

-- 
Jakub Narebski
Warsaw, Poland

^ permalink raw reply

* Re: Git-daemon messing up permissions for gitweb
From: Junio C Hamano @ 2006-06-09 19:51 UTC (permalink / raw)
  To: Post, Mark K; +Cc: git
In-Reply-To: <5A14AF34CFF8AD44A44891F7C9FF41050795782F@usahm236.amer.corp.eds.com>

"Post, Mark K" <mark.post@eds.com> writes:

> I'm trying to set up a git repository for mainframe Linux developers to
> use at git390.osdl.marist.edu.  Everything _seemed_ to go well, until
> Martin Schwidefsky started actually pushing changes back to the
> repository.  When he does that, the projects disappear from the web page
> that gitweb.cgi is generating.

> As far as I can tell, the problem is happening because these files are
> being written out with file permissions of 640, and since Apache is
> running as user wwwrun, it can't read them:
> -rw-r-----  1 sky git  5490 Jun  9 03:35 ./linux-2.6.git/info/refs
> -rw-r-----  1 sky git    54 Jun  9 03:35
> ./linux-2.6.git/objects/info/packs
> -rw-r-----  1 sky git    41 Jun  9 03:35

First of all, it is not git-daemon that is updating these refs.
The daemon is a read only facility.

And you have checked the suggestion by Linus to set the umask to
world readable, which brings me to the next question.  

How did Martin actually "push changes back"?

Was it over git protocol over SSH, or the webdav thing over http
push?  The comment by Linus is about the former and I do not
know offhand who webdav thing runs as or how it handles the
permissino bits.

It could be that your ssh daemon installation bypasses .bashrc
and uses its own .ssh/environment, in which case your user would
may need to do umask there as well.

^ permalink raw reply

* Re: Git-daemon messing up permissions for gitweb
From: Junio C Hamano @ 2006-06-09 20:01 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git, Post, Mark K
In-Reply-To: <Pine.LNX.4.64.0606091201210.5498@g5.osdl.org>

Linus Torvalds <torvalds@osdl.org> writes:

> On Fri, 9 Jun 2006, Post, Mark K wrote:
>>
>> As far as I can tell, the problem is happening because these files are
>> being written out with file permissions of 640, and since Apache is
>> running as user wwwrun, it can't read them:

> Or add
> 	[core]
> 		SharedRepository = true
>
> to the repository config file.

This is about being able to share among the group, not with
people outside, so if wwwrun is outside git group like Mark's
setting I do not think it would do anything helpful to the
situation.

^ permalink raw reply

* RE: Git-daemon messing up permissions for gitweb
From: Post, Mark K @ 2006-06-09 20:08 UTC (permalink / raw)
  To: Junio C Hamano; +Cc: git

Martin is using git over SSH.  I have git-shell in /etc/passwd for his
account.

Mark Post 

-----Original Message-----
From: Junio C Hamano [mailto:junkio@cox.net] 
Sent: Friday, June 09, 2006 3:51 PM
To: Post, Mark K
Cc: git@vger.kernel.org
Subject: Re: Git-daemon messing up permissions for gitweb

"Post, Mark K" <mark.post@eds.com> writes:

> I'm trying to set up a git repository for mainframe Linux developers
to
> use at git390.osdl.marist.edu.  Everything _seemed_ to go well, until
> Martin Schwidefsky started actually pushing changes back to the
> repository.  When he does that, the projects disappear from the web
page
> that gitweb.cgi is generating.

> As far as I can tell, the problem is happening because these files are
> being written out with file permissions of 640, and since Apache is
> running as user wwwrun, it can't read them:
> -rw-r-----  1 sky git  5490 Jun  9 03:35 ./linux-2.6.git/info/refs
> -rw-r-----  1 sky git    54 Jun  9 03:35
> ./linux-2.6.git/objects/info/packs
> -rw-r-----  1 sky git    41 Jun  9 03:35

First of all, it is not git-daemon that is updating these refs.
The daemon is a read only facility.

And you have checked the suggestion by Linus to set the umask to
world readable, which brings me to the next question.  

How did Martin actually "push changes back"?

Was it over git protocol over SSH, or the webdav thing over http
push?  The comment by Linus is about the former and I do not
know offhand who webdav thing runs as or how it handles the
permissino bits.

It could be that your ssh daemon installation bypasses .bashrc
and uses its own .ssh/environment, in which case your user would
may need to do umask there as well.

^ permalink raw reply

* Re: Figured out how to get Mozilla into git
From: Jon Smirl @ 2006-06-09 20:17 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Martin Langhoff, git
In-Reply-To: <Pine.LNX.4.64.0606091158460.5498@g5.osdl.org>

On 6/9/06, Linus Torvalds <torvalds@osdl.org> wrote:
>
>
> On Fri, 9 Jun 2006, Jon Smirl wrote:
> >
> > That looks too small. My svn git import is 2.7GB and the source CVS is
> > 3.0GB. The svn import wasn't finished when I stopped it.
>
> Git is much better at packing than either CVS or SVN. Get used to it ;)

The git tree that Martin got from cvsps is much smaller that the git
tree I got from going to svn then to git.  I don't why the trees are
700KB different, it may be different amounts of packing, or one of the
conversion tools is losing something.

Earlier he said:
>git-repack -a -d but it OOMs on my 2GB+2GBswap machine :(

> > My cvsps process is still running from last night. The error file is
> > 341MB. How big is it when the conversion is finished? My machine is
> > swapping to death.
>
> Do you have all the cvsps patches? There's a few important ones floating
> around, and David Mansfield never did a 2.2 release..

I am running cvsps-2.1-3.fc5 so I may be wasting my time. Error out is
535MB now.
He sent me some git patches, but none for cvsps.

> I'm pretty sure Martin doesn't run plain 2.1.

I haven't come up with anything that is likely to result in Mozilla
switching over to git. Right now it takes three days to convert the
tree. The tree will have to be run in parallel for a while to convince
everyone to switch. I don't have a solution to keeping it in sync in
near real time (commits would still go to CVS). Most Mozilla
developers are interested but the infrastructure needs some help.

Martin has also brought up the problem with needing a partial clone so
that everyone doesn't have to bring down the entire repository. A
trunk checkout is 340MB and Martin's git tree is 2GB (mine 2.7GB).  A
kernel tree is only 680M.

-- 
Jon Smirl
jonsmirl@gmail.com

^ permalink raw reply

* Re: Git-daemon messing up permissions for gitweb
From: Junio C Hamano @ 2006-06-09 20:18 UTC (permalink / raw)
  To: Post, Mark K; +Cc: git
In-Reply-To: <5A14AF34CFF8AD44A44891F7C9FF41050795787F@usahm236.amer.corp.eds.com>

"Post, Mark K" <mark.post@eds.com> writes:

> Martin is using git over SSH.  I have git-shell in /etc/passwd for his
> account.

Ah, then umask git-shell gets from sshd is too restrictive.
Loosen it and you will be fine.

^ permalink raw reply

* RE: Git-daemon messing up permissions for gitweb
From: Linus Torvalds @ 2006-06-09 20:22 UTC (permalink / raw)
  To: Post, Mark K; +Cc: Junio C Hamano, git
In-Reply-To: <5A14AF34CFF8AD44A44891F7C9FF41050795787F@usahm236.amer.corp.eds.com>



On Fri, 9 Jun 2006, Post, Mark K wrote:
>
> Martin is using git over SSH.  I have git-shell in /etc/passwd for his
> account.

Ahh. git-shell doesn't read .bashrc or anything like that.

Does adding a "umask(0022)" to the beginning of main() in shell.c fix it 
for you?

		Linus

^ permalink raw reply

* RE: Git-daemon messing up permissions for gitweb
From: Post, Mark K @ 2006-06-09 20:29 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Junio C Hamano, git

I'm not a C programmer, so I'm not sure exactly how to do what you want.
Is this right (it compiles)?
--- shell.c.orig        2006-05-15 16:01:37.000000000 -0400
+++ shell.c     2006-06-09 16:26:24.619808905 -0400
@@ -31,7 +31,7 @@
 {
        char *prog;
        struct commands *cmd;
-
+       umask(0022);
        /* We want to see "-c cmd args", and nothing else */
        if (argc != 3 || strcmp(argv[1], "-c"))
                die("What do you think I am? A shell?"); 

I won't be able to report success or failure today.  Martin's in Germany
and I think he has a life.


Mark Post

-----Original Message-----
From: Linus Torvalds [mailto:torvalds@osdl.org] 
Sent: Friday, June 09, 2006 4:22 PM
To: Post, Mark K
Cc: Junio C Hamano; git@vger.kernel.org
Subject: RE: Git-daemon messing up permissions for gitweb



On Fri, 9 Jun 2006, Post, Mark K wrote:
>
> Martin is using git over SSH.  I have git-shell in /etc/passwd for his
> account.

Ahh. git-shell doesn't read .bashrc or anything like that.

Does adding a "umask(0022)" to the beginning of main() in shell.c fix it

for you?

		Linus

^ permalink raw reply

* Re: Git-daemon messing up permissions for gitweb
From: Junio C Hamano @ 2006-06-09 20:34 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: git, Post, Mark K
In-Reply-To: <Pine.LNX.4.64.0606091321100.5498@g5.osdl.org>

Linus Torvalds <torvalds@osdl.org> writes:

> On Fri, 9 Jun 2006, Post, Mark K wrote:
>>
>> Martin is using git over SSH.  I have git-shell in /etc/passwd for his
>> account.
>
> Ahh. git-shell doesn't read .bashrc or anything like that.

But that should be tweakable by configuring what sshd does for
the user, shouldn't it?  The "LOGIN PROCESS" section from man
sshd(8) seems to talk about $HOME/.ssh/environment, for example.

^ permalink raw reply

* Re: Figured out how to get Mozilla into git
From: Linus Torvalds @ 2006-06-09 20:40 UTC (permalink / raw)
  To: Jon Smirl; +Cc: Martin Langhoff, git
In-Reply-To: <9e4733910606091317p26d66579mdf93db293f93fb50@mail.gmail.com>



On Fri, 9 Jun 2006, Jon Smirl wrote:
>
> > Git is much better at packing than either CVS or SVN. Get used to it ;)
> 
> The git tree that Martin got from cvsps is much smaller that the git
> tree I got from going to svn then to git.  I don't why the trees are
> 700KB different, it may be different amounts of packing, or one of the
> conversion tools is losing something.

.. or one of them is adding something.

For example, it may well be that cvs2svn does a lot more commits or 
something like that.

That said, I don't even see where git-svn packs anythign at all, and 
you're absolutely right that when/how you repack can make a huge 
difference to disk usage, much more so than any importer details.

> > Do you have all the cvsps patches? There's a few important ones floating
> > around, and David Mansfield never did a 2.2 release..
> 
> I am running cvsps-2.1-3.fc5 so I may be wasting my time. Error out is
> 535MB now.
> He sent me some git patches, but none for cvsps.

I've got a couple, but I was hoping David would do a cvsps-2.2. I have 
this dim memory of him saying he had done some other improvements too.

> I haven't come up with anything that is likely to result in Mozilla
> switching over to git. Right now it takes three days to convert the
> tree. The tree will have to be run in parallel for a while to convince
> everyone to switch. I don't have a solution to keeping it in sync in
> near real time (commits would still go to CVS). Most Mozilla
> developers are interested but the infrastructure needs some help.

Sure. That said, I pretty much guarantee that the size issues will be much 
much worse for any other distributed SCM. 

If Mozilla doesn't need the distributed thing, then SVN is probably the 
best choice. It's still a total piece of crap, but hey, if crap (== 
centralized) is what people are used to, a few billion flies can't be 
wrong ;)

If you got your import done, is there some place I can rsync it from, and 
at least I can make sure that everything works fine for a repo that size.. 
One day the Mozilla people will notice that they really _really_ want the 
distribution, and they'll figure out quickly enough that SVK doesn't cut 
it, I suspect.

		Linus

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox