* git-daemon memory usage, disconnection.
@ 2006-04-19 13:22 David Woodhouse
2006-04-19 14:59 ` Linus Torvalds
0 siblings, 1 reply; 4+ messages in thread
From: David Woodhouse @ 2006-04-19 13:22 UTC (permalink / raw)
To: git
I'm running git-daemon from xinetd and it seems a little greedy...
Cpu(s): 2.7% us, 6.4% sy, 0.0% ni, 1.7% id, 87.7% wa, 1.4% hi, 0.0% si
Mem: 253680k total, 250076k used, 3604k free, 568k buffers
Swap: 500960k total, 500864k used, 96k free, 24696k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
31232 nobody 18 0 155m 29m 7224 D 1.3 11.9 0:25.56 git-rev-list
30743 nobody 18 0 179m 29m 9480 D 0.7 11.9 0:42.60 git-rev-list
31277 nobody 18 0 147m 28m 7476 D 2.6 11.4 0:20.90 git-rev-list
30314 nobody 18 0 233m 26m 7696 D 0.0 10.6 1:20.24 git-rev-list
30612 nobody 18 0 204m 23m 7432 D 1.3 9.4 0:59.19 git-rev-list
30574 nobody 18 0 190m 20m 7608 D 0.3 8.3 0:50.77 git-rev-list
30208 nobody 18 0 140m 14m 7632 D 0.3 5.9 0:15.23 git-pack-object
Now, this wouldn't be _so_ bad if there were only two of them running.
The clients for the other four have actually given up and disconnected
long ago, but git-daemon doesn't seem to have reacted to that.
--
dwmw2
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: git-daemon memory usage, disconnection.
2006-04-19 13:22 git-daemon memory usage, disconnection David Woodhouse
@ 2006-04-19 14:59 ` Linus Torvalds
2006-04-19 15:27 ` David Woodhouse
0 siblings, 1 reply; 4+ messages in thread
From: Linus Torvalds @ 2006-04-19 14:59 UTC (permalink / raw)
To: David Woodhouse; +Cc: git
On Wed, 19 Apr 2006, David Woodhouse wrote:
>
> I'm running git-daemon from xinetd and it seems a little greedy...
>
> Cpu(s): 2.7% us, 6.4% sy, 0.0% ni, 1.7% id, 87.7% wa, 1.4% hi, 0.0% si
> Mem: 253680k total, 250076k used, 3604k free, 568k buffers
> Swap: 500960k total, 500864k used, 96k free, 24696k cached
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 31232 nobody 18 0 155m 29m 7224 D 1.3 11.9 0:25.56 git-rev-list
> 30743 nobody 18 0 179m 29m 9480 D 0.7 11.9 0:42.60 git-rev-list
> 31277 nobody 18 0 147m 28m 7476 D 2.6 11.4 0:20.90 git-rev-list
> 30314 nobody 18 0 233m 26m 7696 D 0.0 10.6 1:20.24 git-rev-list
> 30612 nobody 18 0 204m 23m 7432 D 1.3 9.4 0:59.19 git-rev-list
> 30574 nobody 18 0 190m 20m 7608 D 0.3 8.3 0:50.77 git-rev-list
> 30208 nobody 18 0 140m 14m 7632 D 0.3 5.9 0:15.23 git-pack-object
Well, you've probably got two issues:
- it looks like you aren't packing your archives (which explains why the
disk accesses are horrid, which in turn explains the "D" part).
For a git server, you _really_ want all trees to be mostly packed, or
you want absolutely tons of memory (and 256kB is definitely not "tons"
as far as git is concerned).
- git-rev-list won't notice that there is nobody listening until it gets
a EPIPE, and it won't get an EPIPE until it actually outputs something,
and it won't output anything until it is largely done traversing the
tree..
> Now, this wouldn't be _so_ bad if there were only two of them running.
> The clients for the other four have actually given up and disconnected
> long ago, but git-daemon doesn't seem to have reacted to that.
Well, the way things work under UNIX, you normally don't notice that the
other end isn't interested until you try to write, and you get a "nobody
is listening". And sadly, the packing stuff does most (not all) of the
heavy lifting before it can even start to write things out.
That said, I should probably take a look at git-rev-list --objects memory
usage once again. It's neve rbeen exactly "lean" (and it can't really be:
it does end up needing the total object list in memory for a full clone,
and with something like the kernel, that's about 250 _thousand_ objects).
We should probably also make send-pack.c use the nice revision library,
because right now it's doing that pipe to git-rev-list for no good reason.
Linus
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: git-daemon memory usage, disconnection.
2006-04-19 14:59 ` Linus Torvalds
@ 2006-04-19 15:27 ` David Woodhouse
2006-04-19 15:49 ` Linus Torvalds
0 siblings, 1 reply; 4+ messages in thread
From: David Woodhouse @ 2006-04-19 15:27 UTC (permalink / raw)
To: Linus Torvalds; +Cc: git
On Wed, 2006-04-19 at 07:59 -0700, Linus Torvalds wrote:
> Well, you've probably got two issues:
>
> - it looks like you aren't packing your archives (which explains why the
> disk accesses are horrid, which in turn explains the "D" part).
Hm, good point. They're fairly new trees -- I had foolishly assumed that
they would at least start off packed. That isn't the case though --
perhaps it should be? Did the original clone receive a pack on the wire
and then _split_ it?
If the tools would automatically pack when the number of unpacked
objects reaches a threshold, that would be useful.
Since this repo is only available through git:// and git+ssh:// URLs, I
can safely use git-repack's '-a -d' options, right?
I'll do 'git-repack -l' nightly and 'git-repack -a -d -l' weekly -- does
that seem sane?
> For a git server, you _really_ want all trees to be mostly packed, or
> you want absolutely tons of memory (and 256kB is definitely not "tons"
> as far as git is concerned).
>
> Well, the way things work under UNIX, you normally don't notice that the
> other end isn't interested until you try to write, and you get a "nobody
> is listening". And sadly, the packing stuff does most (not all) of the
> heavy lifting before it can even start to write things out.
Well, it does that with SIGALRM happening periodically, theoretically
for the purpose of providing progress output. Perhaps we could do a
getpeername() or something else to check on the output fd each time?
--
dwmw2
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: git-daemon memory usage, disconnection.
2006-04-19 15:27 ` David Woodhouse
@ 2006-04-19 15:49 ` Linus Torvalds
0 siblings, 0 replies; 4+ messages in thread
From: Linus Torvalds @ 2006-04-19 15:49 UTC (permalink / raw)
To: David Woodhouse; +Cc: git
On Wed, 19 Apr 2006, David Woodhouse wrote:
> On Wed, 2006-04-19 at 07:59 -0700, Linus Torvalds wrote:
> > Well, you've probably got two issues:
> >
> > - it looks like you aren't packing your archives (which explains why the
> > disk accesses are horrid, which in turn explains the "D" part).
>
> Hm, good point. They're fairly new trees -- I had foolishly assumed that
> they would at least start off packed. That isn't the case though --
> perhaps it should be? Did the original clone receive a pack on the wire
> and then _split_ it?
For old versions of git, yes.
> If the tools would automatically pack when the number of unpacked
> objects reaches a threshold, that would be useful.
Well, packing is still best done in the background: you don't generally
want the tools to just stop for a minute to repack while you're doing
something. You'd normally want to do a cron run at 4AM or something, see
if there is lots to pack, and repack that.
The one exception is probably a large conversion process (from CVS, SVN,
whatever). The conversion process itself probably takes ages, and it will
be even slower if it were to keep the potentially huge result unpacked all
the time.
But for normal ops, you really don't want to repack synchronously.
> Since this repo is only available through git:// and git+ssh:// URLs, I
> can safely use git-repack's '-a -d' options, right?
Yes.
> I'll do 'git-repack -l' nightly and 'git-repack -a -d -l' weekly -- does
> that seem sane?
Absolutely. The one exception might be trees that really don't change very
much (which is quite common), so you might make it conditional on seeing
if there are _any_ objects at all in .git/objects/00/, for example. Not
that repack will be very expensive, but still..
> Well, it does that with SIGALRM happening periodically, theoretically
> for the purpose of providing progress output. Perhaps we could do a
> getpeername() or something else to check on the output fd each time?
Yes, that's possibly a good idea. Of course, for git-rev-list, it's just a
pipe, and it's hard to do that check at least portably. On Linux, doing a
"poll()" on a pipe for writing, with newer kernels you'll get a POLLERR if
the other side has hung up, but that's by no means portable.
(On some other systems, doing a zero-sized write() _might_ do it, but at
least Linux will happily say "ok, wrote 0 bytes" even if the other end
isn't listening).
And git-rev-list isn't doing the SIGALARM anyway.
In other words, to do this, we'd have to change send-pack to use the
revision library. Which, as mentioned, is worth-while anyway, but it's not
totally trivial.
Linus
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2006-04-19 15:51 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-04-19 13:22 git-daemon memory usage, disconnection David Woodhouse
2006-04-19 14:59 ` Linus Torvalds
2006-04-19 15:27 ` David Woodhouse
2006-04-19 15:49 ` Linus Torvalds
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).