* git-daemon memory usage, disconnection. @ 2006-04-19 13:22 David Woodhouse 2006-04-19 14:59 ` Linus Torvalds 0 siblings, 1 reply; 4+ messages in thread From: David Woodhouse @ 2006-04-19 13:22 UTC (permalink / raw) To: git I'm running git-daemon from xinetd and it seems a little greedy... Cpu(s): 2.7% us, 6.4% sy, 0.0% ni, 1.7% id, 87.7% wa, 1.4% hi, 0.0% si Mem: 253680k total, 250076k used, 3604k free, 568k buffers Swap: 500960k total, 500864k used, 96k free, 24696k cached PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND 31232 nobody 18 0 155m 29m 7224 D 1.3 11.9 0:25.56 git-rev-list 30743 nobody 18 0 179m 29m 9480 D 0.7 11.9 0:42.60 git-rev-list 31277 nobody 18 0 147m 28m 7476 D 2.6 11.4 0:20.90 git-rev-list 30314 nobody 18 0 233m 26m 7696 D 0.0 10.6 1:20.24 git-rev-list 30612 nobody 18 0 204m 23m 7432 D 1.3 9.4 0:59.19 git-rev-list 30574 nobody 18 0 190m 20m 7608 D 0.3 8.3 0:50.77 git-rev-list 30208 nobody 18 0 140m 14m 7632 D 0.3 5.9 0:15.23 git-pack-object Now, this wouldn't be _so_ bad if there were only two of them running. The clients for the other four have actually given up and disconnected long ago, but git-daemon doesn't seem to have reacted to that. -- dwmw2 ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: git-daemon memory usage, disconnection. 2006-04-19 13:22 git-daemon memory usage, disconnection David Woodhouse @ 2006-04-19 14:59 ` Linus Torvalds 2006-04-19 15:27 ` David Woodhouse 0 siblings, 1 reply; 4+ messages in thread From: Linus Torvalds @ 2006-04-19 14:59 UTC (permalink / raw) To: David Woodhouse; +Cc: git On Wed, 19 Apr 2006, David Woodhouse wrote: > > I'm running git-daemon from xinetd and it seems a little greedy... > > Cpu(s): 2.7% us, 6.4% sy, 0.0% ni, 1.7% id, 87.7% wa, 1.4% hi, 0.0% si > Mem: 253680k total, 250076k used, 3604k free, 568k buffers > Swap: 500960k total, 500864k used, 96k free, 24696k cached > > PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND > 31232 nobody 18 0 155m 29m 7224 D 1.3 11.9 0:25.56 git-rev-list > 30743 nobody 18 0 179m 29m 9480 D 0.7 11.9 0:42.60 git-rev-list > 31277 nobody 18 0 147m 28m 7476 D 2.6 11.4 0:20.90 git-rev-list > 30314 nobody 18 0 233m 26m 7696 D 0.0 10.6 1:20.24 git-rev-list > 30612 nobody 18 0 204m 23m 7432 D 1.3 9.4 0:59.19 git-rev-list > 30574 nobody 18 0 190m 20m 7608 D 0.3 8.3 0:50.77 git-rev-list > 30208 nobody 18 0 140m 14m 7632 D 0.3 5.9 0:15.23 git-pack-object Well, you've probably got two issues: - it looks like you aren't packing your archives (which explains why the disk accesses are horrid, which in turn explains the "D" part). For a git server, you _really_ want all trees to be mostly packed, or you want absolutely tons of memory (and 256kB is definitely not "tons" as far as git is concerned). - git-rev-list won't notice that there is nobody listening until it gets a EPIPE, and it won't get an EPIPE until it actually outputs something, and it won't output anything until it is largely done traversing the tree.. > Now, this wouldn't be _so_ bad if there were only two of them running. > The clients for the other four have actually given up and disconnected > long ago, but git-daemon doesn't seem to have reacted to that. Well, the way things work under UNIX, you normally don't notice that the other end isn't interested until you try to write, and you get a "nobody is listening". And sadly, the packing stuff does most (not all) of the heavy lifting before it can even start to write things out. That said, I should probably take a look at git-rev-list --objects memory usage once again. It's neve rbeen exactly "lean" (and it can't really be: it does end up needing the total object list in memory for a full clone, and with something like the kernel, that's about 250 _thousand_ objects). We should probably also make send-pack.c use the nice revision library, because right now it's doing that pipe to git-rev-list for no good reason. Linus ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: git-daemon memory usage, disconnection. 2006-04-19 14:59 ` Linus Torvalds @ 2006-04-19 15:27 ` David Woodhouse 2006-04-19 15:49 ` Linus Torvalds 0 siblings, 1 reply; 4+ messages in thread From: David Woodhouse @ 2006-04-19 15:27 UTC (permalink / raw) To: Linus Torvalds; +Cc: git On Wed, 2006-04-19 at 07:59 -0700, Linus Torvalds wrote: > Well, you've probably got two issues: > > - it looks like you aren't packing your archives (which explains why the > disk accesses are horrid, which in turn explains the "D" part). Hm, good point. They're fairly new trees -- I had foolishly assumed that they would at least start off packed. That isn't the case though -- perhaps it should be? Did the original clone receive a pack on the wire and then _split_ it? If the tools would automatically pack when the number of unpacked objects reaches a threshold, that would be useful. Since this repo is only available through git:// and git+ssh:// URLs, I can safely use git-repack's '-a -d' options, right? I'll do 'git-repack -l' nightly and 'git-repack -a -d -l' weekly -- does that seem sane? > For a git server, you _really_ want all trees to be mostly packed, or > you want absolutely tons of memory (and 256kB is definitely not "tons" > as far as git is concerned). > > Well, the way things work under UNIX, you normally don't notice that the > other end isn't interested until you try to write, and you get a "nobody > is listening". And sadly, the packing stuff does most (not all) of the > heavy lifting before it can even start to write things out. Well, it does that with SIGALRM happening periodically, theoretically for the purpose of providing progress output. Perhaps we could do a getpeername() or something else to check on the output fd each time? -- dwmw2 ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: git-daemon memory usage, disconnection. 2006-04-19 15:27 ` David Woodhouse @ 2006-04-19 15:49 ` Linus Torvalds 0 siblings, 0 replies; 4+ messages in thread From: Linus Torvalds @ 2006-04-19 15:49 UTC (permalink / raw) To: David Woodhouse; +Cc: git On Wed, 19 Apr 2006, David Woodhouse wrote: > On Wed, 2006-04-19 at 07:59 -0700, Linus Torvalds wrote: > > Well, you've probably got two issues: > > > > - it looks like you aren't packing your archives (which explains why the > > disk accesses are horrid, which in turn explains the "D" part). > > Hm, good point. They're fairly new trees -- I had foolishly assumed that > they would at least start off packed. That isn't the case though -- > perhaps it should be? Did the original clone receive a pack on the wire > and then _split_ it? For old versions of git, yes. > If the tools would automatically pack when the number of unpacked > objects reaches a threshold, that would be useful. Well, packing is still best done in the background: you don't generally want the tools to just stop for a minute to repack while you're doing something. You'd normally want to do a cron run at 4AM or something, see if there is lots to pack, and repack that. The one exception is probably a large conversion process (from CVS, SVN, whatever). The conversion process itself probably takes ages, and it will be even slower if it were to keep the potentially huge result unpacked all the time. But for normal ops, you really don't want to repack synchronously. > Since this repo is only available through git:// and git+ssh:// URLs, I > can safely use git-repack's '-a -d' options, right? Yes. > I'll do 'git-repack -l' nightly and 'git-repack -a -d -l' weekly -- does > that seem sane? Absolutely. The one exception might be trees that really don't change very much (which is quite common), so you might make it conditional on seeing if there are _any_ objects at all in .git/objects/00/, for example. Not that repack will be very expensive, but still.. > Well, it does that with SIGALRM happening periodically, theoretically > for the purpose of providing progress output. Perhaps we could do a > getpeername() or something else to check on the output fd each time? Yes, that's possibly a good idea. Of course, for git-rev-list, it's just a pipe, and it's hard to do that check at least portably. On Linux, doing a "poll()" on a pipe for writing, with newer kernels you'll get a POLLERR if the other side has hung up, but that's by no means portable. (On some other systems, doing a zero-sized write() _might_ do it, but at least Linux will happily say "ok, wrote 0 bytes" even if the other end isn't listening). And git-rev-list isn't doing the SIGALARM anyway. In other words, to do this, we'd have to change send-pack to use the revision library. Which, as mentioned, is worth-while anyway, but it's not totally trivial. Linus ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2006-04-19 15:51 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2006-04-19 13:22 git-daemon memory usage, disconnection David Woodhouse 2006-04-19 14:59 ` Linus Torvalds 2006-04-19 15:27 ` David Woodhouse 2006-04-19 15:49 ` Linus Torvalds
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).