* Re: [PATCH] fetch-pack: avoid fixing thin packs when unnecessary [not found] ` <Pine.LNX.4.64.0612181638220.18171@xanadu.home> @ 2006-12-18 21:55 ` Johannes Schindelin 2006-12-18 22:17 ` Nicolas Pitre 0 siblings, 1 reply; 52+ messages in thread From: Johannes Schindelin @ 2006-12-18 21:55 UTC (permalink / raw) To: Nicolas Pitre; +Cc: Randal L. Schwartz, git Hi, On Mon, 18 Dec 2006, Nicolas Pitre wrote: > On Mon, 18 Dec 2006, Johannes Schindelin wrote: > > > > > When we know that there are no common commits, the pack must be > > closed (i.e. non-thin) already. Avoid "fixing" it in that case. > > > > Signed-off-by: Johannes Schindelin <Johannes.Schindelin@gmx.de> > > > > --- > > > > On Mon, 18 Dec 2006, Johannes Schindelin wrote: > > > > > On Mon, 18 Dec 2006, Randal L. Schwartz wrote: > > > > > > > But then it took nearly an *hour* at the next phase: > > > > > > > > Resolving 313037 deltas. > > > > 100% (313037/313037) done > > > > > > Ouch. > > > > > > We try to avoid unpacking the thin packs received by git-fetch. > > > This means completing that pack (since it can contain deltas > > > against objects which are part of another pack). > > > > > > However, for the clone this is utter overkill. We really should > > > try to avoid resolving unnecessarily. This is really for the > > > clone case, since we do not have _any_ objects in the local > > > repository. > > > > > > It happens that the other case -- fetching an independent branch > > > -- is easy enough: we already have the check for it in > > > fetch-pack.c:586. > > > > ... and here is a lightly tested fix. > > NAK. > > This fixes nothing. See previous message. You're completely right. My patch does what I say, but it does not fix the problem. Ciao, Dscho ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH] fetch-pack: avoid fixing thin packs when unnecessary 2006-12-18 21:55 ` [PATCH] fetch-pack: avoid fixing thin packs when unnecessary Johannes Schindelin @ 2006-12-18 22:17 ` Nicolas Pitre 0 siblings, 0 replies; 52+ messages in thread From: Nicolas Pitre @ 2006-12-18 22:17 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Randal L. Schwartz, git On Mon, 18 Dec 2006, Johannes Schindelin wrote: > Hi, > > On Mon, 18 Dec 2006, Nicolas Pitre wrote: > > > On Mon, 18 Dec 2006, Johannes Schindelin wrote: > > > > > ... and here is a lightly tested fix. > > > > NAK. > > > > This fixes nothing. See previous message. > > You're completely right. My patch does what I say, but it does not fix the > problem. Your patch is also unnecessary. For use_thin_pack to be true, you must provide --thin to git-fetch-pack which is not the case when cloning. ^ permalink raw reply [flat|nested] 52+ messages in thread
[parent not found: <Pine.LNX.4.64.0612181251020.3479@woody.osdl.org>]
[parent not found: <86r6uw9azn.fsf@blue.stonehenge.com>]
[parent not found: <Pine.LNX.4.64.0612181625140.18171@xanadu.home>]
* Re: cloning the kernel - why long time in "Resolving 313037 deltas" [not found] ` <Pine.LNX.4.64.0612181625140.18171@xanadu.home> @ 2006-12-18 22:01 ` Randal L. Schwartz 2006-12-18 22:09 ` Nicolas Pitre 2006-12-18 22:22 ` Linus Torvalds 0 siblings, 2 replies; 52+ messages in thread From: Randal L. Schwartz @ 2006-12-18 22:01 UTC (permalink / raw) To: Nicolas Pitre; +Cc: Linus Torvalds, git >>>>> "Nicolas" == Nicolas Pitre <nico@cam.org> writes: Nicolas> Could you try the following please: Nicolas> time git-index-pack -v -o /dev/null .git/objects/pack/*.pack Nicolas> and provide us with the time it took (or an estimate if it is going to Nicolas> be on hour long)? Nicolas> Performing the above on my kernel repository (after it was repacked into Nicolas> a single ~150MB pack) takes only 37 seconds. I have a single pack. "Indexing" took about 30 seconds. "Resolving 313037 deltas" looks like it's going to take an hour. So that *was* a local delay. -- Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095 <merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/> Perl/Unix/security consulting, Technical writing, Comedy, etc. etc. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: cloning the kernel - why long time in "Resolving 313037 deltas" 2006-12-18 22:01 ` cloning the kernel - why long time in "Resolving 313037 deltas" Randal L. Schwartz @ 2006-12-18 22:09 ` Nicolas Pitre 2006-12-18 22:21 ` Randal L. Schwartz 2006-12-18 22:22 ` Linus Torvalds 1 sibling, 1 reply; 52+ messages in thread From: Nicolas Pitre @ 2006-12-18 22:09 UTC (permalink / raw) To: Randal L. Schwartz; +Cc: Linus Torvalds, git On Mon, 18 Dec 2006, Randal L. Schwartz wrote: > >>>>> "Nicolas" == Nicolas Pitre <nico@cam.org> writes: > > Nicolas> Could you try the following please: > > Nicolas> time git-index-pack -v -o /dev/null .git/objects/pack/*.pack > > Nicolas> and provide us with the time it took (or an estimate if it is going to > Nicolas> be on hour long)? > > Nicolas> Performing the above on my kernel repository (after it was repacked into > Nicolas> a single ~150MB pack) takes only 37 seconds. > > I have a single pack. > > "Indexing" took about 30 seconds. > "Resolving 313037 deltas" looks like it's going to take an hour. > > So that *was* a local delay. What CPU and amount of ram do you have? Are you on Windows? ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: cloning the kernel - why long time in "Resolving 313037 deltas" 2006-12-18 22:09 ` Nicolas Pitre @ 2006-12-18 22:21 ` Randal L. Schwartz 2006-12-18 22:50 ` Nicolas Pitre 0 siblings, 1 reply; 52+ messages in thread From: Randal L. Schwartz @ 2006-12-18 22:21 UTC (permalink / raw) To: Nicolas Pitre; +Cc: Linus Torvalds, git >>>>> "Nicolas" == Nicolas Pitre <nico@cam.org> writes: >> "Indexing" took about 30 seconds. >> "Resolving 313037 deltas" looks like it's going to take an hour. >> >> So that *was* a local delay. Nicolas> What CPU and amount of ram do you have? 2.2 Ghz Intel Core 2 Duo (Macbook Pro high end) I can compile and install GNU Emacs from source in 11 minutes. :) Nicolas> Are you on Windows? Gawd no! -- Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095 <merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/> Perl/Unix/security consulting, Technical writing, Comedy, etc. etc. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: cloning the kernel - why long time in "Resolving 313037 deltas" 2006-12-18 22:21 ` Randal L. Schwartz @ 2006-12-18 22:50 ` Nicolas Pitre 0 siblings, 0 replies; 52+ messages in thread From: Nicolas Pitre @ 2006-12-18 22:50 UTC (permalink / raw) To: Randal L. Schwartz; +Cc: Linus Torvalds, git On Mon, 18 Dec 2006, Randal L. Schwartz wrote: > Nicolas> What CPU and amount of ram do you have? > > 2.2 Ghz Intel Core 2 Duo (Macbook Pro high end) > I can compile and install GNU Emacs from source in 11 minutes. :) So it shouldn't be a lack of resource. > Nicolas> Are you on Windows? > > Gawd no! ;-) I asked because it could have had something with the mmap() usage recently reported to be dreadfully slow on Windows ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: cloning the kernel - why long time in "Resolving 313037 deltas" 2006-12-18 22:01 ` cloning the kernel - why long time in "Resolving 313037 deltas" Randal L. Schwartz 2006-12-18 22:09 ` Nicolas Pitre @ 2006-12-18 22:22 ` Linus Torvalds 2006-12-18 22:26 ` Randal L. Schwartz 1 sibling, 1 reply; 52+ messages in thread From: Linus Torvalds @ 2006-12-18 22:22 UTC (permalink / raw) To: Randal L. Schwartz; +Cc: Nicolas Pitre, git On Mon, 18 Dec 2006, Randal L. Schwartz wrote: > > I have a single pack. > > "Indexing" took about 30 seconds. > "Resolving 313037 deltas" looks like it's going to take an hour. > > So that *was* a local delay. Ok, interesting. Two questions: - what does "top" say (is it CPU-bound? Is it perhaps blowing out your disk cache? Is it swapping?) - do you have "oprofile" (or even just pgprof) to see where the *hell* that time is spent, if it's actually CPU? For me, it takes 25 seconds. Not an hour: [torvalds@woody linux]$ time git-index-pack -v -o /dev/null .git/objects/pack/*.pack Indexing 393507 objects. 100% (393507/393507) done Resolving 316071 deltas. 100% (316071/316071) done fatal: unable to create /dev/null: File exists real 0m24.619s user 0m22.569s sys 0m1.316s (I admit that 25 seconds is already "too much", but it does actually end up exploding a lot of objects and doing quite a bit of work, so I guess it's fair). And the process grew to 33MB in RSS at it's biggest, so it wasn't even using all that much memory (33MB isn't _tiny_, but considering that the pack in question is 155MB and has hundreds of thousands of objects, 33MB isn't really all that bad. You're running this under OS X, aren't you? It's a pig of an OS, but "almost one hour" vs "25 seconds" is still unreasonable. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: cloning the kernel - why long time in "Resolving 313037 deltas" 2006-12-18 22:22 ` Linus Torvalds @ 2006-12-18 22:26 ` Randal L. Schwartz 2006-12-18 23:02 ` Martin Langhoff 2006-12-18 23:28 ` Linus Torvalds 0 siblings, 2 replies; 52+ messages in thread From: Randal L. Schwartz @ 2006-12-18 22:26 UTC (permalink / raw) To: Linus Torvalds; +Cc: Nicolas Pitre, git >>>>> "Linus" == Linus Torvalds <torvalds@osdl.org> writes: >> So that *was* a local delay. Linus> Ok, interesting. Two questions: Linus> - what does "top" say (is it CPU-bound? Is it perhaps blowing out your Linus> disk cache? Is it swapping?) Not swapping, but CPU bound. Linus> - do you have "oprofile" (or even just pgprof) to see where the *hell* Linus> that time is spent, if it's actually CPU? I'm a "bear of very little brane" regarding code development on OSX. I can ask around to see if there's someway to profile this. Linus> You're running this under OS X, aren't you? It's a pig of an OS, but Linus> "almost one hour" vs "25 seconds" is still unreasonable. I agree! -- Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095 <merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/> Perl/Unix/security consulting, Technical writing, Comedy, etc. etc. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: cloning the kernel - why long time in "Resolving 313037 deltas" 2006-12-18 22:26 ` Randal L. Schwartz @ 2006-12-18 23:02 ` Martin Langhoff 2006-12-22 1:44 ` Kyle Moffett 2007-01-03 13:55 ` Andreas Ericsson 2006-12-18 23:28 ` Linus Torvalds 1 sibling, 2 replies; 52+ messages in thread From: Martin Langhoff @ 2006-12-18 23:02 UTC (permalink / raw) To: Randal L. Schwartz; +Cc: Linus Torvalds, Nicolas Pitre, git On 18 Dec 2006 14:26:36 -0800, Randal L. Schwartz <merlyn@stonehenge.com> wrote: > Linus> You're running this under OS X, aren't you? It's a pig of an OS, but > Linus> "almost one hour" vs "25 seconds" is still unreasonable. > > I agree! Me too -- but entirely possible. Disk IO is specially painful on OSX. Stat calls are horrid. Using Arch (which abused stat calls to no end) many ops would take 50x-100x longer on OSX than on Linux. A large unpacked repo with git is a real pain -- and packing it can take hours. Once you are packed it's sweet, but large repos are a pain to deal with. You won't impress anyone with performance over a linux kernel repo -- starting up gitk can take a long time. Stat-heavy stuff like git-diff is noticeably slower under OSX. Have you got a linux partition you can boot into to get comparative timings? [This is part of the reason I am migrating my OSX machine to Linux fulltime, now that it seems that mergedfb+randr will let me switch to dual monitors "hot".] cheers, ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: cloning the kernel - why long time in "Resolving 313037 deltas" 2006-12-18 23:02 ` Martin Langhoff @ 2006-12-22 1:44 ` Kyle Moffett 2006-12-22 1:56 ` Shawn Pearce 2006-12-22 8:04 ` Marco Roeland 2007-01-03 13:55 ` Andreas Ericsson 1 sibling, 2 replies; 52+ messages in thread From: Kyle Moffett @ 2006-12-22 1:44 UTC (permalink / raw) To: Martin Langhoff; +Cc: Randal L. Schwartz, Linus Torvalds, Nicolas Pitre, git On Dec 18, 2006, at 18:02:07, Martin Langhoff wrote: > On 18 Dec 2006 14:26:36 -0800, Randal L. Schwartz > <merlyn@stonehenge.com> wrote: >> Linus Torvalds wrote: >>> You're running this under OS X, aren't you? It's a pig of an OS, >>> but "almost one hour" vs "25 seconds" is still unreasonable. >> >> I agree! > > Me too -- but entirely possible. Disk IO is specially painful on > OSX. Stat calls are horrid. Using Arch (which abused stat calls to > no end) many ops would take 50x-100x longer on OSX than on Linux. A > large unpacked repo with git is a real pain -- and packing it can > take hours. I've actually also seen filesystem operation latency double or triple if you start trying to do operations from multiple threads at once. Suddenly the already dog-slow single-CPU operations start bouncing caches and the Mac OS X mostly-whole-of-BSD-BKL across CPUs and it just crawls. I can definitely see the local disk IO taking 100x longer than the network I/O, especially with an 8-megabit internet link. > Once you are packed it's sweet, but large repos are a pain to deal > with. You won't impress anyone with performance over a linux kernel > repo -- starting up gitk can take a long time. Stat-heavy stuff > like git-diff is noticeably slower under OSX. Just as an example, it takes my OS-X-running Quad-2.5GHz G5 ten times as long to do a "grep -rl foo linux/" as my Linux-running dual-1GHz G4 with 400MHz system bus. This is disk-cache-hot too. And that's not even a stat-heavy workload. There's more than one reason I'm trying to make a Mac OS X ABI emulation layer on top of Linux :-D. Cheers, Kyle Moffett ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: cloning the kernel - why long time in "Resolving 313037 deltas" 2006-12-22 1:44 ` Kyle Moffett @ 2006-12-22 1:56 ` Shawn Pearce 2006-12-22 8:04 ` Marco Roeland 1 sibling, 0 replies; 52+ messages in thread From: Shawn Pearce @ 2006-12-22 1:56 UTC (permalink / raw) To: Kyle Moffett Cc: Martin Langhoff, Randal L. Schwartz, Linus Torvalds, Nicolas Pitre, git Kyle Moffett <mrmacman_g4@mac.com> wrote: > Just as an example, it takes my OS-X-running Quad-2.5GHz G5 ten times > as long to do a "grep -rl foo linux/" as my Linux-running dual-1GHz > G4 with 400MHz system bus. This is disk-cache-hot too. And that's > not even a stat-heavy workload. There's more than one reason I'm > trying to make a Mac OS X ABI emulation layer on top of Linux :-D. Try 'git grep foo' instead; its very fast if you are working with a fully packed Git repository. Doesn't help any other application on Mac OS X however. :-( -- Shawn. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: cloning the kernel - why long time in "Resolving 313037 deltas" 2006-12-22 1:44 ` Kyle Moffett 2006-12-22 1:56 ` Shawn Pearce @ 2006-12-22 8:04 ` Marco Roeland 1 sibling, 0 replies; 52+ messages in thread From: Marco Roeland @ 2006-12-22 8:04 UTC (permalink / raw) To: Kyle Moffett Cc: Martin Langhoff, Randal L. Schwartz, Linus Torvalds, Nicolas Pitre, git On Thursday December 21st 2006 at 20:44 Kyle Moffett wrote: > I've actually also seen filesystem operation latency double or triple > if you start trying to do operations from multiple threads at once. > Suddenly the already dog-slow single-CPU operations start bouncing > caches and the Mac OS X mostly-whole-of-BSD-BKL across CPUs and it > just crawls. I can definitely see the local disk IO taking 100x > longer than the network I/O, especially with an 8-megabit internet link. The mmap() implementation on Mac OS X ate 85% percent of system time on the old version for git-index-pack. Of that time nearly all was spent on some Mach locking function. So yes the BKL like locking inside the Mach message passing seems to be the big culprit, perhaps Andy Tanenbaum can explain this on LCA 2007. <ducks> As most developers now run multicore CPU's we notice these differences even better now! ;-) -- Marco Roeland ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: cloning the kernel - why long time in "Resolving 313037 deltas" 2006-12-18 23:02 ` Martin Langhoff 2006-12-22 1:44 ` Kyle Moffett @ 2007-01-03 13:55 ` Andreas Ericsson 1 sibling, 0 replies; 52+ messages in thread From: Andreas Ericsson @ 2007-01-03 13:55 UTC (permalink / raw) To: Martin Langhoff; +Cc: Randal L. Schwartz, Linus Torvalds, Nicolas Pitre, git Martin Langhoff wrote: > > Once you are packed it's sweet, but large repos are a pain to deal > with. You won't impress anyone with performance over a linux kernel > repo -- starting up gitk can take a long time. I'd suggest running qgit instead, and make sure you've run it once on the repo you're demoing. qgit has a nice little cache where it tucks away most of the data it uses, which makes it draw things a lot faster. -- Andreas Ericsson andreas.ericsson@op5.se OP5 AB www.op5.se Tel: +46 8-230225 Fax: +46 8-230231 ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: cloning the kernel - why long time in "Resolving 313037 deltas" 2006-12-18 22:26 ` Randal L. Schwartz 2006-12-18 23:02 ` Martin Langhoff @ 2006-12-18 23:28 ` Linus Torvalds 2006-12-19 0:13 ` Nicolas Pitre 1 sibling, 1 reply; 52+ messages in thread From: Linus Torvalds @ 2006-12-18 23:28 UTC (permalink / raw) To: Randal L. Schwartz; +Cc: Nicolas Pitre, git On Mon, 18 Dec 2006, Randal L. Schwartz wrote: > > Linus> You're running this under OS X, aren't you? It's a pig of an OS, but > Linus> "almost one hour" vs "25 seconds" is still unreasonable. > > I agree! The only thing that comes to mind is - some actual difference in how the pack looks. Although I don't really see that. I re-pack both my own private kernel repos and the kernel.org one fairly often, and usually with a pretty recent git. I don't think any of the packing heuristics have changed in a long time. And I just checked: the kernel.org pack-file (the big one) unpacks quickly too. - SHA1 library differences. Maybe OpenSSL on OS X sucks? I still don't see the difference being _that_ big, but whatever.. - OS X "mmap()" performance really really sucks. We do end up doing a _log_ of small mmaps in the "Resolving .." stage. On Linux, the strace looks like write(2, "Resolving 316071 deltas.\n", 25) = 25 mmap(NULL, 2619, PROT_READ, MAP_PRIVATE, 3, 0x4c000) = 0x2aaf13be2000 munmap(0x2aaf13be2000, 2619) = 0 mmap(NULL, 2848, PROT_READ, MAP_PRIVATE, 3, 0x4c000) = 0x2aaf13be2000 munmap(0x2aaf13be2000, 2848) = 0 mmap(NULL, 2988, PROT_READ, MAP_PRIVATE, 3, 0x4c000) = 0x2aaf13be2000 munmap(0x2aaf13be2000, 2988) = 0 write(2, " 0% (2/316071) done\r", 22) = 22 mmap(NULL, 3336, PROT_READ, MAP_PRIVATE, 3, 0x4c000) = 0x2aaf13be2000 munmap(0x2aaf13be2000, 3336) = 0 mmap(NULL, 3471, PROT_READ, MAP_PRIVATE, 3, 0x4c000) = 0x2aaf13be2000 munmap(0x2aaf13be2000, 3471) = 0 mmap(NULL, 3611, PROT_READ, MAP_PRIVATE, 3, 0x4c000) = 0x2aaf13be2000 munmap(0x2aaf13be2000, 3611) = 0 ... and it's entirely possible that mmap() really does suck *ss on OS X. Those small mmap's go on for a _loong_ time. I've heard of other operations being slow on OS X - and two orders of magnitude really isn't unthinkable. I don't think people always seem to really understand how _good_ Linux is, and how much faster it can be. It's not just "Windows XP" sucks. Quite often it's literally "Linux is just damn fast". Sadly, that causes problems when the main developers don't even see any issues, just because the Linux kernel environment makes things look really really cheap. Even when it isn't always cheap on other platforms. Nico - have you looked at perhaps making the index-pack.c "mmap()" usage do chunking? Or just mmap the whole damn thing once? Linux is fast, but even Linux will be faster if you just mmap it once ;) ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: cloning the kernel - why long time in "Resolving 313037 deltas" 2006-12-18 23:28 ` Linus Torvalds @ 2006-12-19 0:13 ` Nicolas Pitre 2006-12-19 5:11 ` Theodore Tso 0 siblings, 1 reply; 52+ messages in thread From: Nicolas Pitre @ 2006-12-19 0:13 UTC (permalink / raw) To: Linus Torvalds; +Cc: Randal L. Schwartz, git On Mon, 18 Dec 2006, Linus Torvalds wrote: > I've heard of other operations being slow on OS X - and two orders of > magnitude really isn't unthinkable. I don't think people always seem to > really understand how _good_ Linux is, and how much faster it can be. It's > not just "Windows XP" sucks. Quite often it's literally "Linux is just > damn fast". > > Sadly, that causes problems when the main developers don't even see any > issues, just because the Linux kernel environment makes things look really > really cheap. Even when it isn't always cheap on other platforms. > > Nico - have you looked at perhaps making the index-pack.c "mmap()" usage > do chunking? Or just mmap the whole damn thing once? Linux is fast, but > even Linux will be faster if you just mmap it once ;) Maybe. However the mmap() may occur on section of the pack file which has just been written to in order to write even more, always to the same file. On Linux this is fast because the mmap'd data is likely to still be in the cache. I guess this could be turned into a malloc()/read()/free() with no trouble. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: cloning the kernel - why long time in "Resolving 313037 deltas" 2006-12-19 0:13 ` Nicolas Pitre @ 2006-12-19 5:11 ` Theodore Tso 2006-12-19 6:39 ` Shawn Pearce 2006-12-19 6:47 ` Linus Torvalds 0 siblings, 2 replies; 52+ messages in thread From: Theodore Tso @ 2006-12-19 5:11 UTC (permalink / raw) To: Nicolas Pitre; +Cc: Linus Torvalds, Randal L. Schwartz, git On Mon, Dec 18, 2006 at 07:13:40PM -0500, Nicolas Pitre wrote: > Maybe. However the mmap() may occur on section of the pack file which > has just been written to in order to write even more, always to the same > file. On Linux this is fast because the mmap'd data is likely to still > be in the cache. > > I guess this could be turned into a malloc()/read()/free() with no > trouble. Actually, depending on the size of the chunk, even on Linux malloc/read/free can be faster than the mmap/munmap, because mmap/munmap calls involve page table manipulations, and even on Linux that is often slower or dead even with the memory copy involved with using malloc/read. Even when reading huge chunks of Canon Raw File data at a time, I found (experimentally) that it was no faster to use mmap() compared to read(). And for small chunks of data, malloc/read will definitely win out over mmap(), since the page table operations and resulting page faults completely trump the cost of copying the bytes from the page cache to the read() buffer. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: cloning the kernel - why long time in "Resolving 313037 deltas" 2006-12-19 5:11 ` Theodore Tso @ 2006-12-19 6:39 ` Shawn Pearce 2006-12-19 6:51 ` Linus Torvalds 2006-12-19 16:19 ` Theodore Tso 2006-12-19 6:47 ` Linus Torvalds 1 sibling, 2 replies; 52+ messages in thread From: Shawn Pearce @ 2006-12-19 6:39 UTC (permalink / raw) To: Theodore Tso; +Cc: Nicolas Pitre, Linus Torvalds, Randal L. Schwartz, git Theodore Tso <tytso@mit.edu> wrote: > On Mon, Dec 18, 2006 at 07:13:40PM -0500, Nicolas Pitre wrote: > > Maybe. However the mmap() may occur on section of the pack file which > > has just been written to in order to write even more, always to the same > > file. On Linux this is fast because the mmap'd data is likely to still > > be in the cache. > > > > I guess this could be turned into a malloc()/read()/free() with no > > trouble. > > Actually, depending on the size of the chunk, even on Linux > malloc/read/free can be faster than the mmap/munmap, because > mmap/munmap calls involve page table manipulations, and even on Linux > that is often slower or dead even with the memory copy involved with > using malloc/read. Even when reading huge chunks of Canon Raw File > data at a time, I found (experimentally) that it was no faster to use > mmap() compared to read(). And for small chunks of data, malloc/read > will definitely win out over mmap(), since the page table operations > and resulting page faults completely trump the cost of copying the > bytes from the page cache to the read() buffer. This is why git-fast-import mmaps 128 MiB blocks from the file at a time. The mmap region is usually much larger than the file itself; the application appends to the file via write() then goes back and rereads data when necessary via the already established mmap. Its rare for the application to need to unmap/remap a different block so there really isn't very much page table manipulation overhead. Why isn't git-index-pack doing the same? Is there some hidden glitch in some OS somewhere that has a problem with overmapping a file and appending into it via write()? I've done that on Mac OS X, Linux, BSDi, Solaris... never had a problem. -- ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: cloning the kernel - why long time in "Resolving 313037 deltas" 2006-12-19 6:39 ` Shawn Pearce @ 2006-12-19 6:51 ` Linus Torvalds 2006-12-19 7:26 ` Shawn Pearce 2006-12-19 8:32 ` Shawn Pearce 2006-12-19 16:19 ` Theodore Tso 1 sibling, 2 replies; 52+ messages in thread From: Linus Torvalds @ 2006-12-19 6:51 UTC (permalink / raw) To: Shawn Pearce; +Cc: Theodore Tso, Nicolas Pitre, Randal L. Schwartz, git On Tue, 19 Dec 2006, Shawn Pearce wrote: > > Why isn't git-index-pack doing the same? Is there some hidden glitch > in some OS somewhere that has a problem with overmapping a file and > appending into it via write()? I've done that on Mac OS X, Linux, > BSDi, Solaris... never had a problem. It works on modern systems, but at least old HPUX versions had non-coherent mmap() and write(), and POSIX does not guarantee it. And if you ever want to port to Windows, I don't think you should do it. Anyway, try the pread() version first, see if that fixes the OS X problem. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: cloning the kernel - why long time in "Resolving 313037 deltas" 2006-12-19 6:51 ` Linus Torvalds @ 2006-12-19 7:26 ` Shawn Pearce 2006-12-19 7:52 ` Marco Roeland 2006-12-19 8:32 ` Shawn Pearce 1 sibling, 1 reply; 52+ messages in thread From: Shawn Pearce @ 2006-12-19 7:26 UTC (permalink / raw) To: Linus Torvalds; +Cc: Theodore Tso, Nicolas Pitre, Randal L. Schwartz, git Linus Torvalds <torvalds@osdl.org> wrote: > > > On Tue, 19 Dec 2006, Shawn Pearce wrote: > > > > Why isn't git-index-pack doing the same? Is there some hidden glitch > > in some OS somewhere that has a problem with overmapping a file and > > appending into it via write()? I've done that on Mac OS X, Linux, > > BSDi, Solaris... never had a problem. > > It works on modern systems, but at least old HPUX versions had > non-coherent mmap() and write(), and POSIX does not guarantee it. And if > you ever want to port to Windows, I don't think you should do it. > > Anyway, try the pread() version first, see if that fixes the OS X problem. I'll give your pread() version a shot. But right now I'm in the middle of cloning your linux-2.6 git repository. It is done downloading the pack and my system is pegged at 100% CPU while resolving deltas. ActivityMonitor is showing that I'm spending 94% CPU in the kernel, which is just insane. Clearly Mac OS X's kernel cannot gracefully handle what git-index-pack is currently doing. -- ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: cloning the kernel - why long time in "Resolving 313037 deltas" 2006-12-19 7:26 ` Shawn Pearce @ 2006-12-19 7:52 ` Marco Roeland 2006-12-19 7:58 ` Shawn Pearce 0 siblings, 1 reply; 52+ messages in thread From: Marco Roeland @ 2006-12-19 7:52 UTC (permalink / raw) To: Shawn Pearce Cc: Linus Torvalds, Theodore Tso, Nicolas Pitre, Randal L. Schwartz, git On Tuesday December 19th 2006 at 02:26 Shawn Pearce wrote: > [git-index-pack limping along on Mac OS X] > > ActivityMonitor is showing that I'm spending 94% CPU in the kernel, > which is just insane. Clearly Mac OS X's kernel cannot gracefully > handle what git-index-pack is currently doing. A bit off topic but it might help people diagnose and test different strategies. The equivalent of "oprofile" on Mac OS X is a tool called "shark" from Apple itself. It's very nice actually. It's in the CHUD (Computer Hardware Understanding Developer Tools) package. More information at http://developer.apple.com/tools/performance and it is free as in beer. ;-) Also the OpenSSL version on Mac OS X is rather old and compiled as 32-bit application. OpenSSL is one of the few userspace packages that _really_ benefits tremendously from being compiled as 64-bit. It might explain a bit of the enormous performance difference in this case. But only profiling (perhaps with the help of "shark") will tell of course... The kernel git repository isn't very handy on the Mac HFS+ filesystem, due to it being case-insensitive, but I suppose it won't influence git-index-pack. -- ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: cloning the kernel - why long time in "Resolving 313037 deltas" 2006-12-19 7:52 ` Marco Roeland @ 2006-12-19 7:58 ` Shawn Pearce 0 siblings, 0 replies; 52+ messages in thread From: Shawn Pearce @ 2006-12-19 7:58 UTC (permalink / raw) To: Marco Roeland Cc: Linus Torvalds, Theodore Tso, Nicolas Pitre, Randal L. Schwartz, git Marco Roeland <marco.roeland@xs4all.nl> wrote: > The kernel git repository isn't very handy on the Mac HFS+ filesystem, > due to it being case-insensitive, but I suppose it won't influence > git-index-pack. Yea, I just play with it in packed format on this system. :-) -- ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: cloning the kernel - why long time in "Resolving 313037 deltas" 2006-12-19 6:51 ` Linus Torvalds 2006-12-19 7:26 ` Shawn Pearce @ 2006-12-19 8:32 ` Shawn Pearce 2006-12-19 8:40 ` Marco Roeland 1 sibling, 1 reply; 52+ messages in thread From: Shawn Pearce @ 2006-12-19 8:32 UTC (permalink / raw) To: Linus Torvalds; +Cc: Theodore Tso, Nicolas Pitre, Randal L. Schwartz, git Linus Torvalds <torvalds@osdl.org> wrote: > On Tue, 19 Dec 2006, Shawn Pearce wrote: > > > > Why isn't git-index-pack doing the same? Is there some hidden glitch > > in some OS somewhere that has a problem with overmapping a file and > > appending into it via write()? I've done that on Mac OS X, Linux, > > BSDi, Solaris... never had a problem. > > It works on modern systems, but at least old HPUX versions had > non-coherent mmap() and write(), and POSIX does not guarantee it. And if > you ever want to port to Windows, I don't think you should do it. > > Anyway, try the pread() version first, see if that fixes the OS X problem. It does. Without pread() (aka stock 'next') it takes me over an hour to index a pack of linux-2.6. With pread() its 1m6s to run index-pack on the same pack file. The indexes are (of course) identically produced. -- ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: cloning the kernel - why long time in "Resolving 313037 deltas" 2006-12-19 8:32 ` Shawn Pearce @ 2006-12-19 8:40 ` Marco Roeland 2006-12-19 8:49 ` Shawn Pearce 0 siblings, 1 reply; 52+ messages in thread From: Marco Roeland @ 2006-12-19 8:40 UTC (permalink / raw) To: Shawn Pearce Cc: Linus Torvalds, Theodore Tso, Nicolas Pitre, Randal L. Schwartz, git On Tuesday December 19th 2006 at 03:32 Shawn Pearce wrote: > > Anyway, try the pread() version first, see if that fixes the OS X problem. > > It does. Without pread() (aka stock 'next') it takes me over an > hour to index a pack of linux-2.6. With pread() its 1m6s to run > index-pack on the same pack file. The indexes are (of course) > identically produced. I see the same here. From an (estimated) time of 37 minutes down to 52 seconds with the pread() patch. Running the profiler (shark) on the old version showed that 85% of the time was spent in the Mac OS X mmap() system call. -- ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: cloning the kernel - why long time in "Resolving 313037 deltas" 2006-12-19 8:40 ` Marco Roeland @ 2006-12-19 8:49 ` Shawn Pearce 2006-12-19 9:13 ` Marco Roeland 0 siblings, 1 reply; 52+ messages in thread From: Shawn Pearce @ 2006-12-19 8:49 UTC (permalink / raw) To: Marco Roeland Cc: Linus Torvalds, Theodore Tso, Nicolas Pitre, Johannes Schindelin, Randal L. Schwartz, git Marco Roeland <marco.roeland@xs4all.nl> wrote: > On Tuesday December 19th 2006 at 03:32 Shawn Pearce wrote: > > > > Anyway, try the pread() version first, see if that fixes the OS X problem. > > > > It does. Without pread() (aka stock 'next') it takes me over an > > hour to index a pack of linux-2.6. With pread() its 1m6s to run > > index-pack on the same pack file. The indexes are (of course) > > identically produced. > > I see the same here. From an (estimated) time of 37 minutes down to 52 > seconds with the pread() patch. Running the profiler (shark) on the old > version showed that 85% of the time was spent in the Mac OS X mmap() > system call. More testing on Linux is probably needed, but if using pread() on Linux is breakeven or slightly faster (as suggested by Johannes' LilyPond test) this 60x performance improvement on initial clone of largish projects on Mac OS X would be nice to have. Not that I personally frequently clone large projects on Mac OS X. But new users to Git might. :-) -- ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: cloning the kernel - why long time in "Resolving 313037 deltas" 2006-12-19 8:49 ` Shawn Pearce @ 2006-12-19 9:13 ` Marco Roeland 2006-12-19 20:28 ` Alex Riesen 0 siblings, 1 reply; 52+ messages in thread From: Marco Roeland @ 2006-12-19 9:13 UTC (permalink / raw) To: Shawn Pearce Cc: Marco Roeland, Linus Torvalds, Theodore Tso, Nicolas Pitre, Johannes Schindelin, Randal L. Schwartz, git On Tuesday December 19th 2006 at 03:49 Shawn Pearce wrote: > [pread patch to speed up git-index-pack on Mac OS X] > > More testing on Linux is probably needed, but if using pread() > on Linux is breakeven or slightly faster (as suggested by Johannes' > LilyPond test) this 60x performance improvement on initial clone > of largish projects on Mac OS X would be nice to have. I see a decrease in total time (so an improvement in performance) going from 37.3 to 35.2 seconds with the pread patch on Linux x86-64. Note that both my testing on Linux and Mac OS X was done on dual core processors (Athlon 3800+ XP on Linux and Intel Core 2 Duo 2.16GHz on an iMac). Git is only single threaded and thus uses only core, but the system can use the other core. The minor page faults also decreased from 734866 to 370690. Nice. > Not that I personally frequently clone large projects on Mac OS X. > But new users to Git might. :-) And perhaps the Cygwin version might benefit too. -- ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: cloning the kernel - why long time in "Resolving 313037 deltas" 2006-12-19 9:13 ` Marco Roeland @ 2006-12-19 20:28 ` Alex Riesen 2006-12-21 20:35 ` Juergen Ruehle 0 siblings, 1 reply; 52+ messages in thread From: Alex Riesen @ 2006-12-19 20:28 UTC (permalink / raw) To: Marco Roeland Cc: Shawn Pearce, Linus Torvalds, Theodore Tso, Nicolas Pitre, Johannes Schindelin, Randal L. Schwartz, git Marco Roeland, Tue, Dec 19, 2006 10:13:19 +0100: > > Not that I personally frequently clone large projects on Mac OS X. > > But new users to Git might. :-) > > And perhaps the Cygwin version might benefit too. Does not work there at all. Even errno is not set (0). ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: cloning the kernel - why long time in "Resolving 313037 deltas" 2006-12-19 20:28 ` Alex Riesen @ 2006-12-21 20:35 ` Juergen Ruehle 0 siblings, 0 replies; 52+ messages in thread From: Juergen Ruehle @ 2006-12-21 20:35 UTC (permalink / raw) To: Alex Riesen Cc: Marco Roeland, Shawn Pearce, Linus Torvalds, Theodore Tso, Nicolas Pitre, Johannes Schindelin, Randal L. Schwartz, git Alex Riesen writes: > Marco Roeland, Tue, Dec 19, 2006 10:13:19 +0100: > > > Not that I personally frequently clone large projects on Mac OS X. > > > But new users to Git might. :-) > > > > And perhaps the Cygwin version might benefit too. > > Does not work there at all. Even errno is not set (0). Haven't seen a reply to this yet. Upgrade cygwin.dll. The fix is pretty recent (1.5.22 or something). Perhaps we now need a note in INSTALL. Light testing seems to indicate that this change removes most of the system times while leaving user time mostly unchanged (though I don't know what cygwin time really reports in these categories). In the git repository this is about 30% of the total time. For another (slightly larger but with fewer objects) pack this still saves 20%. I haven't compared it with the NO_MMAP version (too lazy), but overall it's a win for cygwin as well (though we should note that cygwin's mmap seems to have reasonable performance). ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: cloning the kernel - why long time in "Resolving 313037 deltas" 2006-12-19 6:39 ` Shawn Pearce 2006-12-19 6:51 ` Linus Torvalds @ 2006-12-19 16:19 ` Theodore Tso 2006-12-19 16:57 ` Linus Torvalds 2006-12-20 1:58 ` Shawn Pearce 1 sibling, 2 replies; 52+ messages in thread From: Theodore Tso @ 2006-12-19 16:19 UTC (permalink / raw) To: Shawn Pearce; +Cc: Nicolas Pitre, Linus Torvalds, Randal L. Schwartz, git On Tue, Dec 19, 2006 at 01:39:30AM -0500, Shawn Pearce wrote: > This is why git-fast-import mmaps 128 MiB blocks from the file at > a time. The mmap region is usually much larger than the file itself; > the application appends to the file via write() then goes back > and rereads data when necessary via the already established mmap. > Its rare for the application to need to unmap/remap a different block > so there really isn't very much page table manipulation overhead. Yes, but unless you are using the (non-portable, Linux specific) MAP_POPULATE flag to mmap, each time you touch a new page, you end up taking a page fault; and so malloc/read/free might *still* be faster. I'd encourage you to make the change and benchmark it; the results may be surprising. I played with this with dcraw, the Canon Raw File converter a while back (before MAP_POPULATE was added), where I found that with a linear access pattern, if you are reading the entire file, it's stil marginally faster to use read() over mmap(), because with dcraw taking a page fault every 4k of raw file, the system time was significantly higher. So the main reason to use mamp, as Linus puts it, is if the management overhead of needing to read lots of small bits of the file makes the use of malloc/read to be a pain in the *ss, then go for it. But don't assume that you'll get better performance; in my experience, even on the hyper-performant Linus kernel, mmap() in general only barely breaks even with read(). On other systems, things are probably going to be even worse. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: cloning the kernel - why long time in "Resolving 313037 deltas" 2006-12-19 16:19 ` Theodore Tso @ 2006-12-19 16:57 ` Linus Torvalds 2006-12-20 1:54 ` Shawn Pearce 2006-12-20 1:58 ` Shawn Pearce 1 sibling, 1 reply; 52+ messages in thread From: Linus Torvalds @ 2006-12-19 16:57 UTC (permalink / raw) To: Theodore Tso; +Cc: Shawn Pearce, Nicolas Pitre, Randal L. Schwartz, git On Tue, 19 Dec 2006, Theodore Tso wrote: > > So the main reason to use mamp, as Linus puts it, is if the management > overhead of needing to read lots of small bits of the file makes the > use of malloc/read to be a pain in the *ss, then go for it. An example of this in git is the regular pack-file accesses. We're MUCH better off just mmap'ing the whole pack-file (or at least big chunks of it) and not having to maintain difficult structures of "this is where I read that part of the file into memory", or read _big_ chunks when quite often we just use a few kB of it. So mmap for pack-files does make sense, but probably only when you can mmap big chunks, and are going to access much smaller (random) parts of it. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: cloning the kernel - why long time in "Resolving 313037 deltas" 2006-12-19 16:57 ` Linus Torvalds @ 2006-12-20 1:54 ` Shawn Pearce 0 siblings, 0 replies; 52+ messages in thread From: Shawn Pearce @ 2006-12-20 1:54 UTC (permalink / raw) To: Linus Torvalds; +Cc: Theodore Tso, Nicolas Pitre, Randal L. Schwartz, git Linus Torvalds <torvalds@osdl.org> wrote: > On Tue, 19 Dec 2006, Theodore Tso wrote: > > > > So the main reason to use mamp, as Linus puts it, is if the management > > overhead of needing to read lots of small bits of the file makes the > > use of malloc/read to be a pain in the *ss, then go for it. > > An example of this in git is the regular pack-file accesses. We're MUCH > better off just mmap'ing the whole pack-file (or at least big chunks of > it) and not having to maintain difficult structures of "this is where I > read that part of the file into memory", or read _big_ chunks when > quite often we just use a few kB of it. > > So mmap for pack-files does make sense, but probably only when you can > mmap big chunks, and are going to access much smaller (random) parts of > it. Yes, exactly. git-fast-import mmaps the pack file for this very reason. It every so often needs to go back and reread a tree object which has expired from its own in-memory LRU cache. This usually doesn't happen very often, but when it does we don't know where we are going to jump to get data from. mmaping a huge segment of the pack file (or the whole thing if its reasonably small) works for this case as the OS buffer cache can just take care of it for us. But as Linus pointed out mmap and write() aren't safe on some systems. Arrrgh. However git-fast-import would probably work just as well (or maybe slightly better) with pread(). I really should port that code forward to current Git, use pread() instead, and submit the patch to Junio. But nobody really showed a lot of interest. My sliding window pack-file access implementation (that I'm currently rewriting on top of current Git) tries to work in very large chunks, by default its 32 MiB per chunk, but its user/repository configurable so kernel hackers may just set it to 256 MiB and continue to get one large mmap for quite some time to come. Of course I would also like to get that to autoselect the window size rather than just hardcode it. :-) The implementation would prefer a very small number (<8) of very large chunks (>32 MiB), but is designed to more gracefully degrade on huge packs on limited address space systems (e.g. Windows 32 bit) then the current code does. -- ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: cloning the kernel - why long time in "Resolving 313037 deltas" 2006-12-19 16:19 ` Theodore Tso 2006-12-19 16:57 ` Linus Torvalds @ 2006-12-20 1:58 ` Shawn Pearce 1 sibling, 0 replies; 52+ messages in thread From: Shawn Pearce @ 2006-12-20 1:58 UTC (permalink / raw) To: Theodore Tso; +Cc: Nicolas Pitre, Linus Torvalds, Randal L. Schwartz, git Theodore Tso <tytso@mit.edu> wrote: > On Tue, Dec 19, 2006 at 01:39:30AM -0500, Shawn Pearce wrote: > > This is why git-fast-import mmaps 128 MiB blocks from the file at > > a time. The mmap region is usually much larger than the file itself; > > the application appends to the file via write() then goes back > > and rereads data when necessary via the already established mmap. > > Its rare for the application to need to unmap/remap a different block > > so there really isn't very much page table manipulation overhead. > > Yes, but unless you are using the (non-portable, Linux specific) > MAP_POPULATE flag to mmap, each time you touch a new page, you end up > taking a page fault; and so malloc/read/free might *still* be faster. > I'd encourage you to make the change and benchmark it; the results may > be surprising. I played with this with dcraw, the Canon Raw File > converter a while back (before MAP_POPULATE was added), where I found > that with a linear access pattern, if you are reading the entire file, > it's stil marginally faster to use read() over mmap(), because with > dcraw taking a page fault every 4k of raw file, the system time was > significantly higher. Interesting. Lots of good reasons to probably just use pread() in there instead of mmap. For one thing git-fast-import doesn't go back and hit the already written pack data very often. Its own in memory caches usually perform very well. -- ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: cloning the kernel - why long time in "Resolving 313037 deltas" 2006-12-19 5:11 ` Theodore Tso 2006-12-19 6:39 ` Shawn Pearce @ 2006-12-19 6:47 ` Linus Torvalds 2006-12-19 8:32 ` Johannes Schindelin 1 sibling, 1 reply; 52+ messages in thread From: Linus Torvalds @ 2006-12-19 6:47 UTC (permalink / raw) To: Theodore Tso; +Cc: Nicolas Pitre, Randal L. Schwartz, git On Tue, 19 Dec 2006, Theodore Tso wrote: > > Actually, depending on the size of the chunk, even on Linux > malloc/read/free can be faster than the mmap/munmap Yes. In general, mmap/munmap is faster only if: - you access the same data multiple times within one page (ie a single page-fault will actually result in more than one access) OR - you can use it to avoid management overhead (ie you know your data is going to accessed very sparsely, but you don't know the patterns, and trying to keep track of it is painful as hell) That said, under Linux, mmap is almost never really _slower_ either, which is why this issue never made any real difference. The overhead of doing page table manipulation is pretty much balanced out by the overhead of doing a memcpy. But that "mmap is fast" is _not_ true on many other operating systems, which is why it might be worthwhile to try something like the appended on OS X, which uses pread() instead of mmap(). This is _not_ very much tested. It seems to work. Caveat emptor. It would be interesting to hear if many small "pread()" calls are faster than many mmap/munmap calls on OS X. I bet they are. Under Linux, there should be almost no difference. Linus --- diff --git a/index-pack.c b/index-pack.c index 6d6c92b..094f8b2 100644 --- a/index-pack.c +++ b/index-pack.c @@ -8,6 +8,7 @@ #include "tree.h" #include <sys/time.h> #include <signal.h> +#include <unistd.h> static const char index_pack_usage[] = "git-index-pack [-v] [-o <index-file>] [{ ---keep | --keep=<msg> }] { <pack-file> | --stdin [--fix-thin] [<pack-file>] }"; @@ -279,27 +280,25 @@ static void *get_data_from_pack(struct object_entry *obj) { unsigned long from = obj[0].offset + obj[0].hdr_size; unsigned long len = obj[1].offset - from; - unsigned pg_offset = from % getpagesize(); - unsigned char *map, *data; + unsigned char *src, *data; z_stream stream; int st; - map = mmap(NULL, len + pg_offset, PROT_READ, MAP_PRIVATE, - mmap_fd, from - pg_offset); - if (map == MAP_FAILED) - die("cannot mmap pack file: %s", strerror(errno)); + src = xmalloc(len); + if (pread(mmap_fd, src, len, from) != len) + die("cannot pread pack file: %s", strerror(errno)); data = xmalloc(obj->size); memset(&stream, 0, sizeof(stream)); stream.next_out = data; stream.avail_out = obj->size; - stream.next_in = map + pg_offset; + stream.next_in = src; stream.avail_in = len; inflateInit(&stream); while ((st = inflate(&stream, Z_FINISH)) == Z_OK); inflateEnd(&stream); if (st != Z_STREAM_END || stream.total_out != obj->size) die("serious inflate inconsistency"); - munmap(map, len + pg_offset); + free(src); return data; } ^ permalink raw reply related [flat|nested] 52+ messages in thread
* Re: cloning the kernel - why long time in "Resolving 313037 deltas" 2006-12-19 6:47 ` Linus Torvalds @ 2006-12-19 8:32 ` Johannes Schindelin 2006-12-19 9:10 ` Junio C Hamano 0 siblings, 1 reply; 52+ messages in thread From: Johannes Schindelin @ 2006-12-19 8:32 UTC (permalink / raw) To: Linus Torvalds; +Cc: Theodore Tso, Nicolas Pitre, Randal L. Schwartz, git Hi, in a very unscientific test, without your patch local cloning of the LilyPond repo takes 1m33s (user), and with your patch (pread() instead of mmap()) it takes 1m13s (user). The real times are somewhat bogus, but still in favour of pread(), but only by 8 seconds instead of 20. This is on Linux 2.4.32. Ciao, Dscho ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: cloning the kernel - why long time in "Resolving 313037 deltas" 2006-12-19 8:32 ` Johannes Schindelin @ 2006-12-19 9:10 ` Junio C Hamano 2006-12-19 9:47 ` Jeff King ` (2 more replies) 0 siblings, 3 replies; 52+ messages in thread From: Junio C Hamano @ 2006-12-19 9:10 UTC (permalink / raw) To: Johannes Schindelin; +Cc: Theodore Tso, Nicolas Pitre, Randal L. Schwartz, git Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: > Hi, > > in a very unscientific test, without your patch local cloning of the > LilyPond repo takes 1m33s (user), and with your patch (pread() instead of > mmap()) it takes 1m13s (user). The real times are somewhat bogus, but > still in favour of pread(), but only by 8 seconds instead of 20. > > This is on Linux 2.4.32. Interesting. Anybody have numbers from 2.6? ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: cloning the kernel - why long time in "Resolving 313037 deltas" 2006-12-19 9:10 ` Junio C Hamano @ 2006-12-19 9:47 ` Jeff King 2006-12-19 10:24 ` Andy Whitcroft 2006-12-19 15:53 ` [PATCH] index-pack usage of mmap() is unacceptably slower on many OSes other than Linux Nicolas Pitre 2 siblings, 0 replies; 52+ messages in thread From: Jeff King @ 2006-12-19 9:47 UTC (permalink / raw) To: Junio C Hamano Cc: Johannes Schindelin, Theodore Tso, Nicolas Pitre, Randal L. Schwartz, git On Tue, Dec 19, 2006 at 01:10:59AM -0800, Junio C Hamano wrote: > > in a very unscientific test, without your patch local cloning of the > > LilyPond repo takes 1m33s (user), and with your patch (pread() instead of > > mmap()) it takes 1m13s (user). The real times are somewhat bogus, but > > still in favour of pread(), but only by 8 seconds instead of 20. > > > > This is on Linux 2.4.32. > > Interesting. Anybody have numbers from 2.6? Similar results to those posted by Marco Roeland (numbers below are index-pack of linux-2.6 on 2.6.18 i386 smp): mmap 46.78user 4.18system 0:51.08elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+725627minor)pagefaults 0swaps pread 45.70user 3.12system 0:48.96elapsed 99%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+363794minor)pagefaults 0swaps ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: cloning the kernel - why long time in "Resolving 313037 deltas" 2006-12-19 9:10 ` Junio C Hamano 2006-12-19 9:47 ` Jeff King @ 2006-12-19 10:24 ` Andy Whitcroft 2006-12-19 15:53 ` [PATCH] index-pack usage of mmap() is unacceptably slower on many OSes other than Linux Nicolas Pitre 2 siblings, 0 replies; 52+ messages in thread From: Andy Whitcroft @ 2006-12-19 10:24 UTC (permalink / raw) To: Junio C Hamano Cc: Johannes Schindelin, Theodore Tso, Nicolas Pitre, Randal L. Schwartz, git Junio C Hamano wrote: > Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: > >> Hi, >> >> in a very unscientific test, without your patch local cloning of the >> LilyPond repo takes 1m33s (user), and with your patch (pread() instead of >> mmap()) it takes 1m13s (user). The real times are somewhat bogus, but >> still in favour of pread(), but only by 8 seconds instead of 20. >> >> This is on Linux 2.4.32. > > Interesting. Anybody have numbers from 2.6? On my debian etch system: Linux version 2.6.17-2-686 (Debian 2.6.17-9) (waldi@debian.org) (gcc version 4.1.2 20060901 (prerelease) (Debian 4.1.1-13)) #1 SMP Wed Sep 13 16:34:10 UTC 2006 I did one run first (not included) to get things nice and warm, then three runs of each. Overall the same as reported elsewhere marginally better with pread(). I guess you could say a 20-30% improvement in system time which isn't to be sniffed at. -apw mmap(): real 1m5.187s user 1m0.844s sys 0m2.900s real 1m6.748s user 1m0.868s sys 0m3.064s real 1m5.604s user 1m0.760s sys 0m3.124s pread(): real 1m4.676s user 1m0.168s sys 0m2.340s real 1m3.563s user 0m59.796s sys 0m2.248s real 1m4.066s user 1m0.156s sys 0m2.304s ^ permalink raw reply [flat|nested] 52+ messages in thread
* [PATCH] index-pack usage of mmap() is unacceptably slower on many OSes other than Linux 2006-12-19 9:10 ` Junio C Hamano 2006-12-19 9:47 ` Jeff King 2006-12-19 10:24 ` Andy Whitcroft @ 2006-12-19 15:53 ` Nicolas Pitre 2006-12-19 19:00 ` Junio C Hamano 2 siblings, 1 reply; 52+ messages in thread From: Nicolas Pitre @ 2006-12-19 15:53 UTC (permalink / raw) To: Junio C Hamano; +Cc: Linus Torvalds, Randal L. Schwartz, git It was reported by Randal L. Schwartz <merlyn@stonehenge.com> that indexing the Linux repository ~150MB pack takes about an hour on OS x while it's a minute on Linux. It seems that the OS X mmap() implementation is more than 2 orders of magnitude slower than the Linux one. Linus proposed a patch replacing mmap() with pread() bringing index-pack performance on OS X in line with the Linux one. The performances on Linux also improved by a small margin. Signed-off-by: Nicolas Pitre <nico@cam.org> --- OK looks like this has been sorted out while I was away. Good! On Tue, 19 Dec 2006, Junio C Hamano wrote: > Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: > > > Hi, > > > > in a very unscientific test, without your patch local cloning of the > > LilyPond repo takes 1m33s (user), and with your patch (pread() instead of > > mmap()) it takes 1m13s (user). The real times are somewhat bogus, but > > still in favour of pread(), but only by 8 seconds instead of 20. > > > > This is on Linux 2.4.32. > > Interesting. Anybody have numbers from 2.6? My 37 seconds of yesterday dropped to 32. This is Linus's patch plus a few cosmetic changes. diff --git a/index-pack.c b/index-pack.c index 6d6c92b..e08a687 100644 --- a/index-pack.c +++ b/index-pack.c @@ -1,3 +1,8 @@ +#define _XOPEN_SOURCE 500 +#include <unistd.h> +#include <sys/time.h> +#include <signal.h> + #include "cache.h" #include "delta.h" #include "pack.h" @@ -6,8 +11,6 @@ #include "commit.h" #include "tag.h" #include "tree.h" -#include <sys/time.h> -#include <signal.h> static const char index_pack_usage[] = "git-index-pack [-v] [-o <index-file>] [{ ---keep | --keep=<msg> }] { <pack-file> | --stdin [--fix-thin] [<pack-file>] }"; @@ -87,7 +90,7 @@ static unsigned display_progress(unsigned n, unsigned total, unsigned last_pc) static unsigned char input_buffer[4096]; static unsigned long input_offset, input_len, consumed_bytes; static SHA_CTX input_ctx; -static int input_fd, output_fd, mmap_fd; +static int input_fd, output_fd, pack_fd; /* Discard current buffer used content. */ static void flush(void) @@ -148,14 +151,14 @@ static const char *open_pack_file(const char *pack_name) output_fd = open(pack_name, O_CREAT|O_EXCL|O_RDWR, 0600); if (output_fd < 0) die("unable to create %s: %s\n", pack_name, strerror(errno)); - mmap_fd = output_fd; + pack_fd = output_fd; } else { input_fd = open(pack_name, O_RDONLY); if (input_fd < 0) die("cannot open packfile '%s': %s", pack_name, strerror(errno)); output_fd = -1; - mmap_fd = input_fd; + pack_fd = input_fd; } SHA1_Init(&input_ctx); return pack_name; @@ -279,27 +282,25 @@ static void *get_data_from_pack(struct object_entry *obj) { unsigned long from = obj[0].offset + obj[0].hdr_size; unsigned long len = obj[1].offset - from; - unsigned pg_offset = from % getpagesize(); - unsigned char *map, *data; + unsigned char *src, *data; z_stream stream; int st; - map = mmap(NULL, len + pg_offset, PROT_READ, MAP_PRIVATE, - mmap_fd, from - pg_offset); - if (map == MAP_FAILED) - die("cannot mmap pack file: %s", strerror(errno)); + src = xmalloc(len); + if (pread(pack_fd, src, len, from) != len) + die("cannot pread pack file: %s", strerror(errno)); data = xmalloc(obj->size); memset(&stream, 0, sizeof(stream)); stream.next_out = data; stream.avail_out = obj->size; - stream.next_in = map + pg_offset; + stream.next_in = src; stream.avail_in = len; inflateInit(&stream); while ((st = inflate(&stream, Z_FINISH)) == Z_OK); inflateEnd(&stream); if (st != Z_STREAM_END || stream.total_out != obj->size) die("serious inflate inconsistency"); - munmap(map, len + pg_offset); + free(src); return data; } ^ permalink raw reply related [flat|nested] 52+ messages in thread
* Re: [PATCH] index-pack usage of mmap() is unacceptably slower on many OSes other than Linux 2006-12-19 15:53 ` [PATCH] index-pack usage of mmap() is unacceptably slower on many OSes other than Linux Nicolas Pitre @ 2006-12-19 19:00 ` Junio C Hamano 2006-12-19 19:14 ` Nicolas Pitre 0 siblings, 1 reply; 52+ messages in thread From: Junio C Hamano @ 2006-12-19 19:00 UTC (permalink / raw) To: Nicolas Pitre; +Cc: Linus Torvalds, Randal L. Schwartz, git Nicolas Pitre <nico@cam.org> writes: > OK looks like this has been sorted out while I was away. Good! > > This is Linus's patch plus a few cosmetic changes. Not a complaint but rather a request for free education ;-). > diff --git a/index-pack.c b/index-pack.c > index 6d6c92b..e08a687 100644 > --- a/index-pack.c > +++ b/index-pack.c > @@ -1,3 +1,8 @@ > +#define _XOPEN_SOURCE 500 > +#include <unistd.h> > +#include <sys/time.h> > +#include <signal.h> > + > #include "cache.h" > #include "delta.h" > #include "pack.h" > @@ -6,8 +11,6 @@ > #include "commit.h" > #include "tag.h" > #include "tree.h" > -#include <sys/time.h> > -#include <signal.h> Most of the rest of the sources seem to do our includes first and source-file specific system includes at the end. What's the rationale for this change? Do we need _XOPEN_SOURCE=500 because pread() is XSI? Also nobody other than convert-objects.c has _XOPEN_SOURCE level specified. If _XOPEN_SOURCE matters I wonder if we should do so in some central place to make it consistent across source files? ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH] index-pack usage of mmap() is unacceptably slower on many OSes other than Linux 2006-12-19 19:00 ` Junio C Hamano @ 2006-12-19 19:14 ` Nicolas Pitre 2006-12-19 19:55 ` Linus Torvalds 0 siblings, 1 reply; 52+ messages in thread From: Nicolas Pitre @ 2006-12-19 19:14 UTC (permalink / raw) To: Junio C Hamano; +Cc: Linus Torvalds, Randal L. Schwartz, git On Tue, 19 Dec 2006, Junio C Hamano wrote: > > diff --git a/index-pack.c b/index-pack.c > > index 6d6c92b..e08a687 100644 > > --- a/index-pack.c > > +++ b/index-pack.c > > @@ -1,3 +1,8 @@ > > +#define _XOPEN_SOURCE 500 > > +#include <unistd.h> > > +#include <sys/time.h> > > +#include <signal.h> > > + > > #include "cache.h" > > #include "delta.h" > > #include "pack.h" > > @@ -6,8 +11,6 @@ > > #include "commit.h" > > #include "tag.h" > > #include "tree.h" > > -#include <sys/time.h> > > -#include <signal.h> > > Most of the rest of the sources seem to do our includes first > and source-file specific system includes at the end. What's the > rationale for this change? Because _XOPEN_SOURCE must be defined before including unistd.h otherwise pread is not declared and a warning is issued. > Do we need _XOPEN_SOURCE=500 because pread() is XSI? The pread man page says Unix98. > Also nobody other than convert-objects.c has _XOPEN_SOURCE level > specified. If _XOPEN_SOURCE matters I wonder if we should do so > in some central place to make it consistent across source files? Your call I guess. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH] index-pack usage of mmap() is unacceptably slower on many OSes other than Linux 2006-12-19 19:14 ` Nicolas Pitre @ 2006-12-19 19:55 ` Linus Torvalds 2006-12-19 19:57 ` Randal L. Schwartz ` (2 more replies) 0 siblings, 3 replies; 52+ messages in thread From: Linus Torvalds @ 2006-12-19 19:55 UTC (permalink / raw) To: Nicolas Pitre; +Cc: Junio C Hamano, Randal L. Schwartz, git On Tue, 19 Dec 2006, Nicolas Pitre wrote: > > Because _XOPEN_SOURCE must be defined before including unistd.h > otherwise pread is not declared and a warning is issued. May I actually suggest we handle _all_ of these issues in one central place, namely "git-compat-util.h" It's nice to have just one single file that tries to hide the details of all the differences between systems. Sure, that file ends up having to include a lot of standard header files that some of the .c files don't actually _need_, but git compiles reasonably quickly, so I don't think we need to try to optimize compile speed much. It's the C++ people who tend to have sucky compile times. So how about something like the appended? And then just have the rule that we try to include "cache.h" early, because that brings in ALL the really basic system header files? Linus --- diff --git a/convert-objects.c b/convert-objects.c index 8812583..a630132 100644 --- a/convert-objects.c +++ b/convert-objects.c @@ -1,7 +1,3 @@ -#define _XOPEN_SOURCE 600 /* glibc2 and AIX 5.3L need 500, OpenBSD needs 600 for S_ISLNK() */ -#define _XOPEN_SOURCE_EXTENDED 1 /* AIX 5.3L needs this */ -#define _GNU_SOURCE -#include <time.h> #include "cache.h" #include "blob.h" #include "commit.h" diff --git a/git-compat-util.h b/git-compat-util.h index 0272d04..e619e29 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -11,6 +11,10 @@ #define ARRAY_SIZE(x) (sizeof(x)/sizeof(x[0])) +#define _XOPEN_SOURCE 600 /* glibc2 and AIX 5.3L need 500, OpenBSD needs 600 for S_ISLNK() */ +#define _XOPEN_SOURCE_EXTENDED 1 /* AIX 5.3L needs this */ +#define _GNU_SOURCE + #include <unistd.h> #include <stdio.h> #include <sys/stat.h> @@ -25,6 +29,10 @@ #include <netinet/in.h> #include <sys/types.h> #include <dirent.h> +#include <sys/time.h> +#include <time.h> +#include <signal.h> +#include <sys/wait.h> /* On most systems <limits.h> would have given us this, but ^ permalink raw reply related [flat|nested] 52+ messages in thread
* Re: [PATCH] index-pack usage of mmap() is unacceptably slower on many OSes other than Linux 2006-12-19 19:55 ` Linus Torvalds @ 2006-12-19 19:57 ` Randal L. Schwartz 2006-12-19 20:03 ` Randal L. Schwartz 2006-12-19 20:02 ` Jeff Garzik 2006-12-20 22:13 ` Nikolai Weibull 2 siblings, 1 reply; 52+ messages in thread From: Randal L. Schwartz @ 2006-12-19 19:57 UTC (permalink / raw) To: Linus Torvalds; +Cc: Nicolas Pitre, Junio C Hamano, git >>>>> "Linus" == Linus Torvalds <torvalds@osdl.org> writes: Linus> May I actually suggest we handle _all_ of these issues in one central Linus> place, namely "git-compat-util.h" I can't remember now, but a couple of patches I had to submit were because sys/types.h was included either too early or too late on OSX, so let's be sure to get that right. Surely, my patch can be observed somewhere, perhaps in a git repository. :) -- Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095 <merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/> Perl/Unix/security consulting, Technical writing, Comedy, etc. etc. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH] index-pack usage of mmap() is unacceptably slower on many OSes other than Linux 2006-12-19 19:57 ` Randal L. Schwartz @ 2006-12-19 20:03 ` Randal L. Schwartz 0 siblings, 0 replies; 52+ messages in thread From: Randal L. Schwartz @ 2006-12-19 20:03 UTC (permalink / raw) To: Linus Torvalds; +Cc: Nicolas Pitre, Junio C Hamano, git >>>>> "Randal" == Randal L Schwartz <merlyn@stonehenge.com> writes: Randal> I can't remember now, but a couple of patches I had to submit were Randal> because sys/types.h was included either too early or too late on OSX, Randal> so let's be sure to get that right. Surely, my patch can be observed Randal> somewhere, perhaps in a git repository. :) Here's one that might be relevant for OSX: 979e32fa1483a32faa4ec331e29b357e5eb5ef25 And this is an ordering issue for OpenBSD: ed1795fcc5f2aa3f105630429bcbed49c50053fa -- Randal L. Schwartz - Stonehenge Consulting Services, Inc. - +1 503 777 0095 <merlyn@stonehenge.com> <URL:http://www.stonehenge.com/merlyn/> Perl/Unix/security consulting, Technical writing, Comedy, etc. etc. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH] index-pack usage of mmap() is unacceptably slower on many OSes other than Linux 2006-12-19 19:55 ` Linus Torvalds 2006-12-19 19:57 ` Randal L. Schwartz @ 2006-12-19 20:02 ` Jeff Garzik 2006-12-20 0:30 ` Junio C Hamano 2006-12-20 22:13 ` Nikolai Weibull 2 siblings, 1 reply; 52+ messages in thread From: Jeff Garzik @ 2006-12-19 20:02 UTC (permalink / raw) To: Linus Torvalds; +Cc: Nicolas Pitre, Junio C Hamano, Randal L. Schwartz, git Linus Torvalds wrote: > diff --git a/convert-objects.c b/convert-objects.c > index 8812583..a630132 100644 > --- a/convert-objects.c > +++ b/convert-objects.c > @@ -1,7 +1,3 @@ > -#define _XOPEN_SOURCE 600 /* glibc2 and AIX 5.3L need 500, OpenBSD needs 600 for S_ISLNK() */ > -#define _XOPEN_SOURCE_EXTENDED 1 /* AIX 5.3L needs this */ > -#define _GNU_SOURCE > -#include <time.h> > #include "cache.h" > #include "blob.h" > #include "commit.h" > diff --git a/git-compat-util.h b/git-compat-util.h > index 0272d04..e619e29 100644 > --- a/git-compat-util.h > +++ b/git-compat-util.h > @@ -11,6 +11,10 @@ > > #define ARRAY_SIZE(x) (sizeof(x)/sizeof(x[0])) > > +#define _XOPEN_SOURCE 600 /* glibc2 and AIX 5.3L need 500, OpenBSD needs 600 for S_ISLNK() */ > +#define _XOPEN_SOURCE_EXTENDED 1 /* AIX 5.3L needs this */ > +#define _GNU_SOURCE > + > #include <unistd.h> > #include <stdio.h> > #include <sys/stat.h> > @@ -25,6 +29,10 @@ > #include <netinet/in.h> > #include <sys/types.h> > #include <dirent.h> > +#include <sys/time.h> > +#include <time.h> > +#include <signal.h> > +#include <sys/wait.h> If you are going to do this, you have to audit -every- file, to make sure git-compat-util.h is -always- the first header. For example, builtin-mailinfo.c includes git-compat-util.h after ctype.h and iconv.h, which renders your #define _XOPEN_SOURCE 600 useless. /usr/include/features.h has already been included at that point. Jeff ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH] index-pack usage of mmap() is unacceptably slower on many OSes other than Linux 2006-12-19 20:02 ` Jeff Garzik @ 2006-12-20 0:30 ` Junio C Hamano 2006-12-20 0:40 ` Linus Torvalds 0 siblings, 1 reply; 52+ messages in thread From: Junio C Hamano @ 2006-12-20 0:30 UTC (permalink / raw) To: Jeff Garzik; +Cc: Nicolas Pitre, Linus Torvalds, Randal L. Schwartz, git Jeff Garzik <jeff@garzik.org> writes: > If you are going to do this, you have to audit -every- file, to make > sure git-compat-util.h is -always- the first header. Will do. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH] index-pack usage of mmap() is unacceptably slower on many OSes other than Linux 2006-12-20 0:30 ` Junio C Hamano @ 2006-12-20 0:40 ` Linus Torvalds 2006-12-20 0:50 ` Jeff Garzik 2006-12-20 1:12 ` Junio C Hamano 0 siblings, 2 replies; 52+ messages in thread From: Linus Torvalds @ 2006-12-20 0:40 UTC (permalink / raw) To: Junio C Hamano; +Cc: Jeff Garzik, Nicolas Pitre, Randal L. Schwartz, git On Tue, 19 Dec 2006, Junio C Hamano wrote: > Jeff Garzik <jeff@garzik.org> writes: > > > If you are going to do this, you have to audit -every- file, to make > > sure git-compat-util.h is -always- the first header. > > Will do. Well, since any cases where it isn't (and where we'd care) will show up as just a compiler warning, I doubt we really even need to. We can fix things up as they get reported too.. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH] index-pack usage of mmap() is unacceptably slower on many OSes other than Linux 2006-12-20 0:40 ` Linus Torvalds @ 2006-12-20 0:50 ` Jeff Garzik 2006-12-20 1:12 ` Junio C Hamano 1 sibling, 0 replies; 52+ messages in thread From: Jeff Garzik @ 2006-12-20 0:50 UTC (permalink / raw) To: Linus Torvalds; +Cc: Junio C Hamano, Nicolas Pitre, Randal L. Schwartz, git Linus Torvalds wrote: > > On Tue, 19 Dec 2006, Junio C Hamano wrote: > >> Jeff Garzik <jeff@garzik.org> writes: >> >>> If you are going to do this, you have to audit -every- file, to make >>> sure git-compat-util.h is -always- the first header. >> Will do. > > Well, since any cases where it isn't (and where we'd care) will show up > as just a compiler warning, I doubt we really even need to. We can fix > things up as they get reported too.. I wouldn't assume that is always the case. Mostly true, I would suppose. But some of those defines turn on and off 64-bit file size/offset gadgets for example. Jeff ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH] index-pack usage of mmap() is unacceptably slower on many OSes other than Linux 2006-12-20 0:40 ` Linus Torvalds 2006-12-20 0:50 ` Jeff Garzik @ 2006-12-20 1:12 ` Junio C Hamano 2006-12-20 20:17 ` Junio C Hamano 1 sibling, 1 reply; 52+ messages in thread From: Junio C Hamano @ 2006-12-20 1:12 UTC (permalink / raw) To: Jeff Garzik; +Cc: Linus Torvalds, Nicolas Pitre, Randal L. Schwartz, git Linus Torvalds <torvalds@osdl.org> writes: > On Tue, 19 Dec 2006, Junio C Hamano wrote: > >> Jeff Garzik <jeff@garzik.org> writes: >> >> > If you are going to do this, you have to audit -every- file, to make >> > sure git-compat-util.h is -always- the first header. >> >> Will do. > > Well, since any cases where it isn't (and where we'd care) will show up > as just a compiler warning, I doubt we really even need to. We can fix > things up as they get reported too.. True, but I've done it already, so... Test compile especially on non Linux boxes are appreciated (I'll do one on an OpenBSD bochs tomorrow myself anyway, though). -- >8 -- [PATCH] simplify inclusion of system header files. This is a mechanical clean-up of the way *.c files include system header files. (1) sources under compat/, platform sha-1 implementations, and xdelta code are exempt from the following rules; (2) the first #include must be "git-compat-util.h" or one of our own header file that includes it first (e.g. config.h, builtin.h, pkt-line.h); (3) system headers that are included in "git-compat-util.h" need not be included in individual C source files. (4) "git-compat-util.h" does not have to include subsystem specific header files (e.g. expat.h). Signed-off-by: Junio C Hamano <junkio@cox.net> --- archive-tar.c | 1 - archive-zip.c | 1 - blob.c | 1 - builtin-add.c | 2 -- builtin-apply.c | 1 - builtin-archive.c | 1 - builtin-blame.c | 4 ---- builtin-branch.c | 2 +- builtin-for-each-ref.c | 1 - builtin-grep.c | 3 --- builtin-log.c | 2 -- builtin-ls-files.c | 2 -- builtin-mailinfo.c | 9 --------- builtin-mailsplit.c | 7 ------- builtin-mv.c | 2 -- builtin-name-rev.c | 1 - builtin-pack-objects.c | 2 -- builtin-repo-config.c | 1 - builtin-runstatus.c | 2 +- builtin-shortlog.c | 1 - builtin-show-branch.c | 2 -- builtin-stripspace.c | 3 --- builtin-tar-tree.c | 1 - builtin-unpack-objects.c | 2 -- builtin-upload-archive.c | 3 --- color.c | 5 +---- compat/mmap.c | 4 ---- compat/setenv.c | 3 +-- compat/strlcpy.c | 2 +- compat/unsetenv.c | 3 +-- config.c | 1 - connect.c | 6 ------ convert-objects.c | 4 ---- daemon.c | 16 +++------------- date.c | 3 --- diff-delta.c | 5 +---- diff.c | 3 --- diffcore-order.c | 1 - diffcore-pickaxe.c | 2 -- dir.c | 3 --- entry.c | 2 -- fetch-pack.c | 1 - fetch.c | 3 +-- fsck-objects.c | 3 --- git-compat-util.h | 23 +++++++++++++++++++++++ git.c | 14 +------------- grep.c | 1 - help.c | 3 +-- ident.c | 3 --- imap-send.c | 7 ------- interpolate.c | 2 -- lockfile.c | 1 - merge-base.c | 1 - merge-index.c | 4 ---- merge-recursive.c | 7 ------- path-list.c | 1 - path.c | 1 - receive-pack.c | 1 - refs.c | 4 +--- revision.c | 1 - rsh.c | 6 +----- run-command.c | 1 - ssh-upload.c | 2 -- strbuf.c | 4 +--- test-date.c | 3 --- test-delta.c | 8 +------- tree.c | 1 - unpack-trees.c | 2 -- upload-pack.c | 3 --- var.c | 3 --- wt-status.c | 2 +- 71 files changed, 41 insertions(+), 195 deletions(-) diff --git a/archive-tar.c b/archive-tar.c index ff0f6e2..af47fdc 100644 --- a/archive-tar.c +++ b/archive-tar.c @@ -1,7 +1,6 @@ /* * Copyright (c) 2005, 2006 Rene Scharfe */ -#include <time.h> #include "cache.h" #include "commit.h" #include "strbuf.h" diff --git a/archive-zip.c b/archive-zip.c index 36e922a..f31b8ed 100644 --- a/archive-zip.c +++ b/archive-zip.c @@ -1,7 +1,6 @@ /* * Copyright (c) 2006 Rene Scharfe */ -#include <time.h> #include "cache.h" #include "commit.h" #include "blob.h" diff --git a/blob.c b/blob.c index d1af2e6..9776bea 100644 --- a/blob.c +++ b/blob.c @@ -1,6 +1,5 @@ #include "cache.h" #include "blob.h" -#include <stdlib.h> const char *blob_type = "blob"; diff --git a/builtin-add.c b/builtin-add.c index b3f9206..c8a114f 100644 --- a/builtin-add.c +++ b/builtin-add.c @@ -3,8 +3,6 @@ * * Copyright (C) 2006 Linus Torvalds */ -#include <fnmatch.h> - #include "cache.h" #include "builtin.h" #include "dir.h" diff --git a/builtin-apply.c b/builtin-apply.c index 436d9e1..1c35837 100644 --- a/builtin-apply.c +++ b/builtin-apply.c @@ -6,7 +6,6 @@ * This applies patches on top of some (arbitrary) version of the SCM. * */ -#include <fnmatch.h> #include "cache.h" #include "cache-tree.h" #include "quote.h" diff --git a/builtin-archive.c b/builtin-archive.c index a8a1f07..391cf43 100644 --- a/builtin-archive.c +++ b/builtin-archive.c @@ -2,7 +2,6 @@ * Copyright (c) 2006 Franck Bui-Huu * Copyright (c) 2006 Rene Scharfe */ -#include <time.h> #include "cache.h" #include "builtin.h" #include "archive.h" diff --git a/builtin-blame.c b/builtin-blame.c index a250724..9bf6ec9 100644 --- a/builtin-blame.c +++ b/builtin-blame.c @@ -15,10 +15,6 @@ #include "revision.h" #include "xdiff-interface.h" -#include <time.h> -#include <sys/time.h> -#include <regex.h> - static char blame_usage[] = "git-blame [-c] [-l] [-t] [-f] [-n] [-p] [-L n,m] [-S <revs-file>] [-M] [-C] [-C] [commit] [--] file\n" " -c, --compatibility Use the same output mode as git-annotate (Default: off)\n" diff --git a/builtin-branch.c b/builtin-branch.c index 560309c..515330c 100644 --- a/builtin-branch.c +++ b/builtin-branch.c @@ -5,8 +5,8 @@ * Based on git-branch.sh by Junio C Hamano. */ -#include "color.h" #include "cache.h" +#include "color.h" #include "refs.h" #include "commit.h" #include "builtin.h" diff --git a/builtin-for-each-ref.c b/builtin-for-each-ref.c index 227aa3c..af72a12 100644 --- a/builtin-for-each-ref.c +++ b/builtin-for-each-ref.c @@ -6,7 +6,6 @@ #include "tree.h" #include "blob.h" #include "quote.h" -#include <fnmatch.h> /* Quoting styles */ #define QUOTE_NONE 0 diff --git a/builtin-grep.c b/builtin-grep.c index 9873e3d..3b1b1cb 100644 --- a/builtin-grep.c +++ b/builtin-grep.c @@ -10,10 +10,7 @@ #include "tag.h" #include "tree-walk.h" #include "builtin.h" -#include <regex.h> #include "grep.h" -#include <fnmatch.h> -#include <sys/wait.h> /* * git grep pathspecs are somewhat different from diff-tree pathspecs; diff --git a/builtin-log.c b/builtin-log.c index 17014f7..8df3c13 100644 --- a/builtin-log.c +++ b/builtin-log.c @@ -11,8 +11,6 @@ #include "log-tree.h" #include "builtin.h" #include "tag.h" -#include <time.h> -#include <sys/time.h> static int default_show_root = 1; diff --git a/builtin-ls-files.c b/builtin-ls-files.c index bc79ce4..21c2a6e 100644 --- a/builtin-ls-files.c +++ b/builtin-ls-files.c @@ -5,8 +5,6 @@ * * Copyright (C) Linus Torvalds, 2005 */ -#include <fnmatch.h> - #include "cache.h" #include "quote.h" #include "dir.h" diff --git a/builtin-mailinfo.c b/builtin-mailinfo.c index b8d7dbc..e647229 100644 --- a/builtin-mailinfo.c +++ b/builtin-mailinfo.c @@ -2,15 +2,6 @@ * Another stupid program, this one parsing the headers of an * email to figure out authorship and subject */ -#define _GNU_SOURCE -#include <stdio.h> -#include <stdlib.h> -#include <string.h> -#include <ctype.h> -#ifndef NO_ICONV -#include <iconv.h> -#endif -#include "git-compat-util.h" #include "cache.h" #include "builtin.h" diff --git a/builtin-mailsplit.c b/builtin-mailsplit.c index 91a699d..3bca855 100644 --- a/builtin-mailsplit.c +++ b/builtin-mailsplit.c @@ -4,13 +4,6 @@ * It just splits a mbox into a list of files: "0001" "0002" .. * so you can process them further from there. */ -#include <unistd.h> -#include <stdlib.h> -#include <fcntl.h> -#include <sys/types.h> -#include <sys/stat.h> -#include <string.h> -#include <stdio.h> #include "cache.h" #include "builtin.h" diff --git a/builtin-mv.c b/builtin-mv.c index d14a4a7..737af35 100644 --- a/builtin-mv.c +++ b/builtin-mv.c @@ -3,8 +3,6 @@ * * Copyright (C) 2006 Johannes Schindelin */ -#include <fnmatch.h> - #include "cache.h" #include "builtin.h" #include "dir.h" diff --git a/builtin-name-rev.c b/builtin-name-rev.c index 618aa31..b4f15cc 100644 --- a/builtin-name-rev.c +++ b/builtin-name-rev.c @@ -1,4 +1,3 @@ -#include <stdlib.h> #include "builtin.h" #include "cache.h" #include "commit.h" diff --git a/builtin-pack-objects.c b/builtin-pack-objects.c index a2dc7d1..807be8c 100644 --- a/builtin-pack-objects.c +++ b/builtin-pack-objects.c @@ -12,8 +12,6 @@ #include "diff.h" #include "revision.h" #include "list-objects.h" -#include <sys/time.h> -#include <signal.h> static const char pack_usage[] = "\ git-pack-objects [{ -q | --progress | --all-progress }] \n\ diff --git a/builtin-repo-config.c b/builtin-repo-config.c index a38099a..a7ab4ce 100644 --- a/builtin-repo-config.c +++ b/builtin-repo-config.c @@ -1,6 +1,5 @@ #include "builtin.h" #include "cache.h" -#include <regex.h> static const char git_config_set_usage[] = "git-repo-config [ --global ] [ --bool | --int ] [--get | --get-all | --get-regexp | --replace-all | --add | --unset | --unset-all] name [value [value_regex]] | --rename-section old_name new_name | --list"; diff --git a/builtin-runstatus.c b/builtin-runstatus.c index 0b63037..4b489b1 100644 --- a/builtin-runstatus.c +++ b/builtin-runstatus.c @@ -1,5 +1,5 @@ -#include "wt-status.h" #include "cache.h" +#include "wt-status.h" extern int wt_status_use_color; diff --git a/builtin-shortlog.c b/builtin-shortlog.c index 3fc43dd..edb4042 100644 --- a/builtin-shortlog.c +++ b/builtin-shortlog.c @@ -4,7 +4,6 @@ #include "diff.h" #include "path-list.h" #include "revision.h" -#include <string.h> static const char shortlog_usage[] = "git-shortlog [-n] [-s] [<commit-id>... ]"; diff --git a/builtin-show-branch.c b/builtin-show-branch.c index a38ac34..b9d9781 100644 --- a/builtin-show-branch.c +++ b/builtin-show-branch.c @@ -1,5 +1,3 @@ -#include <stdlib.h> -#include <fnmatch.h> #include "cache.h" #include "commit.h" #include "refs.h" diff --git a/builtin-stripspace.c b/builtin-stripspace.c index 09cc910..f0d4d9e 100644 --- a/builtin-stripspace.c +++ b/builtin-stripspace.c @@ -1,6 +1,3 @@ -#include <stdio.h> -#include <string.h> -#include <ctype.h> #include "builtin.h" /* diff --git a/builtin-tar-tree.c b/builtin-tar-tree.c index 4d4cfec..11e62fc 100644 --- a/builtin-tar-tree.c +++ b/builtin-tar-tree.c @@ -1,7 +1,6 @@ /* * Copyright (c) 2005, 2006 Rene Scharfe */ -#include <time.h> #include "cache.h" #include "commit.h" #include "tar.h" diff --git a/builtin-unpack-objects.c b/builtin-unpack-objects.c index e6d7574..d351e02 100644 --- a/builtin-unpack-objects.c +++ b/builtin-unpack-objects.c @@ -8,8 +8,6 @@ #include "tag.h" #include "tree.h" -#include <sys/time.h> - static int dry_run, quiet, recover, has_errors; static const char unpack_usage[] = "git-unpack-objects [-n] [-q] [-r] < pack-file"; diff --git a/builtin-upload-archive.c b/builtin-upload-archive.c index 45c92e1..e4156f8 100644 --- a/builtin-upload-archive.c +++ b/builtin-upload-archive.c @@ -1,9 +1,6 @@ /* * Copyright (c) 2006 Franck Bui-Huu */ -#include <time.h> -#include <sys/wait.h> -#include <sys/poll.h> #include "cache.h" #include "builtin.h" #include "archive.h" diff --git a/color.c b/color.c index d8c8399..09d82ee 100644 --- a/color.c +++ b/color.c @@ -1,8 +1,5 @@ -#include "color.h" #include "cache.h" -#include "git-compat-util.h" - -#include <stdarg.h> +#include "color.h" #define COLOR_RESET "\033[m" diff --git a/compat/mmap.c b/compat/mmap.c index a4d2e50..0fd46e7 100644 --- a/compat/mmap.c +++ b/compat/mmap.c @@ -1,7 +1,3 @@ -#include <stdio.h> -#include <stdlib.h> -#include <unistd.h> -#include <errno.h> #include "../git-compat-util.h" void *gitfakemmap(void *start, size_t length, int prot , int flags, int fd, off_t offset) diff --git a/compat/setenv.c b/compat/setenv.c index b7d7678..3a22ea7 100644 --- a/compat/setenv.c +++ b/compat/setenv.c @@ -1,5 +1,4 @@ -#include <stdlib.h> -#include <string.h> +#include "../git-compat-util.h" int gitsetenv(const char *name, const char *value, int replace) { diff --git a/compat/strlcpy.c b/compat/strlcpy.c index b66856a..4024c36 100644 --- a/compat/strlcpy.c +++ b/compat/strlcpy.c @@ -1,4 +1,4 @@ -#include <string.h> +#include "../git-compat-util.h" size_t gitstrlcpy(char *dest, const char *src, size_t size) { diff --git a/compat/unsetenv.c b/compat/unsetenv.c index 3a5e4ec..eb29f5e 100644 --- a/compat/unsetenv.c +++ b/compat/unsetenv.c @@ -1,5 +1,4 @@ -#include <stdlib.h> -#include <string.h> +#include "../git-compat-util.h" void gitunsetenv (const char *name) { diff --git a/config.c b/config.c index 663993f..913509f 100644 --- a/config.c +++ b/config.c @@ -6,7 +6,6 @@ * */ #include "cache.h" -#include <regex.h> #define MAXNAME (256) diff --git a/connect.c b/connect.c index f7edba8..66daa11 100644 --- a/connect.c +++ b/connect.c @@ -3,12 +3,6 @@ #include "pkt-line.h" #include "quote.h" #include "refs.h" -#include <sys/wait.h> -#include <sys/socket.h> -#include <netinet/in.h> -#include <arpa/inet.h> -#include <netdb.h> -#include <signal.h> static char *server_capabilities; diff --git a/convert-objects.c b/convert-objects.c index 8812583..a630132 100644 --- a/convert-objects.c +++ b/convert-objects.c @@ -1,7 +1,3 @@ -#define _XOPEN_SOURCE 600 /* glibc2 and AIX 5.3L need 500, OpenBSD needs 600 for S_ISLNK() */ -#define _XOPEN_SOURCE_EXTENDED 1 /* AIX 5.3L needs this */ -#define _GNU_SOURCE -#include <time.h> #include "cache.h" #include "blob.h" #include "commit.h" diff --git a/daemon.c b/daemon.c index e66bb80..b129b83 100644 --- a/daemon.c +++ b/daemon.c @@ -1,20 +1,10 @@ -#include <signal.h> -#include <sys/wait.h> -#include <sys/socket.h> -#include <sys/time.h> -#include <sys/poll.h> -#include <netdb.h> -#include <netinet/in.h> -#include <arpa/inet.h> -#include <syslog.h> -#include <pwd.h> -#include <grp.h> -#include <limits.h> -#include "pkt-line.h" #include "cache.h" +#include "pkt-line.h" #include "exec_cmd.h" #include "interpolate.h" +#include <syslog.h> + #ifndef HOST_NAME_MAX #define HOST_NAME_MAX 256 #endif diff --git a/date.c b/date.c index 1825922..7acb8cb 100644 --- a/date.c +++ b/date.c @@ -4,9 +4,6 @@ * Copyright (C) Linus Torvalds, 2005 */ -#include <time.h> -#include <sys/time.h> - #include "cache.h" static time_t my_mktime(struct tm *tm) diff --git a/diff-delta.c b/diff-delta.c index fa16d06..9f998d0 100644 --- a/diff-delta.c +++ b/diff-delta.c @@ -18,11 +18,8 @@ * licensing gets turned into GPLv2 within this project. */ -#include <stdlib.h> -#include <string.h> -#include "delta.h" - #include "git-compat-util.h" +#include "delta.h" /* maximum hash entry list for the same hash bucket */ #define HASH_LIMIT 64 diff --git a/diff.c b/diff.c index 9974435..61889d7 100644 --- a/diff.c +++ b/diff.c @@ -1,9 +1,6 @@ /* * Copyright (C) 2005 Junio C Hamano */ -#include <sys/types.h> -#include <sys/wait.h> -#include <signal.h> #include "cache.h" #include "quote.h" #include "diff.h" diff --git a/diffcore-order.c b/diffcore-order.c index aef6da6..7ad0946 100644 --- a/diffcore-order.c +++ b/diffcore-order.c @@ -4,7 +4,6 @@ #include "cache.h" #include "diff.h" #include "diffcore.h" -#include <fnmatch.h> static char **order; static int order_cnt; diff --git a/diffcore-pickaxe.c b/diffcore-pickaxe.c index cfcce31..de44ada 100644 --- a/diffcore-pickaxe.c +++ b/diffcore-pickaxe.c @@ -5,8 +5,6 @@ #include "diff.h" #include "diffcore.h" -#include <regex.h> - static unsigned int contains(struct diff_filespec *one, const char *needle, unsigned long len, regex_t *regexp) diff --git a/dir.c b/dir.c index e6a61ee..16401d8 100644 --- a/dir.c +++ b/dir.c @@ -5,9 +5,6 @@ * Copyright (C) Linus Torvalds, 2005-2006 * Junio Hamano, 2005-2006 */ -#include <dirent.h> -#include <fnmatch.h> - #include "cache.h" #include "dir.h" diff --git a/entry.c b/entry.c index b2ea0ef..88df713 100644 --- a/entry.c +++ b/entry.c @@ -1,5 +1,3 @@ -#include <sys/types.h> -#include <dirent.h> #include "cache.h" #include "blob.h" diff --git a/fetch-pack.c b/fetch-pack.c index 743eab7..92322cf 100644 --- a/fetch-pack.c +++ b/fetch-pack.c @@ -5,7 +5,6 @@ #include "tag.h" #include "exec_cmd.h" #include "sideband.h" -#include <sys/wait.h> static int keep_pack; static int quiet; diff --git a/fetch.c b/fetch.c index 663b4b2..f69be82 100644 --- a/fetch.c +++ b/fetch.c @@ -1,6 +1,5 @@ -#include "fetch.h" - #include "cache.h" +#include "fetch.h" #include "commit.h" #include "tree.h" #include "tree-walk.h" diff --git a/fsck-objects.c b/fsck-objects.c index 46b628c..409aea0 100644 --- a/fsck-objects.c +++ b/fsck-objects.c @@ -1,6 +1,3 @@ -#include <sys/types.h> -#include <dirent.h> - #include "cache.h" #include "commit.h" #include "tree.h" diff --git a/git-compat-util.h b/git-compat-util.h index 0272d04..87fde99 100644 --- a/git-compat-util.h +++ b/git-compat-util.h @@ -11,6 +11,10 @@ #define ARRAY_SIZE(x) (sizeof(x)/sizeof(x[0])) +#define _XOPEN_SOURCE 600 /* glibc2 and AIX 5.3L need 500, OpenBSD needs 600 for S_ISLNK() */ +#define _XOPEN_SOURCE_EXTENDED 1 /* AIX 5.3L needs this */ +#define _GNU_SOURCE + #include <unistd.h> #include <stdio.h> #include <sys/stat.h> @@ -25,6 +29,25 @@ #include <netinet/in.h> #include <sys/types.h> #include <dirent.h> +#include <sys/time.h> +#include <time.h> +#include <signal.h> +#include <sys/wait.h> +#include <fnmatch.h> +#include <sys/poll.h> +#include <sys/socket.h> +#include <assert.h> +#include <regex.h> +#include <netinet/in.h> +#include <netinet/tcp.h> +#include <arpa/inet.h> +#include <netdb.h> +#include <pwd.h> +#include <grp.h> + +#ifndef NO_ICONV +#include <iconv.h> +#endif /* On most systems <limits.h> would have given us this, but * not on some systems (e.g. GNU/Hurd). diff --git a/git.c b/git.c index 016ee8a..73cf4d4 100644 --- a/git.c +++ b/git.c @@ -1,20 +1,8 @@ -#include <stdio.h> -#include <sys/types.h> -#include <sys/stat.h> -#include <dirent.h> -#include <unistd.h> -#include <stdlib.h> -#include <string.h> -#include <errno.h> -#include <limits.h> -#include <stdarg.h> -#include "git-compat-util.h" +#include "builtin.h" #include "exec_cmd.h" #include "cache.h" #include "quote.h" -#include "builtin.h" - const char git_usage_string[] = "git [--version] [--exec-path[=GIT_EXEC_PATH]] [-p|--paginate] [--bare] [--git-dir=GIT_DIR] [--help] COMMAND [ARGS]"; diff --git a/grep.c b/grep.c index 0fc078e..fcc6762 100644 --- a/grep.c +++ b/grep.c @@ -1,5 +1,4 @@ #include "cache.h" -#include <regex.h> #include "grep.h" void append_grep_pattern(struct grep_opt *opt, const char *pat, diff --git a/help.c b/help.c index 0824c25..341b9e3 100644 --- a/help.c +++ b/help.c @@ -3,12 +3,11 @@ * * Builtin help-related commands (help, usage, version) */ -#include <sys/ioctl.h> #include "cache.h" #include "builtin.h" #include "exec_cmd.h" #include "common-cmds.h" - +#include <sys/ioctl.h> /* most GUI terminals set COLUMNS (although some don't export it) */ static int term_columns(void) diff --git a/ident.c b/ident.c index d7faba6..6ad8fed 100644 --- a/ident.c +++ b/ident.c @@ -7,9 +7,6 @@ */ #include "cache.h" -#include <pwd.h> -#include <netdb.h> - static char git_default_date[50]; static void copy_gecos(struct passwd *w, char *name, int sz) diff --git a/imap-send.c b/imap-send.c index a6a6568..894cbbd 100644 --- a/imap-send.c +++ b/imap-send.c @@ -24,13 +24,6 @@ #include "cache.h" -#include <assert.h> -#include <netinet/in.h> -#include <netinet/tcp.h> -#include <arpa/inet.h> -#include <sys/socket.h> -#include <netdb.h> - typedef struct store_conf { char *name; const char *path; /* should this be here? its interpretation is driver-specific */ diff --git a/interpolate.c b/interpolate.c index 5d9d188..f992ef7 100644 --- a/interpolate.c +++ b/interpolate.c @@ -2,8 +2,6 @@ * Copyright 2006 Jon Loeliger */ -#include <string.h> - #include "git-compat-util.h" #include "interpolate.h" diff --git a/lockfile.c b/lockfile.c index 2a2fea3..261baff 100644 --- a/lockfile.c +++ b/lockfile.c @@ -1,7 +1,6 @@ /* * Copyright (c) 2005, Junio C Hamano */ -#include <signal.h> #include "cache.h" static struct lock_file *lock_file_list; diff --git a/merge-base.c b/merge-base.c index 009caf8..385f4ba 100644 --- a/merge-base.c +++ b/merge-base.c @@ -1,4 +1,3 @@ -#include <stdlib.h> #include "cache.h" #include "commit.h" diff --git a/merge-index.c b/merge-index.c index 646d090..a9983dd 100644 --- a/merge-index.c +++ b/merge-index.c @@ -1,7 +1,3 @@ -#include <sys/types.h> -#include <sys/wait.h> -#include <signal.h> - #include "cache.h" static const char *pgm; diff --git a/merge-recursive.c b/merge-recursive.c index 6dd6e2e..1de273e 100644 --- a/merge-recursive.c +++ b/merge-recursive.c @@ -3,13 +3,6 @@ * Fredrik Kuivinen. * The thieves were Alex Riesen and Johannes Schindelin, in June/July 2006 */ -#include <stdarg.h> -#include <string.h> -#include <assert.h> -#include <sys/wait.h> -#include <sys/types.h> -#include <sys/stat.h> -#include <time.h> #include "cache.h" #include "cache-tree.h" #include "commit.h" diff --git a/path-list.c b/path-list.c index f8800f8..caaa5cc 100644 --- a/path-list.c +++ b/path-list.c @@ -1,4 +1,3 @@ -#include <stdio.h> #include "cache.h" #include "path-list.h" diff --git a/path.c b/path.c index d2c076d..066f621 100644 --- a/path.c +++ b/path.c @@ -11,7 +11,6 @@ * which is what it's designed for. */ #include "cache.h" -#include <pwd.h> static char bad_path[] = "/bad-path/"; diff --git a/receive-pack.c b/receive-pack.c index 5e5510b..59b682c 100644 --- a/receive-pack.c +++ b/receive-pack.c @@ -6,7 +6,6 @@ #include "exec_cmd.h" #include "commit.h" #include "object.h" -#include <sys/wait.h> static const char receive_pack_usage[] = "git-receive-pack <git-dir>"; diff --git a/refs.c b/refs.c index d911b9e..a101ff3 100644 --- a/refs.c +++ b/refs.c @@ -1,10 +1,8 @@ -#include "refs.h" #include "cache.h" +#include "refs.h" #include "object.h" #include "tag.h" -#include <errno.h> - /* ISSYMREF=01 and ISPACKED=02 are public interfaces */ #define REF_KNOWS_PEELED 04 diff --git a/revision.c b/revision.c index 993bb66..7b6f91f 100644 --- a/revision.c +++ b/revision.c @@ -6,7 +6,6 @@ #include "diff.h" #include "refs.h" #include "revision.h" -#include <regex.h> #include "grep.h" static char *path_name(struct name_path *path, const char *name) diff --git a/rsh.c b/rsh.c index f34409e..5754a23 100644 --- a/rsh.c +++ b/rsh.c @@ -1,10 +1,6 @@ -#include <string.h> -#include <sys/types.h> -#include <sys/socket.h> - +#include "cache.h" #include "rsh.h" #include "quote.h" -#include "cache.h" #define COMMAND_SIZE 4096 diff --git a/run-command.c b/run-command.c index 6190868..492ad3e 100644 --- a/run-command.c +++ b/run-command.c @@ -1,6 +1,5 @@ #include "cache.h" #include "run-command.h" -#include <sys/wait.h> #include "exec_cmd.h" int run_command_v_opt(int argc, const char **argv, int flags) diff --git a/ssh-upload.c b/ssh-upload.c index 20b15ea..0b52ae1 100644 --- a/ssh-upload.c +++ b/ssh-upload.c @@ -12,8 +12,6 @@ #include "rsh.h" #include "refs.h" -#include <string.h> - static unsigned char local_version = 1; static unsigned char remote_version; diff --git a/strbuf.c b/strbuf.c index 9d9d8be..7f14b0f 100644 --- a/strbuf.c +++ b/strbuf.c @@ -1,7 +1,5 @@ -#include <stdio.h> -#include <stdlib.h> -#include "strbuf.h" #include "cache.h" +#include "strbuf.h" void strbuf_init(struct strbuf *sb) { sb->buf = NULL; diff --git a/test-date.c b/test-date.c index 93e8027..62e8f23 100644 --- a/test-date.c +++ b/test-date.c @@ -1,6 +1,3 @@ -#include <stdio.h> -#include <time.h> - #include "cache.h" int main(int argc, char **argv) diff --git a/test-delta.c b/test-delta.c index 1be8ee0..795aa08 100644 --- a/test-delta.c +++ b/test-delta.c @@ -8,13 +8,7 @@ * published by the Free Software Foundation. */ -#include <stdio.h> -#include <unistd.h> -#include <string.h> -#include <fcntl.h> -#include <sys/types.h> -#include <sys/stat.h> -#include <sys/mman.h> +#include "git-compat-util.h" #include "delta.h" static const char usage[] = diff --git a/tree.c b/tree.c index ea386e5..b6f02fe 100644 --- a/tree.c +++ b/tree.c @@ -4,7 +4,6 @@ #include "commit.h" #include "tag.h" #include "tree-walk.h" -#include <stdlib.h> const char *tree_type = "tree"; diff --git a/unpack-trees.c b/unpack-trees.c index b8689eb..2e2232c 100644 --- a/unpack-trees.c +++ b/unpack-trees.c @@ -1,5 +1,3 @@ -#include <signal.h> -#include <sys/time.h> #include "cache.h" #include "dir.h" #include "tree.h" diff --git a/upload-pack.c b/upload-pack.c index 4572fff..32b06b2 100644 --- a/upload-pack.c +++ b/upload-pack.c @@ -1,6 +1,3 @@ -#include <signal.h> -#include <sys/wait.h> -#include <sys/poll.h> #include "cache.h" #include "refs.h" #include "pkt-line.h" diff --git a/var.c b/var.c index a57a33b..39977b9 100644 --- a/var.c +++ b/var.c @@ -4,9 +4,6 @@ * Copyright (C) Eric Biederman, 2005 */ #include "cache.h" -#include <stdio.h> -#include <errno.h> -#include <string.h> static const char var_usage[] = "git-var [-l | <variable>]"; diff --git a/wt-status.c b/wt-status.c index cface6c..db42738 100644 --- a/wt-status.c +++ b/wt-status.c @@ -1,6 +1,6 @@ +#include "cache.h" #include "wt-status.h" #include "color.h" -#include "cache.h" #include "object.h" #include "dir.h" #include "commit.h" -- 1.4.4.2.g205bf ^ permalink raw reply related [flat|nested] 52+ messages in thread
* Re: [PATCH] index-pack usage of mmap() is unacceptably slower on many OSes other than Linux 2006-12-20 1:12 ` Junio C Hamano @ 2006-12-20 20:17 ` Junio C Hamano 2006-12-20 20:53 ` Linus Torvalds 0 siblings, 1 reply; 52+ messages in thread From: Junio C Hamano @ 2006-12-20 20:17 UTC (permalink / raw) To: Jeff Garzik; +Cc: Linus Torvalds, Nicolas Pitre, Randal L. Schwartz, git Junio C Hamano <junkio@cox.net> writes: > Linus Torvalds <torvalds@osdl.org> writes: > >> On Tue, 19 Dec 2006, Junio C Hamano wrote: >> >>> Jeff Garzik <jeff@garzik.org> writes: >>> >>> > If you are going to do this, you have to audit -every- file, to make >>> > sure git-compat-util.h is -always- the first header. >>> >>> Will do. >> >> Well, since any cases where it isn't (and where we'd care) will show up >> as just a compiler warning, I doubt we really even need to. We can fix >> things up as they get reported too.. > > True, but I've done it already, so... > > Test compile especially on non Linux boxes are appreciated (I'll > do one on an OpenBSD bochs tomorrow myself anyway, though). I've pushed the results out, along with the index-pack updates from Linus/Nico. I needed to fix the changes to git-compat-util.h a bit from the version I sent earlier to make OpenBSD happy (sys/types.h there did not expose u_int unless _BSD_SOURCE was set, and netinet/in.h was duplicated by mistake). ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH] index-pack usage of mmap() is unacceptably slower on many OSes other than Linux 2006-12-20 20:17 ` Junio C Hamano @ 2006-12-20 20:53 ` Linus Torvalds 2006-12-20 21:52 ` Junio C Hamano 0 siblings, 1 reply; 52+ messages in thread From: Linus Torvalds @ 2006-12-20 20:53 UTC (permalink / raw) To: Junio C Hamano; +Cc: Jeff Garzik, Nicolas Pitre, Randal L. Schwartz, git On Wed, 20 Dec 2006, Junio C Hamano wrote: > > I needed to fix the changes to git-compat-util.h a bit from the > version I sent earlier to make OpenBSD happy (sys/types.h there > did not expose u_int unless _BSD_SOURCE was set, and > netinet/in.h was duplicated by mistake). Please don't use "u_int" in the first place. It's an abomination of a type. It should never be used. There's simply no point. It's "unsigned int", and that doesn't need any header files at all. I don't understand why people are lazy, and can't write "unsigned", but then introduce a type that requires you to have all kinds of magic. The lazyness just results in more work, and is totally nonportable. So "u_int" (along with its idiotic brethren "u_long", "u_short" and "u_char") is just silly. The only user in git seems to have been copied from a source that is insane. It does u_int words[NS_IN6ADDRSZ / NS_INT16SZ]; which is just insane. It actually seems to want to use "uint16_t", which at least would make sense, and be a type that has some _point_ to it. So please change the "u_int" to either "unsigned int" or "uint16_t". Either is better. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH] index-pack usage of mmap() is unacceptably slower on many OSes other than Linux 2006-12-20 20:53 ` Linus Torvalds @ 2006-12-20 21:52 ` Junio C Hamano 0 siblings, 0 replies; 52+ messages in thread From: Junio C Hamano @ 2006-12-20 21:52 UTC (permalink / raw) To: Linus Torvalds; +Cc: Jeff Garzik, Nicolas Pitre, Randal L. Schwartz, git Linus Torvalds <torvalds@osdl.org> writes: > On Wed, 20 Dec 2006, Junio C Hamano wrote: >> >> I needed to fix the changes to git-compat-util.h a bit from the >> version I sent earlier to make OpenBSD happy (sys/types.h there >> did not expose u_int unless _BSD_SOURCE was set, and >> netinet/in.h was duplicated by mistake). > > Please don't use "u_int" in the first place. It's an abomination of a > type. It should never be used. I did not want to touch imported sources in compat/, but > The only user in git seems to have been copied from a source that is > insane. It does > > u_int words[NS_IN6ADDRSZ / NS_INT16SZ]; > > which is just insane. It actually seems to want to use "uint16_t", which > at least would make sense, and be a type that has some _point_ to it. the above argument makes 100% sense. Will fix. However, on sane platforms we do not even compile that file. The problem I observed was that <include/netinet/tcp.h> on OpenBSD uses u_int wants the source code that uses that header file to first include <sys/types.h> to get u_int, which in turn requires __BSD_VISIBLE to be in effect. Unfortunately I think _BSD_SOURCE needs to stay for this reason. ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: [PATCH] index-pack usage of mmap() is unacceptably slower on many OSes other than Linux 2006-12-19 19:55 ` Linus Torvalds 2006-12-19 19:57 ` Randal L. Schwartz 2006-12-19 20:02 ` Jeff Garzik @ 2006-12-20 22:13 ` Nikolai Weibull 2 siblings, 0 replies; 52+ messages in thread From: Nikolai Weibull @ 2006-12-20 22:13 UTC (permalink / raw) To: Linus Torvalds; +Cc: Nicolas Pitre, Junio C Hamano, Randal L. Schwartz, git On 12/19/06, Linus Torvalds <torvalds@osdl.org> wrote: > It's the C++ people who tend to have sucky compile times. Always looking for a way to bash on C++, eh? ^ permalink raw reply [flat|nested] 52+ messages in thread
* Re: cloning the kernel - why long time in "Resolving 313037 deltas" @ 2006-12-21 8:41 linux 0 siblings, 0 replies; 52+ messages in thread From: linux @ 2006-12-21 8:41 UTC (permalink / raw) To: git Remember, OS X is FreeBSD somehow layered on top of Mach 3. In particular, it calls though to Mach for virtual memory. And Mach is at its heart an academic design aiming for flexibility in its VM system rather tha performance. In particular, its provisions for user-space pagers. I can easily believe that its mmap is 10x slower that Linux. 60x is.. a bit extreme. A Darwin kernel hacker should probably get a nudge about it. ^ permalink raw reply [flat|nested] 52+ messages in thread
end of thread, other threads:[~2007-01-03 13:55 UTC | newest]
Thread overview: 52+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <86y7p57y05.fsf@blue.stonehenge.com>
[not found] ` <Pine.LNX.4.63.0612182154170.19693@wbgn013.biozentrum.uni-wuerzburg.de>
[not found] ` <Pine.LNX.4.63.0612182213020.19693@wbgn013.biozentrum.uni-wuerzburg.de>
[not found] ` <Pine.LNX.4.64.0612181638220.18171@xanadu.home>
2006-12-18 21:55 ` [PATCH] fetch-pack: avoid fixing thin packs when unnecessary Johannes Schindelin
2006-12-18 22:17 ` Nicolas Pitre
[not found] ` <Pine.LNX.4.64.0612181251020.3479@woody.osdl.org>
[not found] ` <86r6uw9azn.fsf@blue.stonehenge.com>
[not found] ` <Pine.LNX.4.64.0612181625140.18171@xanadu.home>
2006-12-18 22:01 ` cloning the kernel - why long time in "Resolving 313037 deltas" Randal L. Schwartz
2006-12-18 22:09 ` Nicolas Pitre
2006-12-18 22:21 ` Randal L. Schwartz
2006-12-18 22:50 ` Nicolas Pitre
2006-12-18 22:22 ` Linus Torvalds
2006-12-18 22:26 ` Randal L. Schwartz
2006-12-18 23:02 ` Martin Langhoff
2006-12-22 1:44 ` Kyle Moffett
2006-12-22 1:56 ` Shawn Pearce
2006-12-22 8:04 ` Marco Roeland
2007-01-03 13:55 ` Andreas Ericsson
2006-12-18 23:28 ` Linus Torvalds
2006-12-19 0:13 ` Nicolas Pitre
2006-12-19 5:11 ` Theodore Tso
2006-12-19 6:39 ` Shawn Pearce
2006-12-19 6:51 ` Linus Torvalds
2006-12-19 7:26 ` Shawn Pearce
2006-12-19 7:52 ` Marco Roeland
2006-12-19 7:58 ` Shawn Pearce
2006-12-19 8:32 ` Shawn Pearce
2006-12-19 8:40 ` Marco Roeland
2006-12-19 8:49 ` Shawn Pearce
2006-12-19 9:13 ` Marco Roeland
2006-12-19 20:28 ` Alex Riesen
2006-12-21 20:35 ` Juergen Ruehle
2006-12-19 16:19 ` Theodore Tso
2006-12-19 16:57 ` Linus Torvalds
2006-12-20 1:54 ` Shawn Pearce
2006-12-20 1:58 ` Shawn Pearce
2006-12-19 6:47 ` Linus Torvalds
2006-12-19 8:32 ` Johannes Schindelin
2006-12-19 9:10 ` Junio C Hamano
2006-12-19 9:47 ` Jeff King
2006-12-19 10:24 ` Andy Whitcroft
2006-12-19 15:53 ` [PATCH] index-pack usage of mmap() is unacceptably slower on many OSes other than Linux Nicolas Pitre
2006-12-19 19:00 ` Junio C Hamano
2006-12-19 19:14 ` Nicolas Pitre
2006-12-19 19:55 ` Linus Torvalds
2006-12-19 19:57 ` Randal L. Schwartz
2006-12-19 20:03 ` Randal L. Schwartz
2006-12-19 20:02 ` Jeff Garzik
2006-12-20 0:30 ` Junio C Hamano
2006-12-20 0:40 ` Linus Torvalds
2006-12-20 0:50 ` Jeff Garzik
2006-12-20 1:12 ` Junio C Hamano
2006-12-20 20:17 ` Junio C Hamano
2006-12-20 20:53 ` Linus Torvalds
2006-12-20 21:52 ` Junio C Hamano
2006-12-20 22:13 ` Nikolai Weibull
2006-12-21 8:41 cloning the kernel - why long time in "Resolving 313037 deltas" linux
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).