* Is cogito really this inefficient @ 2005-07-13 12:50 Russell King 2005-07-13 16:51 ` Matthias Urlichs 2005-07-13 20:28 ` Linus Torvalds 0 siblings, 2 replies; 13+ messages in thread From: Russell King @ 2005-07-13 12:50 UTC (permalink / raw) To: git This says it all. 1min 22secs to generate a patch from a locally modified but uncommitted file. cp, edit, diff would be several orders of magnitude faster. What's going on? $ /usr/bin/time cg-diff drivers/serial/8250.c > o Command exited with non-zero status 1 14.40user 17.47system 1:22.96elapsed 38%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (180major+14692minor)pagefaults 0swaps diff --git a/drivers/serial/8250.c b/drivers/serial/8250.c --- a/drivers/serial/8250.c +++ b/drivers/serial/8250.c @@ -2333,6 +2333,7 @@ static int __devinit serial8250_probe(st dev_err(dev, "unable to register port at index %d " "(IO%lx MEM%lx IRQ%d): %d\n", i, p->iobase, p->mapbase, p->irq, ret); + printk(KERN_ERR "uartclk was %d\n", port.uartclk); } } return 0; -- Russell King ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Is cogito really this inefficient 2005-07-13 12:50 Is cogito really this inefficient Russell King @ 2005-07-13 16:51 ` Matthias Urlichs 2005-07-14 7:38 ` Russell King 2005-07-13 20:28 ` Linus Torvalds 1 sibling, 1 reply; 13+ messages in thread From: Matthias Urlichs @ 2005-07-13 16:51 UTC (permalink / raw) To: git Hi, Russell King wrote: > This says it all. 1min 22secs to generate a patch from a locally > modified but uncommitted file. I only get that when the index is out-of-date WRT the file modification dates, so cg-diff has to examine every file. The good news is that the index is being updated as it finds that the files are in sync, so expect this to be significantly faster the next time around. -- Matthias Urlichs | {M:U} IT Design @ m-u-it.de | smurf@smurf.noris.de Disclaimer: The quote was selected randomly. Really. | http://smurf.noris.de - - Praise the sea; on shore remain. -- John Florio ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Is cogito really this inefficient 2005-07-13 16:51 ` Matthias Urlichs @ 2005-07-14 7:38 ` Russell King 0 siblings, 0 replies; 13+ messages in thread From: Russell King @ 2005-07-14 7:38 UTC (permalink / raw) To: Matthias Urlichs; +Cc: git On Wed, Jul 13, 2005 at 06:51:30PM +0200, Matthias Urlichs wrote: > Hi, Russell King wrote: > > > This says it all. 1min 22secs to generate a patch from a locally > > modified but uncommitted file. > > I only get that when the index is out-of-date WRT the file modification > dates, so cg-diff has to examine every file. > > The good news is that the index is being updated as it finds that the > files are in sync, so expect this to be significantly faster the next time > around. It isn't. First time it was 1min11, second time _immediately_ after it was 1min22. See my reply to Linus. Oddly, show-diff seemed to be a lot more efficient in previous git revisions. -- Russell King ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Is cogito really this inefficient 2005-07-13 12:50 Is cogito really this inefficient Russell King 2005-07-13 16:51 ` Matthias Urlichs @ 2005-07-13 20:28 ` Linus Torvalds 2005-07-14 7:37 ` Russell King 1 sibling, 1 reply; 13+ messages in thread From: Linus Torvalds @ 2005-07-13 20:28 UTC (permalink / raw) To: Russell King; +Cc: git On Wed, 13 Jul 2005, Russell King wrote: > > This says it all. 1min 22secs to generate a patch from a locally > modified but uncommitted file. No, there's something else going on. Most likely that something forced a total index file re-validation, and the time you see is every single checked out file having its SHA1 re-computed. Was this a recently cloned tree, or what was the last operation you did on that tree before that command? Something must have invalidated the index. Linus ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Is cogito really this inefficient 2005-07-13 20:28 ` Linus Torvalds @ 2005-07-14 7:37 ` Russell King 2005-07-14 9:08 ` Catalin Marinas 2005-07-14 15:26 ` Linus Torvalds 0 siblings, 2 replies; 13+ messages in thread From: Russell King @ 2005-07-14 7:37 UTC (permalink / raw) To: Linus Torvalds; +Cc: git On Wed, Jul 13, 2005 at 01:28:18PM -0700, Linus Torvalds wrote: > On Wed, 13 Jul 2005, Russell King wrote: > > This says it all. 1min 22secs to generate a patch from a locally > > modified but uncommitted file. > > No, there's something else going on. > > Most likely that something forced a total index file re-validation, and > the time you see is every single checked out file having its SHA1 > re-computed. > > Was this a recently cloned tree, or what was the last operation you did on > that tree before that command? Something must have invalidated the index. cg-update origin and then I edited drivers/serial/8250.c As discovered using: sh -x /usr/bin/cg-diff drivers/serial/8250.c it appears that cg-diff does a git-update-cache --refresh >/dev/null each time it's run, which is taking the bulk of the time. Also note that curiously, it exits with status 1. -- Russell King ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Is cogito really this inefficient 2005-07-14 7:37 ` Russell King @ 2005-07-14 9:08 ` Catalin Marinas 2005-07-14 9:59 ` Russell King 2005-07-14 15:26 ` Linus Torvalds 1 sibling, 1 reply; 13+ messages in thread From: Catalin Marinas @ 2005-07-14 9:08 UTC (permalink / raw) To: Russell King; +Cc: Linus Torvalds, git Russell King <rmk@arm.linux.org.uk> wrote: > it appears that cg-diff does a > > git-update-cache --refresh >/dev/null > > each time it's run, which is taking the bulk of the time. Also note > that curiously, it exits with status 1. Does git-ls-files --unmerged show any files? -- Catalin ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Is cogito really this inefficient 2005-07-14 9:08 ` Catalin Marinas @ 2005-07-14 9:59 ` Russell King 2005-07-14 15:51 ` Linus Torvalds 0 siblings, 1 reply; 13+ messages in thread From: Russell King @ 2005-07-14 9:59 UTC (permalink / raw) To: Catalin Marinas; +Cc: Linus Torvalds, git On Thu, Jul 14, 2005 at 10:08:31AM +0100, Catalin Marinas wrote: > Russell King <rmk@arm.linux.org.uk> wrote: > > it appears that cg-diff does a > > > > git-update-cache --refresh >/dev/null > > > > each time it's run, which is taking the bulk of the time. Also note > > that curiously, it exits with status 1. > > Does git-ls-files --unmerged show any files? No, and it returns fairly quickly: $ /usr/bin/time git-ls-files --unmerged 0.29user 0.03system 0:00.43elapsed 73%CPU (0avgtext+0avgdata 0maxresident)k 0inputs+0outputs (0major+655minor)pagefaults 0swaps Actually, I should've left the sh -x /usr/bin/cg-diff drivers/serial/8250.c running a little longer. It's not the git-update-cache command which is taking the time, it's git-diff-cache. Running the diff several times, both with and without changes to drivers/serial/8250.c, it seems that sometimes it's faster. I guess it has to do with dentry invalidation... However, the point is - I've only asked for _one_ file. Why do we need to look at _every_ file in the tree? I could understand this behaviour if I'd asked for a diff across the whole tree, but I didn't. Internally, the sha1 of the unmodified drivers/serial/8250.c should be known, so should be trivial to unpack that and generate a diff. Given the cache, this should be something which should be lightning fast when the requested fileset to diff is already known. -- Russell King ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Is cogito really this inefficient 2005-07-14 9:59 ` Russell King @ 2005-07-14 15:51 ` Linus Torvalds 2005-07-15 0:29 ` Linus Torvalds 0 siblings, 1 reply; 13+ messages in thread From: Linus Torvalds @ 2005-07-14 15:51 UTC (permalink / raw) To: Russell King; +Cc: Catalin Marinas, git On Thu, 14 Jul 2005, Russell King wrote: > > Actually, I should've left the sh -x /usr/bin/cg-diff drivers/serial/8250.c > running a little longer. It's not the git-update-cache command which > is taking the time, it's git-diff-cache. Ok. git-diff-cache actually ends up reading your HEAD tree, and that, in turn, is 1000+ tree objects. So it can take a while for the whole tree, especially in the nonpacked and uncached case. git-diff-tree (comparing two trees) is smart enough to limit itself to just the sub-trees that have been named, and would have compared the two trees by looking up just eight objects (three subdirectories from each tree, and then the file itself from both trees). But git-diff-cache isn't - because it's comparing the tree against the index file, and the index is inevitably the whole tree. And I now think I know what makes it slow. Not only are you basically opening 1100 files (the tree objects - there's really that many subdirectories in the kernel. Scary), but because you have alternate object directories, and almost all of the objects are in the alternate (not your primary), you'll basically always end up _first_ looking in the primary, failing, and then looking in the alternate. Together with the hashing, you'll be looking all over the place, in other words ;) Which means that you'll be needing a fair amount of memory to keep all of those negative dentries etc cached (and the directory tree too). This is something the pack-files will just help enormously with, but it was only recently that we turned git around to check the pack-files _first_, and the object directories second, so you probably won't see it (not to mention that you probably don't have big pack-files at all ;) I'll look into making diff-cache be more efficient. I normally don't use it myself, so I didn't bother (I use git-diff-files, which is way more efficient, but doesn't show the difference against the _tree_, it shows the difference against the index. Since cogito tries to hide the index from you, cogito can't very well use that). Linus ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Is cogito really this inefficient 2005-07-14 15:51 ` Linus Torvalds @ 2005-07-15 0:29 ` Linus Torvalds 2005-07-15 2:10 ` Junio C Hamano 2005-07-15 9:48 ` Russell King 0 siblings, 2 replies; 13+ messages in thread From: Linus Torvalds @ 2005-07-15 0:29 UTC (permalink / raw) To: Russell King; +Cc: Catalin Marinas, git On Thu, 14 Jul 2005, Linus Torvalds wrote: > > I'll look into making diff-cache be more efficient. I normally don't use > it myself, so I didn't bother (I use git-diff-files, which is way more > efficient, but doesn't show the difference against the _tree_, it shows > the difference against the index. Since cogito tries to hide the index > from you, cogito can't very well use that). Ok, done. I made git-diff-cache _and_ git-diff-files limit the pathnames early, so that they don't even bother expanding the tree objects that are irrelevant, and don't bother even validating index objects that don't match the pathnames given. Junio - I think this makes gitcore-pathspec pretty pointless, but I didn't actually remove it. I guess "git-diff-helper" still uses it. Linus ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Is cogito really this inefficient 2005-07-15 0:29 ` Linus Torvalds @ 2005-07-15 2:10 ` Junio C Hamano 2005-07-15 9:48 ` Russell King 1 sibling, 0 replies; 13+ messages in thread From: Junio C Hamano @ 2005-07-15 2:10 UTC (permalink / raw) To: Linus Torvalds; +Cc: git Linus Torvalds <torvalds@osdl.org> writes: > On Thu, 14 Jul 2005, Linus Torvalds wrote: >> >> I'll look into making diff-cache be more efficient. I normally don't use >> it myself, so I didn't bother (I use git-diff-files, which is way more >> efficient, but doesn't show the difference against the _tree_, it shows >> the difference against the index. Since cogito tries to hide the index >> from you, cogito can't very well use that). > > Ok, done. Wonderful. > Junio - I think this makes gitcore-pathspec pretty pointless, but I didn't > actually remove it. I guess "git-diff-helper" still uses it. And probably it shouldn't; diff-helper should be raw-to-patch converter, nothing more. Usually I'd volunteer to clean up the remaining mess (which was originally my mess anyway) myself, but since I'd already asked smurf to help cleaning up the diff option parsing, and recently I've suddenly got quite busy in the day job, so ... ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Is cogito really this inefficient 2005-07-15 0:29 ` Linus Torvalds 2005-07-15 2:10 ` Junio C Hamano @ 2005-07-15 9:48 ` Russell King 1 sibling, 0 replies; 13+ messages in thread From: Russell King @ 2005-07-15 9:48 UTC (permalink / raw) To: Linus Torvalds; +Cc: Catalin Marinas, git On Thu, Jul 14, 2005 at 05:29:09PM -0700, Linus Torvalds wrote: > On Thu, 14 Jul 2005, Linus Torvalds wrote: > > I'll look into making diff-cache be more efficient. I normally don't use > > it myself, so I didn't bother (I use git-diff-files, which is way more > > efficient, but doesn't show the difference against the _tree_, it shows > > the difference against the index. Since cogito tries to hide the index > > from you, cogito can't very well use that). > > Ok, done. Thanks Linus. I'll look forward to trying this out. -- Russell King ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Is cogito really this inefficient 2005-07-14 7:37 ` Russell King 2005-07-14 9:08 ` Catalin Marinas @ 2005-07-14 15:26 ` Linus Torvalds 2005-07-19 23:54 ` Petr Baudis 1 sibling, 1 reply; 13+ messages in thread From: Linus Torvalds @ 2005-07-14 15:26 UTC (permalink / raw) To: Russell King; +Cc: git On Thu, 14 Jul 2005, Russell King wrote: > > cg-update origin > and then I edited drivers/serial/8250.c Hmm.. > it appears that cg-diff does a > > git-update-cache --refresh >/dev/null > > each time it's run, which is taking the bulk of the time. Also note > that curiously, it exits with status 1. That part is normal - a update-cache is fast (it takes me 0.08 sec for the kernel) if the cache is already mostly up-to-date, and the non-zero exit status just means that some file was different (ie it's telling the caller that there are edits in your tree - drivers/serial/8250.c). The update-cache is slow only if the index isn't up-to-date, which can happen either if somebody plays games with the index, or if somebody touches all the files in the tree. It's quite possible that some path in cg-update ends up not updating the index properly. For example, I notice that the "fast-forward" uses "git-checkout-cache -f -a", which can do so (lack of "-u" fila), but then it does do a "git-update-cache --refresh" later, so that doesn't seem to be it either. If you do a "git-diff-files" every once in a while, it will _scream_ at you whenever you have files that aren't up-to-date in the cache. That's normal in small doses, of course (eg your edit of drivers/serial/8250.c would make that one not up-to-date), but if you get a _lot_ of files listed, that's usually a sign that something screwed up your index. Linus ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: Is cogito really this inefficient 2005-07-14 15:26 ` Linus Torvalds @ 2005-07-19 23:54 ` Petr Baudis 0 siblings, 0 replies; 13+ messages in thread From: Petr Baudis @ 2005-07-19 23:54 UTC (permalink / raw) To: Linus Torvalds; +Cc: Russell King, git Dear diary, on Thu, Jul 14, 2005 at 05:26:07PM CEST, I got a letter where Linus Torvalds <torvalds@osdl.org> told me that... > It's quite possible that some path in cg-update ends up not updating the > index properly. For example, I notice that the "fast-forward" uses > "git-checkout-cache -f -a", which can do so (lack of "-u" fila), but then > it does do a "git-update-cache --refresh" later, so that doesn't seem to > be it either. Just a side note for casual readers, Cogito could use a cleanup here - from large part it ignores things like git-checkout-cache -u simply because there was no such option at the time that part of Cogito was written. I myself am not even too familiar about those gazillions of funny new options, and as long as it works, I prefer not to touch that code, but if someone is bored and wants to get familiar with core git usage as well as Cogito internals... -- Petr "Pasky" Baudis Stuff: http://pasky.or.cz/ If you want the holes in your knowledge showing up try teaching someone. -- Alan Cox ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2005-07-19 23:55 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2005-07-13 12:50 Is cogito really this inefficient Russell King 2005-07-13 16:51 ` Matthias Urlichs 2005-07-14 7:38 ` Russell King 2005-07-13 20:28 ` Linus Torvalds 2005-07-14 7:37 ` Russell King 2005-07-14 9:08 ` Catalin Marinas 2005-07-14 9:59 ` Russell King 2005-07-14 15:51 ` Linus Torvalds 2005-07-15 0:29 ` Linus Torvalds 2005-07-15 2:10 ` Junio C Hamano 2005-07-15 9:48 ` Russell King 2005-07-14 15:26 ` Linus Torvalds 2005-07-19 23:54 ` Petr Baudis
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).