* git log with ordering option and --first-parent is unnecessarily slow @ 2016-04-09 3:55 Josiah Boning 2016-04-09 5:01 ` Jeff King 0 siblings, 1 reply; 3+ messages in thread From: Josiah Boning @ 2016-04-09 3:55 UTC (permalink / raw) To: git As measured on linux.git, adding --date-order to a log command can result in a significant slowdown (~25x here; I've seen ~100x on other repositories): $ time git log --first-parent --max-count=51 master > /dev/null real 0m0.024s user 0m0.006s sys 0m0.016s $ time git log --date-order --first-parent --max-count=51 master > /dev/null real 0m0.652s user 0m0.570s sys 0m0.074s In combination with --first-parent, --date-order (or any other ordering option) should be a no-op, since --first-parent selects a linear history. So it seems like there's a significant performance win available by ignoring ordering options when --first-parent is present. Would this change be desirable? If so, I'll see about submitting a patch. More generally, it seems like it might be possible to identify when the selected commits are linear and avoid the cost of sorting. Josiah ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: git log with ordering option and --first-parent is unnecessarily slow 2016-04-09 3:55 git log with ordering option and --first-parent is unnecessarily slow Josiah Boning @ 2016-04-09 5:01 ` Jeff King 2016-04-09 5:42 ` Josiah Boning 0 siblings, 1 reply; 3+ messages in thread From: Jeff King @ 2016-04-09 5:01 UTC (permalink / raw) To: Josiah Boning; +Cc: git On Fri, Apr 08, 2016 at 08:55:54PM -0700, Josiah Boning wrote: > As measured on linux.git, adding --date-order to a log command can > result in a significant slowdown (~25x here; I've seen ~100x on other > repositories): > > $ time git log --first-parent --max-count=51 master > /dev/null > real 0m0.024s > user 0m0.006s > sys 0m0.016s > $ time git log --date-order --first-parent --max-count=51 master > /dev/null > real 0m0.652s > user 0m0.570s > sys 0m0.074s Try timing "git log --first-parent master >/dev/null". It should be about the same as the latter, which gives a hint about what is going on. It's the "--max-count" which is interesting here. It is applied _after_ the sorting. So in the non-sorted case, git can stream out commits and stop after printing 51. In the sorted case, we have to access every commit to get its date (well, every commit on the first-parent path), then sort them, and then return the first 51. > In combination with --first-parent, --date-order (or any other > ordering option) should be a no-op, since --first-parent selects a > linear history. So it seems like there's a significant performance win > available by ignoring ordering options when --first-parent is present. > Would this change be desirable? If so, I'll see about submitting a > patch. I do agree that --date-order on a linear parent walk cannot change the ordering (which guarantees child-before-parent), and is a noop. But note that not all first-parent invocations are strictly linear. For example: git log --first-parent --date-order master next which starts from two tips. We'd still want to order commits from the two lists according to --date-order. I suppose we could catch the single-tip, first-parent case and ignore any ordering options which imply child-before-parent (which is currently all of them). But I did not think too hard if there are any other corner cases. This sounds like a case of "doctor, it hurts when I do this". Why do you want to add --date-order in such a case, when it is a noop? > More generally, it seems like it might be possible to identify when > the selected commits are linear and avoid the cost of sorting. It's not the cost of sorting. It's the cost of accessing the commits (if you profile, you should see most time spent in zlib). So figuring out that the case is linear will require roughly the same expense. -Peff ^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: git log with ordering option and --first-parent is unnecessarily slow 2016-04-09 5:01 ` Jeff King @ 2016-04-09 5:42 ` Josiah Boning 0 siblings, 0 replies; 3+ messages in thread From: Josiah Boning @ 2016-04-09 5:42 UTC (permalink / raw) To: Jeff King; +Cc: git On Fri, Apr 8, 2016 at 10:01 PM, Jeff King wrote: > I do agree that --date-order on a linear parent walk cannot change the > ordering (which guarantees child-before-parent), and is a noop. But > note that not all first-parent invocations are strictly linear. For > example: > > git log --first-parent --date-order master next > > which starts from two tips. We'd still want to order commits from the > two lists according to --date-order. Ah, indeed; multiple tips certainly count as a nonlinearity. > I suppose we could catch the single-tip, first-parent case and ignore > any ordering options which imply child-before-parent (which is currently > all of them). But I did not think too hard if there are any other corner > cases. This sounds like a case of "doctor, it hurts when I do this". Why > do you want to add --date-order in such a case, when it is a noop? It hurt when I did that, so I stopped. In my case, the git log command was written when the repositories involved were much smaller, so the delay was not so noticeable. Today I found and fixed the invocation--but it still seems nice to fix at the git level, if that's possible without adding much complexity. > It's not the cost of sorting. It's the cost of accessing the commits (if > you profile, you should see most time spent in zlib). So figuring out > that the case is linear will require roughly the same expense. I'm imagining that while traversing the commit graph, one could keep an is_linear flag, which is set initially if starting from a single tip, and becomes unset when a merge commit is encountered. If is_linear is still true when the entire requested range or the desired number of commit sare accumulated, then there's no need to sort. This is, as a disclaimer, the guesswork of someone not familiar with the codebase. I'll go read the source before imagining further :) Josiah ^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2016-04-09 5:42 UTC | newest] Thread overview: 3+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-04-09 3:55 git log with ordering option and --first-parent is unnecessarily slow Josiah Boning 2016-04-09 5:01 ` Jeff King 2016-04-09 5:42 ` Josiah Boning
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).