From: Linus Torvalds <torvalds@linux-foundation.org>
To: Git Mailing List <git@vger.kernel.org>,
Junio C Hamano <gitster@pobox.com>
Subject: three-way diff performance problem
Date: Tue, 21 Jul 2009 11:10:01 -0700 (PDT) [thread overview]
Message-ID: <alpine.LFD.2.01.0907211038120.19335@localhost.localdomain> (raw)
Ok, so I decided to abuse git today, because I actually had a problem
where git could potentially solve it. Sadly, it's now 55 minutes later,
and my single git command hasn't completed yet :(
What did I do?
We have a really odd bug-report for the kernel where switching the
compiler from using '-fwrapv' to using '-fno-strict-overflow' causes a
non-working kernel.
So the exact same kernel sources work fine with either 'fwrapv' or,
indeed, with neither option, but with '-fno-strict-overflow' it just hangs
at boot.
Now, the problem is that I can't actually see the issue, so I just have
three kernel binaries, two of which work, and one which does not. I can
disassemble them, and I did, and the end result are three files with
roughly 1.14 million lines.
The diffs between them are in the 40- to 90-thousand line area, and it's
really hard to read diffs of assembly code and try to figure out what the
problem with the generated code is. Yeah, I'm pretty good at reading
assembler, but I'm not _that_ good. Just scanning 40 thousand lines takes
a while, looking at it _closely_ is just incredibly hard and isn't going
to ever really work out.
But there's a saving grace: I have two working kernels. And the
differences are often silly things like different register allocation, and
they are _different_. Many functions end up being the same between one
working kernel and the failing one, while being different in another
working kernel.
So what I want is the differences that are different from _both_ working
kernels, since being the same as _one_ of them means that it's not a bug.
Now, being a git person, what does that say? Right: just check in the
working kernels as two branches under the same filename, then merge the
branches, and force the merge result to be the non-working kernel, and do
a three-way combined context diff! So I did exactly that.
An hour later, it's still thinking.
So I used our nice new 'perf' tool from the kernel on a run that I killed
after five minuets, and lookie here:
#
# (17566764 samples)
#
# Overhead Command Shared Object Symbol
# ........ ................ ......................... ......
#
96.26% git /home/torvalds/bin/git [.] consume_line
...
it says we're spending pretty much all our time in a single function.
In fact, on the instruction-level profile, it looks like it's all spent on
four instructions:
3.51 : 45aec0: 4c 85 72 10 test %r14,0x10(%rdx)
86.27 : 45aec4: 48 0f 45 ca cmovne %rdx,%rcx
: if (sline->lost_head) {
: struct lline *last_one = NULL;
: /* We cannot squash it with earlier one */
: for (lline = sline->lost_head;
: lline;
: lline = lline->next)
6.69 : 45aec8: 48 8b 12 mov (%rdx),%rdx
:
: /* Check to see if we can squash things */
: if (sline->lost_head) {
: struct lline *last_one = NULL;
: /* We cannot squash it with earlier one */
: for (lline = sline->lost_head;
3.51 : 45aecb: 48 85 d2 test %rdx,%rdx
0.01 : 45aece: 75 f0 jne 45aec0 <consume_line+0xc0>
(ok, five, I included the 0.01% of the branch instruction that finishes
the loop - the way Nehalem works, it will merge those test+jne
instructions into one macro-instruction internally, which is why the
branch that is obviously part of the loop doesn't get a lot of hits)
Just FYI, those four instructions come from inlining 'append_lost()'. It's
compiled that first for-loop into some very good assembly language, but
the problem is that this all looks very much quadratic in number of
lost_lines. The quality of the compiler thus sadly in no way helps make up
for a really really expensive algorithm.
Any ideas?
For any git people who want to look at the horror that caused this, feel
free to download the (fairly small) git repository from
git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/git-merge-test.git
and try to do a
git show --color fno-strict-overflow
just for fun. Do "gitk --all" to see the simple nature of the repository,
but if you do, you'll need to kill the "git diff-tree -r -p --textconv .."
process that gitk will start (and which will take forever).
The code in question is all Junio's, and is over three years old. So it's
obviously been working fine. And the load is obviously not really
appropriate, but at the same time I think it's an interesting thing to
try. An it doesn't seem to be about memory use - It's now been running for
85 minutes (ok, this email took long to write, and I uploaded the thing
etc in the meantime etc), and it's holding steady at roughly 175MB RSS.
Linus
next reply other threads:[~2009-07-21 18:12 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2009-07-21 18:10 Linus Torvalds [this message]
2009-07-21 18:16 ` three-way diff performance problem Linus Torvalds
2009-07-21 19:21 ` Junio C Hamano
2009-07-21 19:31 ` Linus Torvalds
2009-07-21 19:47 ` Junio C Hamano
2009-07-21 20:34 ` Linus Torvalds
2009-07-21 20:46 ` Junio C Hamano
2009-07-21 21:01 ` Linus Torvalds
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.LFD.2.01.0907211038120.19335@localhost.localdomain \
--to=torvalds@linux-foundation.org \
--cc=git@vger.kernel.org \
--cc=gitster@pobox.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).