From: Austin S Hemmelgarn <ahferroin7@gmail.com>
To: Borislav Petkov <bp@alien8.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
Linux-Kernel mailing list <linux-kernel@vger.kernel.org>,
Alan Cox <alan@redhat.com>
Subject: Re: [PATCH 1/1] x86_64: add config options to optimize for newer AMD processors
Date: Thu, 03 Oct 2013 14:12:07 -0400 [thread overview]
Message-ID: <524DB377.2070909@gmail.com> (raw)
In-Reply-To: <20131003165740.GA17417@pd.tnic>
On 2013-10-03 12:57, Borislav Petkov wrote:> On Thu, Oct 03, 2013 at 09:27:45AM -0700, Linus Torvalds wrote:
>> On Thu, Oct 3, 2013 at 5:06 AM, Austin S Hemmelgarn
>> <ahferroin7@gmail.com> wrote:
>>> improved. Building kernel 3.12-rc2 with allmodconfig using 8 jobs on a FX-8320 takes
>>>
>>> 22 minutes and 57 seconds on a kernel with CONFIG_MK8,
>>> 21 minutes and 35 seconds on a kernel with CONFIG_GENERIC, and
>>> 19 minutes and 11 seconds on a kernel with CONFIG_PILEDRIVER.
>>
>> That's certainly noticeable. Surprisingly so. What makes MK8 so bad in
>> particular, I wonder?
>>
>> Just out of interest, have you done any profiles on the kernel cost
>> here to see what it is that makes such a big difference. Because
>> normally on a kernel build, I see most of the overhead in path lookup.
>> But that's only true for otherwise optimized builds that don't have
>> system call auditing etc debugging that spreads the costs out over
>> everything..
>
> Yeah, I was having some doubts about the numbers above so I ran my own
> benchmarking, machine is a Piledriver box:
>
> vendor_id : AuthenticAMD
> cpu family : 21
> model : 2
> model name : AMD FX(tm)-8350 Eight-Core Processor
> stepping : 0
>
> and I don't really see any of those improvements above. Actually,
> -march=bdver2 is even slightly worse in comparison to mk8.
>
> And the workload is of building a config specific to that machine but
> allmodconfig looks very similar, the numbers being simply higher.
>
> $ zgrep MK8 /proc/config.gz
> CONFIG_MK8=y
>
> /home/boris/bin/perf stat --repeat 10 -a --sync --pre /home/boris/kernel/pre-build-kernel.sh make -s -j64 bzImage
>
> Performance counter stats for 'make -s -j64 bzImage' (10 runs):
>
> 1081808.628840 task-clock # 7.996 CPUs utilized ( +- 0.06% ) [100.00%]
> 1,203,753 context-switches # 0.001 M/sec ( +- 0.04% ) [100.00%]
> 48,748 cpu-migrations # 0.045 K/sec ( +- 0.59% ) [100.00%]
> 31,145,439 page-faults # 0.029 M/sec ( +- 0.00% )
> 3,836,736,801,500 cycles # 3.547 GHz ( +- 0.03% ) [100.00%]
> 957,386,966,493 stalled-cycles-frontend # 24.95% frontend cycles idle ( +- 0.06% ) [100.00%]
> 218,581,249,251 stalled-cycles-backend # 5.70% backend cycles idle ( +- 0.06% ) [100.00%]
> 2,466,632,641,972 instructions # 0.64 insns per cycle
> # 0.39 stalled cycles per insn ( +- 0.00% ) [100.00%]
> 537,749,333,838 branches # 497.084 M/sec ( +- 0.00% ) [100.00%]
> 27,802,940,176 branch-misses # 5.17% of all branches ( +- 0.00% )
>
> 135.292843025 seconds time elapsed ( +- 0.06% )
>
>
> $ zgrep PILEDRIVER /proc/config.gz
> CONFIG_MPILEDRIVER=y
>
> /home/boris/bin/perf stat --repeat 10 -a --sync --pre /home/boris/kernel/pre-build-kernel.sh make -s -j64 bzImage
>
> Performance counter stats for 'make -s -j64 bzImage' (10 runs):
>
> 1085723.230470 task-clock # 7.996 CPUs utilized ( +- 0.10% ) [100.00%]
> 1,204,355 context-switches # 0.001 M/sec ( +- 0.10% ) [100.00%]
> 49,143 cpu-migrations # 0.045 K/sec ( +- 0.76% ) [100.00%]
> 31,196,575 page-faults # 0.029 M/sec ( +- 0.00% )
> 3,851,255,065,133 cycles # 3.547 GHz ( +- 0.02% ) [100.00%]
> 958,840,197,117 stalled-cycles-frontend # 24.90% frontend cycles idle ( +- 0.09% ) [100.00%]
> 220,260,399,411 stalled-cycles-backend # 5.72% backend cycles idle ( +- 0.04% ) [100.00%]
> 2,466,701,295,156 instructions # 0.64 insns per cycle
> # 0.39 stalled cycles per insn ( +- 0.00% ) [100.00%]
> 537,992,040,195 branches # 495.515 M/sec ( +- 0.00% ) [100.00%]
> 27,860,290,286 branch-misses # 5.18% of all branches ( +- 0.00% )
>
> 135.784111961 seconds time elapsed ( +- 0.10% )
>
Part of the difference between our results may be that I have my entire userspace built with -mtune=bdver2, so less of the time is spent in userspace. Also, the part about using many more threads than cpu cores was with regards to sysbench, not the kernel build, for that I just used 8 jobs in make.
With regards to the differences shown above relative to CONFIG_MK8, that does actually make sense; with CONFIG_MK8, gcc makes very minimal use of extension instructions (afaik, only MMX, SSE, and 3Dnow!), this improves performance slightly on bulldozer derivatives because there are only half as many SSE and FP units as CPU cores (and the scheduler isn't as smart as it could be with regards to that, but that is something for another patch as far as I am concerned).
next prev parent reply other threads:[~2013-10-03 18:12 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-09-29 17:54 [PATCH 1/1] x86_64: add config options to optimize for newer AMD processors Austin S Hemmelgarn
2013-09-29 18:01 ` Borislav Petkov
2013-09-29 20:41 ` Austin S Hemmelgarn
2013-09-29 20:50 ` Borislav Petkov
2013-09-29 21:23 ` Austin S Hemmelgarn
2013-09-29 21:30 ` Borislav Petkov
2013-10-03 13:42 ` Austin S Hemmelgarn
[not found] ` <524D5DAC.3000004@gmail.com>
2013-10-03 16:27 ` Linus Torvalds
2013-10-03 16:57 ` Borislav Petkov
2013-10-03 18:12 ` Austin S Hemmelgarn [this message]
2013-10-03 18:28 ` Borislav Petkov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=524DB377.2070909@gmail.com \
--to=ahferroin7@gmail.com \
--cc=alan@redhat.com \
--cc=bp@alien8.de \
--cc=linux-kernel@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox