From: Alex Shi <alex.shi@intel.com>
To: Borislav Petkov <bp@alien8.de>
Cc: Ilari Stenroth <ilari.stenroth@gmail.com>,
linux-kernel@vger.kernel.org,
"H. Peter Anvin" <h.peter.anvin@intel.com>
Subject: Re: arch/x86/kernel/cpu/intel.c needs an update for Haswell?
Date: Thu, 01 Aug 2013 16:53:31 +0800 [thread overview]
Message-ID: <51FA220B.5070307@intel.com> (raw)
In-Reply-To: <20130730195402.GD23299@pd.tnic>
On 07/31/2013 03:54 AM, Borislav Petkov wrote:
> On Tue, Jul 30, 2013 at 10:44:02PM +0300, Ilari Stenroth wrote:
>> On 30.7.2013 22.35, Borislav Petkov wrote:
>>> On Tue, Jul 30, 2013 at 09:50:49PM +0300, Ilari Stenroth wrote:
>>>> Does somebody know why arch/x86/kernel/cpu/intel.c has
>>>> tlb_flushall_shift detection logic for Ivy Bridge CPU family but not
>>>> for Haswell? Maybe intel_cacheinfo.c needs to be checked for Haswell
>>>> updates too.
>>>
>>> Because someone needs to sit down and write it. Oh, and more
>>> importantly, test it on real hardware.
>>>
>>> :-)
>>>
>> Right :-) Can volunteer to test, only once I get a motherboard bug
>> fixed. It runs only one core. Poor Supermicro X10SLH-F thinks Xeon
>> E3-1265Lv3 has 1C2T :-/
>
> Yeah, if I had to guess, I'd say the highest probability is for patches
> about it to be coming from Alex. :)
>
just borrowed a haswell laptop and run the munmap case on this. :)
The cpu is 2 core * HT. The test show tlb_flushall_shift = 1 has best
performance.
tlb_flushall_shift is 1
=============== t = 2
munmap use 243ms 14889ns/time, memory access uses 336949 times/thread/ms, cost 2ns/time
munmap use 152ms 18662ns/time, memory access uses 336561 times/thread/ms, cost 2ns/time
munmap use 60ms 14835ns/time, memory access uses 198710 times/thread/ms, cost 5ns/time
munmap use 41ms 20030ns/time, memory access uses 208748 times/thread/ms, cost 4ns/time
munmap use 21ms 20995ns/time, memory access uses 191849 times/thread/ms, cost 5ns/time
munmap use 21ms 41909ns/time, memory access uses 296545 times/thread/ms, cost 3ns/time
=============== t = 4
munmap use 468ms 14287ns/time, memory access uses 72088 times/thread/ms, cost 13ns/time
munmap use 286ms 17488ns/time, memory access uses 65232 times/thread/ms, cost 15ns/time
munmap use 210ms 25746ns/time, memory access uses 97080 times/thread/ms, cost 10ns/time
munmap use 66ms 16138ns/time, memory access uses 56450 times/thread/ms, cost 17ns/time
munmap use 51ms 25323ns/time, memory access uses 41930 times/thread/ms, cost 23ns/time
munmap use 44ms 43599ns/time, memory access uses 53031 times/thread/ms, cost 18ns/time
munmap use 28ms 56011ns/time, memory access uses 36889 times/thread/ms, cost 27ns/time
=============== t = 8
munmap use 2429ms 74138ns/time, memory access uses 42202 times/thread/ms, cost 23ns/time
munmap use 1079ms 65880ns/time, memory access uses 41497 times/thread/ms, cost 24ns/time
munmap use 623ms 76108ns/time, memory access uses 47844 times/thread/ms, cost 20ns/time
munmap use 387ms 94619ns/time, memory access uses 34652 times/thread/ms, cost 28ns/time
munmap use 90ms 44180ns/time, memory access uses 26498 times/thread/ms, cost 37ns/time
munmap use 49ms 47903ns/time, memory access uses 33863 times/thread/ms, cost 29ns/time
munmap use 26ms 51164ns/time, memory access uses 31491 times/thread/ms, cost 31ns/time
tlb_flush_shift is -1
=============== t = 2
munmap use 418ms 12766ns/time, memory access uses 124215 times/thread/ms, cost 8ns/time
munmap use 184ms 11271ns/time, memory access uses 36519 times/thread/ms, cost 27ns/time
munmap use 116ms 14177ns/time, memory access uses 112472 times/thread/ms, cost 8ns/time
munmap use 66ms 16347ns/time, memory access uses 137546 times/thread/ms, cost 7ns/time
munmap use 43ms 21087ns/time, memory access uses 47053 times/thread/ms, cost 21ns/time
munmap use 31ms 30787ns/time, memory access uses 202638 times/thread/ms, cost 4ns/time
munmap use 22ms 43187ns/time, memory access uses 255272 times/thread/ms, cost 3ns/time
=============== t = 4
munmap use 572ms 17483ns/time, memory access uses 54936 times/thread/ms, cost 18ns/time
munmap use 481ms 29360ns/time, memory access uses 71397 times/thread/ms, cost 14ns/time
munmap use 168ms 20575ns/time, memory access uses 59827 times/thread/ms, cost 16ns/time
munmap use 73ms 18062ns/time, memory access uses 34687 times/thread/ms, cost 28ns/time
munmap use 42ms 20581ns/time, memory access uses 48571 times/thread/ms, cost 20ns/time
munmap use 46ms 45261ns/time, memory access uses 43408 times/thread/ms, cost 23ns/time
munmap use 21ms 41828ns/time, memory access uses 49751 times/thread/ms, cost 20ns/time
=============== t = 8
munmap use 1761ms 53756ns/time, memory access uses 40636 times/thread/ms, cost 24ns/time
munmap use 238ms 14541ns/time, memory access uses 19968 times/thread/ms, cost 50ns/time
munmap use 262ms 31988ns/time, memory access uses 31964 times/thread/ms, cost 31ns/time
munmap use 127ms 31086ns/time, memory access uses 35674 times/thread/ms, cost 28ns/time
munmap use 73ms 35764ns/time, memory access uses 23482 times/thread/ms, cost 42ns/time
munmap use 59ms 58406ns/time, memory access uses 36680 times/thread/ms, cost 27ns/time
munmap use 20ms 40608ns/time, memory access uses 26733 times/thread/ms, cost 37ns/time
------
>From 1322ea9e17ad4d9e49e2d93cfc04805368e28273 Mon Sep 17 00:00:00 2001
From: Alex Shi <alex.shi@intel.com>
Date: Thu, 1 Aug 2013 16:30:23 +0800
Subject: [PATCH 2/2] tlb/tlb_flushall_shift: add haswell tlb_flush_shift
Tested on i5 4350U with munmap case, https://lkml.org/lkml/2012/5/17/59
The best performance is tlb_flush_shift = 1.
The balance point is 256 entries.
Signed-off-by: Alex Shi <alex.shi@intel.com>
---
arch/x86/kernel/cpu/intel.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/x86/kernel/cpu/intel.c b/arch/x86/kernel/cpu/intel.c
index 9a4bc51..ac9b83a 100644
--- a/arch/x86/kernel/cpu/intel.c
+++ b/arch/x86/kernel/cpu/intel.c
@@ -627,6 +627,7 @@ static void intel_tlb_flushall_shift_set(struct cpuinfo_x86 *c)
tlb_flushall_shift = 5;
break;
case 0x63a: /* Ivybridge */
+ case 0x645: /* Haswell */
tlb_flushall_shift = 1;
break;
case 0x63e: /* Ivybridge EP */
--
1.7.12
--
Thanks
Alex
next prev parent reply other threads:[~2013-08-01 8:54 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-07-30 18:50 arch/x86/kernel/cpu/intel.c needs an update for Haswell? Ilari Stenroth
2013-07-30 19:35 ` Borislav Petkov
2013-07-30 19:44 ` Ilari Stenroth
2013-07-30 19:54 ` Borislav Petkov
2013-08-01 8:53 ` Alex Shi [this message]
2013-08-01 9:07 ` Alex Shi
2013-08-05 2:47 ` Alex Shi
2013-08-05 2:48 ` Alex Shi
2013-08-05 2:59 ` H. Peter Anvin
2013-08-17 20:53 ` Ilari Stenroth
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=51FA220B.5070307@intel.com \
--to=alex.shi@intel.com \
--cc=bp@alien8.de \
--cc=h.peter.anvin@intel.com \
--cc=ilari.stenroth@gmail.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.