Re: [PATCH 2/3] x86/flush_tlb: try flush_tlb_single one by one in flush_tlb_range

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Alex Shi <alex.shi@intel.com>
To: Borislav Petkov <bp@amd64.org>
Cc: andi.kleen@intel.com, tim.c.chen@linux.intel.com,
	jeremy@goop.org, chrisw@sous-sol.org, akataria@vmware.com,
	tglx@linutronix.de, mingo@redhat.com, hpa@zytor.com,
	rostedt@goodmis.org, fweisbec@gmail.com, riel@redhat.com,
	luto@mit.edu, avi@redhat.com, len.brown@intel.com,
	paul.gortmaker@windriver.com, dhowells@redhat.com,
	fenghua.yu@intel.com, borislav.petkov@amd.com,
	yinghai@kernel.org, cpw@sgi.com, steiner@sgi.com,
	linux-kernel@vger.kernel.org, yongjie.ren@intel.com
Subject: Re: [PATCH 2/3] x86/flush_tlb: try flush_tlb_single one by one in flush_tlb_range
Date: Wed, 02 May 2012 17:24:09 +0800	[thread overview]
Message-ID: <4FA0FD39.9060908@intel.com> (raw)
In-Reply-To: <20120430105440.GC9303@aftab.osrc.amd.com>

On 04/30/2012 06:54 PM, Borislav Petkov wrote:

> On Sat, Apr 28, 2012 at 04:51:38PM +0800, Alex Shi wrote:
>> x86 has no flush_tlb_range support in instruction level. Currently the
>> flush_tlb_range just implemented by flushing all page table. That is not
>> the best solution for all scenarios. In fact, if we just use 'invlpg' to
>> flush few lines from TLB, we can get the performance gain from later
>> remain TLB lines accessing.
>>
>> But the 'invlpg' instruction costs much of time. Its execution time can
>> compete with cr3 rewriting, and even a bit more on SNB CPU.
>>
>> So, on a 512 4KB TLB entries CPU, the balance points is at:
>> 512 * 100ns(assumed TLB refill cost) =
>> x(TLB flush entries) * 140ns(assumed invlpg cost)
>>
>> Here, x is about 360, that is about 5/8 of 512 entries.
>>
>> But with the mysterious CPU pre-fetcher and page miss handler Unit, the
>> assumed TLB refill cost is far lower then 100ns in sequential access. And
>> 2 HT siblings in one core makes the memory access more faster if they are
>> accessing the same memory. So, in the patch, I just do the change when
>> the target entries is less than 1/16 of whole active tlb entries.
>> Actually, I have no data support for the percentage '1/16', so any
>> suggestions are welcomed.
> 
> You could find the proper value empirically here by replacing the
> FLUSHALL_BAR thing with a variable and exporting it through procfs or
> sysfs or whatever, only for testing purposes, and letting mprotect.c
> set it to a different value each time. Then run a bunch of times with
> different thread counts and invalidation entries count and see which
> combination performs best.



For some of scenario, above equation can be modified as:
(512 - X) * 100ns(assumed TLB refill cost) = X * 140ns(assumed invlpg cost)

When thread number less than cpu numbers, balance point can up to 1/2
TLB entries.

When thread number is equal to cpu number with HT, on our SNB EP
machine, the balance point is 1/16 TLB entries, on NHM EP machine,
balance at 1/32. So, need to change FLUSHALL_BAR to 32.

when thread number is bigger than cpu number, context switch eat all
improvement. the memory access latency is same as unpatched kernel.

next prev parent reply	other threads:[~2012-05-02  9:25 UTC|newest]

Thread overview: 21+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-04-28  8:51 [PATCH 0/3] TLB flush range optimization Alex Shi
2012-04-28  8:51 ` [PATCH 1/3] x86/tlb_info: get last level TLB entry number of CPU Alex Shi
2012-04-29 13:55   ` Borislav Petkov
2012-04-30  4:25     ` Alex Shi
2012-04-30 10:45       ` Borislav Petkov
2012-04-28  8:51 ` [PATCH 2/3] x86/flush_tlb: try flush_tlb_single one by one in flush_tlb_range Alex Shi
2012-04-30 10:54   ` Borislav Petkov
2012-05-02  9:24     ` Alex Shi [this message]
2012-05-02  9:38       ` Borislav Petkov
2012-05-02 11:38         ` Alex Shi
2012-05-02 13:04           ` Nick Piggin
2012-05-02 13:15             ` Alex Shi
2012-05-02 13:24             ` Alex Shi
2012-05-06  2:55             ` Alex Shi
2012-05-02 13:44           ` Borislav Petkov
2012-05-03  9:15             ` Alex Shi
2012-05-04  2:24   ` Ren, Yongjie
2012-05-04  5:46     ` Alex Shi
2012-04-28  8:51 ` [PATCH 3/3] x86/tlb: fall back to flush all when meet a THP large page Alex Shi
  -- strict thread matches above, loose matches on Subject: below --
2012-04-28  8:50 [PATCH 1/3] x86/tlb_info: get last level TLB entry number of CPU Alex Shi
2012-04-28  8:50 ` [PATCH 2/3] x86/flush_tlb: try flush_tlb_single one by one in flush_tlb_range Alex Shi
2012-05-02 15:21   ` Rik van Riel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4FA0FD39.9060908@intel.com \
    --to=alex.shi@intel.com \
    --cc=akataria@vmware.com \
    --cc=andi.kleen@intel.com \
    --cc=avi@redhat.com \
    --cc=borislav.petkov@amd.com \
    --cc=bp@amd64.org \
    --cc=chrisw@sous-sol.org \
    --cc=cpw@sgi.com \
    --cc=dhowells@redhat.com \
    --cc=fenghua.yu@intel.com \
    --cc=fweisbec@gmail.com \
    --cc=hpa@zytor.com \
    --cc=jeremy@goop.org \
    --cc=len.brown@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luto@mit.edu \
    --cc=mingo@redhat.com \
    --cc=paul.gortmaker@windriver.com \
    --cc=riel@redhat.com \
    --cc=rostedt@goodmis.org \
    --cc=steiner@sgi.com \
    --cc=tglx@linutronix.de \
    --cc=tim.c.chen@linux.intel.com \
    --cc=yinghai@kernel.org \
    --cc=yongjie.ren@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox