Re: [PATCH] x86,mm: delay TLB flush after clearing accessed bit

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Rik van Riel <riel@redhat.com>
To: Ingo Molnar <mingo@kernel.org>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	shli@kernel.org, akpm@linux-foundation.org, hughd@google.com,
	mgorman@suse.de, Linus Torvalds <torvalds@linux-foundation.org>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Thomas Gleixner <tglx@linutronix.de>,
	"H. Peter Anvin" <hpa@zytor.com>
Subject: Re: [PATCH] x86,mm: delay TLB flush after clearing accessed bit
Date: Tue, 01 Apr 2014 08:55:29 -0400	[thread overview]
Message-ID: <533AB741.5080508@redhat.com> (raw)
In-Reply-To: <20140401105318.GA2823@gmail.com>

On 04/01/2014 06:53 AM, Ingo Molnar wrote:
> 
> The speedup looks good to me!
> 
> I have one major concern (see the last item), plus a few minor nits:

I will address all the minor issues. Let me explain the major one :)

>> @@ -196,6 +201,13 @@ static inline void reset_lazy_tlbstate(void)
>>  	this_cpu_write(cpu_tlbstate.active_mm, &init_mm);
>>  }
>>  
>> +static inline void tlb_set_force_flush(int cpu)
>> +{
>> +	struct tlb_state *percputlb= &per_cpu(cpu_tlbstate, cpu);
> 
> s/b= /b = /
> 
>> +	if (percputlb->force_flush == false)
>> +		percputlb->force_flush = true;
>> +}
>> +
>>  #endif	/* SMP */

This code does a test before the set, so each cache line will only be
grabbed exclusively once, if there is heavy pageout scanning activity.

>> @@ -399,11 +400,13 @@ int pmdp_test_and_clear_young(struct vm_area_struct *vma,
>>  int ptep_clear_flush_young(struct vm_area_struct *vma,
>>  			   unsigned long address, pte_t *ptep)
>>  {
>> -	int young;
>> +	int young, cpu;
>>  
>>  	young = ptep_test_and_clear_young(vma, address, ptep);
>> -	if (young)
>> -		flush_tlb_page(vma, address);
>> +	if (young) {
>> +		for_each_cpu(cpu, vma->vm_mm->cpu_vm_mask_var)
>> +			tlb_set_force_flush(cpu);
> 
> Hm, just to play the devil's advocate - what happens when we have a va 
> that is used on a few dozen, a few hundred or a few thousand CPUs? 
> Will the savings be dwarved by the O(nr_cpus_used) loop overhead?
> 
> Especially as this is touching cachelines on other CPUs and likely 
> creating the worst kind of cachemisses. That can really kill 
> performance.

flush_tlb_page does the same O(nr_cpus_used) loop, but it sends an
IPI to each CPU every time, instead of dirtying a cache line once
per pageout run (or until the next context switch).

Does that address your concern?

-- 
All rights reversed

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)

From: Rik van Riel <riel@redhat.com>
To: Ingo Molnar <mingo@kernel.org>
Cc: linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	shli@kernel.org, akpm@linux-foundation.org, hughd@google.com,
	mgorman@suse.de, Linus Torvalds <torvalds@linux-foundation.org>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Thomas Gleixner <tglx@linutronix.de>,
	"H. Peter Anvin" <hpa@zytor.com>
Subject: Re: [PATCH] x86,mm: delay TLB flush after clearing accessed bit
Date: Tue, 01 Apr 2014 08:55:29 -0400	[thread overview]
Message-ID: <533AB741.5080508@redhat.com> (raw)
In-Reply-To: <20140401105318.GA2823@gmail.com>

On 04/01/2014 06:53 AM, Ingo Molnar wrote:
> 
> The speedup looks good to me!
> 
> I have one major concern (see the last item), plus a few minor nits:

I will address all the minor issues. Let me explain the major one :)

>> @@ -196,6 +201,13 @@ static inline void reset_lazy_tlbstate(void)
>>  	this_cpu_write(cpu_tlbstate.active_mm, &init_mm);
>>  }
>>  
>> +static inline void tlb_set_force_flush(int cpu)
>> +{
>> +	struct tlb_state *percputlb= &per_cpu(cpu_tlbstate, cpu);
> 
> s/b= /b = /
> 
>> +	if (percputlb->force_flush == false)
>> +		percputlb->force_flush = true;
>> +}
>> +
>>  #endif	/* SMP */

This code does a test before the set, so each cache line will only be
grabbed exclusively once, if there is heavy pageout scanning activity.

>> @@ -399,11 +400,13 @@ int pmdp_test_and_clear_young(struct vm_area_struct *vma,
>>  int ptep_clear_flush_young(struct vm_area_struct *vma,
>>  			   unsigned long address, pte_t *ptep)
>>  {
>> -	int young;
>> +	int young, cpu;
>>  
>>  	young = ptep_test_and_clear_young(vma, address, ptep);
>> -	if (young)
>> -		flush_tlb_page(vma, address);
>> +	if (young) {
>> +		for_each_cpu(cpu, vma->vm_mm->cpu_vm_mask_var)
>> +			tlb_set_force_flush(cpu);
> 
> Hm, just to play the devil's advocate - what happens when we have a va 
> that is used on a few dozen, a few hundred or a few thousand CPUs? 
> Will the savings be dwarved by the O(nr_cpus_used) loop overhead?
> 
> Especially as this is touching cachelines on other CPUs and likely 
> creating the worst kind of cachemisses. That can really kill 
> performance.

flush_tlb_page does the same O(nr_cpus_used) loop, but it sends an
IPI to each CPU every time, instead of dirtying a cache line once
per pageout run (or until the next context switch).

Does that address your concern?

-- 
All rights reversed

next prev parent reply	other threads:[~2014-04-01 12:55 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-03-31 15:34 [PATCH] x86,mm: delay TLB flush after clearing accessed bit Rik van Riel
2014-03-31 15:34 ` Rik van Riel
2014-04-01 10:53 ` Ingo Molnar
2014-04-01 10:53   ` Ingo Molnar
2014-04-01 12:55   ` Rik van Riel [this message]
2014-04-01 12:55     ` Rik van Riel
2014-04-01 13:20     ` Ingo Molnar
2014-04-01 13:20       ` Ingo Molnar
2014-04-01 13:26       ` Rik van Riel
2014-04-01 13:26         ` Rik van Riel
2014-04-01 15:13 ` Linus Torvalds
2014-04-01 15:13   ` Linus Torvalds
2014-04-01 16:11   ` Rik van Riel
2014-04-01 16:11     ` Rik van Riel
2014-04-01 16:21     ` Linus Torvalds
2014-04-01 16:21       ` Linus Torvalds
2014-04-01 18:31       ` Rik van Riel
2014-04-01 18:31         ` Rik van Riel
2014-04-02  6:06         ` Shaohua Li
2014-04-02  6:06           ` Shaohua Li
2014-04-02  7:46           ` Ingo Molnar
2014-04-02  7:46             ` Ingo Molnar

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=533AB741.5080508@redhat.com \
    --to=riel@redhat.com \
    --cc=a.p.zijlstra@chello.nl \
    --cc=akpm@linux-foundation.org \
    --cc=hpa@zytor.com \
    --cc=hughd@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mgorman@suse.de \
    --cc=mingo@kernel.org \
    --cc=shli@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.