Re: [PATCH] x86/mm: Don't reenter flush_tlb_func_common()

All of lore.kernel.org
 help / color / mirror / Atom feed

From: zhong jiang <zhongjiang@huawei.com>
To: Andy Lutomirski <luto@kernel.org>
Cc: x86@kernel.org, linux-kernel@vger.kernel.org,
	Borislav Petkov <bp@alien8.de>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Mel Gorman <mgorman@suse.de>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	Nadav Amit <nadav.amit@gmail.com>, Rik van Riel <riel@redhat.com>,
	Dave Hansen <dave.hansen@intel.com>,
	Arjan van de Ven <arjan@linux.intel.com>,
	Peter Zijlstra <peterz@infradead.org>
Subject: Re: [PATCH] x86/mm: Don't reenter flush_tlb_func_common()
Date: Mon, 19 Jun 2017 21:33:34 +0800	[thread overview]
Message-ID: <5947D2AE.6080609@huawei.com> (raw)
In-Reply-To: <b13eee98a0e5322fbdc450f234a01006ec374e2c.1497847645.git.luto@kernel.org>

On 2017/6/19 12:48, Andy Lutomirski wrote:
> It was historically possible to have two concurrent TLB flushes
> targeting the same CPU: one initiated locally and one initiated
> remotely.  This can now cause an OOPS in leave_mm() at
> arch/x86/mm/tlb.c:47:
>
>         if (this_cpu_read(cpu_tlbstate.state) == TLBSTATE_OK)
>                 BUG();
>
> with this call trace:
>  flush_tlb_func_local arch/x86/mm/tlb.c:239 [inline]
>  flush_tlb_mm_range+0x26d/0x370 arch/x86/mm/tlb.c:317
>
> Without reentrancy, this OOPS is impossible: leave_mm() is only
> called if we're not in TLBSTATE_OK, but then we're unexpectedly
> in TLBSTATE_OK in leave_mm().
>
> This can be caused by flush_tlb_func_remote() happening between
> the two checks and calling leave_mm(), resulting in two consecutive
> leave_mm() calls on the same CPU with no intervening switch_mm()
> calls.
>
> We never saw this OOPS before because the old leave_mm()
> implementation didn't put us back in TLBSTATE_OK, so the assertion
> didn't fire.
  HI, Andy

  Today, I see same OOPS in linux 3.4 stable. It prove that it indeed has fired.
   but It is rarely to appear.  I review the code. I found the a  issue.
  when current->mm is NULL,  leave_mm will be called. but  it maybe in
  TLBSTATE_OK,  eg: unuse_mm call after task->mm = NULL , but before enter_lazy_tlb.

   therefore,  it will fire. is it right?

  Thanks
  zhongjiang
> Nadav noticed the reentrancy issue in a different context, but
> neither of us realized that it caused a problem yet.
>
> Cc: Nadav Amit <nadav.amit@gmail.com>
> Cc: Dave Hansen <dave.hansen@intel.com>
> Reported-by: "Levin, Alexander (Sasha Levin)" <alexander.levin@verizon.com>
> Fixes: 3d28ebceaffa ("x86/mm: Rework lazy TLB to track the actual loaded mm")
> Signed-off-by: Andy Lutomirski <luto@kernel.org>
> ---
>  arch/x86/mm/tlb.c | 15 +++++++++++++--
>  1 file changed, 13 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
> index 2a5e851f2035..f06239c6919f 100644
> --- a/arch/x86/mm/tlb.c
> +++ b/arch/x86/mm/tlb.c
> @@ -208,6 +208,9 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
>  static void flush_tlb_func_common(const struct flush_tlb_info *f,
>  				  bool local, enum tlb_flush_reason reason)
>  {
> +	/* This code cannot presently handle being reentered. */
> +	VM_WARN_ON(!irqs_disabled());
> +
>  	if (this_cpu_read(cpu_tlbstate.state) != TLBSTATE_OK) {
>  		leave_mm(smp_processor_id());
>  		return;
> @@ -313,8 +316,12 @@ void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start,
>  		info.end = TLB_FLUSH_ALL;
>  	}
>  
> -	if (mm == this_cpu_read(cpu_tlbstate.loaded_mm))
> +	if (mm == this_cpu_read(cpu_tlbstate.loaded_mm)) {
> +		local_irq_disable();
>  		flush_tlb_func_local(&info, TLB_LOCAL_MM_SHOOTDOWN);
> +		local_irq_enable();
> +	}
> +
>  	if (cpumask_any_but(mm_cpumask(mm), cpu) < nr_cpu_ids)
>  		flush_tlb_others(mm_cpumask(mm), &info);
>  	put_cpu();
> @@ -370,8 +377,12 @@ void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch)
>  
>  	int cpu = get_cpu();
>  
> -	if (cpumask_test_cpu(cpu, &batch->cpumask))
> +	if (cpumask_test_cpu(cpu, &batch->cpumask)) {
> +		local_irq_disable();
>  		flush_tlb_func_local(&info, TLB_LOCAL_SHOOTDOWN);
> +		local_irq_enable();
> +	}
> +
>  	if (cpumask_any_but(&batch->cpumask, cpu) < nr_cpu_ids)
>  		flush_tlb_others(&batch->cpumask, &info);
>  	cpumask_clear(&batch->cpumask);


--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

WARNING: multiple messages have this Message-ID (diff)

From: zhong jiang <zhongjiang@huawei.com>
To: Andy Lutomirski <luto@kernel.org>
Cc: <x86@kernel.org>, <linux-kernel@vger.kernel.org>,
	Borislav Petkov <bp@alien8.de>,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	Mel Gorman <mgorman@suse.de>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	Nadav Amit <nadav.amit@gmail.com>, Rik van Riel <riel@redhat.com>,
	Dave Hansen <dave.hansen@intel.com>,
	"Arjan van de Ven" <arjan@linux.intel.com>,
	Peter Zijlstra <peterz@infradead.org>
Subject: Re: [PATCH] x86/mm: Don't reenter flush_tlb_func_common()
Date: Mon, 19 Jun 2017 21:33:34 +0800	[thread overview]
Message-ID: <5947D2AE.6080609@huawei.com> (raw)
In-Reply-To: <b13eee98a0e5322fbdc450f234a01006ec374e2c.1497847645.git.luto@kernel.org>

On 2017/6/19 12:48, Andy Lutomirski wrote:
> It was historically possible to have two concurrent TLB flushes
> targeting the same CPU: one initiated locally and one initiated
> remotely.  This can now cause an OOPS in leave_mm() at
> arch/x86/mm/tlb.c:47:
>
>         if (this_cpu_read(cpu_tlbstate.state) == TLBSTATE_OK)
>                 BUG();
>
> with this call trace:
>  flush_tlb_func_local arch/x86/mm/tlb.c:239 [inline]
>  flush_tlb_mm_range+0x26d/0x370 arch/x86/mm/tlb.c:317
>
> Without reentrancy, this OOPS is impossible: leave_mm() is only
> called if we're not in TLBSTATE_OK, but then we're unexpectedly
> in TLBSTATE_OK in leave_mm().
>
> This can be caused by flush_tlb_func_remote() happening between
> the two checks and calling leave_mm(), resulting in two consecutive
> leave_mm() calls on the same CPU with no intervening switch_mm()
> calls.
>
> We never saw this OOPS before because the old leave_mm()
> implementation didn't put us back in TLBSTATE_OK, so the assertion
> didn't fire.
  HI, Andy

  Today, I see same OOPS in linux 3.4 stable. It prove that it indeed has fired.
   but It is rarely to appear.  I review the code. I found the a  issue.
  when current->mm is NULL,  leave_mm will be called. but  it maybe in
  TLBSTATE_OK,  eg: unuse_mm call after task->mm = NULL , but before enter_lazy_tlb.

   therefore,  it will fire. is it right?

  Thanks
  zhongjiang
> Nadav noticed the reentrancy issue in a different context, but
> neither of us realized that it caused a problem yet.
>
> Cc: Nadav Amit <nadav.amit@gmail.com>
> Cc: Dave Hansen <dave.hansen@intel.com>
> Reported-by: "Levin, Alexander (Sasha Levin)" <alexander.levin@verizon.com>
> Fixes: 3d28ebceaffa ("x86/mm: Rework lazy TLB to track the actual loaded mm")
> Signed-off-by: Andy Lutomirski <luto@kernel.org>
> ---
>  arch/x86/mm/tlb.c | 15 +++++++++++++--
>  1 file changed, 13 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c
> index 2a5e851f2035..f06239c6919f 100644
> --- a/arch/x86/mm/tlb.c
> +++ b/arch/x86/mm/tlb.c
> @@ -208,6 +208,9 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next,
>  static void flush_tlb_func_common(const struct flush_tlb_info *f,
>  				  bool local, enum tlb_flush_reason reason)
>  {
> +	/* This code cannot presently handle being reentered. */
> +	VM_WARN_ON(!irqs_disabled());
> +
>  	if (this_cpu_read(cpu_tlbstate.state) != TLBSTATE_OK) {
>  		leave_mm(smp_processor_id());
>  		return;
> @@ -313,8 +316,12 @@ void flush_tlb_mm_range(struct mm_struct *mm, unsigned long start,
>  		info.end = TLB_FLUSH_ALL;
>  	}
>  
> -	if (mm == this_cpu_read(cpu_tlbstate.loaded_mm))
> +	if (mm == this_cpu_read(cpu_tlbstate.loaded_mm)) {
> +		local_irq_disable();
>  		flush_tlb_func_local(&info, TLB_LOCAL_MM_SHOOTDOWN);
> +		local_irq_enable();
> +	}
> +
>  	if (cpumask_any_but(mm_cpumask(mm), cpu) < nr_cpu_ids)
>  		flush_tlb_others(mm_cpumask(mm), &info);
>  	put_cpu();
> @@ -370,8 +377,12 @@ void arch_tlbbatch_flush(struct arch_tlbflush_unmap_batch *batch)
>  
>  	int cpu = get_cpu();
>  
> -	if (cpumask_test_cpu(cpu, &batch->cpumask))
> +	if (cpumask_test_cpu(cpu, &batch->cpumask)) {
> +		local_irq_disable();
>  		flush_tlb_func_local(&info, TLB_LOCAL_SHOOTDOWN);
> +		local_irq_enable();
> +	}
> +
>  	if (cpumask_any_but(&batch->cpumask, cpu) < nr_cpu_ids)
>  		flush_tlb_others(&batch->cpumask, &info);
>  	cpumask_clear(&batch->cpumask);

next prev parent reply	other threads:[~2017-06-19 13:36 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2017-06-19  4:48 [PATCH] x86/mm: Don't reenter flush_tlb_func_common() Andy Lutomirski
2017-06-19  4:48 ` Andy Lutomirski
2017-06-19 13:33 ` zhong jiang [this message]
2017-06-19 13:33   ` zhong jiang
2017-06-19 15:05   ` Andy Lutomirski
2017-06-19 15:05     ` Andy Lutomirski
2017-06-20  2:56     ` zhong jiang
2017-06-20  2:56       ` zhong jiang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=5947D2AE.6080609@huawei.com \
    --to=zhongjiang@huawei.com \
    --cc=akpm@linux-foundation.org \
    --cc=arjan@linux.intel.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=luto@kernel.org \
    --cc=mgorman@suse.de \
    --cc=nadav.amit@gmail.com \
    --cc=peterz@infradead.org \
    --cc=riel@redhat.com \
    --cc=torvalds@linux-foundation.org \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.