All of lore.kernel.org
 help / color / mirror / Atom feed
From: Sergey Matyukevich <geomatsi@gmail.com>
To: guoren@kernel.org
Cc: anup@brainfault.org, paul.walmsley@sifive.com,
	palmer@dabbelt.com, conor.dooley@microchip.com, heiko@sntech.de,
	philipp.tomsich@vrull.eu, alex@ghiti.fr, hch@lst.de,
	ajones@ventanamicro.com, gary@garyguo.net, jszhang@kernel.org,
	linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org,
	Guo Ren <guoren@linux.alibaba.com>,
	Anup Patel <apatel@ventanamicro.com>,
	Palmer Dabbelt <palmer@rivosinc.com>
Subject: Re: [PATCH V3] riscv: asid: Fixup stale TLB entry cause application crash
Date: Fri, 18 Nov 2022 23:57:21 +0300	[thread overview]
Message-ID: <Y3fxsWPLPlKGfMBj@curiosity> (raw)
In-Reply-To: <20221111075902.798571-1-guoren@kernel.org>

Hi Guo Ren,


> After use_asid_allocator is enabled, the userspace application will
> crash by stale TLB entries. Because only using cpumask_clear_cpu without
> local_flush_tlb_all couldn't guarantee CPU's TLB entries were fresh.
> Then set_mm_asid would cause the user space application to get a stale
> value by stale TLB entry, but set_mm_noasid is okay.

... [snip]

> +	/*
> +	 * The mm_cpumask indicates which harts' TLBs contain the virtual
> +	 * address mapping of the mm. Compared to noasid, using asid
> +	 * can't guarantee that stale TLB entries are invalidated because
> +	 * the asid mechanism wouldn't flush TLB for every switch_mm for
> +	 * performance. So when using asid, keep all CPUs footmarks in
> +	 * cpumask() until mm reset.
> +	 */
> +	cpumask_set_cpu(cpu, mm_cpumask(next));
> +	if (static_branch_unlikely(&use_asid_allocator)) {
> +		set_mm_asid(next, cpu);
> +	} else {
> +		cpumask_clear_cpu(cpu, mm_cpumask(prev));
> +		set_mm_noasid(next);
> +	}
>  }

I observe similar user-space crashes on my SMP systems with enabled ASID.
My attempt to fix the issue was a bit different, see the following patch:

https://lore.kernel.org/linux-riscv/20220829205219.283543-1-geomatsi@gmail.com/

In brief, the idea was borrowed from flush_icache_mm handling:
- keep track of CPUs not running the task
- perform per-ASID TLB flush on such CPUs only if the task is switched there

Your patch also works fine in my tests fixing those crashes. I have a
question though, regarding removed cpumask_clear_cpu. How CPUs no more
running the task are removed from its mm_cpumask ? If they are not
removed, then flush_tlb_mm/flush_tlb_page will broadcast unnecessary
TLB flushes to those CPUs when ASID is enabled.

Regards,
Sergey

_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

WARNING: multiple messages have this Message-ID (diff)
From: Sergey Matyukevich <geomatsi@gmail.com>
To: guoren@kernel.org
Cc: anup@brainfault.org, paul.walmsley@sifive.com,
	palmer@dabbelt.com, conor.dooley@microchip.com, heiko@sntech.de,
	philipp.tomsich@vrull.eu, alex@ghiti.fr, hch@lst.de,
	ajones@ventanamicro.com, gary@garyguo.net, jszhang@kernel.org,
	linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org,
	Guo Ren <guoren@linux.alibaba.com>,
	Anup Patel <apatel@ventanamicro.com>,
	Palmer Dabbelt <palmer@rivosinc.com>
Subject: Re: [PATCH V3] riscv: asid: Fixup stale TLB entry cause application crash
Date: Fri, 18 Nov 2022 23:57:21 +0300	[thread overview]
Message-ID: <Y3fxsWPLPlKGfMBj@curiosity> (raw)
In-Reply-To: <20221111075902.798571-1-guoren@kernel.org>

Hi Guo Ren,


> After use_asid_allocator is enabled, the userspace application will
> crash by stale TLB entries. Because only using cpumask_clear_cpu without
> local_flush_tlb_all couldn't guarantee CPU's TLB entries were fresh.
> Then set_mm_asid would cause the user space application to get a stale
> value by stale TLB entry, but set_mm_noasid is okay.

... [snip]

> +	/*
> +	 * The mm_cpumask indicates which harts' TLBs contain the virtual
> +	 * address mapping of the mm. Compared to noasid, using asid
> +	 * can't guarantee that stale TLB entries are invalidated because
> +	 * the asid mechanism wouldn't flush TLB for every switch_mm for
> +	 * performance. So when using asid, keep all CPUs footmarks in
> +	 * cpumask() until mm reset.
> +	 */
> +	cpumask_set_cpu(cpu, mm_cpumask(next));
> +	if (static_branch_unlikely(&use_asid_allocator)) {
> +		set_mm_asid(next, cpu);
> +	} else {
> +		cpumask_clear_cpu(cpu, mm_cpumask(prev));
> +		set_mm_noasid(next);
> +	}
>  }

I observe similar user-space crashes on my SMP systems with enabled ASID.
My attempt to fix the issue was a bit different, see the following patch:

https://lore.kernel.org/linux-riscv/20220829205219.283543-1-geomatsi@gmail.com/

In brief, the idea was borrowed from flush_icache_mm handling:
- keep track of CPUs not running the task
- perform per-ASID TLB flush on such CPUs only if the task is switched there

Your patch also works fine in my tests fixing those crashes. I have a
question though, regarding removed cpumask_clear_cpu. How CPUs no more
running the task are removed from its mm_cpumask ? If they are not
removed, then flush_tlb_mm/flush_tlb_page will broadcast unnecessary
TLB flushes to those CPUs when ASID is enabled.

Regards,
Sergey

  parent reply	other threads:[~2022-11-18 20:57 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-11-11  7:59 [PATCH V3] riscv: asid: Fixup stale TLB entry cause application crash guoren
2022-11-11  7:59 ` guoren
2022-11-11  8:29 ` Andrew Jones
2022-11-11  8:29   ` Andrew Jones
2022-11-18 20:57 ` Sergey Matyukevich [this message]
2022-11-18 20:57   ` Sergey Matyukevich
2022-11-19  3:37   ` Guo Ren
2022-11-19  3:37     ` Guo Ren
2022-11-21 20:03     ` Sergey Matyukevich
2022-11-21 20:03       ` Sergey Matyukevich
2022-12-08 23:30   ` Palmer Dabbelt
2022-12-08 23:30     ` Palmer Dabbelt
2022-12-09  3:13     ` Guo Ren
2022-12-09  3:13       ` Guo Ren
2022-12-09  3:15       ` Guo Ren
2022-12-09  3:15         ` Guo Ren
2022-11-21 19:59 ` Sergey Matyukevich
2022-11-21 19:59   ` Sergey Matyukevich
2022-12-23 12:53 ` Lad, Prabhakar
2022-12-23 12:53   ` Lad, Prabhakar
2023-02-23 17:57   ` Zong Li
2023-02-23 17:57     ` Zong Li
2023-02-23 23:20     ` Guo Ren
2023-02-23 23:20       ` Guo Ren
2023-02-25 19:29     ` Sergey Matyukevich
2023-02-25 19:29       ` Sergey Matyukevich
2023-02-26  4:24       ` Guo Ren
2023-02-26  4:24         ` Guo Ren
2023-02-27 22:40         ` Gary Guo
2023-02-27 22:40           ` Gary Guo
2023-02-28  3:12           ` Guo Ren
2023-02-28  3:12             ` Guo Ren

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Y3fxsWPLPlKGfMBj@curiosity \
    --to=geomatsi@gmail.com \
    --cc=ajones@ventanamicro.com \
    --cc=alex@ghiti.fr \
    --cc=anup@brainfault.org \
    --cc=apatel@ventanamicro.com \
    --cc=conor.dooley@microchip.com \
    --cc=gary@garyguo.net \
    --cc=guoren@kernel.org \
    --cc=guoren@linux.alibaba.com \
    --cc=hch@lst.de \
    --cc=heiko@sntech.de \
    --cc=jszhang@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=palmer@dabbelt.com \
    --cc=palmer@rivosinc.com \
    --cc=paul.walmsley@sifive.com \
    --cc=philipp.tomsich@vrull.eu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.