linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Edgecombe, Rick P" <rick.p.edgecombe@intel.com>
To: "debug@rivosinc.com" <debug@rivosinc.com>
Cc: "nathan@kernel.org" <nathan@kernel.org>,
	"kito.cheng@sifive.com" <kito.cheng@sifive.com>,
	"jeffreyalaw@gmail.com" <jeffreyalaw@gmail.com>,
	"lorenzo.stoakes@oracle.com" <lorenzo.stoakes@oracle.com>,
	"mhocko@suse.com" <mhocko@suse.com>,
	"charlie@rivosinc.com" <charlie@rivosinc.com>,
	"david@redhat.com" <david@redhat.com>,
	"masahiroy@kernel.org" <masahiroy@kernel.org>,
	"samitolvanen@google.com" <samitolvanen@google.com>,
	"conor.dooley@microchip.com" <conor.dooley@microchip.com>,
	"bjorn@rivosinc.com" <bjorn@rivosinc.com>,
	"linux-riscv@lists.infradead.org"
	<linux-riscv@lists.infradead.org>,
	"nicolas.schier@linux.dev" <nicolas.schier@linux.dev>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"andrew@sifive.com" <andrew@sifive.com>,
	"monk.chiang@sifive.com" <monk.chiang@sifive.com>,
	"justinstitt@google.com" <justinstitt@google.com>,
	"palmer@dabbelt.com" <palmer@dabbelt.com>,
	"morbo@google.com" <morbo@google.com>,
	"aou@eecs.berkeley.edu" <aou@eecs.berkeley.edu>,
	"nick.desaulniers+lkml@gmail.com"
	<nick.desaulniers+lkml@gmail.com>,
	"rppt@kernel.org" <rppt@kernel.org>,
	"broonie@kernel.org" <broonie@kernel.org>,
	"ved@rivosinc.com" <ved@rivosinc.com>,
	"heinrich.schuchardt@canonical.com"
	<heinrich.schuchardt@canonical.com>,
	"vbabka@suse.cz" <vbabka@suse.cz>,
	"Liam.Howlett@oracle.com" <Liam.Howlett@oracle.com>,
	"alex@ghiti.fr" <alex@ghiti.fr>,
	"fweimer@redhat.com" <fweimer@redhat.com>,
	"surenb@google.com" <surenb@google.com>,
	"linux-kbuild@vger.kernel.org" <linux-kbuild@vger.kernel.org>,
	"cleger@rivosinc.com" <cleger@rivosinc.com>,
	"samuel.holland@sifive.com" <samuel.holland@sifive.com>,
	"llvm@lists.linux.dev" <llvm@lists.linux.dev>,
	"paul.walmsley@sifive.com" <paul.walmsley@sifive.com>,
	"ajones@ventanamicro.com" <ajones@ventanamicro.com>,
	"linux-mm@kvack.org" <linux-mm@kvack.org>,
	"apatel@ventanamicro.com" <apatel@ventanamicro.com>,
	"akpm@linux-foundation.org" <akpm@linux-foundation.org>
Subject: Re: [PATCH 10/11] scs: generic scs code updated to leverage hw assisted shadow stack
Date: Fri, 25 Jul 2025 18:05:22 +0000	[thread overview]
Message-ID: <a19c1338f2fa4cb19a4f8b7552ff54ded20b403a.camel@intel.com> (raw)
In-Reply-To: <aIO8uSqiplnyyNOd@debug.ba.rivosinc.com>

On Fri, 2025-07-25 at 10:19 -0700, Deepak Gupta wrote:
> > This doesn't update the direct map alias I think. Do you want to protect it?
> 
> Yes any alternate address mapping which is writeable is a problem and dilutes
> the mechanism. How do I go about updating direct map ? (I pretty new to linux
> kernel and have limited understanding on which kernel api's to use here to
> unmap
> direct map)

Here is some info on how it works:

set_memory_foo() variants should (I didn't check riscv implementation, but on
x86) update the target addresses passed in *and* the direct map alias. And flush
the TLB.

vmalloc_node_range() will just set the permission on the vmalloc alias and not
touch the direct map alias.

vfree() works by trying to batch the flushing for unmap operations to avoid
flushing the TLB too much. When memory is unmapped in userspace, it will only
flush on the CPU's with that MM (process address space). But for kernel memory
the mappings are shared between all CPUs. So, like on a big server or something,
it requires way more work and distance IPIs, etc. So vmalloc will try to be
efficient and keep zapped mappings unflushed until it has enough to clean them
up in bulk. In the meantime it won't reuse that vmalloc address space.

But this means there can also be other vmalloc aliases still in the TLB for any
page that gets allocated from the page allocator. If you want to be fully sure
there are no writable aliases, you need to call vm_unmap_aliases() each time you
change kernel permissions, which will do the vmalloc TLB flush immediately. Many
set_memory() implementations call this automatically, but it looks like not
riscv.


So doing something like vmalloc(), set_memory_shadow_stack() on alloc and
set_memory_rw(), vfree() on free is doing the expensive flush (depends on the
device how expensive) in a previously fast path. Ignoring the direct map alias
is faster. A middle ground would be to do the allocation/conversion and freeing
of a bunch of stacks at once, and recycle them.


You could make it tidy first and then optimize it later, or make it faster first
and maximally secure later. Or try to do it all at once. But there have long
been discussions on batching type kernel memory permission solutions. So it
would could be a whole project itself.

> 
> > 
> > > 
> > >   out:
> > > @@ -59,7 +72,7 @@ void *scs_alloc(int node)
> > >   	if (!s)
> > >   		return NULL;
> > > 
> > > -	*__scs_magic(s) = SCS_END_MAGIC;
> > > +	__scs_store_magic(__scs_magic(s), SCS_END_MAGIC);
> > > 
> > >   	/*
> > >   	 * Poison the allocation to catch unintentional accesses to
> > > @@ -87,6 +100,16 @@ void scs_free(void *s)
> > >   			return;
> > > 
> > >   	kasan_unpoison_vmalloc(s, SCS_SIZE, KASAN_VMALLOC_PROT_NORMAL);
> > > +	/*
> > > +	 * Hardware protected shadow stack is not writeable by regular
> > > stores
> > > +	 * Thus adding this back to free list will raise faults by
> > > vmalloc
> > > +	 * It needs to be writeable again. It's good sanity as well
> > > because
> > > +	 * then it can't be inadvertently accesses and if done, it will
> > > fault.
> > > +	 */
> > > +#ifdef CONFIG_ARCH_HAS_KERNEL_SHADOW_STACK
> > > +	set_memory_rw((unsigned long)s, (SCS_SIZE/PAGE_SIZE));
> > 
> > Above you don't update the direct map permissions. So I don't think you need
> > this. vmalloc should flush the permissioned mapping before re-using it with
> > the
> > lazy cleanup scheme.
> 
> If I didn't do this, I was getting a page fault on this vmalloc address. It
> directly
> uses first 8 bytes to add it into some list and that was the location of
> fault.

Ah right! Because it is using the vfree atomic variant.

You could create your own WQ in SCS and call vfree() in non-atomic context. If
you want to avoid thr set_memory_rw() on free, in the ignoring the direct map
case.

  reply	other threads:[~2025-07-25 18:05 UTC|newest]

Thread overview: 41+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-24 23:36 [PATCH 00/11] riscv: fine grained hardware assisted kernel control-flow integrity Deepak Gupta
2025-07-24 23:36 ` [PATCH 01/11] riscv: add landing pad for asm routines Deepak Gupta
2025-07-25  6:13   ` Heinrich Schuchardt
2025-07-25 14:10     ` Deepak Gupta
2025-07-25 15:27   ` Sami Tolvanen
2025-07-25 17:01     ` Deepak Gupta
2025-07-24 23:36 ` [PATCH 02/11] riscv: update asm call site in `call_on_irq_stack` to setup correct label Deepak Gupta
2025-07-25  6:23   ` Heinrich Schuchardt
2025-07-25 14:16     ` Deepak Gupta
2025-07-25 15:33   ` Sami Tolvanen
2025-07-25 16:56     ` Deepak Gupta
2025-07-24 23:36 ` [PATCH 03/11] riscv: indirect jmp in asm that's static in nature to use sw guarded jump Deepak Gupta
2025-07-25  6:26   ` Heinrich Schuchardt
2025-07-24 23:36 ` [PATCH 04/11] riscv: exception handlers can be software guarded transfers Deepak Gupta
2025-07-24 23:36 ` [PATCH 05/11] riscv: enable landing pad enforcement Deepak Gupta
2025-07-25  6:33   ` Heinrich Schuchardt
2025-07-25 14:20     ` Deepak Gupta
2025-07-25 14:43       ` Heinrich Schuchardt
2025-07-24 23:36 ` [PATCH 06/11] mm: Introduce ARCH_HAS_KERNEL_SHADOW_STACK Deepak Gupta
2025-07-26  7:42   ` Mike Rapoport
2025-07-29  0:36     ` Deepak Gupta
2025-07-24 23:37 ` [PATCH 07/11] scs: place init shadow stack in .shadowstack section Deepak Gupta
2025-07-24 23:37 ` [PATCH 08/11] riscv/mm: prepare shadow stack for init task Deepak Gupta
2025-07-24 23:37 ` [PATCH 09/11] riscv: scs: add hardware shadow stack support to scs Deepak Gupta
2025-07-24 23:37 ` [PATCH 10/11] scs: generic scs code updated to leverage hw assisted shadow stack Deepak Gupta
2025-07-25 16:13   ` Sami Tolvanen
2025-07-25 16:42     ` Deepak Gupta
2025-07-25 16:47       ` Deepak Gupta
2025-07-25 16:46     ` Mark Brown
2025-07-28 12:47     ` Will Deacon
2025-07-28 16:37       ` Deepak Gupta
2025-07-25 17:06   ` Edgecombe, Rick P
2025-07-25 17:19     ` Deepak Gupta
2025-07-25 18:05       ` Edgecombe, Rick P [this message]
2025-07-28 19:23         ` Deepak Gupta
2025-07-28 21:19           ` Deepak Gupta
2025-07-24 23:37 ` [PATCH 11/11] riscv: Kconfig & Makefile for riscv kernel control flow integrity Deepak Gupta
2025-07-25 11:26   ` Heinrich Schuchardt
2025-07-25 14:23     ` Deepak Gupta
2025-07-25 14:39       ` Heinrich Schuchardt
2025-07-24 23:38 ` [PATCH 00/11] riscv: fine grained hardware assisted kernel control-flow integrity Deepak Gupta

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a19c1338f2fa4cb19a4f8b7552ff54ded20b403a.camel@intel.com \
    --to=rick.p.edgecombe@intel.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=ajones@ventanamicro.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex@ghiti.fr \
    --cc=andrew@sifive.com \
    --cc=aou@eecs.berkeley.edu \
    --cc=apatel@ventanamicro.com \
    --cc=bjorn@rivosinc.com \
    --cc=broonie@kernel.org \
    --cc=charlie@rivosinc.com \
    --cc=cleger@rivosinc.com \
    --cc=conor.dooley@microchip.com \
    --cc=david@redhat.com \
    --cc=debug@rivosinc.com \
    --cc=fweimer@redhat.com \
    --cc=heinrich.schuchardt@canonical.com \
    --cc=jeffreyalaw@gmail.com \
    --cc=justinstitt@google.com \
    --cc=kito.cheng@sifive.com \
    --cc=linux-kbuild@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=llvm@lists.linux.dev \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=masahiroy@kernel.org \
    --cc=mhocko@suse.com \
    --cc=monk.chiang@sifive.com \
    --cc=morbo@google.com \
    --cc=nathan@kernel.org \
    --cc=nick.desaulniers+lkml@gmail.com \
    --cc=nicolas.schier@linux.dev \
    --cc=palmer@dabbelt.com \
    --cc=paul.walmsley@sifive.com \
    --cc=rppt@kernel.org \
    --cc=samitolvanen@google.com \
    --cc=samuel.holland@sifive.com \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    --cc=ved@rivosinc.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).