From: Dan Williams <dan.j.williams@intel.com>
To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>,
Dan Williams <dan.j.williams@intel.com>
Cc: <dave.hansen@linux.intel.com>, Arnd Bergmann <arnd@arndb.de>,
Ingo Molnar <mingo@kernel.org>, Kees Cook <kees@kernel.org>,
Kirill Shutemov <kirill.shutemov@linux.intel.com>,
Michael Ellerman <mpe@ellerman.id.au>,
Naveen N Rao <naveen@kernel.org>,
Nikolay Borisov <nik.borisov@suse.com>, <stable@vger.kernel.org>,
Suzuki K Poulose <suzuki.poulose@arm.com>,
"Vishal Annapurve" <vannapurve@google.com>, <x86@kernel.org>,
<linux-coco@lists.linux.dev>
Subject: Re: [PATCH v5] x86/devmem: Drop /dev/mem access for confidential guests
Date: Wed, 30 Apr 2025 16:03:54 -0700 [thread overview]
Message-ID: <6812ac5a4d8b2_10d8f62942d@dwillia2-xfh.jf.intel.com.notmuch> (raw)
In-Reply-To: <2025043025-cathouse-headlamp-7bde@gregkh>
Greg Kroah-Hartman wrote:
> On Tue, Apr 29, 2025 at 07:46:22PM -0700, Dan Williams wrote:
> > Nikolay reports that accessing BIOS data (first 1MB of the physical
> > address space) via /dev/mem results in a "crash" / terminated VM
> > (unhandled SEPT violation). See report [1] for details.
> >
> > The cause is ioremap() (via xlate_dev_mem_ptr()) establishes an
> > unencrypted mapping where the kernel had established an encrypted
> > mapping previously. The CPU enforces mapping consistency with a fault
> > upon detecting a mismatch. A similar situation arises with devmem access
> > to "unaccepted" confidential memory. In summary, it is fraught to allow
> > uncoordinated userspace mapping of confidential memory.
> >
> > While there is an existing mitigation to simulate and redirect access to
> > the BIOS data area with STRICT_DEVMEM=y, it is insufficient.
> > Specifically, STRICT_DEVMEM=y traps read(2) access to the BIOS data
> > area, and returns a zeroed buffer. However, it turns out the kernel
> > fails to enforce the same via mmap(2), and a direct mapping is
> > established. This is a hole, and unfortunately userspace has learned to
> > exploit it [2].
> >
> > This means the kernel either needs: a mechanism to ensure consistent
> > plus accepted "encrypted" mappings of this /dev/mem mmap() hole, close
> > the hole by mapping the zero page in the mmap(2) path, block only BIOS
> > data access and let typical STRICT_DEVMEM protect the rest, or disable
> > /dev/mem altogether.
> >
> > The simplest option for now is arrange for /dev/mem to always behave as
> > if lockdown is enabled for confidential guests. Require confidential
> > guest userspace to jettison legacy dependencies on /dev/mem similar to
> > how other legacy mechanisms are jettisoned for confidential operation.
> > Recall that modern methods for BIOS data access are available like
> > /sys/firmware/dmi/tables.
> >
> > Now, this begs the question what to do with PCI sysfs which allows
> > userspace mappings of confidential MMIO with similar mapping consistency
> > and acceptance expectations? Here, the existing mitigation of
> > IO_STRICT_DEVMEM is likely sufficient. The kernel is expected to use
> > request_mem_region() when toggling the state of MMIO. With
> > IO_STRICT_DEVMEM that enforces kernel-exclusive access until
> > release_mem_region(), i.e. mapping conflicts are prevented.
> >
> > Cc: <x86@kernel.org>
> > Cc: Kees Cook <kees@kernel.org>
> > Cc: Ingo Molnar <mingo@kernel.org>
> > Cc: Vishal Annapurve <vannapurve@google.com>
> > Cc: Kirill Shutemov <kirill.shutemov@linux.intel.com>
> > Cc: Arnd Bergmann <arnd@arndb.de>
> > Cc: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > Cc: Michael Ellerman <mpe@ellerman.id.au>
> > Cc: Suzuki K Poulose <suzuki.poulose@arm.com>
> > Link: https://sources.debian.org/src/libdebian-installer/0.125/src/system/subarch-x86-linux.c/?hl=113#L93 [2]
> > Reported-by: Nikolay Borisov <nik.borisov@suse.com>
> > Closes: http://lore.kernel.org/20250318113604.297726-1-nik.borisov@suse.com [1]
> > Fixes: 9aa6ea69852c ("x86/tdx: Make pages shared in ioremap()")
> > Cc: <stable@vger.kernel.org>
> > Acked-by: Dave Hansen <dave.hansen@linux.intel.com>
> > Reviewed-by: Nikolay Borisov <nik.borisov@suse.com>
> > Acked-by: Naveen N Rao (AMD) <naveen@kernel.org>
> > Tested-by: Naveen N Rao (AMD) <naveen@kernel.org>
> > Signed-off-by: Dan Williams <dan.j.williams@intel.com>
> > ---
> > arch/x86/Kconfig | 4 ++++
> > drivers/char/mem.c | 10 ++++++++++
> > 2 files changed, 14 insertions(+)
> >
> > diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
> > index 4b9f378e05f6..36f11aad1ae5 100644
> > --- a/arch/x86/Kconfig
> > +++ b/arch/x86/Kconfig
> > @@ -891,6 +891,8 @@ config INTEL_TDX_GUEST
> > depends on X86_X2APIC
> > depends on EFI_STUB
> > depends on PARAVIRT
> > + select STRICT_DEVMEM
> > + select IO_STRICT_DEVMEM
> > select ARCH_HAS_CC_PLATFORM
> > select X86_MEM_ENCRYPT
> > select X86_MCE
> > @@ -1510,6 +1512,8 @@ config AMD_MEM_ENCRYPT
> > bool "AMD Secure Memory Encryption (SME) support"
> > depends on X86_64 && CPU_SUP_AMD
> > depends on EFI_STUB
> > + select STRICT_DEVMEM
> > + select IO_STRICT_DEVMEM
> > select DMA_COHERENT_POOL
> > select ARCH_USE_MEMREMAP_PROT
> > select INSTRUCTION_DECODER
> > diff --git a/drivers/char/mem.c b/drivers/char/mem.c
> > index 48839958b0b1..47729606b817 100644
> > --- a/drivers/char/mem.c
> > +++ b/drivers/char/mem.c
> > @@ -30,6 +30,7 @@
> > #include <linux/uio.h>
> > #include <linux/uaccess.h>
> > #include <linux/security.h>
> > +#include <linux/cc_platform.h>
> >
> > #define DEVMEM_MINOR 1
> > #define DEVPORT_MINOR 4
> > @@ -595,6 +596,15 @@ static int open_port(struct inode *inode, struct file *filp)
> > if (rc)
> > return rc;
> >
> > + /*
> > + * Enforce encrypted mapping consistency and avoid unaccepted
> > + * memory conflicts, "lockdown" /dev/mem for confidential
> > + * guests.
> > + */
> > + if (IS_ENABLED(CONFIG_STRICT_DEVMEM) &&
> > + cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT))
> > + return -EPERM;
>
> I hate to ask, but why not force the whole "confidential computing"
> stuff to enable IO_STRICT_DEVMEM as well? I don't see why you would
> want a cc guest raw access to devmem, do you?
This patch at least forces it for x86 CC guests. I was not quite ready
to say any platform that has "select ARCH_HAS_CC_PLATFORM" should get
the same treatment. At the same time, no strong reason *not* to do that.
> You kind of mention it above in the last paragraph, but forcing that on
> for these guests feels like the best, and simplest, solution overall as
> the number of different "is this secure, no really is this secure, no
> what about this option" chain of tests that we have in this driver is
> getting kind of silly.
True, and this matches what Arnd is saying.
> OR, why not just force all cc guests to enable a security module that
> implements this for them? :)
Let me take a look at that option as that seems to be where Arnd's
questions are leading me as well.
next prev parent reply other threads:[~2025-04-30 23:04 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-04-30 2:46 [PATCH v5] Restrict devmem for confidential VMs Dan Williams
2025-04-30 2:46 ` [PATCH v5] x86/devmem: Remove duplicate range_is_allowed() definition Dan Williams
2025-04-30 7:19 ` Greg Kroah-Hartman
2025-04-30 2:46 ` [PATCH v5] x86/devmem: Drop /dev/mem access for confidential guests Dan Williams
2025-04-30 7:19 ` Greg Kroah-Hartman
2025-04-30 23:03 ` Dan Williams [this message]
2025-04-30 17:31 ` Arnd Bergmann
2025-05-01 0:56 ` Dan Williams
2025-05-01 8:12 ` Arnd Bergmann
2025-05-01 20:01 ` Arnd Bergmann
2025-05-01 20:18 ` Dave Hansen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6812ac5a4d8b2_10d8f62942d@dwillia2-xfh.jf.intel.com.notmuch \
--to=dan.j.williams@intel.com \
--cc=arnd@arndb.de \
--cc=dave.hansen@linux.intel.com \
--cc=gregkh@linuxfoundation.org \
--cc=kees@kernel.org \
--cc=kirill.shutemov@linux.intel.com \
--cc=linux-coco@lists.linux.dev \
--cc=mingo@kernel.org \
--cc=mpe@ellerman.id.au \
--cc=naveen@kernel.org \
--cc=nik.borisov@suse.com \
--cc=stable@vger.kernel.org \
--cc=suzuki.poulose@arm.com \
--cc=vannapurve@google.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox