From: "Dan Williams (nvidia)" <djbw@kernel.org>
To: Ravi Kumar Bandi <ravib@amazon.com>, hch@infradead.org
Cc: akpm@linux-foundation.org, bhelgaas@google.com,
david@kernel.org, djbw@kernel.org,
ilpo.jarvinen@linux.intel.com, linux-kernel@vger.kernel.org,
ravib@amazon.com
Subject: Re: [PATCH] resource: export iomem_get_mapping() for loadable modules
Date: Wed, 13 May 2026 11:45:14 -0700 [thread overview]
Message-ID: <6a04c6ba958b2_107b100f9@djbw-dev.notmuch> (raw)
In-Reply-To: <20260512075022.48671-1-ravib@amazon.com>
Ravi Kumar Bandi wrote:
> On 5/11/26, Bjorn Helgaas wrote:
> > Which endpoint driver?
>
> Thank you for checking.
>
> It is Marvell PCIe driver for a device without a hot-plug pin routed
> to the CPU. The link-down event is detected by the in-tree Xilinx DMA PL
> PCIe controller driver (drivers/pci/controller/pcie-xilinx-dma-pl.c).
>
> > Why would the endpoint driver map iomem to userspace?
>
> sysfs exposed BAR resource files are mmap'd directly by a third-party
> userspace driver.
>
> > Well, we need to see the user(s) first, but once they materialize this
> > begs for a higher level API. Until then it is moot anyway.
>
> Understood. Thank you for the feedback. We will work on a higher level
> API proposal and get back.
I took a look and there are some problems here.
My first thought was, "why does the endpoint driver need to do this? The
PCI core device removal should be responsible for zapping mappings."
2 things defeat this:
1/ for sysfs bar mappings, the unmap_mapping_range() in
kernfs_drain_open_files() misses mappings established against
the shared iomem_get_mapping().
2/ procfs access to BAR space has never unmapped on device removal
The practical implication of this is that userspace mappings of BARs can
survive past device removal. As for mitigations, with IO_STRICT_DEVMEM
the kernel will zap them before use, with LOCKDOWN the mappings can not
be established, and CAP_SYS_RAWIO is required (for procfs) to create
these mappings.
I recall that Sima added support for ioport mmap revoke support in:
636b21b50152 PCI: Revoke mappings like devmem
...but given revoke_iomem() only evacuates at request_mem_region() time,
I do not see how that ever worked.
We could do something like the following for kernfs regression in the
near term, or just proceed with making revoke_iomem() something that the
PCI core does unconditionally by physical address on device removal.
That would also fix the procfs gap.
Ravi, I think your time is best spent getting the PCI core to handle the
unmap on device removal.
-- >8 --
Subject: resource: Fix PCI/sysfs mmap revocation vs CONFIG_IO_STRICT_DEVMEM=n
From: Dan Williams <djbw@kernel.org>
CONFIG_IO_STRICT_DEVMEM wants to arrange for exclusive kernel mappings of
device MMIO resources. It prevents scenarios where the kernel is confused
by a user triggered device effect while the driver is running. It uses a
shared inode address_space such that request_mem_region() can unmap user
mappings by physical resource address.
This shared inode however is not considered by kernfs_drain_open_files().
It is also the case that the procfs method for mapping PCI resources has
historically never cleaned up user mappings on device removal.
Fix at least the regression of unmap behavior for the
CONFIG_IO_STRICT_DEVMEM=n case by disabling the shared inode when
revoke_iomem() is disabled.
Signed-off-by: Dan Williams <djbw@kernel.org>
---
drivers/char/mem.c | 3 ++-
fs/sysfs/file.c | 2 +-
kernel/resource.c | 11 ++++++++---
3 files changed, 11 insertions(+), 5 deletions(-)
diff --git a/drivers/char/mem.c b/drivers/char/mem.c
index 5fd421e48c04..d9a79b0e4e4f 100644
--- a/drivers/char/mem.c
+++ b/drivers/char/mem.c
@@ -632,7 +632,8 @@ static int open_port(struct inode *inode, struct file *filp)
* revocations when drivers want to take over a /dev/mem mapped
* range.
*/
- filp->f_mapping = iomem_get_mapping();
+ if (iomem_get_mapping())
+ filp->f_mapping = iomem_get_mapping();
return 0;
}
diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index 5709cede1d75..60947fa4a402 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -189,7 +189,7 @@ static int sysfs_kf_bin_open(struct kernfs_open_file *of)
{
const struct bin_attribute *battr = of->kn->priv;
- if (battr->f_mapping)
+ if (battr->f_mapping && battr->f_mapping())
of->file->f_mapping = battr->f_mapping();
return 0;
diff --git a/kernel/resource.c b/kernel/resource.c
index d02a53fb95d8..816bcdb9fa05 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -1297,9 +1297,6 @@ static void revoke_iomem(struct resource *res)
unmap_mapping_range(inode->i_mapping, res->start, resource_size(res), 1);
}
-#else
-static void revoke_iomem(struct resource *res) {}
-#endif
struct address_space *iomem_get_mapping(void)
{
@@ -1311,6 +1308,14 @@ struct address_space *iomem_get_mapping(void)
*/
return smp_load_acquire(&iomem_inode)->i_mapping;
}
+#else
+static void revoke_iomem(struct resource *res) {}
+
+struct address_space *iomem_get_mapping(void)
+{
+ return NULL;
+}
+#endif
static int __request_region_locked(struct resource *res, struct resource *parent,
resource_size_t start, resource_size_t n,
prev parent reply other threads:[~2026-05-13 18:45 UTC|newest]
Thread overview: 9+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-11 6:56 [PATCH] resource: export iomem_get_mapping() for loadable modules Ravi Kumar Bandi
2026-05-11 7:14 ` David Hildenbrand (Arm)
2026-05-11 7:17 ` Christoph Hellwig
2026-05-11 16:16 ` Ravi Kumar Bandi
2026-05-12 7:10 ` David Hildenbrand (Arm)
2026-05-12 7:31 ` Ravi Kumar Bandi
2026-05-12 7:22 ` Christoph Hellwig
2026-05-12 7:50 ` Ravi Kumar Bandi
2026-05-13 18:45 ` Dan Williams (nvidia) [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6a04c6ba958b2_107b100f9@djbw-dev.notmuch \
--to=djbw@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=bhelgaas@google.com \
--cc=david@kernel.org \
--cc=hch@infradead.org \
--cc=ilpo.jarvinen@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=ravib@amazon.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox