* [PATCH] resource: export iomem_get_mapping() for loadable modules
@ 2026-05-11 6:56 Ravi Kumar Bandi
2026-05-11 7:14 ` David Hildenbrand (Arm)
0 siblings, 1 reply; 9+ messages in thread
From: Ravi Kumar Bandi @ 2026-05-11 6:56 UTC (permalink / raw)
To: Bjorn Helgaas, Ilpo Järvinen, Andrew Morton, Dan Williams
Cc: David Hildenbrand, Ravi Kumar Bandi, linux-kernel
Loadable PCIe driver modules that handle surprise removal or link-down
events need to zap userspace mappings of PCI BAR resources to deliver
SIGBUS on next access, rather than leaving stale mappings to a dead
device.
The correct way to do this is via unmap_mapping_range() on the iomem
address space, which is already exported via EXPORT_SYMBOL.
However, iomem_get_mapping() which returns the iomem address space
is not exported, making it impossible to use unmap_mapping_range()
correctly from a loadable module without resorting to workarounds
such as walking all process VMAs or opening /dev/mem.
Export iomem_get_mapping() via EXPORT_SYMBOL_GPL to complete the
existing exported API and allow loadable modules to properly zap
PCI BAR mappings on device removal.
Signed-off-by: Ravi Kumar Bandi <ravib@amazon.com>
---
kernel/resource.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/kernel/resource.c b/kernel/resource.c
index d02a53fb95d8..8801e390fe2e 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -1311,6 +1311,7 @@ struct address_space *iomem_get_mapping(void)
*/
return smp_load_acquire(&iomem_inode)->i_mapping;
}
+EXPORT_SYMBOL_GPL(iomem_get_mapping);
static int __request_region_locked(struct resource *res, struct resource *parent,
resource_size_t start, resource_size_t n,
--
2.47.3
^ permalink raw reply related [flat|nested] 9+ messages in thread
* Re: [PATCH] resource: export iomem_get_mapping() for loadable modules
2026-05-11 6:56 [PATCH] resource: export iomem_get_mapping() for loadable modules Ravi Kumar Bandi
@ 2026-05-11 7:14 ` David Hildenbrand (Arm)
2026-05-11 7:17 ` Christoph Hellwig
0 siblings, 1 reply; 9+ messages in thread
From: David Hildenbrand (Arm) @ 2026-05-11 7:14 UTC (permalink / raw)
To: Ravi Kumar Bandi, Bjorn Helgaas, Ilpo Järvinen,
Andrew Morton, Dan Williams
Cc: linux-kernel
On 5/11/26 08:56, Ravi Kumar Bandi wrote:
> Loadable PCIe driver modules that handle surprise removal or link-down
> events need to zap userspace mappings of PCI BAR resources to deliver
> SIGBUS on next access, rather than leaving stale mappings to a dead
> device.
>
> The correct way to do this is via unmap_mapping_range() on the iomem
> address space, which is already exported via EXPORT_SYMBOL.
> However, iomem_get_mapping() which returns the iomem address space
> is not exported, making it impossible to use unmap_mapping_range()
> correctly from a loadable module without resorting to workarounds
> such as walking all process VMAs or opening /dev/mem.
>
> Export iomem_get_mapping() via EXPORT_SYMBOL_GPL to complete the
> existing exported API and allow loadable modules to properly zap
> PCI BAR mappings on device removal.
>
> Signed-off-by: Ravi Kumar Bandi <ravib@amazon.com>
> ---
> kernel/resource.c | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/kernel/resource.c b/kernel/resource.c
> index d02a53fb95d8..8801e390fe2e 100644
> --- a/kernel/resource.c
> +++ b/kernel/resource.c
> @@ -1311,6 +1311,7 @@ struct address_space *iomem_get_mapping(void)
> */
> return smp_load_acquire(&iomem_inode)->i_mapping;
> }
> +EXPORT_SYMBOL_GPL(iomem_get_mapping);
>
> static int __request_region_locked(struct resource *res, struct resource *parent,
> resource_size_t start, resource_size_t n,
Which in-tree driver wants to make use of this?
--
Cheers,
David
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] resource: export iomem_get_mapping() for loadable modules
2026-05-11 7:14 ` David Hildenbrand (Arm)
@ 2026-05-11 7:17 ` Christoph Hellwig
2026-05-11 16:16 ` Ravi Kumar Bandi
0 siblings, 1 reply; 9+ messages in thread
From: Christoph Hellwig @ 2026-05-11 7:17 UTC (permalink / raw)
To: David Hildenbrand (Arm)
Cc: Ravi Kumar Bandi, Bjorn Helgaas, Ilpo Järvinen,
Andrew Morton, Dan Williams, linux-kernel
On Mon, May 11, 2026 at 09:14:04AM +0200, David Hildenbrand (Arm) wrote:
> >
> > static int __request_region_locked(struct resource *res, struct resource *parent,
> > resource_size_t start, resource_size_t n,
>
> Which in-tree driver wants to make use of this?
And why?
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] resource: export iomem_get_mapping() for loadable modules
2026-05-11 7:17 ` Christoph Hellwig
@ 2026-05-11 16:16 ` Ravi Kumar Bandi
2026-05-12 7:10 ` David Hildenbrand (Arm)
2026-05-12 7:22 ` Christoph Hellwig
0 siblings, 2 replies; 9+ messages in thread
From: Ravi Kumar Bandi @ 2026-05-11 16:16 UTC (permalink / raw)
To: hch, david; +Cc: akpm, bhelgaas, djbw, ilpo.jarvinen, linux-kernel, ravib
On Mon, May 11, 2026 at 00:17:45 -0700, Christoph Hellwig wrote:
> And why?
On Mon, May 11, 2026 at 09:14:04 +0200, David Hildenbrand wrote:
> Which in-tree driver wants to make use of this?
Thank you both for reviewing the patch.
Currently this is needed by a PCIe endpoint driver that handles surprise
removal on a device without a hot-plug pin routed to the CPU. The link-down
event is detected by the in-tree Xilinx DMA PL PCIe controller driver
(drivers/pci/controller/pcie-xilinx-dma-pl.c), which notifies the endpoint
driver via the link-down callback.
When link-down is detected, the endpoint driver needs to zap existing
userspace BAR mappings (mmap'd via sysfs resource files) to deliver SIGBUS
rather than leaving stale mappings to a dead device, preventing kernel
crashes on subsequent accesses.
unmap_mapping_range() is already EXPORT_SYMBOL'd for this purpose
(mm/memory.c), but without iomem_get_mapping() being exported, loadable
modules cannot use it correctly for PCI BAR mappings, forcing ugly
workarounds such as walking all process VMAs.
Regards,
Ravi Kumar Bandi
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] resource: export iomem_get_mapping() for loadable modules
2026-05-11 16:16 ` Ravi Kumar Bandi
@ 2026-05-12 7:10 ` David Hildenbrand (Arm)
2026-05-12 7:31 ` Ravi Kumar Bandi
2026-05-12 7:22 ` Christoph Hellwig
1 sibling, 1 reply; 9+ messages in thread
From: David Hildenbrand (Arm) @ 2026-05-12 7:10 UTC (permalink / raw)
To: Ravi Kumar Bandi, hch; +Cc: akpm, bhelgaas, djbw, ilpo.jarvinen, linux-kernel
On 5/11/26 18:16, Ravi Kumar Bandi wrote:
> On Mon, May 11, 2026 at 00:17:45 -0700, Christoph Hellwig wrote:
>> And why?
>
> On Mon, May 11, 2026 at 09:14:04 +0200, David Hildenbrand wrote:
>> Which in-tree driver wants to make use of this?
>
> Thank you both for reviewing the patch.
>
> Currently this is needed by a PCIe endpoint driver that handles surprise
> removal on a device without a hot-plug pin routed to the CPU. The link-down
> event is detected by the in-tree Xilinx DMA PL PCIe controller driver
> (drivers/pci/controller/pcie-xilinx-dma-pl.c), which notifies the endpoint
> driver via the link-down callback.
>
> When link-down is detected, the endpoint driver needs to zap existing
> userspace BAR mappings (mmap'd via sysfs resource files) to deliver SIGBUS
> rather than leaving stale mappings to a dead device, preventing kernel
> crashes on subsequent accesses.
>
> unmap_mapping_range() is already EXPORT_SYMBOL'd for this purpose
> (mm/memory.c), but without iomem_get_mapping() being exported, loadable
> modules cannot use it correctly for PCI BAR mappings, forcing ugly
> workarounds such as walking all process VMAs.
Okay, so you essentially want to implement something similar to revoke_iomem()?
--
Cheers,
David
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] resource: export iomem_get_mapping() for loadable modules
2026-05-11 16:16 ` Ravi Kumar Bandi
2026-05-12 7:10 ` David Hildenbrand (Arm)
@ 2026-05-12 7:22 ` Christoph Hellwig
2026-05-12 7:50 ` Ravi Kumar Bandi
1 sibling, 1 reply; 9+ messages in thread
From: Christoph Hellwig @ 2026-05-12 7:22 UTC (permalink / raw)
To: Ravi Kumar Bandi
Cc: hch, david, akpm, bhelgaas, djbw, ilpo.jarvinen, linux-kernel
On Mon, May 11, 2026 at 04:16:43PM +0000, Ravi Kumar Bandi wrote:
> On Mon, May 11, 2026 at 00:17:45 -0700, Christoph Hellwig wrote:
> > And why?
>
> On Mon, May 11, 2026 at 09:14:04 +0200, David Hildenbrand wrote:
> > Which in-tree driver wants to make use of this?
>
> Thank you both for reviewing the patch.
>
> Currently this is needed by a PCIe endpoint driver that handles surprise
> removal on a device without a hot-plug pin routed to the CPU.
Which endpoint driver?
> When link-down is detected, the endpoint driver needs to zap existing
> userspace BAR mappings (mmap'd via sysfs resource files) to deliver SIGBUS
> rather than leaving stale mappings to a dead device, preventing kernel
> crashes on subsequent accesses.
Why would the endpoint driver map iomem to userspace?
> unmap_mapping_range() is already EXPORT_SYMBOL'd for this purpose
> (mm/memory.c), but without iomem_get_mapping() being exported, loadable
> modules cannot use it correctly for PCI BAR mappings, forcing ugly
> workarounds such as walking all process VMAs.
Well, we need to see the user(s) first, but once they materialize this
begs for a higher level API. Until then it is moot anyway.
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] resource: export iomem_get_mapping() for loadable modules
2026-05-12 7:10 ` David Hildenbrand (Arm)
@ 2026-05-12 7:31 ` Ravi Kumar Bandi
0 siblings, 0 replies; 9+ messages in thread
From: Ravi Kumar Bandi @ 2026-05-12 7:31 UTC (permalink / raw)
To: david; +Cc: akpm, bhelgaas, djbw, hch, ilpo.jarvinen, linux-kernel, ravib
On 5/11/26, Christoph Hellwig wrote:
> Okay, so you essentially want to implement something similar to revoke_iomem()?
Thank you for checking.
Yes, that's correct.
Regards,
Ravi Kumar Bandi
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] resource: export iomem_get_mapping() for loadable modules
2026-05-12 7:22 ` Christoph Hellwig
@ 2026-05-12 7:50 ` Ravi Kumar Bandi
2026-05-13 18:45 ` Dan Williams (nvidia)
0 siblings, 1 reply; 9+ messages in thread
From: Ravi Kumar Bandi @ 2026-05-12 7:50 UTC (permalink / raw)
To: hch; +Cc: akpm, bhelgaas, david, djbw, ilpo.jarvinen, linux-kernel, ravib
On 5/11/26, Bjorn Helgaas wrote:
> Which endpoint driver?
Thank you for checking.
It is Marvell PCIe driver for a device without a hot-plug pin routed
to the CPU. The link-down event is detected by the in-tree Xilinx DMA PL
PCIe controller driver (drivers/pci/controller/pcie-xilinx-dma-pl.c).
> Why would the endpoint driver map iomem to userspace?
sysfs exposed BAR resource files are mmap'd directly by a third-party
userspace driver.
> Well, we need to see the user(s) first, but once they materialize this
> begs for a higher level API. Until then it is moot anyway.
Understood. Thank you for the feedback. We will work on a higher level
API proposal and get back.
Regards,
Ravi Kumar Bandi
^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: [PATCH] resource: export iomem_get_mapping() for loadable modules
2026-05-12 7:50 ` Ravi Kumar Bandi
@ 2026-05-13 18:45 ` Dan Williams (nvidia)
0 siblings, 0 replies; 9+ messages in thread
From: Dan Williams (nvidia) @ 2026-05-13 18:45 UTC (permalink / raw)
To: Ravi Kumar Bandi, hch
Cc: akpm, bhelgaas, david, djbw, ilpo.jarvinen, linux-kernel, ravib
Ravi Kumar Bandi wrote:
> On 5/11/26, Bjorn Helgaas wrote:
> > Which endpoint driver?
>
> Thank you for checking.
>
> It is Marvell PCIe driver for a device without a hot-plug pin routed
> to the CPU. The link-down event is detected by the in-tree Xilinx DMA PL
> PCIe controller driver (drivers/pci/controller/pcie-xilinx-dma-pl.c).
>
> > Why would the endpoint driver map iomem to userspace?
>
> sysfs exposed BAR resource files are mmap'd directly by a third-party
> userspace driver.
>
> > Well, we need to see the user(s) first, but once they materialize this
> > begs for a higher level API. Until then it is moot anyway.
>
> Understood. Thank you for the feedback. We will work on a higher level
> API proposal and get back.
I took a look and there are some problems here.
My first thought was, "why does the endpoint driver need to do this? The
PCI core device removal should be responsible for zapping mappings."
2 things defeat this:
1/ for sysfs bar mappings, the unmap_mapping_range() in
kernfs_drain_open_files() misses mappings established against
the shared iomem_get_mapping().
2/ procfs access to BAR space has never unmapped on device removal
The practical implication of this is that userspace mappings of BARs can
survive past device removal. As for mitigations, with IO_STRICT_DEVMEM
the kernel will zap them before use, with LOCKDOWN the mappings can not
be established, and CAP_SYS_RAWIO is required (for procfs) to create
these mappings.
I recall that Sima added support for ioport mmap revoke support in:
636b21b50152 PCI: Revoke mappings like devmem
...but given revoke_iomem() only evacuates at request_mem_region() time,
I do not see how that ever worked.
We could do something like the following for kernfs regression in the
near term, or just proceed with making revoke_iomem() something that the
PCI core does unconditionally by physical address on device removal.
That would also fix the procfs gap.
Ravi, I think your time is best spent getting the PCI core to handle the
unmap on device removal.
-- >8 --
Subject: resource: Fix PCI/sysfs mmap revocation vs CONFIG_IO_STRICT_DEVMEM=n
From: Dan Williams <djbw@kernel.org>
CONFIG_IO_STRICT_DEVMEM wants to arrange for exclusive kernel mappings of
device MMIO resources. It prevents scenarios where the kernel is confused
by a user triggered device effect while the driver is running. It uses a
shared inode address_space such that request_mem_region() can unmap user
mappings by physical resource address.
This shared inode however is not considered by kernfs_drain_open_files().
It is also the case that the procfs method for mapping PCI resources has
historically never cleaned up user mappings on device removal.
Fix at least the regression of unmap behavior for the
CONFIG_IO_STRICT_DEVMEM=n case by disabling the shared inode when
revoke_iomem() is disabled.
Signed-off-by: Dan Williams <djbw@kernel.org>
---
drivers/char/mem.c | 3 ++-
fs/sysfs/file.c | 2 +-
kernel/resource.c | 11 ++++++++---
3 files changed, 11 insertions(+), 5 deletions(-)
diff --git a/drivers/char/mem.c b/drivers/char/mem.c
index 5fd421e48c04..d9a79b0e4e4f 100644
--- a/drivers/char/mem.c
+++ b/drivers/char/mem.c
@@ -632,7 +632,8 @@ static int open_port(struct inode *inode, struct file *filp)
* revocations when drivers want to take over a /dev/mem mapped
* range.
*/
- filp->f_mapping = iomem_get_mapping();
+ if (iomem_get_mapping())
+ filp->f_mapping = iomem_get_mapping();
return 0;
}
diff --git a/fs/sysfs/file.c b/fs/sysfs/file.c
index 5709cede1d75..60947fa4a402 100644
--- a/fs/sysfs/file.c
+++ b/fs/sysfs/file.c
@@ -189,7 +189,7 @@ static int sysfs_kf_bin_open(struct kernfs_open_file *of)
{
const struct bin_attribute *battr = of->kn->priv;
- if (battr->f_mapping)
+ if (battr->f_mapping && battr->f_mapping())
of->file->f_mapping = battr->f_mapping();
return 0;
diff --git a/kernel/resource.c b/kernel/resource.c
index d02a53fb95d8..816bcdb9fa05 100644
--- a/kernel/resource.c
+++ b/kernel/resource.c
@@ -1297,9 +1297,6 @@ static void revoke_iomem(struct resource *res)
unmap_mapping_range(inode->i_mapping, res->start, resource_size(res), 1);
}
-#else
-static void revoke_iomem(struct resource *res) {}
-#endif
struct address_space *iomem_get_mapping(void)
{
@@ -1311,6 +1308,14 @@ struct address_space *iomem_get_mapping(void)
*/
return smp_load_acquire(&iomem_inode)->i_mapping;
}
+#else
+static void revoke_iomem(struct resource *res) {}
+
+struct address_space *iomem_get_mapping(void)
+{
+ return NULL;
+}
+#endif
static int __request_region_locked(struct resource *res, struct resource *parent,
resource_size_t start, resource_size_t n,
^ permalink raw reply related [flat|nested] 9+ messages in thread
end of thread, other threads:[~2026-05-13 18:45 UTC | newest]
Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-05-11 6:56 [PATCH] resource: export iomem_get_mapping() for loadable modules Ravi Kumar Bandi
2026-05-11 7:14 ` David Hildenbrand (Arm)
2026-05-11 7:17 ` Christoph Hellwig
2026-05-11 16:16 ` Ravi Kumar Bandi
2026-05-12 7:10 ` David Hildenbrand (Arm)
2026-05-12 7:31 ` Ravi Kumar Bandi
2026-05-12 7:22 ` Christoph Hellwig
2026-05-12 7:50 ` Ravi Kumar Bandi
2026-05-13 18:45 ` Dan Williams (nvidia)
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox