* [PATCH v4 0/3] vfio/pci: Introduce vfio_pci driver for ISM devices
@ 2026-03-13 14:40 Julian Ruess
2026-03-13 14:40 ` [PATCH v4 1/3] vfio/pci: Rename vfio_config_do_rw() to vfio_pci_config_rw_single() and export it Julian Ruess
` (3 more replies)
0 siblings, 4 replies; 8+ messages in thread
From: Julian Ruess @ 2026-03-13 14:40 UTC (permalink / raw)
To: schnelle, wintera, ts, oberpar, gbayer, Alex Williamson,
Jason Gunthorpe, Yishai Hadas, Shameer Kolothum, Kevin Tian
Cc: mjrosato, alifm, raspl, hca, agordeev, gor, julianr, kvm,
linux-kernel, linux-s390, linux-pci
Hi all,
This series adds a vfio_pci variant driver for the s390-specific
Internal Shared Memory (ISM) devices used for inter-VM communication
including SMC-D.
This is a prerequisite for an in-development open-source user space
driver stack that will allow to use ISM devices to provide remote
console and block device functionality. This stack will be part of
s390-tools.
This driver would also allow QEMU to mediate access to an ISM device,
enabling a form of PCI pass-through even for guests whose hardware
cannot directly execute PCI accesses, such as nested guests.
On s390, kernel primitives such as ioread() and iowrite() are switched
over from function handle based PCI load/stores instructions to PCI
memory-I/O (MIO) loads/stores when these are available and not
explicitly disabled. Since these instructions cannot be used with ISM
devices, ensure that classic function handle-based PCI instructions are
used instead.
The driver is still required even when MIO instructions are disabled, as
the ISM device relies on the PCI store‑block (PCISTB) instruction to
perform write operations.
Thank you,
Julian
Signed-off-by: Julian Ruess <julianr@linux.ibm.com>
---
Changes in v4:
- Fix bug with < 8 byte reads. For code simplicity, only support 8 byte reads.
- Fix leak of ivpcd.
- Fix cache replacement by implementing a per-device kmem_cache.
- Link to v3: https://lore.kernel.org/r/20260305-vfio_pci_ism-v3-0-1217076c81d9@linux.ibm.com
Changes in v3:
- Add comments to ism_vfio_pci_do_io_r() and ism_vfio_pci_do_io_w().
- Format Kconfig.
- Add 4k boundary check to ism_vfio_pci_do_io_w().
- Use kmem_cache instead of kzalloc in ism_vfio_pci_do_io_w().
- Add error handler to struct ism_vfio_pci_driver.
- Link to v2: https://lore.kernel.org/r/20260224-vfio_pci_ism-v2-0-f010945373fa@linux.ibm.com
Changes in v2:
- Remove common code patch that sets VFIO_PCI_OFFSET_SHIFT to 48.
- Implement ism_vfio_pci_ioctl_get_region_info() to have own region
offsets.
- For config space accesses, rename vfio_config_do_rw() to
vfio_pci_config_rw_single() and export it.
- Use zdev->maxstbl instead of ZPCI_BOUNDARY_SIZE.
- Add comment that zPCI must not use MIO instructions for config space
access.
- Rework patch descriptions.
- Update license info.
- Link to v1: https://lore.kernel.org/r/20260212-vfio_pci_ism-v1-0-333262ade074@linux.ibm.com
---
Julian Ruess (3):
vfio/pci: Rename vfio_config_do_rw() to vfio_pci_config_rw_single() and export it
vfio/ism: Implement vfio_pci driver for ISM devices
MAINTAINERS: add VFIO ISM PCI DRIVER section
MAINTAINERS | 6 +
drivers/vfio/pci/Kconfig | 2 +
drivers/vfio/pci/Makefile | 2 +
drivers/vfio/pci/ism/Kconfig | 10 ++
drivers/vfio/pci/ism/Makefile | 3 +
drivers/vfio/pci/ism/main.c | 345 +++++++++++++++++++++++++++++++++++++
drivers/vfio/pci/vfio_pci_config.c | 8 +-
drivers/vfio/pci/vfio_pci_priv.h | 4 +
8 files changed, 377 insertions(+), 3 deletions(-)
---
base-commit: 0257f64bdac7fdca30fa3cae0df8b9ecbec7733a
change-id: 20250227-vfio_pci_ism-0ccc2e472247
Best regards,
--
Julian Ruess <julianr@linux.ibm.com>
^ permalink raw reply [flat|nested] 8+ messages in thread
* [PATCH v4 1/3] vfio/pci: Rename vfio_config_do_rw() to vfio_pci_config_rw_single() and export it
2026-03-13 14:40 [PATCH v4 0/3] vfio/pci: Introduce vfio_pci driver for ISM devices Julian Ruess
@ 2026-03-13 14:40 ` Julian Ruess
2026-03-13 14:40 ` [PATCH v4 2/3] vfio/ism: Implement vfio_pci driver for ISM devices Julian Ruess
` (2 subsequent siblings)
3 siblings, 0 replies; 8+ messages in thread
From: Julian Ruess @ 2026-03-13 14:40 UTC (permalink / raw)
To: schnelle, wintera, ts, oberpar, gbayer, Alex Williamson,
Jason Gunthorpe, Yishai Hadas, Shameer Kolothum, Kevin Tian
Cc: mjrosato, alifm, raspl, hca, agordeev, gor, julianr, kvm,
linux-kernel, linux-s390, linux-pci
A follow-up patch adds a new variant driver for s390 ISM devices. Since
this device uses a 256 TiB BAR 0 that is never mapped, the variant
driver needs its own ISM_VFIO_PCI_OFFSET_MASK. To minimally mirror the
functionality of vfio_pci_config_rw() with such a custom mask, export
vfio_config_do_rw(). To better distinguish the now exported function
from vfio_pci_config_rw(), rename it to vfio_pci_config_rw_single()
emphasizing that it does a single config space read or write.
Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Julian Ruess <julianr@linux.ibm.com>
---
drivers/vfio/pci/vfio_pci_config.c | 8 +++++---
drivers/vfio/pci/vfio_pci_priv.h | 4 ++++
2 files changed, 9 insertions(+), 3 deletions(-)
diff --git a/drivers/vfio/pci/vfio_pci_config.c b/drivers/vfio/pci/vfio_pci_config.c
index b4e39253f98da61a5e2b6dd0089b2f6aef4b85a0..fbb47b4ddb43d42b758b16778e6e701379d7e7db 100644
--- a/drivers/vfio/pci/vfio_pci_config.c
+++ b/drivers/vfio/pci/vfio_pci_config.c
@@ -1880,8 +1880,9 @@ static size_t vfio_pci_cap_remaining_dword(struct vfio_pci_core_device *vdev,
return i;
}
-static ssize_t vfio_config_do_rw(struct vfio_pci_core_device *vdev, char __user *buf,
- size_t count, loff_t *ppos, bool iswrite)
+ssize_t vfio_pci_config_rw_single(struct vfio_pci_core_device *vdev,
+ char __user *buf, size_t count, loff_t *ppos,
+ bool iswrite)
{
struct pci_dev *pdev = vdev->pdev;
struct perm_bits *perm;
@@ -1970,6 +1971,7 @@ static ssize_t vfio_config_do_rw(struct vfio_pci_core_device *vdev, char __user
return ret;
}
+EXPORT_SYMBOL_GPL(vfio_pci_config_rw_single);
ssize_t vfio_pci_config_rw(struct vfio_pci_core_device *vdev, char __user *buf,
size_t count, loff_t *ppos, bool iswrite)
@@ -1981,7 +1983,7 @@ ssize_t vfio_pci_config_rw(struct vfio_pci_core_device *vdev, char __user *buf,
pos &= VFIO_PCI_OFFSET_MASK;
while (count) {
- ret = vfio_config_do_rw(vdev, buf, count, &pos, iswrite);
+ ret = vfio_pci_config_rw_single(vdev, buf, count, &pos, iswrite);
if (ret < 0)
return ret;
diff --git a/drivers/vfio/pci/vfio_pci_priv.h b/drivers/vfio/pci/vfio_pci_priv.h
index 27ac280f00b975989f6cbc02c11aaca01f9badf3..28a3edf65aeecfa06cd1856637cd33eec1fa3006 100644
--- a/drivers/vfio/pci/vfio_pci_priv.h
+++ b/drivers/vfio/pci/vfio_pci_priv.h
@@ -37,6 +37,10 @@ int vfio_pci_set_irqs_ioctl(struct vfio_pci_core_device *vdev, uint32_t flags,
ssize_t vfio_pci_config_rw(struct vfio_pci_core_device *vdev, char __user *buf,
size_t count, loff_t *ppos, bool iswrite);
+ssize_t vfio_pci_config_rw_single(struct vfio_pci_core_device *vdev,
+ char __user *buf, size_t count, loff_t *ppos,
+ bool iswrite);
+
ssize_t vfio_pci_bar_rw(struct vfio_pci_core_device *vdev, char __user *buf,
size_t count, loff_t *ppos, bool iswrite);
--
2.51.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v4 2/3] vfio/ism: Implement vfio_pci driver for ISM devices
2026-03-13 14:40 [PATCH v4 0/3] vfio/pci: Introduce vfio_pci driver for ISM devices Julian Ruess
2026-03-13 14:40 ` [PATCH v4 1/3] vfio/pci: Rename vfio_config_do_rw() to vfio_pci_config_rw_single() and export it Julian Ruess
@ 2026-03-13 14:40 ` Julian Ruess
2026-03-13 14:40 ` [PATCH v4 3/3] MAINTAINERS: add VFIO ISM PCI DRIVER section Julian Ruess
2026-03-13 15:41 ` [PATCH v4 0/3] vfio/pci: Introduce vfio_pci driver for ISM devices Alex Williamson
3 siblings, 0 replies; 8+ messages in thread
From: Julian Ruess @ 2026-03-13 14:40 UTC (permalink / raw)
To: schnelle, wintera, ts, oberpar, gbayer, Alex Williamson,
Jason Gunthorpe, Yishai Hadas, Shameer Kolothum, Kevin Tian
Cc: mjrosato, alifm, raspl, hca, agordeev, gor, julianr, kvm,
linux-kernel, linux-s390, linux-pci
Add a vfio_pci variant driver for the s390-specific Internal Shared
Memory (ISM) devices used for inter-VM communication.
This enables the development of vfio-pci-based user space drivers for
ISM devices.
On s390, kernel primitives such as ioread() and iowrite() are switched
over from function handle based PCI load/stores instructions to PCI
memory-I/O (MIO) loads/stores when these are available and not
explicitly disabled. Since these instructions cannot be used with ISM
devices, ensure that classic function handle-based PCI instructions are
used instead.
The driver is still required even when MIO instructions are disabled, as
the ISM device relies on the PCI store block (PCISTB) instruction to
perform write operations.
Stores are not fragmented, therefore one ioctl corresponds to exactly
one PCISTB instruction. User space must ensure to not write more than
4096 bytes at once to an ISM BAR which is the maximum payload of the
PCISTB instruction.
Signed-off-by: Julian Ruess <julianr@linux.ibm.com>
---
drivers/vfio/pci/Kconfig | 2 +
drivers/vfio/pci/Makefile | 2 +
drivers/vfio/pci/ism/Kconfig | 10 ++
drivers/vfio/pci/ism/Makefile | 3 +
drivers/vfio/pci/ism/main.c | 345 ++++++++++++++++++++++++++++++++++++++++++
5 files changed, 362 insertions(+)
diff --git a/drivers/vfio/pci/Kconfig b/drivers/vfio/pci/Kconfig
index 1e82b44bda1a0a544e1add7f4b06edecf35aaf81..296bf01e185ecacc388ebc69e92706c99e47c814 100644
--- a/drivers/vfio/pci/Kconfig
+++ b/drivers/vfio/pci/Kconfig
@@ -60,6 +60,8 @@ config VFIO_PCI_DMABUF
source "drivers/vfio/pci/mlx5/Kconfig"
+source "drivers/vfio/pci/ism/Kconfig"
+
source "drivers/vfio/pci/hisilicon/Kconfig"
source "drivers/vfio/pci/pds/Kconfig"
diff --git a/drivers/vfio/pci/Makefile b/drivers/vfio/pci/Makefile
index e0a0757dd1d2b0bc69b7e4d79441d5cacf4e1cd8..6138f1bf241df04e7419f196b404abdf9b194050 100644
--- a/drivers/vfio/pci/Makefile
+++ b/drivers/vfio/pci/Makefile
@@ -11,6 +11,8 @@ obj-$(CONFIG_VFIO_PCI) += vfio-pci.o
obj-$(CONFIG_MLX5_VFIO_PCI) += mlx5/
+obj-$(CONFIG_ISM_VFIO_PCI) += ism/
+
obj-$(CONFIG_HISI_ACC_VFIO_PCI) += hisilicon/
obj-$(CONFIG_PDS_VFIO_PCI) += pds/
diff --git a/drivers/vfio/pci/ism/Kconfig b/drivers/vfio/pci/ism/Kconfig
new file mode 100644
index 0000000000000000000000000000000000000000..02f47d25fed2d34c732b67b3a3655b64a7625467
--- /dev/null
+++ b/drivers/vfio/pci/ism/Kconfig
@@ -0,0 +1,10 @@
+# SPDX-License-Identifier: GPL-2.0
+config ISM_VFIO_PCI
+ tristate "VFIO support for ISM devices"
+ depends on S390
+ select VFIO_PCI_CORE
+ help
+ This provides user space support for IBM Internal Shared Memory (ISM)
+ Adapter devices using the VFIO framework.
+
+ If you don't know what to do here, say N.
diff --git a/drivers/vfio/pci/ism/Makefile b/drivers/vfio/pci/ism/Makefile
new file mode 100644
index 0000000000000000000000000000000000000000..32cc3c66dd11395da85a2b6f05b3d97036ed8a35
--- /dev/null
+++ b/drivers/vfio/pci/ism/Makefile
@@ -0,0 +1,3 @@
+# SPDX-License-Identifier: GPL-2.0
+obj-$(CONFIG_ISM_VFIO_PCI) += ism-vfio-pci.o
+ism-vfio-pci-y := main.o
diff --git a/drivers/vfio/pci/ism/main.c b/drivers/vfio/pci/ism/main.c
new file mode 100644
index 0000000000000000000000000000000000000000..34951ace4f041255d9b46cc9c5af66b181e72dfd
--- /dev/null
+++ b/drivers/vfio/pci/ism/main.c
@@ -0,0 +1,345 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * vfio-ISM driver for s390
+ *
+ * Copyright IBM Corp.
+ */
+
+#include "../vfio_pci_priv.h"
+#include "linux/slab.h"
+
+#define ISM_VFIO_PCI_OFFSET_SHIFT 48
+#define ISM_VFIO_PCI_OFFSET_TO_INDEX(off) (off >> ISM_VFIO_PCI_OFFSET_SHIFT)
+#define ISM_VFIO_PCI_INDEX_TO_OFFSET(index) ((u64)(index) << ISM_VFIO_PCI_OFFSET_SHIFT)
+#define ISM_VFIO_PCI_OFFSET_MASK (((u64)(1) << ISM_VFIO_PCI_OFFSET_SHIFT) - 1)
+
+struct ism_vfio_pci_core_device {
+ struct vfio_pci_core_device core_device;
+ struct kmem_cache *store_block_cache;
+};
+
+static int ism_pci_open_device(struct vfio_device *core_vdev)
+{
+ struct ism_vfio_pci_core_device *ivdev;
+ struct vfio_pci_core_device *vdev;
+ int ret;
+
+ ivdev = container_of(core_vdev, struct ism_vfio_pci_core_device,
+ core_device.vdev);
+ vdev = &ivdev->core_device;
+
+ ret = vfio_pci_core_enable(vdev);
+ if (ret)
+ return ret;
+
+ vfio_pci_core_finish_enable(vdev);
+ return 0;
+}
+
+/*
+ * ism_vfio_pci_do_io_r()
+ *
+ * On s390, kernel primitives such as ioread() and iowrite() are switched over
+ * from function handle based PCI load/stores instructions to PCI memory-I/O (MIO)
+ * loads/stores when these are available and not explicitly disabled. Since these
+ * instructions cannot be used with ISM devices, ensure that classic function
+ * handle-based PCI instructions are used instead. The z/Architecture and ISM
+ * allow reads smaller than 8 bytes. Restrict it to 8 bytes for code simplicity.
+ */
+static ssize_t ism_vfio_pci_do_io_r(struct vfio_pci_core_device *vdev,
+ char __user *buf, loff_t off, size_t count,
+ int bar)
+{
+ struct zpci_dev *zdev = to_zpci(vdev->pdev);
+ ssize_t ret, done = 0;
+ u64 req, val;
+
+ if (!IS_ALIGNED(off, sizeof(val)) || !IS_ALIGNED(count, sizeof(val)))
+ return -EINVAL;
+
+ while (count) {
+ req = ZPCI_CREATE_REQ(READ_ONCE(zdev->fh), bar, sizeof(val));
+ /*
+ * Use __zpci_load() to bypass automatic use of PCI MIO instructions
+ * which are not supported on ISM devices
+ */
+ ret = __zpci_load(&val, req, off);
+ if (ret)
+ return ret;
+ if (copy_to_user(buf, &val, sizeof(val)))
+ return -EFAULT;
+ count -= sizeof(val);
+ done += sizeof(val);
+ off += sizeof(val);
+ buf += sizeof(val);
+ }
+ return done;
+}
+
+/*
+ * ism_vfio_pci_do_io_w()
+ *
+ * Ensure that the PCI store block (PCISTB) instruction is used as required by the
+ * ISM device. The ISM device also uses a 256 TiB BAR 0 for write operations,
+ * which requires a 48bit region address space (ISM_VFIO_PCI_OFFSET_SHIFT).
+ */
+static ssize_t ism_vfio_pci_do_io_w(struct vfio_pci_core_device *vdev,
+ char __user *buf, loff_t off, size_t count,
+ int bar)
+{
+ struct zpci_dev *zdev = to_zpci(vdev->pdev);
+ struct ism_vfio_pci_core_device *ivpcd;
+ ssize_t ret;
+ void *data;
+ u64 req;
+
+ if (count > zdev->maxstbl)
+ return -EINVAL;
+ if (((off % PAGE_SIZE) + count) > PAGE_SIZE)
+ return -EINVAL;
+
+ ivpcd = container_of(vdev, struct ism_vfio_pci_core_device,
+ core_device);
+ data = kmem_cache_zalloc(ivpcd->store_block_cache, GFP_KERNEL);
+ if (!data)
+ return -ENOMEM;
+
+ if (copy_from_user(data, buf, count)) {
+ ret = -EFAULT;
+ goto out_free;
+ }
+
+ req = ZPCI_CREATE_REQ(READ_ONCE(zdev->fh), bar, count);
+ ret = __zpci_store_block(data, req, off);
+ if (ret)
+ goto out_free;
+
+ ret = count;
+
+out_free:
+ kmem_cache_free(ivpcd->store_block_cache, data);
+ return ret;
+}
+
+static ssize_t ism_vfio_pci_bar_rw(struct vfio_pci_core_device *vdev,
+ char __user *buf, size_t count, loff_t *ppos,
+ bool iswrite)
+{
+ int bar = ISM_VFIO_PCI_OFFSET_TO_INDEX(*ppos);
+ loff_t pos = *ppos & ISM_VFIO_PCI_OFFSET_MASK;
+ resource_size_t end;
+ ssize_t done = 0;
+
+ if (pci_resource_start(vdev->pdev, bar))
+ end = pci_resource_len(vdev->pdev, bar);
+ else
+ return -EINVAL;
+
+ if (pos >= end)
+ return -EINVAL;
+
+ count = min(count, (size_t)(end - pos));
+
+ if (iswrite)
+ done = ism_vfio_pci_do_io_w(vdev, buf, pos, count, bar);
+ else
+ done = ism_vfio_pci_do_io_r(vdev, buf, pos, count, bar);
+
+ if (done >= 0)
+ *ppos += done;
+
+ return done;
+}
+
+static ssize_t ism_vfio_pci_config_rw(struct vfio_pci_core_device *vdev,
+ char __user *buf, size_t count,
+ loff_t *ppos, bool iswrite)
+{
+ loff_t pos = *ppos;
+ size_t done = 0;
+ int ret = 0;
+
+ pos &= ISM_VFIO_PCI_OFFSET_MASK;
+
+ while (count) {
+ /*
+ * zPCI must not use MIO instructions for config space access,
+ * so we can use common code path here.
+ */
+ ret = vfio_pci_config_rw_single(vdev, buf, count, &pos, iswrite);
+ if (ret < 0)
+ return ret;
+
+ count -= ret;
+ done += ret;
+ buf += ret;
+ pos += ret;
+ }
+
+ *ppos += done;
+
+ return done;
+}
+
+static ssize_t ism_vfio_pci_rw(struct vfio_device *core_vdev, char __user *buf,
+ size_t count, loff_t *ppos, bool iswrite)
+{
+ unsigned int index = ISM_VFIO_PCI_OFFSET_TO_INDEX(*ppos);
+ struct vfio_pci_core_device *vdev;
+ int ret;
+
+ vdev = container_of(core_vdev, struct vfio_pci_core_device, vdev);
+
+ if (!count)
+ return 0;
+
+ switch (index) {
+ case VFIO_PCI_CONFIG_REGION_INDEX:
+ ret = ism_vfio_pci_config_rw(vdev, buf, count, ppos, iswrite);
+ break;
+
+ case VFIO_PCI_BAR0_REGION_INDEX ... VFIO_PCI_BAR5_REGION_INDEX:
+ ret = ism_vfio_pci_bar_rw(vdev, buf, count, ppos, iswrite);
+ break;
+
+ default:
+ return -EINVAL;
+ }
+
+ return ret;
+}
+
+static ssize_t ism_vfio_pci_read(struct vfio_device *core_vdev,
+ char __user *buf, size_t count, loff_t *ppos)
+{
+ return ism_vfio_pci_rw(core_vdev, buf, count, ppos, false);
+}
+
+static ssize_t ism_vfio_pci_write(struct vfio_device *core_vdev,
+ const char __user *buf, size_t count,
+ loff_t *ppos)
+{
+ return ism_vfio_pci_rw(core_vdev, (char __user *)buf, count, ppos,
+ true);
+}
+
+static int ism_vfio_pci_ioctl_get_region_info(struct vfio_device *core_vdev,
+ struct vfio_region_info *info,
+ struct vfio_info_cap *caps)
+{
+ struct vfio_pci_core_device *vdev =
+ container_of(core_vdev, struct vfio_pci_core_device, vdev);
+ struct pci_dev *pdev = vdev->pdev;
+
+ switch (info->index) {
+ case VFIO_PCI_CONFIG_REGION_INDEX:
+ info->offset = ISM_VFIO_PCI_INDEX_TO_OFFSET(info->index);
+ info->size = pdev->cfg_size;
+ info->flags = VFIO_REGION_INFO_FLAG_READ |
+ VFIO_REGION_INFO_FLAG_WRITE;
+ break;
+ case VFIO_PCI_BAR0_REGION_INDEX ... VFIO_PCI_BAR5_REGION_INDEX:
+ info->offset = ISM_VFIO_PCI_INDEX_TO_OFFSET(info->index);
+ info->size = pci_resource_len(pdev, info->index);
+ if (!info->size) {
+ info->flags = 0;
+ break;
+ }
+ info->flags = VFIO_REGION_INFO_FLAG_READ |
+ VFIO_REGION_INFO_FLAG_WRITE;
+ break;
+ default:
+ info->offset = 0;
+ info->size = 0;
+ info->flags = 0;
+ }
+ return 0;
+}
+
+static const struct vfio_device_ops ism_pci_ops = {
+ .name = "ism-vfio-pci",
+ .init = vfio_pci_core_init_dev,
+ .release = vfio_pci_core_release_dev,
+ .open_device = ism_pci_open_device,
+ .close_device = vfio_pci_core_close_device,
+ .ioctl = vfio_pci_core_ioctl,
+ .get_region_info_caps = ism_vfio_pci_ioctl_get_region_info,
+ .device_feature = vfio_pci_core_ioctl_feature,
+ .read = ism_vfio_pci_read,
+ .write = ism_vfio_pci_write,
+ .request = vfio_pci_core_request,
+ .match = vfio_pci_core_match,
+ .match_token_uuid = vfio_pci_core_match_token_uuid,
+ .bind_iommufd = vfio_iommufd_physical_bind,
+ .unbind_iommufd = vfio_iommufd_physical_unbind,
+ .attach_ioas = vfio_iommufd_physical_attach_ioas,
+ .detach_ioas = vfio_iommufd_physical_detach_ioas,
+};
+
+static int ism_vfio_pci_probe(struct pci_dev *pdev,
+ const struct pci_device_id *id)
+{
+ struct ism_vfio_pci_core_device *ivpcd;
+ struct zpci_dev *zdev = to_zpci(pdev);
+ char cache_name[20];
+ int ret;
+
+ ivpcd = vfio_alloc_device(ism_vfio_pci_core_device, core_device.vdev,
+ &pdev->dev, &ism_pci_ops);
+ if (IS_ERR(ivpcd))
+ return PTR_ERR(ivpcd);
+
+ snprintf(cache_name, sizeof(cache_name), "ism_sb_fid_%08x", zdev->fid);
+ ivpcd->store_block_cache =
+ kmem_cache_create(cache_name, zdev->maxstbl, 0, 0, NULL);
+ if (!ivpcd->store_block_cache) {
+ vfio_put_device(&ivpcd->core_device.vdev);
+ return -ENOMEM;
+ }
+
+ dev_set_drvdata(&pdev->dev, &ivpcd->core_device);
+ ret = vfio_pci_core_register_device(&ivpcd->core_device);
+ if (ret) {
+ kmem_cache_destroy(ivpcd->store_block_cache);
+ vfio_put_device(&ivpcd->core_device.vdev);
+ }
+
+ return ret;
+}
+
+static void ism_vfio_pci_remove(struct pci_dev *pdev)
+{
+ struct vfio_pci_core_device *core_device;
+ struct ism_vfio_pci_core_device *ivpcd;
+
+ core_device = dev_get_drvdata(&pdev->dev);
+ ivpcd = container_of(core_device, struct ism_vfio_pci_core_device,
+ core_device);
+
+ vfio_pci_core_unregister_device(&ivpcd->core_device);
+ vfio_put_device(&ivpcd->core_device.vdev);
+
+ kmem_cache_destroy(ivpcd->store_block_cache);
+}
+
+static const struct pci_device_id ism_device_table[] = {
+ { PCI_DRIVER_OVERRIDE_DEVICE_VFIO(PCI_VENDOR_ID_IBM,
+ PCI_DEVICE_ID_IBM_ISM) },
+ {}
+};
+MODULE_DEVICE_TABLE(pci, ism_device_table);
+
+static struct pci_driver ism_vfio_pci_driver = {
+ .name = KBUILD_MODNAME,
+ .id_table = ism_device_table,
+ .probe = ism_vfio_pci_probe,
+ .remove = ism_vfio_pci_remove,
+ .err_handler = &vfio_pci_core_err_handlers,
+ .driver_managed_dma = true,
+};
+
+module_pci_driver(ism_vfio_pci_driver);
+
+MODULE_LICENSE("GPL");
+MODULE_DESCRIPTION("vfio-pci variant driver for the IBM Internal Shared Memory (ISM) device");
+MODULE_AUTHOR("IBM Corporation");
--
2.51.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* [PATCH v4 3/3] MAINTAINERS: add VFIO ISM PCI DRIVER section
2026-03-13 14:40 [PATCH v4 0/3] vfio/pci: Introduce vfio_pci driver for ISM devices Julian Ruess
2026-03-13 14:40 ` [PATCH v4 1/3] vfio/pci: Rename vfio_config_do_rw() to vfio_pci_config_rw_single() and export it Julian Ruess
2026-03-13 14:40 ` [PATCH v4 2/3] vfio/ism: Implement vfio_pci driver for ISM devices Julian Ruess
@ 2026-03-13 14:40 ` Julian Ruess
2026-03-13 15:41 ` [PATCH v4 0/3] vfio/pci: Introduce vfio_pci driver for ISM devices Alex Williamson
3 siblings, 0 replies; 8+ messages in thread
From: Julian Ruess @ 2026-03-13 14:40 UTC (permalink / raw)
To: schnelle, wintera, ts, oberpar, gbayer, Alex Williamson,
Jason Gunthorpe, Yishai Hadas, Shameer Kolothum, Kevin Tian
Cc: mjrosato, alifm, raspl, hca, agordeev, gor, julianr, kvm,
linux-kernel, linux-s390, linux-pci
ism_vfio_pci is a new kernel component that allows
to use the ISM device from userspace. Add myself
as a maintainer.
Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>
Signed-off-by: Julian Ruess <julianr@linux.ibm.com>
---
MAINTAINERS | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/MAINTAINERS b/MAINTAINERS
index bc3bcc641663343e9e49281a1b409c84d383583f..1ea2aab4a9bf1294b0f6a1a8c4cdb74188813816 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -27703,6 +27703,12 @@ L: kvm@vger.kernel.org
S: Maintained
F: drivers/vfio/pci/hisilicon/
+VFIO ISM PCI DRIVER
+M: Julian Ruess <julianr@linux.ibm.com>
+L: kvm@vger.kernel.org
+S: Maintained
+F: drivers/vfio/pci/ism/
+
VFIO MEDIATED DEVICE DRIVERS
M: Kirti Wankhede <kwankhede@nvidia.com>
L: kvm@vger.kernel.org
--
2.51.0
^ permalink raw reply related [flat|nested] 8+ messages in thread
* Re: [PATCH v4 0/3] vfio/pci: Introduce vfio_pci driver for ISM devices
2026-03-13 14:40 [PATCH v4 0/3] vfio/pci: Introduce vfio_pci driver for ISM devices Julian Ruess
` (2 preceding siblings ...)
2026-03-13 14:40 ` [PATCH v4 3/3] MAINTAINERS: add VFIO ISM PCI DRIVER section Julian Ruess
@ 2026-03-13 15:41 ` Alex Williamson
2026-03-16 12:33 ` Julian Ruess
3 siblings, 1 reply; 8+ messages in thread
From: Alex Williamson @ 2026-03-13 15:41 UTC (permalink / raw)
To: Julian Ruess
Cc: schnelle, wintera, ts, oberpar, gbayer, Jason Gunthorpe,
Yishai Hadas, Shameer Kolothum, Kevin Tian, mjrosato, alifm,
raspl, hca, agordeev, gor, kvm, linux-kernel, linux-s390,
linux-pci, alex
On Fri, 13 Mar 2026 15:40:27 +0100
Julian Ruess <julianr@linux.ibm.com> wrote:
> Hi all,
>
> This series adds a vfio_pci variant driver for the s390-specific
> Internal Shared Memory (ISM) devices used for inter-VM communication
> including SMC-D.
>
> This is a prerequisite for an in-development open-source user space
> driver stack that will allow to use ISM devices to provide remote
> console and block device functionality. This stack will be part of
> s390-tools.
>
> This driver would also allow QEMU to mediate access to an ISM device,
> enabling a form of PCI pass-through even for guests whose hardware
> cannot directly execute PCI accesses, such as nested guests.
>
> On s390, kernel primitives such as ioread() and iowrite() are switched
> over from function handle based PCI load/stores instructions to PCI
> memory-I/O (MIO) loads/stores when these are available and not
> explicitly disabled. Since these instructions cannot be used with ISM
> devices, ensure that classic function handle-based PCI instructions are
> used instead.
>
> The driver is still required even when MIO instructions are disabled, as
> the ISM device relies on the PCI store‑block (PCISTB) instruction to
> perform write operations.
>
> Thank you,
> Julian
>
> Signed-off-by: Julian Ruess <julianr@linux.ibm.com>
> ---
> Changes in v4:
> - Fix bug with < 8 byte reads. For code simplicity, only support 8 byte reads.
Does the ISM device define sub-8-byte accesses as valid? It looks like
if pread() doesn't return the desired size QEMU will fill the return
with -1. Unless such accesses are classified as undefined by ISM,
doesn't that suggest a potential data corruption issue to the guest
driver? Thanks,
Alex
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v4 0/3] vfio/pci: Introduce vfio_pci driver for ISM devices
2026-03-13 15:41 ` [PATCH v4 0/3] vfio/pci: Introduce vfio_pci driver for ISM devices Alex Williamson
@ 2026-03-16 12:33 ` Julian Ruess
2026-03-16 19:03 ` Alex Williamson
0 siblings, 1 reply; 8+ messages in thread
From: Julian Ruess @ 2026-03-16 12:33 UTC (permalink / raw)
To: Alex Williamson, Julian Ruess
Cc: schnelle, wintera, ts, oberpar, gbayer, Jason Gunthorpe,
Yishai Hadas, Shameer Kolothum, Kevin Tian, mjrosato, alifm,
raspl, hca, agordeev, gor, kvm, linux-kernel, linux-s390,
linux-pci
On Fri Mar 13, 2026 at 4:41 PM CET, Alex Williamson wrote:
> On Fri, 13 Mar 2026 15:40:27 +0100
> Julian Ruess <julianr@linux.ibm.com> wrote:
>
>> Hi all,
>>
>> This series adds a vfio_pci variant driver for the s390-specific
>> Internal Shared Memory (ISM) devices used for inter-VM communication
>> including SMC-D.
>>
>> This is a prerequisite for an in-development open-source user space
>> driver stack that will allow to use ISM devices to provide remote
>> console and block device functionality. This stack will be part of
>> s390-tools.
>>
>> This driver would also allow QEMU to mediate access to an ISM device,
>> enabling a form of PCI pass-through even for guests whose hardware
>> cannot directly execute PCI accesses, such as nested guests.
>>
>> On s390, kernel primitives such as ioread() and iowrite() are switched
>> over from function handle based PCI load/stores instructions to PCI
>> memory-I/O (MIO) loads/stores when these are available and not
>> explicitly disabled. Since these instructions cannot be used with ISM
>> devices, ensure that classic function handle-based PCI instructions are
>> used instead.
>>
>> The driver is still required even when MIO instructions are disabled, as
>> the ISM device relies on the PCI store‑block (PCISTB) instruction to
>> perform write operations.
>>
>> Thank you,
>> Julian
>>
>> Signed-off-by: Julian Ruess <julianr@linux.ibm.com>
>> ---
>> Changes in v4:
>> - Fix bug with < 8 byte reads. For code simplicity, only support 8 byte reads.
>
> Does the ISM device define sub-8-byte accesses as valid? It looks like
> if pread() doesn't return the desired size QEMU will fill the return
> with -1. Unless such accesses are classified as undefined by ISM,
> doesn't that suggest a potential data corruption issue to the guest
> driver? Thanks,
>
> Alex
Hi Alex,
thanks for the quick feedback!
We are currently developing this extension for a non‑QEMU vfio user space
driver. Reads smaller than 8 bytes are theoretically valid, but they are not
used by this driver nor the existing in-kernel driver at the moment. We could
extend this in the future if needed.
vfio‑pci based PCI pass-through of the ISM device is already possible without
this extension. In that case, the ISM driver in the guest kernel accesses the
BARs directly through hardware virtualization, without using the new access
routines provided by this variant driver.
Julian
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v4 0/3] vfio/pci: Introduce vfio_pci driver for ISM devices
2026-03-16 12:33 ` Julian Ruess
@ 2026-03-16 19:03 ` Alex Williamson
2026-03-17 10:01 ` Niklas Schnelle
0 siblings, 1 reply; 8+ messages in thread
From: Alex Williamson @ 2026-03-16 19:03 UTC (permalink / raw)
To: Julian Ruess
Cc: schnelle, wintera, ts, oberpar, gbayer, Jason Gunthorpe,
Yishai Hadas, Shameer Kolothum, Kevin Tian, mjrosato, alifm,
raspl, hca, agordeev, gor, kvm, linux-kernel, linux-s390,
linux-pci, alex
On Mon, 16 Mar 2026 13:33:04 +0100
"Julian Ruess" <julianr@linux.ibm.com> wrote:
> On Fri Mar 13, 2026 at 4:41 PM CET, Alex Williamson wrote:
> > On Fri, 13 Mar 2026 15:40:27 +0100
> > Julian Ruess <julianr@linux.ibm.com> wrote:
> >
> >> Hi all,
> >>
> >> This series adds a vfio_pci variant driver for the s390-specific
> >> Internal Shared Memory (ISM) devices used for inter-VM communication
> >> including SMC-D.
> >>
> >> This is a prerequisite for an in-development open-source user space
> >> driver stack that will allow to use ISM devices to provide remote
> >> console and block device functionality. This stack will be part of
> >> s390-tools.
> >>
> >> This driver would also allow QEMU to mediate access to an ISM device,
> >> enabling a form of PCI pass-through even for guests whose hardware
> >> cannot directly execute PCI accesses, such as nested guests.
> >>
> >> On s390, kernel primitives such as ioread() and iowrite() are switched
> >> over from function handle based PCI load/stores instructions to PCI
> >> memory-I/O (MIO) loads/stores when these are available and not
> >> explicitly disabled. Since these instructions cannot be used with ISM
> >> devices, ensure that classic function handle-based PCI instructions are
> >> used instead.
> >>
> >> The driver is still required even when MIO instructions are disabled, as
> >> the ISM device relies on the PCI store‑block (PCISTB) instruction to
> >> perform write operations.
> >>
> >> Thank you,
> >> Julian
> >>
> >> Signed-off-by: Julian Ruess <julianr@linux.ibm.com>
> >> ---
> >> Changes in v4:
> >> - Fix bug with < 8 byte reads. For code simplicity, only support 8 byte reads.
> >
> > Does the ISM device define sub-8-byte accesses as valid? It looks like
> > if pread() doesn't return the desired size QEMU will fill the return
> > with -1. Unless such accesses are classified as undefined by ISM,
> > doesn't that suggest a potential data corruption issue to the guest
> > driver? Thanks,
> >
> > Alex
>
> Hi Alex,
>
> thanks for the quick feedback!
>
> We are currently developing this extension for a non‑QEMU vfio user space
> driver. Reads smaller than 8 bytes are theoretically valid, but they are not
> used by this driver nor the existing in-kernel driver at the moment. We could
> extend this in the future if needed.
>
> vfio‑pci based PCI pass-through of the ISM device is already possible without
> this extension. In that case, the ISM driver in the guest kernel accesses the
> BARs directly through hardware virtualization, without using the new access
> routines provided by this variant driver.
Hi Julian,
The cover letter argues a secondary use case with QEMU, especially in a
nested environment. The ISM range appears to be an interface to a
variety of device types, console and block are noted. It's also noted
in the implementation that the z/Architecture allows sub-8-byte access.
I think we need to be cautious that the existence of this driver makes
it available for use with QEMU and other VMMs. In the case of QEMU
vfio_region_ops will allow single-byte access by default.
The restricted access width is positioned as a simplification here, but
it needs to be evaluated against all the use cases. Unless we're 100%
sure none of those use cases rely on sub-8-byte accesses, we might be
setting ourselves up for hacks later to fix or detect partial access
support.
I'll leave it to IBM folks to determine if this is indeed a
simplification for long term support of all use cases and not a short
term fix for the short term use case. Thanks,
Alex
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH v4 0/3] vfio/pci: Introduce vfio_pci driver for ISM devices
2026-03-16 19:03 ` Alex Williamson
@ 2026-03-17 10:01 ` Niklas Schnelle
0 siblings, 0 replies; 8+ messages in thread
From: Niklas Schnelle @ 2026-03-17 10:01 UTC (permalink / raw)
To: Alex Williamson, Julian Ruess
Cc: wintera, ts, oberpar, gbayer, Jason Gunthorpe, Yishai Hadas,
Shameer Kolothum, Kevin Tian, mjrosato, alifm, raspl, hca,
agordeev, gor, kvm, linux-kernel, linux-s390, linux-pci
On Mon, 2026-03-16 at 13:03 -0600, Alex Williamson wrote:
> On Mon, 16 Mar 2026 13:33:04 +0100
> "Julian Ruess" <julianr@linux.ibm.com> wrote:
>
> > On Fri Mar 13, 2026 at 4:41 PM CET, Alex Williamson wrote:
> > > On Fri, 13 Mar 2026 15:40:27 +0100
> > > Julian Ruess <julianr@linux.ibm.com> wrote:
> > >
--- snip ---
> >
> > Hi Alex,
> >
> > thanks for the quick feedback!
> >
> > We are currently developing this extension for a non‑QEMU vfio user space
> > driver. Reads smaller than 8 bytes are theoretically valid, but they are not
> > used by this driver nor the existing in-kernel driver at the moment. We could
> > extend this in the future if needed.
> >
> > vfio‑pci based PCI pass-through of the ISM device is already possible without
> > this extension. In that case, the ISM driver in the guest kernel accesses the
> > BARs directly through hardware virtualization, without using the new access
> > routines provided by this variant driver.
>
> Hi Julian,
>
> The cover letter argues a secondary use case with QEMU, especially in a
> nested environment. The ISM range appears to be an interface to a
> variety of device types, console and block are noted. It's also noted
> in the implementation that the z/Architecture allows sub-8-byte access.
>
> I think we need to be cautious that the existence of this driver makes
> it available for use with QEMU and other VMMs. In the case of QEMU
> vfio_region_ops will allow single-byte access by default.
>
> The restricted access width is positioned as a simplification here, but
> it needs to be evaluated against all the use cases. Unless we're 100%
> sure none of those use cases rely on sub-8-byte accesses, we might be
> setting ourselves up for hacks later to fix or detect partial access
> support.
>
> I'll leave it to IBM folks to determine if this is indeed a
> simplification for long term support of all use cases and not a short
> term fix for the short term use case. Thanks,
>
> Alex
Hi Alex,
Thank you for the insights and expertise it is highly appreciated. Your
reasoning makes sense to me and I agree this simplification looks like
it may be ok for now but could cause more trouble than it is worth
later and there's really no reason not to just support < 8 byte
accesses too.
One thing I'd like to explain though since you mention ISM potentially
being an interface to different device types. This part of the cover
letter is easy to misunderstand since we haven't yet send out patches
opening up that direction. There is only one ISM device and even future
versions would likely just iterate but serve the same purpose. The
multiple device types including consoles and block devices will not
replace this ISM device but rather sit on top of a virtual bus called
the ISM Peer Bus where the ISM device will serve as a communication
channel between two Linux instances connected via an ISM device.
Specifically in the current design there will be a vfio-pci based user-
space daemon/driver on one Linux instance that provides virtual
consoles and block devices for other Linux instances on the other side
of ISM based communication channels. Those Linux instances will use
kernel integrated drivers for the console and block devices. Currently
there are no plans for passing these devices through to guests since we
can also just pass-through an ISM device and ISM devices being virtual
we're not limited in how many we can have.
Thanks,
Niklas
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2026-03-17 10:02 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-13 14:40 [PATCH v4 0/3] vfio/pci: Introduce vfio_pci driver for ISM devices Julian Ruess
2026-03-13 14:40 ` [PATCH v4 1/3] vfio/pci: Rename vfio_config_do_rw() to vfio_pci_config_rw_single() and export it Julian Ruess
2026-03-13 14:40 ` [PATCH v4 2/3] vfio/ism: Implement vfio_pci driver for ISM devices Julian Ruess
2026-03-13 14:40 ` [PATCH v4 3/3] MAINTAINERS: add VFIO ISM PCI DRIVER section Julian Ruess
2026-03-13 15:41 ` [PATCH v4 0/3] vfio/pci: Introduce vfio_pci driver for ISM devices Alex Williamson
2026-03-16 12:33 ` Julian Ruess
2026-03-16 19:03 ` Alex Williamson
2026-03-17 10:01 ` Niklas Schnelle
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox