From: Alex Williamson <alex@shazbot.org>
To: Niklas Schnelle <schnelle@linux.ibm.com>
Cc: Farhan Ali <alifm@linux.ibm.com>,
Julian Ruess <julianr@linux.ibm.com>,
wintera@linux.ibm.com, ts@linux.ibm.com, oberpar@linux.ibm.com,
gbayer@linux.ibm.com, Jason Gunthorpe <jgg@ziepe.ca>,
Yishai Hadas <yishaih@nvidia.com>,
Shameer Kolothum <skolothumtho@nvidia.com>,
Kevin Tian <kevin.tian@intel.com>,
mjrosato@linux.ibm.com, raspl@linux.ibm.com, hca@linux.ibm.com,
agordeev@linux.ibm.com, gor@linux.ibm.com, kvm@vger.kernel.org,
linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org,
linux-pci@vger.kernel.org, alex@shazbot.org
Subject: Re: [PATCH v8 2/3] vfio/ism: Implement vfio_pci driver for ISM devices
Date: Mon, 30 Mar 2026 09:36:46 -0600 [thread overview]
Message-ID: <20260330093646.03b0455f@shazbot.org> (raw)
In-Reply-To: <64e3158a441c79c55febead9aac956c31f034fb9.camel@linux.ibm.com>
On Fri, 27 Mar 2026 15:53:30 +0100
Niklas Schnelle <schnelle@linux.ibm.com> wrote:
> On Thu, 2026-03-26 at 12:05 -0700, Farhan Ali wrote:
> > On 3/26/2026 6:03 AM, Niklas Schnelle wrote:
> > > On Wed, 2026-03-25 at 14:31 +0100, Julian Ruess wrote:
> > > > Add a vfio_pci variant driver for the s390-specific Internal Shared
> > > > Memory (ISM) devices used for inter-VM communication.
> > > >
> > > > This enables the development of vfio-pci-based user space drivers for
> > > > ISM devices.
> > > >
> > > > On s390, kernel primitives such as ioread() and iowrite() are switched
> > > > over from function-handle-based PCI load/stores instructions to PCI
> > > > memory-I/O (MIO) loads/stores when these are available and not
> > > > explicitly disabled. Since these instructions cannot be used with ISM
> > > > devices, ensure that classic function-handle-based PCI instructions are
> > > > used instead.
> > > >
> > > > The driver is still required even when MIO instructions are disabled, as
> > > > the ISM device relies on the PCI store block (PCISTB) instruction to
> > > > perform write operations.
> > > >
> > > > Stores are not fragmented, therefore one ioctl corresponds to exactly
> > > > one PCISTB instruction. User space must ensure to not write more than
> > > > 4096 bytes at once to an ISM BAR which is the maximum payload of the
> > > > PCISTB instruction.
> > > >
> > > > Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
> > > > Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>
> > > > Signed-off-by: Julian Ruess <julianr@linux.ibm.com>
> > > > ---
> > > > drivers/vfio/pci/Kconfig | 2 +
> > > > drivers/vfio/pci/Makefile | 2 +
> > > > drivers/vfio/pci/ism/Kconfig | 10 ++
> > > > drivers/vfio/pci/ism/Makefile | 3 +
> > > > drivers/vfio/pci/ism/main.c | 408 ++++++++++++++++++++++++++++++++++++++++++
> > > > 5 files changed, 425 insertions(+)
> > > >
> > > --- snip ---
> > > > +
> > > > +static int ism_vfio_pci_ioctl_get_region_info(struct vfio_device *core_vdev,
> > > > + struct vfio_region_info *info,
> > > > + struct vfio_info_cap *caps)
> > > > +{
> > > > + struct vfio_pci_core_device *vdev =
> > > > + container_of(core_vdev, struct vfio_pci_core_device, vdev);
> > > > + struct pci_dev *pdev = vdev->pdev;
> > > > +
> > > > + switch (info->index) {
> > > > + case VFIO_PCI_CONFIG_REGION_INDEX:
> > > > + info->offset = ISM_VFIO_PCI_INDEX_TO_OFFSET(info->index);
> > > > + info->size = pdev->cfg_size;
> > > > + info->flags = VFIO_REGION_INFO_FLAG_READ |
> > > > + VFIO_REGION_INFO_FLAG_WRITE;
> > > > + break;
> > > > + case VFIO_PCI_BAR0_REGION_INDEX ... VFIO_PCI_BAR5_REGION_INDEX:
> > > > + info->offset = ISM_VFIO_PCI_INDEX_TO_OFFSET(info->index);
> > > > + info->size = pci_resource_len(pdev, info->index);
> > > > + if (!info->size) {
> > > > + info->flags = 0;
> > > > + break;
> > > > + }
> > > > + info->flags = VFIO_REGION_INFO_FLAG_READ |
> > > > + VFIO_REGION_INFO_FLAG_WRITE;
> > > > + break;
> > > > + default:
> > > > + info->offset = 0;
> > > > + info->size = 0;
> > > > + info->flags = 0;
> > > > + return -EINVAL;
> > > Thinking more about this, the above default handling actually breaks
> > > additional regions such as the one added by Farhan for the error
> > > events. I think this needs to instead call the generic
> > > vfio_pci_ioctl_get_region_info() for other regions.
> >
> > Note for error events we are using a VFIO device feature and not a
> > region. At this point we actually don't have additional regions. I do
> > agree that if we were to add a zpci specific region we would need to
> > update here and ism_vfio_pci_rw() to allow region read/write ops. This
> > ISM driver would also need to support all the possible callbacks in
> > struct vfio_pci_regops.
> >
> > Given that we don't have any additional regions yet, I am conflicted if
> > it makes sense to add all the code to support additional regions without
> > a good way to test it. Maybe its something we can defer when if we add
> > additional regions? Thanks
> >
> > Farhan
>
> @Alex, we experimented a bit and it turns out with the custom
> ISM_VFIO_PCI_OFFSET_SHIFT it becomes quite tricky to reliably call into
> common vfio-pci code for handling other regions as that again relies on
> the normal VFIO_PCI_OFFSET_SHIFT.
>
> So we came up with two options:
>
> 1) vfio-pci-ism will only support this basic set of regions. It seems
> to us that new extensions like Farhan's error events should use device
> features anyway, so this to us seems like an acceptable limitation and
> would essentially mean we take this patch as is.
>
> 2) We revisit the original idea of adjusting VFIO_PCI_OFFSET_SHIFT
> globally but do so only for CONFIG_64BIT so as not to break 32 bit
> platforms. This would simplify vfio-pci-ism and cause less code
> duplication but also affects everyone else.
>
> Either option works for us, so I'm hoping for your input.
There's risk involved with changing the default shift. The fear is
there's userspace drivers that hard code the shift. DPDK was even such
a user at one point, iirc. Maybe it's ok to break such users, maybe
there are actually no such users left and it's all FUD at this point.
Either way, I have a hard time justifying that risk for a single,
obscure S390 device.
Rather than using CONFIG_64BIT as a basis for using a different
default, should we start out with a more narrow scope of CONFIG_S390?
It's likely inevitable that we'll hit this wall in general, but maybe
S390 can be the "crash test dummy" to prove to userspace that they
really need to use the uAPI rather than hard coded values.
Farhan's series is a bit of a straw-man since it doesn't actually
convey the error codes via regions, but I can certainly agree that
using a unique region shift becomes an ongoing burden relative to
vfio-pci-core helpers. I can accept an S390 specific (for now) change
to the region spacing if that's the most sensible path. Thanks,
Alex
next prev parent reply other threads:[~2026-03-30 15:36 UTC|newest]
Thread overview: 19+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-25 13:31 [PATCH v8 0/3] vfio/pci: Introduce vfio_pci driver for ISM devices Julian Ruess
2026-03-25 13:31 ` [PATCH v8 1/3] vfio/pci: Rename vfio_config_do_rw() to vfio_pci_config_rw_single() and export it Julian Ruess
2026-03-25 13:31 ` [PATCH v8 2/3] vfio/ism: Implement vfio_pci driver for ISM devices Julian Ruess
2026-03-26 13:03 ` Niklas Schnelle
2026-03-26 19:05 ` Farhan Ali
2026-03-27 14:53 ` Niklas Schnelle
2026-03-30 15:36 ` Alex Williamson [this message]
2026-03-30 15:56 ` Jason Gunthorpe
2026-03-30 18:09 ` Alex Williamson
2026-03-30 18:16 ` Jason Gunthorpe
2026-03-30 18:39 ` Alex Williamson
2026-03-31 0:03 ` Jason Gunthorpe
2026-03-31 8:29 ` Niklas Schnelle
2026-03-31 20:44 ` Alex Williamson
2026-03-30 18:15 ` Niklas Schnelle
2026-04-01 16:28 ` Farhan Ali
2026-04-01 22:04 ` Alex Williamson
2026-04-02 9:06 ` Julian Ruess
2026-03-25 13:31 ` [PATCH v8 3/3] MAINTAINERS: add VFIO ISM PCI DRIVER section Julian Ruess
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260330093646.03b0455f@shazbot.org \
--to=alex@shazbot.org \
--cc=agordeev@linux.ibm.com \
--cc=alifm@linux.ibm.com \
--cc=gbayer@linux.ibm.com \
--cc=gor@linux.ibm.com \
--cc=hca@linux.ibm.com \
--cc=jgg@ziepe.ca \
--cc=julianr@linux.ibm.com \
--cc=kevin.tian@intel.com \
--cc=kvm@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=linux-s390@vger.kernel.org \
--cc=mjrosato@linux.ibm.com \
--cc=oberpar@linux.ibm.com \
--cc=raspl@linux.ibm.com \
--cc=schnelle@linux.ibm.com \
--cc=skolothumtho@nvidia.com \
--cc=ts@linux.ibm.com \
--cc=wintera@linux.ibm.com \
--cc=yishaih@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox