public inbox for kvm@vger.kernel.org
 help / color / mirror / Atom feed
From: "Julian Ruess" <julianr@linux.ibm.com>
To: "Niklas Schnelle" <schnelle@linux.ibm.com>,
	"Julian Ruess" <julianr@linux.ibm.com>, <wintera@linux.ibm.com>,
	<ts@linux.ibm.com>, <oberpar@linux.ibm.com>,
	<gbayer@linux.ibm.com>, "Alex Williamson" <alex@shazbot.org>,
	"Jason Gunthorpe" <jgg@ziepe.ca>,
	"Yishai Hadas" <yishaih@nvidia.com>,
	"Shameer Kolothum" <skolothumtho@nvidia.com>,
	"Kevin Tian" <kevin.tian@intel.com>
Cc: <mjrosato@linux.ibm.com>, <alifm@linux.ibm.com>,
	<raspl@linux.ibm.com>, <hca@linux.ibm.com>,
	<agordeev@linux.ibm.com>, <gor@linux.ibm.com>,
	<kvm@vger.kernel.org>, <linux-kernel@vger.kernel.org>,
	<linux-s390@vger.kernel.org>, <linux-pci@vger.kernel.org>
Subject: Re: [PATCH v6 2/3] vfio/ism: Implement vfio_pci driver for ISM devices
Date: Fri, 20 Mar 2026 14:37:31 +0100	[thread overview]
Message-ID: <DH7NFCRHDMI3.1S64YUGRBH7O8@linux.ibm.com> (raw)
In-Reply-To: <18f44f67f096e0003d4083da20e2ace4c118c2f4.camel@linux.ibm.com>

On Fri Mar 20, 2026 at 1:54 PM CET, Niklas Schnelle wrote:
> On Thu, 2026-03-19 at 16:42 +0100, Julian Ruess wrote:
>> Add a vfio_pci variant driver for the s390-specific Internal Shared
>> Memory (ISM) devices used for inter-VM communication.
>> 
>> This enables the development of vfio-pci-based user space drivers for
>> ISM devices.
>> 
>> On s390, kernel primitives such as ioread() and iowrite() are switched
>> over from function handle based PCI load/stores instructions to PCI
>> memory-I/O (MIO) loads/stores when these are available and not
>> explicitly disabled. Since these instructions cannot be used with ISM
>> devices, ensure that classic function handle-based PCI instructions are
>> used instead.
>> 
>> The driver is still required even when MIO instructions are disabled, as
>> the ISM device relies on the PCI store block (PCISTB) instruction to
>> perform write operations.
>> 
>> Stores are not fragmented, therefore one ioctl corresponds to exactly
>> one PCISTB instruction. User space must ensure to not write more than
>> 4096 bytes at once to an ISM BAR which is the maximum payload of the
>> PCISTB instruction.
>> 
>> Signed-off-by: Julian Ruess <julianr@linux.ibm.com>
>> ---
>>  drivers/vfio/pci/Kconfig      |   2 +
>>  drivers/vfio/pci/Makefile     |   2 +
>>  drivers/vfio/pci/ism/Kconfig  |  10 ++
>>  drivers/vfio/pci/ism/Makefile |   3 +
>>  drivers/vfio/pci/ism/main.c   | 400 ++++++++++++++++++++++++++++++++++++++++++
>>  5 files changed, 417 insertions(+)
>> 
>
> As a general note, ISM devices do not support any low power/sleep modes
> which is why there is no PM handling done.
>
>> +
>> +/*
>> + * ism_vfio_pci_do_io_w()
>> + *
>> + * Ensure that the PCI store block (PCISTB) instruction is used as required by the
>> + * ISM device. The ISM device also uses a 256 TiB BAR 0 for write operations,
>> + * which requires a 48bit region address space (ISM_VFIO_PCI_OFFSET_SHIFT).
>> + */
>> +static ssize_t ism_vfio_pci_do_io_w(struct vfio_pci_core_device *vdev,
>> +				    char __user *buf, loff_t off, size_t count,
>> +				    int bar)
>> +{
>> +	struct zpci_dev *zdev = to_zpci(vdev->pdev);
>> +	struct ism_vfio_pci_core_device *ivpcd;
>> +	ssize_t ret;
>> +	void *data;
>> +	u64 req;
>> +
>> +	if (count > zdev->maxstbl)
>> +		return -EINVAL;
>> +	if (((off % PAGE_SIZE) + count) > PAGE_SIZE)
>> +		return -EINVAL;
>> +
>> +	ivpcd = container_of(vdev, struct ism_vfio_pci_core_device,
>> +			     core_device);
>> +	data = kmem_cache_alloc(ivpcd->store_block_cache, GFP_KERNEL);
>> +	if (!data)
>> +		return -ENOMEM;
>> +
>> +	if (copy_from_user(data, buf, count)) {
>> +		ret = -EFAULT;
>> +		goto out_free;
>> +	}
>> +
>> +	req = ZPCI_CREATE_REQ(READ_ONCE(zdev->fh), bar, count);
>> +	ret = __zpci_store_block(data, req, off);
>
> Note: There is a Sashiko finding that PCI Store Block needs 8 byte
> alignment and we don't ensure that off is 8 byte aligned here. While
> generally true the ISM device has relaxed alignment rules and this
> requirement does not apply. That said we should set PAGE_SIZE alignment
> on the kmem_cache such that data is guaranteed PAGE_SIZE aligned and
> the page crossing check works correctly.
>
>> +	if (ret)
>> +		goto out_free;
>> +
>> +	ret = count;
>> +
>> +out_free:
>> +	kmem_cache_free(ivpcd->store_block_cache, data);
>> +	return ret;
>> +}
>> +
> --- snip ---
>
>> +
>> +static int ism_vfio_pci_init_dev(struct vfio_device *core_vdev)
>> +{
>> +	struct zpci_dev *zdev = to_zpci(to_pci_dev(core_vdev->dev));
>> +	struct ism_vfio_pci_core_device *ivpcd;
>> +	char cache_name[20];
>> +	int ret;
>> +
>> +	ivpcd = container_of(core_vdev, struct ism_vfio_pci_core_device,
>> +			     core_device.vdev);
>> +
>> +	snprintf(cache_name, sizeof(cache_name), "ism_sb_fid_%08x", zdev->fid);
>> +	ivpcd->store_block_cache =
>> +		kmem_cache_create(cache_name, zdev->maxstbl, 0, 0, NULL);
>
> Sashiko notes here that the cache_name is stack allocated and says that
> kmem_cache_create() doesn't copy it. Looking at
> __kmem_cache_create_args() however this seems to be not true as that
> does krstrdup_const() on the passed-in name.

I think we should change this to:

ivpcd->store_block_cache = kmem_cache_create(
       cache_name, zdev->maxstbl, PAGE_SIZE,
       (SLAB_RECLAIM_ACCOUNT | SLAB_ACCOUNT | SLAB_PANIC), NULL);

The alignment ensures that we do not cross the integral boundary, and as
Sashiko said, memcg accounting is also useful here.

-- snip --

  reply	other threads:[~2026-03-20 13:37 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-19 15:42 [PATCH v6 0/3] vfio/pci: Introduce vfio_pci driver for ISM devices Julian Ruess
2026-03-19 15:42 ` [PATCH v6 1/3] vfio/pci: Rename vfio_config_do_rw() to vfio_pci_config_rw_single() and export it Julian Ruess
2026-03-19 15:42 ` [PATCH v6 2/3] vfio/ism: Implement vfio_pci driver for ISM devices Julian Ruess
2026-03-20 12:54   ` Niklas Schnelle
2026-03-20 13:37     ` Julian Ruess [this message]
2026-03-20 13:55       ` Niklas Schnelle
2026-03-19 15:42 ` [PATCH v6 3/3] MAINTAINERS: add VFIO ISM PCI DRIVER section Julian Ruess

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=DH7NFCRHDMI3.1S64YUGRBH7O8@linux.ibm.com \
    --to=julianr@linux.ibm.com \
    --cc=agordeev@linux.ibm.com \
    --cc=alex@shazbot.org \
    --cc=alifm@linux.ibm.com \
    --cc=gbayer@linux.ibm.com \
    --cc=gor@linux.ibm.com \
    --cc=hca@linux.ibm.com \
    --cc=jgg@ziepe.ca \
    --cc=kevin.tian@intel.com \
    --cc=kvm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=mjrosato@linux.ibm.com \
    --cc=oberpar@linux.ibm.com \
    --cc=raspl@linux.ibm.com \
    --cc=schnelle@linux.ibm.com \
    --cc=skolothumtho@nvidia.com \
    --cc=ts@linux.ibm.com \
    --cc=wintera@linux.ibm.com \
    --cc=yishaih@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox