From: "Cédric Le Goater" <clegoate@redhat.com>
To: Niklas Schnelle <schnelle@linux.ibm.com>,
Matthew Rosato <mjrosato@linux.ibm.com>,
qemu-s390x@nongnu.org
Cc: farman@linux.ibm.com, thuth@redhat.com, pasic@linux.ibm.com,
borntraeger@linux.ibm.com, richard.henderson@linaro.org,
david@redhat.com, iii@linux.ibm.com, qemu-devel@nongnu.org
Subject: Re: [PATCH v4 1/2] s390x/pci: add support for guests that request direct mapping
Date: Mon, 10 Feb 2025 14:26:58 +0100 [thread overview]
Message-ID: <61b05250-dc77-4aa9-bb0f-8394edec203b@redhat.com> (raw)
In-Reply-To: <13cd0bc678f489ff26911362570efe1aca97a642.camel@linux.ibm.com>
On 2/10/25 14:12, Niklas Schnelle wrote:
> On Fri, 2025-02-07 at 15:56 -0500, Matthew Rosato wrote:
>> When receiving a guest mpcifc(4) or mpcifc(6) instruction without the T
>> bit set, treat this as a request to perform direct mapping instead of
>> address translation. In order to facilitate this, pin the entirety of
>> guest memory into the host iommu.
>>
>> Pinning for the direct mapping case is handled via vfio and its memory
>> listener. Additionally, ram discard settings are inherited from vfio:
>> coordinated discards (e.g. virtio-mem) are allowed while uncoordinated
>> discards (e.g. virtio-balloon) are disabled.
>>
>> Subsequent guest DMA operations are all expected to be of the format
>> guest_phys+sdma, allowing them to be used as lookup into the host
>> iommu table.
>>
>> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
>> ---
>> hw/s390x/s390-pci-bus.c | 38 +++++++++++++++++++++++++++++++--
>> hw/s390x/s390-pci-inst.c | 13 +++++++++--
>> hw/s390x/s390-pci-vfio.c | 23 ++++++++++++++++----
>> hw/s390x/s390-virtio-ccw.c | 5 +++++
>> include/hw/s390x/s390-pci-bus.h | 4 ++++
>> 5 files changed, 75 insertions(+), 8 deletions(-)
>>
>>
> ---8<---
>>
>> static const VMStateDescription s390_pci_device_vmstate = {
>> diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
>> index e386d75d58..8cdeb6cb7f 100644
>> --- a/hw/s390x/s390-pci-inst.c
>> +++ b/hw/s390x/s390-pci-inst.c
>> @@ -16,6 +16,7 @@
>> #include "exec/memory.h"
>> #include "qemu/error-report.h"
>> #include "system/hw_accel.h"
>> +#include "hw/boards.h"
>> #include "hw/pci/pci_device.h"
>> #include "hw/s390x/s390-pci-inst.h"
>> #include "hw/s390x/s390-pci-bus.h"
>> @@ -1008,17 +1009,25 @@ static int reg_ioat(CPUS390XState *env, S390PCIBusDevice *pbdev, ZpciFib fib,
>> }
>>
>> /* currently we only support designation type 1 with translation */
>> - if (!(dt == ZPCI_IOTA_RTTO && t)) {
>> + if (t && dt != ZPCI_IOTA_RTTO) {
>> error_report("unsupported ioat dt %d t %d", dt, t);
>> s390_program_interrupt(env, PGM_OPERAND, ra);
>> return -EINVAL;
>> + } else if (!t && !pbdev->rtr_avail) {
>> + error_report("relaxed translation not allowed");
>> + s390_program_interrupt(env, PGM_OPERAND, ra);
>> + return -EINVAL;
>> }
>>
>> iommu->pba = pba;
>> iommu->pal = pal;
>> iommu->g_iota = g_iota;
>>
>> - s390_pci_iommu_enable(iommu);
>> + if (t) {
>> + s390_pci_iommu_enable(iommu);
>> + } else {
>> + s390_pci_iommu_direct_map_enable(iommu);
>> + }
>>
>> return 0;
>> }
>> diff --git a/hw/s390x/s390-pci-vfio.c b/hw/s390x/s390-pci-vfio.c
>> index 7dbbc76823..443e222912 100644
>> --- a/hw/s390x/s390-pci-vfio.c
>> +++ b/hw/s390x/s390-pci-vfio.c
>> @@ -131,13 +131,28 @@ static void s390_pci_read_base(S390PCIBusDevice *pbdev,
>> /* Store function type separately for type-specific behavior */
>> pbdev->pft = cap->pft;
>>
>> + /*
>> + * If the device is a passthrough ISM device, disallow relaxed
>> + * translation.
>> + */
>> + if (pbdev->pft == ZPCI_PFT_ISM) {
>> + pbdev->rtr_avail = false;
>> + }
>
> Just a note for external readers. The ISM device does work without the
> above but having explicit guest IOMMU mappings plus the respective
> shadow mappings in the host would catch driver misbehavior more easily.
> At the same time ISM uses long lived IOMMU mappings so the cost of
> shadowing its mappings is low.
>
>> +
>> /*
>> * If appropriate, reduce the size of the supported DMA aperture reported
>> - * to the guest based upon the vfio DMA limit.
>> + * to the guest based upon the vfio DMA limit. This is applicable for
>> + * devices that are guaranteed to not use relaxed translation. If the
>> + * device is capable of relaxed translation then we must advertise the
>> + * full aperture. In this case, if translation is used then we will
>> + * rely on the vfio DMA limit counting and use RPCIT CC1 / status 16
>> + * to request that the guest free DMA mappings as necessary.
>> */
>> - vfio_size = pbdev->iommu->max_dma_limit << TARGET_PAGE_BITS;
>> - if (vfio_size > 0 && vfio_size < cap->end_dma - cap->start_dma + 1) {
>> - pbdev->zpci_fn.edma = cap->start_dma + vfio_size - 1;
>> + if (!pbdev->rtr_avail) {
>> + vfio_size = pbdev->iommu->max_dma_limit << TARGET_PAGE_BITS;
>> + if (vfio_size > 0 && vfio_size < cap->end_dma - cap->start_dma + 1) {
>> + pbdev->zpci_fn.edma = cap->start_dma + vfio_size - 1;
>> + }
>> }
>> }
>>
>> diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
>> index d9e683c5b4..6a6cb39808 100644
>> --- a/hw/s390x/s390-virtio-ccw.c
>> +++ b/hw/s390x/s390-virtio-ccw.c
>> @@ -937,8 +937,13 @@ static void ccw_machine_9_2_instance_options(MachineState *machine)
>>
>> static void ccw_machine_9_2_class_options(MachineClass *mc)
>> {
>> + static GlobalProperty compat[] = {
>> + { TYPE_S390_PCI_DEVICE, "relaxed-translation", "off", },
>> + };
>> +
>> ccw_machine_10_0_class_options(mc);
>> compat_props_add(mc->compat_props, hw_compat_9_2, hw_compat_9_2_len);
>> + compat_props_add(mc->compat_props, compat, G_N_ELEMENTS(compat));
>> }
>> DEFINE_CCW_MACHINE(9, 2);
>>
>> diff --git a/include/hw/s390x/s390-pci-bus.h b/include/hw/s390x/s390-pci-bus.h
>> index 2c43ea123f..ea9e04ec49 100644
>> --- a/include/hw/s390x/s390-pci-bus.h
>> +++ b/include/hw/s390x/s390-pci-bus.h
>> @@ -277,7 +277,9 @@ struct S390PCIIOMMU {
>> AddressSpace as;
>> MemoryRegion mr;
>> IOMMUMemoryRegion iommu_mr;
>> + MemoryRegion dm_mr;
>> bool enabled;
>> + bool direct_map;
>> uint64_t g_iota;
>> uint64_t pba;
>> uint64_t pal;
>> @@ -362,6 +364,7 @@ struct S390PCIBusDevice {
>> bool interp;
>> bool forwarding_assist;
>> bool aif;
>> + bool rtr_avail;
>> QTAILQ_ENTRY(S390PCIBusDevice) link;
>> };
>>
>> @@ -389,6 +392,7 @@ int pci_chsc_sei_nt2_have_event(void);
>> void s390_pci_sclp_configure(SCCB *sccb);
>> void s390_pci_sclp_deconfigure(SCCB *sccb);
>> void s390_pci_iommu_enable(S390PCIIOMMU *iommu);
>> +void s390_pci_iommu_direct_map_enable(S390PCIIOMMU *iommu);
>> void s390_pci_iommu_disable(S390PCIIOMMU *iommu);
>> void s390_pci_generate_error_event(uint16_t pec, uint32_t fh, uint32_t fid,
>> uint64_t faddr, uint32_t e);
>
> I'm not too familiar with the existing code or QEMU in general, but the
> changes makes sense to me. I'm assuming the braces around single
> statement bodies are accepted style in QEMU?
They are required :
https://qemu.readthedocs.io/en/v9.2.0/devel/style.html#block-structure
>
> I retested this version together with the v4 of the kernel version too.
> So feel free to add:
>
> Tested-by: Niklas Schnelle <schnelle@linux.ibm.com>
> Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>
>
Thanks,
C.
next prev parent reply other threads:[~2025-02-10 13:29 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-07 20:56 [PATCH v4 0/2] s390x/pci: relax I/O address translation requirement Matthew Rosato
2025-02-07 20:56 ` [PATCH v4 1/2] s390x/pci: add support for guests that request direct mapping Matthew Rosato
2025-02-10 13:12 ` Niklas Schnelle
2025-02-10 13:26 ` Cédric Le Goater [this message]
2025-02-10 14:52 ` David Hildenbrand
2025-02-07 20:56 ` [PATCH v4 2/2] s390x/pci: indicate QEMU supports relaxed translation for passthrough Matthew Rosato
2025-02-10 13:29 ` Niklas Schnelle
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=61b05250-dc77-4aa9-bb0f-8394edec203b@redhat.com \
--to=clegoate@redhat.com \
--cc=borntraeger@linux.ibm.com \
--cc=david@redhat.com \
--cc=farman@linux.ibm.com \
--cc=iii@linux.ibm.com \
--cc=mjrosato@linux.ibm.com \
--cc=pasic@linux.ibm.com \
--cc=qemu-devel@nongnu.org \
--cc=qemu-s390x@nongnu.org \
--cc=richard.henderson@linaro.org \
--cc=schnelle@linux.ibm.com \
--cc=thuth@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).