qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Cédric Le Goater" <clegoate@redhat.com>
To: Niklas Schnelle <schnelle@linux.ibm.com>,
	Matthew Rosato <mjrosato@linux.ibm.com>,
	qemu-s390x@nongnu.org
Cc: farman@linux.ibm.com, thuth@redhat.com, pasic@linux.ibm.com,
	borntraeger@linux.ibm.com, richard.henderson@linaro.org,
	david@redhat.com, iii@linux.ibm.com, qemu-devel@nongnu.org
Subject: Re: [PATCH v4 1/2] s390x/pci: add support for guests that request direct mapping
Date: Mon, 10 Feb 2025 14:26:58 +0100	[thread overview]
Message-ID: <61b05250-dc77-4aa9-bb0f-8394edec203b@redhat.com> (raw)
In-Reply-To: <13cd0bc678f489ff26911362570efe1aca97a642.camel@linux.ibm.com>

On 2/10/25 14:12, Niklas Schnelle wrote:
> On Fri, 2025-02-07 at 15:56 -0500, Matthew Rosato wrote:
>> When receiving a guest mpcifc(4) or mpcifc(6) instruction without the T
>> bit set, treat this as a request to perform direct mapping instead of
>> address translation.  In order to facilitate this, pin the entirety of
>> guest memory into the host iommu.
>>
>> Pinning for the direct mapping case is handled via vfio and its memory
>> listener.  Additionally, ram discard settings are inherited from vfio:
>> coordinated discards (e.g. virtio-mem) are allowed while uncoordinated
>> discards (e.g. virtio-balloon) are disabled.
>>
>> Subsequent guest DMA operations are all expected to be of the format
>> guest_phys+sdma, allowing them to be used as lookup into the host
>> iommu table.
>>
>> Signed-off-by: Matthew Rosato <mjrosato@linux.ibm.com>
>> ---
>>   hw/s390x/s390-pci-bus.c         | 38 +++++++++++++++++++++++++++++++--
>>   hw/s390x/s390-pci-inst.c        | 13 +++++++++--
>>   hw/s390x/s390-pci-vfio.c        | 23 ++++++++++++++++----
>>   hw/s390x/s390-virtio-ccw.c      |  5 +++++
>>   include/hw/s390x/s390-pci-bus.h |  4 ++++
>>   5 files changed, 75 insertions(+), 8 deletions(-)
>>
>>
> ---8<---
>>   
>>   static const VMStateDescription s390_pci_device_vmstate = {
>> diff --git a/hw/s390x/s390-pci-inst.c b/hw/s390x/s390-pci-inst.c
>> index e386d75d58..8cdeb6cb7f 100644
>> --- a/hw/s390x/s390-pci-inst.c
>> +++ b/hw/s390x/s390-pci-inst.c
>> @@ -16,6 +16,7 @@
>>   #include "exec/memory.h"
>>   #include "qemu/error-report.h"
>>   #include "system/hw_accel.h"
>> +#include "hw/boards.h"
>>   #include "hw/pci/pci_device.h"
>>   #include "hw/s390x/s390-pci-inst.h"
>>   #include "hw/s390x/s390-pci-bus.h"
>> @@ -1008,17 +1009,25 @@ static int reg_ioat(CPUS390XState *env, S390PCIBusDevice *pbdev, ZpciFib fib,
>>       }
>>   
>>       /* currently we only support designation type 1 with translation */
>> -    if (!(dt == ZPCI_IOTA_RTTO && t)) {
>> +    if (t && dt != ZPCI_IOTA_RTTO) {
>>           error_report("unsupported ioat dt %d t %d", dt, t);
>>           s390_program_interrupt(env, PGM_OPERAND, ra);
>>           return -EINVAL;
>> +    } else if (!t && !pbdev->rtr_avail) {
>> +        error_report("relaxed translation not allowed");
>> +        s390_program_interrupt(env, PGM_OPERAND, ra);
>> +        return -EINVAL;
>>       }
>>   
>>       iommu->pba = pba;
>>       iommu->pal = pal;
>>       iommu->g_iota = g_iota;
>>   
>> -    s390_pci_iommu_enable(iommu);
>> +    if (t) {
>> +        s390_pci_iommu_enable(iommu);
>> +    } else {
>> +        s390_pci_iommu_direct_map_enable(iommu);
>> +    }
>>   
>>       return 0;
>>   }
>> diff --git a/hw/s390x/s390-pci-vfio.c b/hw/s390x/s390-pci-vfio.c
>> index 7dbbc76823..443e222912 100644
>> --- a/hw/s390x/s390-pci-vfio.c
>> +++ b/hw/s390x/s390-pci-vfio.c
>> @@ -131,13 +131,28 @@ static void s390_pci_read_base(S390PCIBusDevice *pbdev,
>>       /* Store function type separately for type-specific behavior */
>>       pbdev->pft = cap->pft;
>>   
>> +    /*
>> +     * If the device is a passthrough ISM device, disallow relaxed
>> +     * translation.
>> +     */
>> +    if (pbdev->pft == ZPCI_PFT_ISM) {
>> +        pbdev->rtr_avail = false;
>> +    }
> 
> Just a note for external readers. The ISM device does work without the
> above but having explicit guest IOMMU mappings plus the respective
> shadow mappings in the host would catch driver misbehavior more easily.
> At the same time ISM uses long lived IOMMU mappings so the cost of
> shadowing its mappings is low.
> 
>> +
>>       /*
>>        * If appropriate, reduce the size of the supported DMA aperture reported
>> -     * to the guest based upon the vfio DMA limit.
>> +     * to the guest based upon the vfio DMA limit.  This is applicable for
>> +     * devices that are guaranteed to not use relaxed translation.  If the
>> +     * device is capable of relaxed translation then we must advertise the
>> +     * full aperture.  In this case, if translation is used then we will
>> +     * rely on the vfio DMA limit counting and use RPCIT CC1 / status 16
>> +     * to request that the guest free DMA mappings as necessary.
>>        */
>> -    vfio_size = pbdev->iommu->max_dma_limit << TARGET_PAGE_BITS;
>> -    if (vfio_size > 0 && vfio_size < cap->end_dma - cap->start_dma + 1) {
>> -        pbdev->zpci_fn.edma = cap->start_dma + vfio_size - 1;
>> +    if (!pbdev->rtr_avail) {
>> +        vfio_size = pbdev->iommu->max_dma_limit << TARGET_PAGE_BITS;
>> +        if (vfio_size > 0 && vfio_size < cap->end_dma - cap->start_dma + 1) {
>> +            pbdev->zpci_fn.edma = cap->start_dma + vfio_size - 1;
>> +        }
>>       }
>>   }
>>   
>> diff --git a/hw/s390x/s390-virtio-ccw.c b/hw/s390x/s390-virtio-ccw.c
>> index d9e683c5b4..6a6cb39808 100644
>> --- a/hw/s390x/s390-virtio-ccw.c
>> +++ b/hw/s390x/s390-virtio-ccw.c
>> @@ -937,8 +937,13 @@ static void ccw_machine_9_2_instance_options(MachineState *machine)
>>   
>>   static void ccw_machine_9_2_class_options(MachineClass *mc)
>>   {
>> +    static GlobalProperty compat[] = {
>> +        { TYPE_S390_PCI_DEVICE, "relaxed-translation", "off", },
>> +    };
>> +
>>       ccw_machine_10_0_class_options(mc);
>>       compat_props_add(mc->compat_props, hw_compat_9_2, hw_compat_9_2_len);
>> +    compat_props_add(mc->compat_props, compat, G_N_ELEMENTS(compat));
>>   }
>>   DEFINE_CCW_MACHINE(9, 2);
>>   
>> diff --git a/include/hw/s390x/s390-pci-bus.h b/include/hw/s390x/s390-pci-bus.h
>> index 2c43ea123f..ea9e04ec49 100644
>> --- a/include/hw/s390x/s390-pci-bus.h
>> +++ b/include/hw/s390x/s390-pci-bus.h
>> @@ -277,7 +277,9 @@ struct S390PCIIOMMU {
>>       AddressSpace as;
>>       MemoryRegion mr;
>>       IOMMUMemoryRegion iommu_mr;
>> +    MemoryRegion dm_mr;
>>       bool enabled;
>> +    bool direct_map;
>>       uint64_t g_iota;
>>       uint64_t pba;
>>       uint64_t pal;
>> @@ -362,6 +364,7 @@ struct S390PCIBusDevice {
>>       bool interp;
>>       bool forwarding_assist;
>>       bool aif;
>> +    bool rtr_avail;
>>       QTAILQ_ENTRY(S390PCIBusDevice) link;
>>   };
>>   
>> @@ -389,6 +392,7 @@ int pci_chsc_sei_nt2_have_event(void);
>>   void s390_pci_sclp_configure(SCCB *sccb);
>>   void s390_pci_sclp_deconfigure(SCCB *sccb);
>>   void s390_pci_iommu_enable(S390PCIIOMMU *iommu);
>> +void s390_pci_iommu_direct_map_enable(S390PCIIOMMU *iommu);
>>   void s390_pci_iommu_disable(S390PCIIOMMU *iommu);
>>   void s390_pci_generate_error_event(uint16_t pec, uint32_t fh, uint32_t fid,
>>                                      uint64_t faddr, uint32_t e);
> 
> I'm not too familiar with the existing code or QEMU in general, but the
> changes makes sense to me. I'm assuming the braces around single
> statement bodies are accepted style in QEMU?

They are required :

   https://qemu.readthedocs.io/en/v9.2.0/devel/style.html#block-structure


> 
> I retested this version together with the v4 of the kernel version too.
> So feel free to add:
> 
> Tested-by: Niklas Schnelle <schnelle@linux.ibm.com>
> Reviewed-by: Niklas Schnelle <schnelle@linux.ibm.com>
> 


Thanks,

C.




  reply	other threads:[~2025-02-10 13:29 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-07 20:56 [PATCH v4 0/2] s390x/pci: relax I/O address translation requirement Matthew Rosato
2025-02-07 20:56 ` [PATCH v4 1/2] s390x/pci: add support for guests that request direct mapping Matthew Rosato
2025-02-10 13:12   ` Niklas Schnelle
2025-02-10 13:26     ` Cédric Le Goater [this message]
2025-02-10 14:52   ` David Hildenbrand
2025-02-07 20:56 ` [PATCH v4 2/2] s390x/pci: indicate QEMU supports relaxed translation for passthrough Matthew Rosato
2025-02-10 13:29   ` Niklas Schnelle

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=61b05250-dc77-4aa9-bb0f-8394edec203b@redhat.com \
    --to=clegoate@redhat.com \
    --cc=borntraeger@linux.ibm.com \
    --cc=david@redhat.com \
    --cc=farman@linux.ibm.com \
    --cc=iii@linux.ibm.com \
    --cc=mjrosato@linux.ibm.com \
    --cc=pasic@linux.ibm.com \
    --cc=qemu-devel@nongnu.org \
    --cc=qemu-s390x@nongnu.org \
    --cc=richard.henderson@linaro.org \
    --cc=schnelle@linux.ibm.com \
    --cc=thuth@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).