qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Eric Auger <eric.auger@redhat.com>
To: Zhenzhong Duan <zhenzhong.duan@intel.com>, qemu-devel@nongnu.org
Cc: alex@shazbot.org, clg@redhat.com, mst@redhat.com,
	jasowang@redhat.com, peterx@redhat.com, ddutile@redhat.com,
	jgg@nvidia.com, nicolinc@nvidia.com, skolothumtho@nvidia.com,
	joao.m.martins@oracle.com, clement.mathieu--drif@eviden.com,
	kevin.tian@intel.com, yi.l.liu@intel.com, chao.p.peng@intel.com
Subject: Re: [PATCH v8 21/23] Workaround for ERRATA_772415_SPR17
Date: Wed, 10 Dec 2025 18:52:11 +0100	[thread overview]
Message-ID: <ed5a4f7d-a071-4c4c-8e78-ab2e1709f4eb@redhat.com> (raw)
In-Reply-To: <20251117093729.1121324-22-zhenzhong.duan@intel.com>



On 11/17/25 10:37 AM, Zhenzhong Duan wrote:
> On a system influenced by ERRATA_772415, IOMMU_HW_INFO_VTD_ERRATA_772415_SPR17
> is repored by IOMMU_DEVICE_GET_HW_INFO. Due to this errata, even the readonly
> range mapped on second stage page table could still be written.
>
> Reference from 4th Gen Intel Xeon Processor Scalable Family Specification
> Update, Errata Details, SPR17.
> https://edc.intel.com/content/www/us/en/design/products-and-solutions/processors-and-chipsets/eagle-stream/sapphire-rapids-specification-update/
>
> Also copied the SPR17 details from above link:
> "Problem: When remapping hardware is configured by system software in
> scalable mode as Nested (PGTT=011b) and with PWSNP field Set in the
> PASID-table-entry, it may Set Accessed bit and Dirty bit (and Extended
> Access bit if enabled) in first-stage page-table entries even when
> second-stage mappings indicate that corresponding first-stage page-table
> is Read-Only.
>
> Implication: Due to this erratum, pages mapped as Read-only in second-stage
> page-tables may be modified by remapping hardware Access/Dirty bit updates.
>
> Workaround: None identified. System software enabling nested translations
> for a VM should ensure that there are no read-only pages in the
> corresponding second-stage mappings."
>
> Introduce a helper vfio_device_get_host_iommu_quirk_bypass_ro to check if
> readonly mappings should be bypassed.
>
> Signed-off-by: Zhenzhong Duan <zhenzhong.duan@intel.com>

since it will be moved to a different series, I skip the review for now.

Thanks

Eric
> ---
>  include/hw/vfio/vfio-container.h |  1 +
>  include/hw/vfio/vfio-device.h    |  3 +++
>  hw/vfio/device.c                 | 14 ++++++++++++++
>  hw/vfio/iommufd.c                |  9 ++++++++-
>  hw/vfio/listener.c               |  6 ++++--
>  5 files changed, 30 insertions(+), 3 deletions(-)
>
> diff --git a/include/hw/vfio/vfio-container.h b/include/hw/vfio/vfio-container.h
> index 9f6e8cedfc..a7d5c5ed67 100644
> --- a/include/hw/vfio/vfio-container.h
> +++ b/include/hw/vfio/vfio-container.h
> @@ -52,6 +52,7 @@ struct VFIOContainer {
>      QLIST_HEAD(, VFIODevice) device_list;
>      GList *iova_ranges;
>      NotifierWithReturn cpr_reboot_notifier;
> +    bool bypass_ro;
>  };
>  
>  #define TYPE_VFIO_IOMMU "vfio-iommu"
> diff --git a/include/hw/vfio/vfio-device.h b/include/hw/vfio/vfio-device.h
> index 48d00c7bc4..f6f3d0e378 100644
> --- a/include/hw/vfio/vfio-device.h
> +++ b/include/hw/vfio/vfio-device.h
> @@ -268,6 +268,9 @@ void vfio_device_prepare(VFIODevice *vbasedev, VFIOContainer *bcontainer,
>  void vfio_device_unprepare(VFIODevice *vbasedev);
>  
>  bool vfio_device_get_viommu_flags_want_nesting(VFIODevice *vbasedev);
> +bool vfio_device_get_host_iommu_quirk_bypass_ro(VFIODevice *vbasedev,
> +                                                uint32_t type, void *caps,
> +                                                uint32_t size);
>  
>  int vfio_device_get_region_info(VFIODevice *vbasedev, int index,
>                                  struct vfio_region_info **info);
> diff --git a/hw/vfio/device.c b/hw/vfio/device.c
> index 71eb069eb6..290011e154 100644
> --- a/hw/vfio/device.c
> +++ b/hw/vfio/device.c
> @@ -533,6 +533,20 @@ bool vfio_device_get_viommu_flags_want_nesting(VFIODevice *vbasedev)
>      return false;
>  }
>  
> +bool vfio_device_get_host_iommu_quirk_bypass_ro(VFIODevice *vbasedev,
> +                                                uint32_t type, void *caps,
> +                                                uint32_t size)
> +{
> +    VFIOPCIDevice *vdev = vfio_pci_from_vfio_device(vbasedev);
> +
> +    if (vdev) {
> +        return !!(pci_device_get_host_iommu_quirks(PCI_DEVICE(vdev), type,
> +                                                   caps, size) &
> +                  HOST_IOMMU_QUIRK_NESTING_PARENT_BYPASS_RO);
> +    }
> +    return false;
> +}
> +
>  /*
>   * Traditional ioctl() based io
>   */
> diff --git a/hw/vfio/iommufd.c b/hw/vfio/iommufd.c
> index 63f8442865..2a7b0d0c07 100644
> --- a/hw/vfio/iommufd.c
> +++ b/hw/vfio/iommufd.c
> @@ -351,6 +351,7 @@ static bool iommufd_cdev_autodomains_get(VFIODevice *vbasedev,
>      VFIOContainer *bcontainer = VFIO_IOMMU(container);
>      uint32_t type, flags = 0;
>      uint64_t hw_caps;
> +    VendorCaps caps;
>      VFIOIOASHwpt *hwpt;
>      uint32_t hwpt_id;
>      int ret;
> @@ -396,7 +397,8 @@ static bool iommufd_cdev_autodomains_get(VFIODevice *vbasedev,
>       * instead.
>       */
>      if (!iommufd_backend_get_device_info(vbasedev->iommufd, vbasedev->devid,
> -                                         &type, NULL, 0, &hw_caps, errp)) {
> +                                         &type, &caps, sizeof(caps), &hw_caps,
> +                                         errp)) {
>          return false;
>      }
>  
> @@ -411,6 +413,11 @@ static bool iommufd_cdev_autodomains_get(VFIODevice *vbasedev,
>       */
>      if (vfio_device_get_viommu_flags_want_nesting(vbasedev)) {
>          flags |= IOMMU_HWPT_ALLOC_NEST_PARENT;
> +
> +        if (vfio_device_get_host_iommu_quirk_bypass_ro(vbasedev, type,
> +                                                       &caps, sizeof(caps))) {
> +            bcontainer->bypass_ro = true;
> +        }
>      }
>  
>      if (cpr_is_incoming()) {
> diff --git a/hw/vfio/listener.c b/hw/vfio/listener.c
> index ca2377d860..090f935d30 100644
> --- a/hw/vfio/listener.c
> +++ b/hw/vfio/listener.c
> @@ -502,7 +502,8 @@ void vfio_container_region_add(VFIOContainer *bcontainer,
>      int ret;
>      Error *err = NULL;
>  
> -    if (!vfio_listener_valid_section(section, false, "region_add")) {
> +    if (!vfio_listener_valid_section(section, bcontainer->bypass_ro,
> +                                     "region_add")) {
>          return;
>      }
>  
> @@ -668,7 +669,8 @@ static void vfio_listener_region_del(MemoryListener *listener,
>      int ret;
>      bool try_unmap = true;
>  
> -    if (!vfio_listener_valid_section(section, false, "region_del")) {
> +    if (!vfio_listener_valid_section(section, bcontainer->bypass_ro,
> +                                     "region_del")) {
>          return;
>      }
>  



  reply	other threads:[~2025-12-10 17:52 UTC|newest]

Thread overview: 45+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-17  9:37 [PATCH v8 00/23] intel_iommu: Enable first stage translation for passthrough device Zhenzhong Duan
2025-11-17  9:37 ` [PATCH v8 01/23] intel_iommu: Rename vtd_ce_get_rid2pasid_entry to vtd_ce_get_pasid_entry Zhenzhong Duan
2025-11-17  9:37 ` [PATCH v8 02/23] intel_iommu: Delete RPS capability related supporting code Zhenzhong Duan
2025-12-10 10:57   ` Eric Auger
2025-12-11  8:22   ` Jason Wang
2025-12-11 11:04     ` Yi Liu
2025-11-17  9:37 ` [PATCH v8 03/23] intel_iommu: Update terminology to match VTD spec Zhenzhong Duan
2025-11-17  9:37 ` [PATCH v8 04/23] hw/pci: Export pci_device_get_iommu_bus_devfn() and return bool Zhenzhong Duan
2025-11-17  9:37 ` [PATCH v8 05/23] hw/pci: Introduce pci_device_get_viommu_flags() Zhenzhong Duan
2025-11-17  9:37 ` [PATCH v8 06/23] intel_iommu: Implement get_viommu_flags() callback Zhenzhong Duan
2025-11-17  9:37 ` [PATCH v8 07/23] intel_iommu: Introduce a new structure VTDHostIOMMUDevice Zhenzhong Duan
2025-11-17  9:37 ` [PATCH v8 08/23] vfio/iommufd: Force creating nesting parent HWPT Zhenzhong Duan
2025-11-17  9:37 ` [PATCH v8 09/23] intel_iommu_accel: Check for compatibility with IOMMUFD backed device when x-flts=on Zhenzhong Duan
2025-12-10 13:59   ` Eric Auger
2025-12-11  6:49     ` Duan, Zhenzhong
2025-12-11  7:09       ` Eric Auger
2025-12-12  2:29         ` Duan, Zhenzhong
2025-11-17  9:37 ` [PATCH v8 10/23] intel_iommu_accel: Fail passthrough device under PCI bridge if x-flts=on Zhenzhong Duan
2025-12-10 14:01   ` Eric Auger
2025-11-17  9:37 ` [PATCH v8 11/23] intel_iommu_accel: Stick to system MR for IOMMUFD backed host device when x-flts=on Zhenzhong Duan
2025-12-10 14:02   ` Eric Auger
2025-11-17  9:37 ` [PATCH v8 12/23] intel_iommu: Add some macros and inline functions Zhenzhong Duan
2025-11-17  9:37 ` [PATCH v8 13/23] intel_iommu_accel: Bind/unbind guest page table to host Zhenzhong Duan
2025-12-10 17:42   ` Eric Auger
2025-12-11  7:52     ` Duan, Zhenzhong
2025-12-12  2:12       ` Duan, Zhenzhong
2025-12-12  3:02         ` Nicolin Chen
2025-11-17  9:37 ` [PATCH v8 14/23] intel_iommu_accel: Propagate PASID-based iotlb invalidation " Zhenzhong Duan
2025-12-10 17:49   ` Eric Auger
2025-11-17  9:37 ` [PATCH v8 15/23] intel_iommu: Replay all pasid bindings when either SRTP or TE bit is changed Zhenzhong Duan
2025-11-17  9:37 ` [PATCH v8 16/23] intel_iommu: Replay pasid bindings after context cache invalidation Zhenzhong Duan
2025-11-17  9:37 ` [PATCH v8 17/23] vfio/listener: Bypass readonly region for dirty tracking Zhenzhong Duan
2025-11-28  2:08   ` Duan, Zhenzhong
2025-11-28  4:27     ` Yi Liu
2025-11-28  5:47       ` Duan, Zhenzhong
2025-11-28 12:58     ` Cédric Le Goater
2025-12-01  3:21       ` Duan, Zhenzhong
2025-11-17  9:37 ` [PATCH v8 18/23] intel_iommu: Add migration support with x-flts=on Zhenzhong Duan
2025-11-17  9:37 ` [PATCH v8 19/23] hw/pci: Introduce pci_device_get_host_iommu_quirks() Zhenzhong Duan
2025-11-17  9:37 ` [PATCH v8 20/23] intel_iommu_accel: Implement get_host_iommu_quirks() callback Zhenzhong Duan
2025-11-17  9:37 ` [PATCH v8 21/23] Workaround for ERRATA_772415_SPR17 Zhenzhong Duan
2025-12-10 17:52   ` Eric Auger [this message]
2025-11-17  9:37 ` [PATCH v8 22/23] intel_iommu: Enable host device when x-flts=on in scalable mode Zhenzhong Duan
2025-11-17  9:37 ` [PATCH v8 23/23] docs/devel: Add IOMMUFD nesting documentation Zhenzhong Duan
2025-12-09  9:50 ` [PATCH v8 00/23] intel_iommu: Enable first stage translation for passthrough device Duan, Zhenzhong

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ed5a4f7d-a071-4c4c-8e78-ab2e1709f4eb@redhat.com \
    --to=eric.auger@redhat.com \
    --cc=alex@shazbot.org \
    --cc=chao.p.peng@intel.com \
    --cc=clement.mathieu--drif@eviden.com \
    --cc=clg@redhat.com \
    --cc=ddutile@redhat.com \
    --cc=jasowang@redhat.com \
    --cc=jgg@nvidia.com \
    --cc=joao.m.martins@oracle.com \
    --cc=kevin.tian@intel.com \
    --cc=mst@redhat.com \
    --cc=nicolinc@nvidia.com \
    --cc=peterx@redhat.com \
    --cc=qemu-devel@nongnu.org \
    --cc=skolothumtho@nvidia.com \
    --cc=yi.l.liu@intel.com \
    --cc=zhenzhong.duan@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).