qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Alex Williamson <alex.williamson@redhat.com>
To: "Cédric Le Goater" <clg@redhat.com>
Cc: qemu-devel@nongnu.org
Subject: Re: [PATCH v2 8/9] vfio: Check compatibility of CPU and IOMMU address space width
Date: Thu, 30 Jan 2025 11:58:00 -0700	[thread overview]
Message-ID: <20250130115800.60b7cbe6.alex.williamson@redhat.com> (raw)
In-Reply-To: <20250130134346.1754143-9-clg@redhat.com>

On Thu, 30 Jan 2025 14:43:45 +0100
Cédric Le Goater <clg@redhat.com> wrote:

> Print a warning if IOMMU address space width is smaller than the
> physical address width. In this case, PCI peer-to-peer transactions on
> BARs are not supported and failures of device MMIO regions are to be
> expected.
> 
> This can occur with the 39-bit IOMMU address space width as found on
> consumer grade processors or when using a vIOMMU device with default
> settings.
> 
> Signed-off-by: Cédric Le Goater <clg@redhat.com>
> ---
>  hw/vfio/common.c | 21 +++++++++++++++++++++
>  1 file changed, 21 insertions(+)
> 
> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> index 5c9d8657d746ce30af5ae8f9122101e086a61ef5..e5ef93c589b2bed68f790608868ec3c7779d4cb8 100644
> --- a/hw/vfio/common.c
> +++ b/hw/vfio/common.c
> @@ -44,6 +44,8 @@
>  #include "migration/qemu-file.h"
>  #include "system/tpm.h"
>  
> +#include "hw/core/cpu.h"
> +
>  VFIODeviceList vfio_device_list =
>      QLIST_HEAD_INITIALIZER(vfio_device_list);
>  static QLIST_HEAD(, VFIOAddressSpace) vfio_address_spaces =
> @@ -1546,12 +1548,28 @@ retry:
>      return info;
>  }
>  
> +static bool vfio_device_check_address_space(VFIODevice *vbasedev, Error **errp)
> +{
> +    uint32_t cpu_aw_bits = cpu_get_phys_bits(first_cpu);
> +    uint32_t iommu_aw_bits = vfio_device_get_aw_bits(vbasedev);
> +
> +    if (cpu_aw_bits && cpu_aw_bits > iommu_aw_bits) {

Should we be testing the 64-bit MMIO window and maximum RAM GPA rather
than the vCPU physical address width?

Maybe we're just stuck with an indirect solution currently.  AIUI,
we're testing the vCPU physical address width because (obviously) all
devices and memory need to be addressable within that address space.
However, as we've explored offline, there are bare metal systems where
the CPU physical address width exceeds the IOMMU address width and this
is not a fundamental problem.  It places restrictions on the maximum
RAM physical address and the location of the 64-bit MMIO space.

RAM tends not to be a problem on these bare metal systems since they
physically cannot hold enough RAM to exceed the IOMMU width and, for
the most part, we map RAM starting from the bottom of the address
space.  So long as the VMM doesn't decide to map RAM at the top of the
physical address space, this doesn't become a problem.

However, we've decided to do exactly that for the 64-bit MMIO window.
It's not that the vCPU width being greater than the IOMMU width is a
fundamental problem, it's that we've chosen a 64-bit MMIO policy that
makes this become a problem and QEMU only has a convenient mechanism to
check the host IOMMU width when a vfio device is present.  Arguably,
when a vIOMMU is present guest firmware should be enlightened to
understand the address width of that vIOMMU when placing the 64-bit
MMIO window.  I'd say the failure to do so is a current firmware bug.

If the vIOMMU address width were honored, we could recommend users set
that to match the host, regardless of the vCPU physical address width.
We also already have a failure condition if the vIOMMU address width
exceeds the vfio-device (ie. indirectly the host) IOMMU width.

Of course without a vIOMMU, given our 64-bit MMIO policy, we don't have
a good solution without specifying the 64-bit window or implementing a
more conservative placement.

Not sure how much of this is immediately solvable and some indication
to the user how they can resolve the issue, such as implemented here, is
better than none, but maybe we can elaborate in a comment that this is
really more of a workaround for the current behavior of firmware
relative to the 64-bit MMIO placement policy.  Thanks,

Alex

> +        error_setg(errp, "Host physical address space (%u) is larger than "
> +                   "the host IOMMU address space (%u).", cpu_aw_bits,
> +                   iommu_aw_bits);
> +        vfio_device_error_append(vbasedev, errp);
> +        return false;
> +    }
> +    return true;
> +}
> +
>  bool vfio_attach_device(char *name, VFIODevice *vbasedev,
>                          AddressSpace *as, Error **errp)
>  {
>      const VFIOIOMMUClass *ops =
>          VFIO_IOMMU_CLASS(object_class_by_name(TYPE_VFIO_IOMMU_LEGACY));
>      HostIOMMUDevice *hiod = NULL;
> +    Error *local_err = NULL;
>  
>      if (vbasedev->iommufd) {
>          ops = VFIO_IOMMU_CLASS(object_class_by_name(TYPE_VFIO_IOMMU_IOMMUFD));
> @@ -1571,6 +1589,9 @@ bool vfio_attach_device(char *name, VFIODevice *vbasedev,
>          return false;
>      }
>  
> +    if (!vfio_device_check_address_space(vbasedev, &local_err)) {
> +        warn_report_err(local_err);
> +    }
>      return true;
>  }
>  



  reply	other threads:[~2025-01-30 18:59 UTC|newest]

Thread overview: 35+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-30 13:43 [PATCH v2 0/9]vfio: Improve error reporting when MMIO region mapping fails Cédric Le Goater
2025-01-30 13:43 ` [PATCH v2 1/9] util/error: Introduce warn_report_once_err() Cédric Le Goater
2025-01-30 14:25   ` Markus Armbruster
2025-01-30 16:03     ` Cédric Le Goater
2025-01-30 16:28       ` Cédric Le Goater
2025-01-30 17:55   ` Alex Williamson
2025-01-30 21:26     ` Cédric Le Goater
2025-01-31  8:30       ` Markus Armbruster
2025-01-30 13:43 ` [PATCH v2 2/9] vfio/pci: Replace "iommu_device" by "vIOMMU" Cédric Le Goater
2025-02-10 14:28   ` Philippe Mathieu-Daudé
2025-01-30 13:43 ` [PATCH v2 3/9] vfio: Rephrase comment in vfio_listener_region_add() error path Cédric Le Goater
2025-01-30 13:43 ` [PATCH v2 4/9] vfio: Introduce vfio_get_vfio_device() Cédric Le Goater
2025-02-10 14:32   ` Philippe Mathieu-Daudé
2025-02-10 16:19     ` Cédric Le Goater
2025-01-30 13:43 ` [PATCH v2 5/9] vfio: Improve error reporting when MMIO region mapping fails Cédric Le Goater
2025-02-10 14:36   ` Philippe Mathieu-Daudé
2025-02-10 16:17     ` Cédric Le Goater
2025-01-30 13:43 ` [PATCH v2 6/9] vfio: Remove reports of DMA mapping errors in backends Cédric Le Goater
2025-01-30 13:43 ` [PATCH v2 7/9] cpu: Introduce cpu_get_phys_bits() Cédric Le Goater
2025-02-10 14:40   ` Philippe Mathieu-Daudé
2025-03-06 10:37   ` Philippe Mathieu-Daudé
2025-03-06 14:41     ` Cédric Le Goater
2025-01-30 13:43 ` [PATCH v2 8/9] vfio: Check compatibility of CPU and IOMMU address space width Cédric Le Goater
2025-01-30 18:58   ` Alex Williamson [this message]
2025-01-31 12:42     ` Cédric Le Goater
2025-01-31 13:23       ` Gerd Hoffmann
2025-01-31 17:03         ` Cédric Le Goater
2025-02-06  7:54           ` Gerd Hoffmann
2025-02-06 17:05             ` Cédric Le Goater
2025-01-31 22:18         ` Alex Williamson
2025-02-06  8:22           ` Gerd Hoffmann
2025-03-06 10:33   ` Philippe Mathieu-Daudé
2025-09-05 13:04   ` Daniel Kral
2025-09-08  8:35     ` Cédric Le Goater
2025-01-30 13:43 ` [PATCH v2 9/9] vfio: Remove superfluous error report in vfio_listener_region_add() Cédric Le Goater

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250130115800.60b7cbe6.alex.williamson@redhat.com \
    --to=alex.williamson@redhat.com \
    --cc=clg@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).