From: Alex Williamson <alex.williamson@redhat.com>
To: Auger Eric <eric.auger@redhat.com>
Cc: peter.maydell@linaro.org, aik@ozlabs.ru, qemu-devel@nongnu.org,
peterx@redhat.com, qemu-arm@nongnu.org, pbonzini@redhat.com,
eric.auger.pro@gmail.com
Subject: Re: [Qemu-devel] [PATCH v5 2/2] hw/vfio/common: Fail on VFIO/HW nested paging detection
Date: Fri, 30 Aug 2019 11:22:45 -0600 [thread overview]
Message-ID: <20190830112245.7e98d32d@x1.home> (raw)
In-Reply-To: <979ca496-3401-0d53-e42d-8e04922ece52@redhat.com>
On Fri, 30 Aug 2019 10:06:56 +0200
Auger Eric <eric.auger@redhat.com> wrote:
> Hi Alex,
>
> On 8/29/19 8:14 PM, Alex Williamson wrote:
> > On Thu, 29 Aug 2019 11:01:41 +0200
> > Eric Auger <eric.auger@redhat.com> wrote:
> >
> >> As of today, VFIO only works along with vIOMMU supporting
> >> caching mode. The SMMUv3 does not support this mode and
> >> requires HW nested paging to work properly with VFIO.
> >>
> >> So any attempt to run a VFIO device protected by such IOMMU
> >> would prevent the assigned device from working and at the
> >> moment the guest does not even boot as the default
> >> memory_region_iommu_replay() implementation attempts to
> >> translate the whole address space and completely stalls
> >> the guest.
> >
> > Why doesn't this stall an x86 guest?
> it does not stall on x86 since intel_iommu implements a custom replay
> (see vtd_iommu_replay) and you do not execute the dummy default one.
> This function performs a full page table walk, scanning all the valid
> entries and calling the MAP notifier on those. Although this operation
> is tedious it has nothing to compare against the dummy default replay
> function which calls translate() on the whole address range (on a page
> basis).
Ah right. OTOH, what are the arguments against smmuv3 providing a
replay function?
> > I'm a bit confused about what this provides versus the flag_changed
> > notifier looking for IOMMU_NOTIFIER_MAP, which AIUI is the common
> > deficiency between VT-d w/o caching-mode and SMMUv3 w/o nested mode.
> > The iommu notifier is registered prior to calling iommu_replay, so it
> > seems we already have an opportunity to do something there. Help me
> > understand why this is needed. Thanks,
>
> At the moment the smmuv3 notify_flag_changed callback implementation
> (smmuv3_notify_flag_changed) emits a warning when it detects a MAP
> notifier gets registered:
>
> warn_report("SMMUv3 does not support notification on MAP: "
> "device %s will not function properly", pcidev->name);
>
> and then the replay gets executed, looping forever.
>
> I could exit instead of emitting a warning but the drawback is that on
> vfio hotplug, it will also exit whereas we would rather simply reject
> the hotplug.
There are solutions to the above by modifying the existing framework
rather than creating a parallel solution though. For instance, could
memory_region_register_iommu_notifier() reject the notifier if the flag
change is incompatible, allowing the fault to propagate back to vfio
and taking a similar exit path as provided here.
> I think the solution based on the IOMMU MR attribute handles both the
> static and hotplug solutions. Also looking further, I will need this
> IOMMU MR attribute for 2stage SMMU integration (see [RFC v5 14/29]
> vfio: Force nested if iommu requires it). I know that it is standing
> for a while and it is still hypothetical but setting up 2stage will
> require specific treatments in the vfio common.c code, opt-in the
> 2stage mode, register specific iommu mr notifiers. Using the IOMMU MR
> attribute allows me to detect which kind of VFIO/IOMMU integration I
> need to setup.
Hmm, I'm certainly more on board with that use case. I guess the
question is whether the problem statement presented here justifies what
seems to be a parallel solution to what we have today, or could have
with some enhancements. Thanks,
Alex
> >>
> >> So let's fail on that case.
> >>
> >> Signed-off-by: Eric Auger <eric.auger@redhat.com>
> >>
> >> ---
> >>
> >> v3 -> v4:
> >> - use IOMMU_ATTR_HW_NESTED_PAGING
> >> - do not abort anymore but jump to fail
> >> ---
> >> hw/vfio/common.c | 10 ++++++++++
> >> 1 file changed, 10 insertions(+)
> >>
> >> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
> >> index 3e03c495d8..e8c009d019 100644
> >> --- a/hw/vfio/common.c
> >> +++ b/hw/vfio/common.c
> >> @@ -606,9 +606,19 @@ static void
> >> vfio_listener_region_add(MemoryListener *listener, if
> >> (memory_region_is_iommu(section->mr)) { VFIOGuestIOMMU *giommu;
> >> IOMMUMemoryRegion *iommu_mr =
> >> IOMMU_MEMORY_REGION(section->mr);
> >> + bool nested;
> >> int iommu_idx;
> >>
> >> trace_vfio_listener_region_add_iommu(iova, end);
> >> +
> >> + if (!memory_region_iommu_get_attr(iommu_mr,
> >> +
> >> IOMMU_ATTR_NEED_HW_NESTED_PAGING,
> >> + (void *)&nested) &&
> >> nested) {
> >> + error_report("VFIO/vIOMMU integration based on HW
> >> nested paging "
> >> + "is not yet supported");
> >> + ret = -EINVAL;
> >> + goto fail;
> >> + }
> >> /*
> >> * FIXME: For VFIO iommu types which have KVM
> >> acceleration to
> >> * avoid bouncing all map/unmaps through qemu this way,
> >> this
> >
> >
prev parent reply other threads:[~2019-08-30 17:24 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-08-29 9:01 [Qemu-devel] [PATCH v5 0/2] VFIO/SMMUv3: Fail on VFIO/HW nested paging detection Eric Auger
2019-08-29 9:01 ` [Qemu-devel] [PATCH v5 1/2] memory: Add IOMMU_ATTR_NEED_HW_NESTED_PAGING IOMMU memory region attribute Eric Auger
2019-08-29 9:01 ` [Qemu-devel] [PATCH v5 2/2] hw/vfio/common: Fail on VFIO/HW nested paging detection Eric Auger
2019-08-29 18:14 ` Alex Williamson
2019-08-30 8:06 ` Auger Eric
2019-08-30 17:22 ` Alex Williamson [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20190830112245.7e98d32d@x1.home \
--to=alex.williamson@redhat.com \
--cc=aik@ozlabs.ru \
--cc=eric.auger.pro@gmail.com \
--cc=eric.auger@redhat.com \
--cc=pbonzini@redhat.com \
--cc=peter.maydell@linaro.org \
--cc=peterx@redhat.com \
--cc=qemu-arm@nongnu.org \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).