qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: Eric Auger <eric.auger@redhat.com>
To: Jean-Philippe Brucker <jean-philippe@linaro.org>
Cc: "Duan, Zhenzhong" <zhenzhong.duan@intel.com>,
	"eric.auger.pro@gmail.com" <eric.auger.pro@gmail.com>,
	"qemu-devel@nongnu.org" <qemu-devel@nongnu.org>,
	"qemu-arm@nongnu.org" <qemu-arm@nongnu.org>,
	"alex.williamson@redhat.com" <alex.williamson@redhat.com>,
	"peter.maydell@linaro.org" <peter.maydell@linaro.org>,
	"peterx@redhat.com" <peterx@redhat.com>,
	"yanghliu@redhat.com" <yanghliu@redhat.com>,
	"pbonzini@redhat.com" <pbonzini@redhat.com>,
	"mst@redhat.com" <mst@redhat.com>,
	"clg@redhat.com" <clg@redhat.com>
Subject: Re: [RFC 0/7] VIRTIO-IOMMU/VFIO: Fix host iommu geometry handling for hotplugged devices
Date: Wed, 31 Jan 2024 12:22:39 +0100	[thread overview]
Message-ID: <f139efc5-c34c-4c83-9954-a37bffcb5c90@redhat.com> (raw)
In-Reply-To: <20240130182239.GA1392966@myrica>

Hi Jean,

On 1/30/24 19:22, Jean-Philippe Brucker wrote:
> On Mon, Jan 29, 2024 at 05:38:55PM +0100, Eric Auger wrote:
>>> There may be a separate argument for clearing bypass. With a coldplugged
>>> VFIO device the flow is:
>>>
>>> 1. Map the whole guest address space in VFIO to implement boot-bypass.
>>>    This allocates all guest pages, which takes a while and is wasteful.
>>>    I've actually crashed a host that way, when spawning a guest with too
>>>    much RAM.
>> interesting
>>> 2. Start the VM
>>> 3. When the virtio-iommu driver attaches a (non-identity) domain to the
>>>    assigned endpoint, then unmap the whole address space in VFIO, and most
>>>    pages are given back to the host.
>>>
>>> We can't disable boot-bypass because the BIOS needs it. But instead the
>>> flow could be:
>>>
>>> 1. Start the VM, with only the virtual endpoints. Nothing to pin.
>>> 2. The virtio-iommu driver disables bypass during boot
>> We needed this boot-bypass mode for booting with virtio-blk-scsi
>> protected with virtio-iommu for instance.
>> That was needed because we don't have any virtio-iommu driver in edk2 as
>> opposed to intel iommu driver, right?
> Yes. What I had in mind is the x86 SeaBIOS which doesn't have any IOMMU
> driver and accesses the default SATA device:
>
>  $ qemu-system-x86_64 -M q35 -device virtio-iommu,boot-bypass=off
>  qemu: virtio_iommu_translate sid=250 is not known!!
>  qemu: no buffer available in event queue to report event
>  qemu: AHCI: Failed to start FIS receive engine: bad FIS receive buffer address
>
> But it's the same problem with edk2. Also a guest OS without a
> virtio-iommu driver needs boot-bypass. Once firmware boot is complete, the
> OS with a virtio-iommu driver normally can turn bypass off in the config
> space, it's not useful anymore. If it needs to put some endpoints in
> bypass, then it can attach them to a bypass domain.

yup
>
>>> 3. Hotplug the VFIO device. With bypass disabled there is no need to pin
>>>    the whole guest address space, unless the guest explicitly asks for an
>>>    identity domain.
>>>
>>> However, I don't know if this is a realistic scenario that will actually
>>> be used.
>>>
>>> By the way, do you have an easy way to reproduce the issue described here?
>>> I've had to enable iommu.forcedac=1 on the command-line, otherwise Linux
>>> just allocates 32-bit IOVAs.
>> I don't have a simple generic reproducer. It happens when assigning this
>> device:
>> Ethernet Controller E810-C for QSFP (Ethernet Network Adapter E810-C-Q2)
>>
>> I have not encountered that issue with another device yet.
>> I see on guest side in dmesg:
>> [    6.849292] ice 0000:00:05.0: Using 64-bit DMA addresses
>>
>> That's emitted in dma-iommu.c iommu_dma_alloc_iova().
>> Looks like the guest first tries to allocate an iova in the 32-bit AS
>> and if this fails use the whole dma_limit.
>> Seems the 32b IOVA alloc failed here ;-)
> Interesting, are you running some demanding workload and a lot of CPUs?
> That's a lot of IOVAs used up, I'm curious about what kind of DMA pattern
> does that.
Well nothing smart, just booting the guest with the assigned NIC. 8 vcpus

Thanks

Eric
>
> Thanks,
> Jean
>



      reply	other threads:[~2024-01-31 11:23 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-17  8:02 [RFC 0/7] VIRTIO-IOMMU/VFIO: Fix host iommu geometry handling for hotplugged devices Eric Auger
2024-01-17  8:02 ` [RFC 1/7] hw/pci: Introduce PCIIOMMUOps::set_host_iova_regions Eric Auger
2024-01-17  8:02 ` [RFC 2/7] hw/pci: Introduce pci_device_iommu_bus Eric Auger
2024-01-18  7:32   ` Duan, Zhenzhong
2024-01-17  8:02 ` [RFC 3/7] vfio/pci: Pass the usable IOVA ranges through PCIIOMMUOps Eric Auger
2024-01-17  8:02 ` [RFC 4/7] virtio-iommu: Implement PCIIOMMUOps set_host_resv_regions Eric Auger
2024-01-18  7:43   ` Duan, Zhenzhong
2024-01-18 12:25     ` Eric Auger
2024-01-19  7:00       ` Duan, Zhenzhong
2024-01-22  7:17         ` Duan, Zhenzhong
2024-01-17  8:02 ` [RFC 5/7] virtio-iommu: Remove the implementation of iommu_set_iova_ranges Eric Auger
2024-01-17  8:02 ` [RFC 6/7] hw/vfio: Remove memory_region_iommu_set_iova_ranges() call Eric Auger
2024-01-17  8:02 ` [RFC 7/7] memory: Remove IOMMU MR iommu_set_iova_range API Eric Auger
2024-01-18  7:10 ` [RFC 0/7] VIRTIO-IOMMU/VFIO: Fix host iommu geometry handling for hotplugged devices Duan, Zhenzhong
2024-01-18  9:43   ` Eric Auger
2024-01-19  6:46     ` Duan, Zhenzhong
2024-01-25 18:48     ` Jean-Philippe Brucker
2024-01-29 16:38       ` Eric Auger
2024-01-30 18:22         ` Jean-Philippe Brucker
2024-01-31 11:22           ` Eric Auger [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=f139efc5-c34c-4c83-9954-a37bffcb5c90@redhat.com \
    --to=eric.auger@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=clg@redhat.com \
    --cc=eric.auger.pro@gmail.com \
    --cc=jean-philippe@linaro.org \
    --cc=mst@redhat.com \
    --cc=pbonzini@redhat.com \
    --cc=peter.maydell@linaro.org \
    --cc=peterx@redhat.com \
    --cc=qemu-arm@nongnu.org \
    --cc=qemu-devel@nongnu.org \
    --cc=yanghliu@redhat.com \
    --cc=zhenzhong.duan@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).