qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Cédric Le Goater" <clg@redhat.com>
To: Joao Martins <joao.m.martins@oracle.com>
Cc: Alex Williamson <alex.williamson@redhat.com>,
	Avihai Horon <avihaih@nvidia.com>,
	Gerd Hoffmann <kraxel@redhat.com>,
	qemu-devel@nongnu.org
Subject: Re: [RFC PATCH] vfio/common: Separate vfio-pci ranges
Date: Fri, 8 Sep 2023 10:39:07 +0200	[thread overview]
Message-ID: <eeb858f8-be7a-9778-98dd-b5c472b56203@redhat.com> (raw)
In-Reply-To: <002da1e6-929e-3681-eb3c-1fea30c7adc0@oracle.com>

On 9/8/23 10:35, Joao Martins wrote:
> On 08/09/2023 09:28, Cédric Le Goater wrote:
>> On 9/8/23 10:16, Joao Martins wrote:
>>> On 08/09/2023 08:14, Cédric Le Goater wrote:
>>>> From: Joao Martins <joao.m.martins@oracle.com>
>>>>
>>>> QEMU computes the DMA logging ranges for two predefined ranges: 32-bit
>>>> and 64-bit. In the OVMF case, when the dynamic MMIO window is enabled,
>>>> QEMU includes in the 64-bit range the RAM regions at the lower part
>>>> and vfio-pci device RAM regions which are at the top of the address
>>>> space. This range contains a large gap and the size can be bigger than
>>>> the dirty tracking HW limits of some devices (MLX5 has a 2^42 limit).
>>>>
>>>> To avoid such large ranges, introduce a new PCI range covering the
>>>> vfio-pci device RAM regions, this only if the addresses are above 4GB
>>>> to avoid breaking potential SeaBIOS guests.
>>>>
>>>> Signed-off-by: Joao Martins <joao.m.martins@oracle.com>
>>>> [ clg: - wrote commit log
>>>>          - fixed overlapping 32-bit and PCI ranges when using SeaBIOS ]
>>>> Signed-off-by: Cédric Le Goater <clg@redhat.com>
>>>> ---
>>>>    hw/vfio/common.c     | 51 +++++++++++++++++++++++++++++++++++++++-----
>>>>    hw/vfio/trace-events |  2 +-
>>>>    2 files changed, 47 insertions(+), 6 deletions(-)
>>>>
>>>> diff --git a/hw/vfio/common.c b/hw/vfio/common.c
>>>> index
>>>> 237101d03844273f653d98b6d053a1ae9c05a247..a5548e3bebf999e6d9cef08bdaf1fbc3b437e5eb 100644
>>>> --- a/hw/vfio/common.c
>>>> +++ b/hw/vfio/common.c
>>>> @@ -27,6 +27,7 @@
>>>>      #include "hw/vfio/vfio-common.h"
>>>>    #include "hw/vfio/vfio.h"
>>>> +#include "hw/vfio/pci.h"
>>>>    #include "exec/address-spaces.h"
>>>>    #include "exec/memory.h"
>>>>    #include "exec/ram_addr.h"
>>>> @@ -1400,6 +1401,8 @@ typedef struct VFIODirtyRanges {
>>>>        hwaddr max32;
>>>>        hwaddr min64;
>>>>        hwaddr max64;
>>>> +    hwaddr minpci;
>>>> +    hwaddr maxpci;
>>>
>>> Considering this is about pci64 hole relocation, I wondered post-reading your
>>> feedback, that maybe we should rename {min,max}pci to {min,max}pci64 (...)
>>
>> yes.
>>
>>>
>>>>    } VFIODirtyRanges;
>>>>      typedef struct VFIODirtyRangesListener {
>>>> @@ -1408,6 +1411,31 @@ typedef struct VFIODirtyRangesListener {
>>>>        MemoryListener listener;
>>>>    } VFIODirtyRangesListener;
>>>>    +static bool vfio_section_is_vfio_pci(MemoryRegionSection *section,
>>>> +                                     VFIOContainer *container)
>>>> +{
>>>> +    VFIOPCIDevice *pcidev;
>>>> +    VFIODevice *vbasedev;
>>>> +    VFIOGroup *group;
>>>> +    Object *owner;
>>>> +
>>>> +    owner = memory_region_owner(section->mr);
>>>> +
>>>> +    QLIST_FOREACH(group, &container->group_list, container_next) {
>>>> +        QLIST_FOREACH(vbasedev, &group->device_list, next) {
>>>> +            if (vbasedev->type != VFIO_DEVICE_TYPE_PCI) {
>>>> +                continue;
>>>> +            }
>>>> +            pcidev = container_of(vbasedev, VFIOPCIDevice, vbasedev);
>>>> +            if (OBJECT(pcidev) == owner) {
>>>> +                return true;
>>>> +            }
>>>> +        }
>>>> +    }
>>>> +
>>>> +    return false;
>>>> +}
>>>> +
>>>>    static void vfio_dirty_tracking_update(MemoryListener *listener,
>>>>                                           MemoryRegionSection *section)
>>>>    {
>>>> @@ -1434,9 +1462,14 @@ static void vfio_dirty_tracking_update(MemoryListener
>>>> *listener,
>>>>         * would be an IOVATree but that has a much bigger runtime overhead and
>>>>         * unnecessary complexity.
>>>>         */
>>>> -    min = (end <= UINT32_MAX) ? &range->min32 : &range->min64;
>>>> -    max = (end <= UINT32_MAX) ? &range->max32 : &range->max64;
>>>> -
>>>> +    if (vfio_section_is_vfio_pci(section, dirty->container) &&
>>>> +        iova >= UINT32_MAX) {
>>>> +        min = &range->minpci;
>>>> +        max = &range->maxpci;
>>>
>>> (...) specially considering this check of making sure we skip the pci-hole32 (as
>>> that one is fixed)
>>
>> yep. That check above might deserve a comment also.
>>
>> Could you resend please ?
>>
> 
> yes. This is on top of your vfio-8.2 branch right?

yes. Just reset the HEAD, I will merge the new version.

C.



      reply	other threads:[~2023-09-08  8:39 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-08  7:14 [RFC PATCH] vfio/common: Separate vfio-pci ranges Cédric Le Goater
2023-09-08  8:16 ` Joao Martins
2023-09-08  8:28   ` Cédric Le Goater
2023-09-08  8:35     ` Joao Martins
2023-09-08  8:39       ` Cédric Le Goater [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=eeb858f8-be7a-9778-98dd-b5c472b56203@redhat.com \
    --to=clg@redhat.com \
    --cc=alex.williamson@redhat.com \
    --cc=avihaih@nvidia.com \
    --cc=joao.m.martins@oracle.com \
    --cc=kraxel@redhat.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).