From: Joao Martins <joao.m.martins@oracle.com>
To: "Cédric Le Goater" <clg@kaod.org>, "Jason Gunthorpe" <jgg@nvidia.com>
Cc: Yishai Hadas <yishaih@nvidia.com>,
alex.williamson@redhat.com, saeedm@nvidia.com,
kvm@vger.kernel.org, netdev@vger.kernel.org, kuba@kernel.org,
kevin.tian@intel.com, leonro@nvidia.com, maorg@nvidia.com,
cohuck@redhat.com, 'Avihai Horon' <avihaih@nvidia.com>,
Tarun Gupta <targupta@nvidia.com>
Subject: Re: [PATCH V7 vfio 07/10] vfio/mlx5: Create and destroy page tracker object
Date: Thu, 7 Sep 2023 11:51:36 +0100 [thread overview]
Message-ID: <a8eceae4-84a5-06c4-29c3-5769d6f122ce@oracle.com> (raw)
In-Reply-To: <97d88872-e3c8-74f8-d93c-4368393ad0a5@kaod.org>
On 07/09/2023 10:56, Cédric Le Goater wrote:
> On 9/6/23 13:51, Jason Gunthorpe wrote:
>> On Wed, Sep 06, 2023 at 10:55:26AM +0200, Cédric Le Goater wrote:
>>
>>>> + WARN_ON(node);
>>>> + log_addr_space_size = ilog2(total_ranges_len);
>>>> + if (log_addr_space_size <
>>>> + (MLX5_CAP_ADV_VIRTUALIZATION(mdev, pg_track_log_min_addr_space)) ||
>>>> + log_addr_space_size >
>>>> + (MLX5_CAP_ADV_VIRTUALIZATION(mdev, pg_track_log_max_addr_space))) {
>>>> + err = -EOPNOTSUPP;
>>>> + goto out;
>>>> + }
>>>
>>>
>>> We are seeing an issue with dirty page tracking when doing migration
>>> of an OVMF VM guest. The vfio-pci variant driver for the MLX5 VF
>>> device complains when dirty page tracking is initialized from QEMU :
>>>
>>> qemu-kvm: 0000:b1:00.2: Failed to start DMA logging, err -95 (Operation
>>> not supported)
>>>
>>> The 64-bit computed range is :
>>>
>>> vfio_device_dirty_tracking_start nr_ranges 2 32:[0x0 - 0x807fffff],
>>> 64:[0x100000000 - 0x3838000fffff]
>>>
>>> which seems to be too large for the HW. AFAICT, the MLX5 HW has a 42
>>> bits address space limitation for dirty tracking (min is 12). Is it a
>>> FW tunable or a strict limitation ?
>>
>> It would be good to explain where this is coming from, all devices
>> need to make some decision on what address space ranges to track and I
>> would say 2^42 is already pretty generous limit..
>
>
> QEMU computes the DMA logging ranges for two predefined ranges: 32-bit
> and 64-bit. In the OVMF case, QEMU includes in the 64-bit range, RAM
> (at the lower part) and device RAM regions (at the top of the address
> space). The size of that range can be bigger than the 2^42 limit of
> the MLX5 HW for dirty tracking. QEMU is not making much effort to be
> smart. There is room for improvement.
>
Interesting, we haven't reproduced this in our testing with OVMF multi-TB
configs with these VFs. Could you share the OVMF base version you were using? or
maybe we didn't triggered it considering the total device RAM regions would be
small enough to fit the 32G PCI hole64 that is advertised that avoids a
hypothetical relocation.
We could use do more than 2 ranges (or going back to sharing all ranges), or add
a set of ranges that represents the device RAM without computing a min/max there
(not sure we can figure that out from within the memory listener does all this
logic); it would perhaps a bit too BIOS specific if we start looking at specific
parts of the address space (e.g. phys-bits-1) to compute these ranges.
Joao
next prev parent reply other threads:[~2023-09-07 15:29 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-09-08 18:34 [PATCH V7 vfio 00/10] Add device DMA logging support for mlx5 driver Yishai Hadas
2022-09-08 18:34 ` [PATCH V7 vfio 01/10] net/mlx5: Introduce ifc bits for page tracker Yishai Hadas
2022-09-08 18:34 ` [PATCH V7 vfio 02/10] net/mlx5: Query ADV_VIRTUALIZATION capabilities Yishai Hadas
2022-09-08 18:34 ` [PATCH V7 vfio 03/10] vfio: Introduce DMA logging uAPIs Yishai Hadas
2022-09-08 18:34 ` [PATCH V7 vfio 04/10] vfio: Add an IOVA bitmap support Yishai Hadas
2022-09-08 18:34 ` [PATCH V7 vfio 05/10] vfio: Introduce the DMA logging feature support Yishai Hadas
2022-09-08 18:34 ` [PATCH V7 vfio 06/10] vfio/mlx5: Init QP based resources for dirty tracking Yishai Hadas
2022-09-08 18:34 ` [PATCH V7 vfio 07/10] vfio/mlx5: Create and destroy page tracker object Yishai Hadas
2023-09-06 8:55 ` Cédric Le Goater
2023-09-06 9:48 ` Yishai Hadas
2023-09-06 11:51 ` Jason Gunthorpe
2023-09-06 12:08 ` Joao Martins
2023-09-07 9:56 ` Cédric Le Goater
2023-09-07 10:51 ` Joao Martins [this message]
2023-09-07 12:16 ` Cédric Le Goater
2023-09-07 16:33 ` Joao Martins
2023-09-07 17:34 ` Cédric Le Goater
2022-09-08 18:34 ` [PATCH V7 vfio 08/10] vfio/mlx5: Report dirty pages from tracker Yishai Hadas
2022-09-08 18:34 ` [PATCH V7 vfio 09/10] vfio/mlx5: Manage error scenarios on tracker Yishai Hadas
2022-09-08 18:34 ` [PATCH V7 vfio 10/10] vfio/mlx5: Set the driver DMA logging callbacks Yishai Hadas
2022-09-08 20:17 ` [PATCH V7 vfio 00/10] Add device DMA logging support for mlx5 driver Alex Williamson
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a8eceae4-84a5-06c4-29c3-5769d6f122ce@oracle.com \
--to=joao.m.martins@oracle.com \
--cc=alex.williamson@redhat.com \
--cc=avihaih@nvidia.com \
--cc=clg@kaod.org \
--cc=cohuck@redhat.com \
--cc=jgg@nvidia.com \
--cc=kevin.tian@intel.com \
--cc=kuba@kernel.org \
--cc=kvm@vger.kernel.org \
--cc=leonro@nvidia.com \
--cc=maorg@nvidia.com \
--cc=netdev@vger.kernel.org \
--cc=saeedm@nvidia.com \
--cc=targupta@nvidia.com \
--cc=yishaih@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).