From: "Cédric Le Goater" <clg@redhat.com>
To: Alex Williamson <alex@shazbot.org>, John Levon <john.levon@nutanix.com>
Cc: qemu-devel@nongnu.org,
Alex Williamson <alex.williamson@redhat.com>,
John Johnson <john.g.johnson@oracle.com>,
Elena Ufimtseva <elena.ufimtseva@oracle.com>,
Jagannathan Raman <jag.raman@oracle.com>
Subject: Re: [PULL 25/28] vfio: add region info cache
Date: Tue, 14 Oct 2025 15:58:02 +0200 [thread overview]
Message-ID: <c20591f7-d743-4380-ab89-0efe254acfb2@redhat.com> (raw)
In-Reply-To: <cb4b8412-f1a5-4c7c-b2f4-d65b72194412@app.fastmail.com>
On 10/14/25 15:49, Alex Williamson wrote:
> On Tue, Oct 14, 2025, at 7:32 AM, John Levon wrote:
>> On Tue, Oct 14, 2025 at 03:16:46PM +0200, Cédric Le Goater wrote:
>>
>>>> + /* check cache */
>>>
>>> It would be good to add an assert to check the index value. More important
>>> we need to fix an ugly "index out-of-bounds" bug that can occur when booting
>>> a VM with a vGPU :
>>>
>>> -device vfio-pci-nohotplug,host=0000:27:00.4,display=on,ramfb=true ...
>>>
>>> The interesting part is :
>>>
>>> Thread 1 (Thread 0x7ffff6891ec0 (LWP 11372) "qemu-kvm"):
>>> #0 0x000055555581b83d in vfio_region_setup (obj=0x5555588c0b70, vbasedev=0x5555588c1630, region=0x555558a9c040, index=9, name=0x555555de94ba <str.68.llvm> "display") at ../hw/vfio/region.c:199
>>> #1 0x00005555558208a4 in vfio_display_region_update (opaque=<optimized out>) at ../hw/vfio/display.c:449
>>> #2 0x00005555556bdd6c in graphic_hw_update (con=0x555558acf830) at ../ui/console.c:143
>>> #3 vnc_refresh (dcl=0x7fffec048050) at ../ui/vnc.c:3262
>>> #4 0x00005555556a15cb in dpy_refresh (s=0x555558acf980) at ../ui/console.c:880
>>> #5 gui_update (opaque=0x555558acf980) at ../ui/console.c:90
>>> (gdb) p vbasedev->num_regions
>>> $9 = 9
>>>
>>> Index 9 is beyond the maximum valid index of the reginfo array :/
>>>
>>> We didn't take into account the ioctl VFIO_DEVICE_QUERY_GFX_PLANE
>>> which can return region index 9 which is beyond the maximum valid
>>> index of the reginfo array :/
>>
>> My apologies - we hit the exact same issue internally, but with a much older
>> codebase, so I did not realise this could be an upstream problem as well!
>>
>> We put this down to a bug in the nvidia driver - surely it shouldn't be
>> reporting fewer regions than are actually in use. So we applied what we thought
>> to be a gross hack of boundary checking, and not using the region cache in case
>> it's beyond num_regions.
>>
>> To put it another way, the header file says:
>>
>> 217 __u32 num_regions; /* Max region index + 1 */
>>
>> If it's not actually the max region index + 1, what are the expected semantics
>> of this field, or of region indices more generally? We could not find any clear
>> documentation on the topic other than this comment.
>
> '9' only defines the end of the fixed, pre-defined region indexes for vfio-pci, ie. VFIO_PCI_NUM_REGIONS. Beyond that, we support device specific regions. The GFX region is one such device specific region.
>
> We enumerate these regions based on vfio_device_info.num_regions and use the capabilities feature of the vfio_region_info to introspect the region type provided.
>
> There is no fixed limit to the number of regions a device may expose, nor is vfio_device_info.num_regions necessarily a static value. We're currently discussing a uAPI for generating special mappings to a region that could dynamically increase the reported regions. Thanks,
We should then improve the VFIO region cache handling by reallocating
the reginfo array on demand.
Thanks,
C.
next prev parent reply other threads:[~2025-10-14 13:58 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-05-09 13:12 [PULL 00/28] vfio queue Cédric Le Goater
2025-05-09 13:12 ` [PULL 01/28] vfio/container: ram discard disable helper Cédric Le Goater
2025-05-09 13:12 ` [PULL 02/28] vfio/container: reform vfio_container_connect cleanup Cédric Le Goater
2025-05-09 13:12 ` [PULL 03/28] vfio/container: vfio_container_group_add Cédric Le Goater
2025-05-09 13:12 ` [PULL 04/28] vfio/igd: Restrict legacy mode to Gen6-9 devices Cédric Le Goater
2025-05-09 13:12 ` [PULL 05/28] vfio/igd: Always emulate ASLS (OpRegion) register Cédric Le Goater
2025-05-09 13:12 ` [PULL 06/28] vfio/igd: Detect IGD device by OpRegion Cédric Le Goater
2025-05-09 13:12 ` [PULL 07/28] vfio/igd: Check vendor and device ID on GVT-g mdev Cédric Le Goater
2025-05-09 13:12 ` [PULL 08/28] vfio/igd: Check OpRegion support " Cédric Le Goater
2025-05-09 13:12 ` [PULL 09/28] vfio/igd: Enable OpRegion by default Cédric Le Goater
2025-05-09 13:12 ` [PULL 10/28] vfio/igd: Allow overriding GMS with 0xf0 to 0xfe on Gen9+ Cédric Le Goater
2025-05-09 13:13 ` [PULL 11/28] vfio/igd: Only emulate GGC register when x-igd-gms is set Cédric Le Goater
2025-05-09 13:13 ` [PULL 12/28] vfio/igd: Remove generation limitation for IGD passthrough Cédric Le Goater
2025-05-09 13:13 ` [PULL 13/28] linux-header: update-linux-header script changes Cédric Le Goater
2025-05-09 13:13 ` [PULL 14/28] linux-headers: Update to Linux v6.15-rc3 Cédric Le Goater
2025-05-09 13:13 ` [PULL 15/28] vfio: add vfio_device_prepare() Cédric Le Goater
2025-05-09 13:13 ` [PULL 16/28] vfio: add vfio_device_unprepare() Cédric Le Goater
2025-05-09 13:13 ` [PULL 17/28] vfio: add vfio_attach_device_by_iommu_type() Cédric Le Goater
2025-05-09 13:13 ` [PULL 18/28] vfio: add vfio_device_get_irq_info() helper Cédric Le Goater
2025-05-09 13:13 ` [PULL 19/28] vfio: consistently handle return value for helpers Cédric Le Goater
2025-05-09 13:13 ` [PULL 20/28] vfio: add strread/writeerror() Cédric Le Goater
2025-05-09 13:13 ` [PULL 21/28] vfio: add vfio_pci_config_space_read/write() Cédric Le Goater
2025-05-09 13:13 ` [PULL 22/28] vfio: add unmap_all flag to DMA unmap callback Cédric Le Goater
2025-05-09 13:13 ` [PULL 23/28] vfio: implement unmap all for DMA unmap callbacks Cédric Le Goater
2025-05-09 13:13 ` [PULL 24/28] vfio: add device IO ops vector Cédric Le Goater
2025-05-09 13:13 ` [PULL 25/28] vfio: add region info cache Cédric Le Goater
2025-10-14 13:16 ` Cédric Le Goater
2025-10-14 13:32 ` John Levon
2025-10-14 13:49 ` Alex Williamson
2025-10-14 13:58 ` Cédric Le Goater [this message]
2025-10-14 14:01 ` John Levon
2025-10-14 14:00 ` John Levon
2025-10-14 14:29 ` Alex Williamson
2025-10-14 14:36 ` John Levon
2025-05-09 13:13 ` [PULL 26/28] vfio: add read/write to device IO ops vector Cédric Le Goater
2025-05-09 13:13 ` [PULL 27/28] vfio: add vfio-pci-base class Cédric Le Goater
2025-05-09 13:13 ` [PULL 28/28] vfio/container: pass listener_begin/commit callbacks Cédric Le Goater
2025-05-10 18:36 ` [PULL 00/28] vfio queue Stefan Hajnoczi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c20591f7-d743-4380-ab89-0efe254acfb2@redhat.com \
--to=clg@redhat.com \
--cc=alex.williamson@redhat.com \
--cc=alex@shazbot.org \
--cc=elena.ufimtseva@oracle.com \
--cc=jag.raman@oracle.com \
--cc=john.g.johnson@oracle.com \
--cc=john.levon@nutanix.com \
--cc=qemu-devel@nongnu.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).