qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
From: "Alex Williamson" <alex@shazbot.org>
To: "John Levon" <john.levon@nutanix.com>,
	"Cédric Le Goater" <clg@redhat.com>
Cc: qemu-devel@nongnu.org,
	"Alex Williamson" <alex.williamson@redhat.com>,
	"John Johnson" <john.g.johnson@oracle.com>,
	"Elena Ufimtseva" <elena.ufimtseva@oracle.com>,
	"Jagannathan Raman" <jag.raman@oracle.com>
Subject: Re: [PULL 25/28] vfio: add region info cache
Date: Tue, 14 Oct 2025 07:49:24 -0600	[thread overview]
Message-ID: <cb4b8412-f1a5-4c7c-b2f4-d65b72194412@app.fastmail.com> (raw)
In-Reply-To: <aO5RAIX6WI0MerI-@lent>

On Tue, Oct 14, 2025, at 7:32 AM, John Levon wrote:
> On Tue, Oct 14, 2025 at 03:16:46PM +0200, Cédric Le Goater wrote:
>
>> > +    /* check cache */
>> 
>> It would be good to add an assert to check the index value. More important
>> we need to fix an ugly "index out-of-bounds" bug that can occur when booting
>> a VM with a vGPU :
>> 
>>   -device vfio-pci-nohotplug,host=0000:27:00.4,display=on,ramfb=true ...
>> 
>> The interesting part is :
>> 
>>   Thread 1 (Thread 0x7ffff6891ec0 (LWP 11372) "qemu-kvm"):
>>   #0  0x000055555581b83d in vfio_region_setup (obj=0x5555588c0b70, vbasedev=0x5555588c1630, region=0x555558a9c040, index=9, name=0x555555de94ba <str.68.llvm> "display") at ../hw/vfio/region.c:199
>>   #1  0x00005555558208a4 in vfio_display_region_update (opaque=<optimized out>) at ../hw/vfio/display.c:449
>>   #2  0x00005555556bdd6c in graphic_hw_update (con=0x555558acf830) at ../ui/console.c:143
>>   #3  vnc_refresh (dcl=0x7fffec048050) at ../ui/vnc.c:3262
>>   #4  0x00005555556a15cb in dpy_refresh (s=0x555558acf980) at ../ui/console.c:880
>>   #5  gui_update (opaque=0x555558acf980) at ../ui/console.c:90
>>   (gdb) p vbasedev->num_regions
>>   $9 = 9
>> 
>> Index 9 is beyond the maximum valid index of the reginfo array :/
>> 
>> We didn't take into account the ioctl VFIO_DEVICE_QUERY_GFX_PLANE
>> which can return region index 9 which is beyond the maximum valid
>> index of the reginfo array :/
>
> My apologies - we hit the exact same issue internally, but with a much older
> codebase, so I did not realise this could be an upstream problem as well!
>
> We put this down to a bug in the nvidia driver - surely it shouldn't be
> reporting fewer regions than are actually in use. So we applied what we thought
> to be a gross hack of boundary checking, and not using the region cache in case
> it's beyond num_regions.
>
> To put it another way, the header file says:
>
>    217         __u32   num_regions;    /* Max region index + 1 */
>
> If it's not actually the max region index + 1, what are the expected semantics
> of this field, or of region indices more generally? We could not find any clear
> documentation on the topic other than this comment.

'9' only defines the end of the fixed, pre-defined region indexes for vfio-pci, ie. VFIO_PCI_NUM_REGIONS.  Beyond that, we support device specific regions.  The GFX region is one such device specific region.

We enumerate these regions based on vfio_device_info.num_regions and use the capabilities feature of the vfio_region_info to introspect the region type provided.

There is no fixed limit to the number of regions a device may expose, nor is vfio_device_info.num_regions necessarily a static value.  We're currently discussing a uAPI for generating special mappings to a region that could dynamically increase the reported regions.  Thanks,

Alex


  reply	other threads:[~2025-10-14 13:50 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-09 13:12 [PULL 00/28] vfio queue Cédric Le Goater
2025-05-09 13:12 ` [PULL 01/28] vfio/container: ram discard disable helper Cédric Le Goater
2025-05-09 13:12 ` [PULL 02/28] vfio/container: reform vfio_container_connect cleanup Cédric Le Goater
2025-05-09 13:12 ` [PULL 03/28] vfio/container: vfio_container_group_add Cédric Le Goater
2025-05-09 13:12 ` [PULL 04/28] vfio/igd: Restrict legacy mode to Gen6-9 devices Cédric Le Goater
2025-05-09 13:12 ` [PULL 05/28] vfio/igd: Always emulate ASLS (OpRegion) register Cédric Le Goater
2025-05-09 13:12 ` [PULL 06/28] vfio/igd: Detect IGD device by OpRegion Cédric Le Goater
2025-05-09 13:12 ` [PULL 07/28] vfio/igd: Check vendor and device ID on GVT-g mdev Cédric Le Goater
2025-05-09 13:12 ` [PULL 08/28] vfio/igd: Check OpRegion support " Cédric Le Goater
2025-05-09 13:12 ` [PULL 09/28] vfio/igd: Enable OpRegion by default Cédric Le Goater
2025-05-09 13:12 ` [PULL 10/28] vfio/igd: Allow overriding GMS with 0xf0 to 0xfe on Gen9+ Cédric Le Goater
2025-05-09 13:13 ` [PULL 11/28] vfio/igd: Only emulate GGC register when x-igd-gms is set Cédric Le Goater
2025-05-09 13:13 ` [PULL 12/28] vfio/igd: Remove generation limitation for IGD passthrough Cédric Le Goater
2025-05-09 13:13 ` [PULL 13/28] linux-header: update-linux-header script changes Cédric Le Goater
2025-05-09 13:13 ` [PULL 14/28] linux-headers: Update to Linux v6.15-rc3 Cédric Le Goater
2025-05-09 13:13 ` [PULL 15/28] vfio: add vfio_device_prepare() Cédric Le Goater
2025-05-09 13:13 ` [PULL 16/28] vfio: add vfio_device_unprepare() Cédric Le Goater
2025-05-09 13:13 ` [PULL 17/28] vfio: add vfio_attach_device_by_iommu_type() Cédric Le Goater
2025-05-09 13:13 ` [PULL 18/28] vfio: add vfio_device_get_irq_info() helper Cédric Le Goater
2025-05-09 13:13 ` [PULL 19/28] vfio: consistently handle return value for helpers Cédric Le Goater
2025-05-09 13:13 ` [PULL 20/28] vfio: add strread/writeerror() Cédric Le Goater
2025-05-09 13:13 ` [PULL 21/28] vfio: add vfio_pci_config_space_read/write() Cédric Le Goater
2025-05-09 13:13 ` [PULL 22/28] vfio: add unmap_all flag to DMA unmap callback Cédric Le Goater
2025-05-09 13:13 ` [PULL 23/28] vfio: implement unmap all for DMA unmap callbacks Cédric Le Goater
2025-05-09 13:13 ` [PULL 24/28] vfio: add device IO ops vector Cédric Le Goater
2025-05-09 13:13 ` [PULL 25/28] vfio: add region info cache Cédric Le Goater
2025-10-14 13:16   ` Cédric Le Goater
2025-10-14 13:32     ` John Levon
2025-10-14 13:49       ` Alex Williamson [this message]
2025-10-14 13:58         ` Cédric Le Goater
2025-10-14 14:01           ` John Levon
2025-10-14 14:00         ` John Levon
2025-10-14 14:29           ` Alex Williamson
2025-10-14 14:36             ` John Levon
2025-05-09 13:13 ` [PULL 26/28] vfio: add read/write to device IO ops vector Cédric Le Goater
2025-05-09 13:13 ` [PULL 27/28] vfio: add vfio-pci-base class Cédric Le Goater
2025-05-09 13:13 ` [PULL 28/28] vfio/container: pass listener_begin/commit callbacks Cédric Le Goater
2025-05-10 18:36 ` [PULL 00/28] vfio queue Stefan Hajnoczi

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cb4b8412-f1a5-4c7c-b2f4-d65b72194412@app.fastmail.com \
    --to=alex@shazbot.org \
    --cc=alex.williamson@redhat.com \
    --cc=clg@redhat.com \
    --cc=elena.ufimtseva@oracle.com \
    --cc=jag.raman@oracle.com \
    --cc=john.g.johnson@oracle.com \
    --cc=john.levon@nutanix.com \
    --cc=qemu-devel@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).