qemu-devel.nongnu.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] vfio/igd: Check host PCI address when probing
@ 2025-03-25 17:22 Tomita Moeko
  2025-04-09 17:18 ` Alex Williamson
  0 siblings, 1 reply; 13+ messages in thread
From: Tomita Moeko @ 2025-03-25 17:22 UTC (permalink / raw)
  To: Alex Williamson, Cédric Le Goater, Tomita Moeko; +Cc: qemu-devel

So far, all Intel VGA adapters, including discrete GPUs like A770 and
B580, were treated as IGD devices. While this had no functional impact,
a error about "unsupported IGD device" will be printed when passthrough
Intel discrete GPUs.

Since IGD devices must be at "0000:00:02.0", let's check the host PCI
address when probing.

Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com>
---
 hw/vfio/igd.c | 23 +++++++++--------------
 1 file changed, 9 insertions(+), 14 deletions(-)

diff --git a/hw/vfio/igd.c b/hw/vfio/igd.c
index 265fffc2aa..ff250017b0 100644
--- a/hw/vfio/igd.c
+++ b/hw/vfio/igd.c
@@ -53,6 +53,13 @@
  * headless setup is desired, the OpRegion gets in the way of that.
  */
 
+static bool vfio_is_igd(VFIOPCIDevice *vdev)
+{
+    return vfio_pci_is(vdev, PCI_VENDOR_ID_INTEL, PCI_ANY_ID) &&
+           vfio_is_vga(vdev) &&
+           vfio_pci_host_match(&vdev->host, "0000:00:02.0");
+}
+
 /*
  * This presumes the device is already known to be an Intel VGA device, so we
  * take liberties in which device ID bits match which generation.  This should
@@ -427,13 +434,7 @@ void vfio_probe_igd_bar0_quirk(VFIOPCIDevice *vdev, int nr)
     VFIOConfigMirrorQuirk *ggc_mirror, *bdsm_mirror;
     int gen;
 
-    /*
-     * This must be an Intel VGA device at address 00:02.0 for us to even
-     * consider enabling legacy mode. Some driver have dependencies on the PCI
-     * bus address.
-     */
-    if (!vfio_pci_is(vdev, PCI_VENDOR_ID_INTEL, PCI_ANY_ID) ||
-        !vfio_is_vga(vdev) || nr != 0) {
+    if (nr != 0 || !vfio_is_igd(vdev)) {
         return;
     }
 
@@ -490,13 +491,7 @@ static bool vfio_pci_igd_config_quirk(VFIOPCIDevice *vdev, Error **errp)
     bool legacy_mode_enabled = false;
     Error *err = NULL;
 
-    /*
-     * This must be an Intel VGA device at address 00:02.0 for us to even
-     * consider enabling legacy mode.  The vBIOS has dependencies on the
-     * PCI bus address.
-     */
-    if (!vfio_pci_is(vdev, PCI_VENDOR_ID_INTEL, PCI_ANY_ID) ||
-        !vfio_is_vga(vdev)) {
+    if (!vfio_is_igd(vdev)) {
         return true;
     }
 
-- 
2.47.2



^ permalink raw reply related	[flat|nested] 13+ messages in thread

* Re: [PATCH] vfio/igd: Check host PCI address when probing
  2025-03-25 17:22 [PATCH] vfio/igd: Check host PCI address when probing Tomita Moeko
@ 2025-04-09 17:18 ` Alex Williamson
  2025-04-10  7:34   ` Cédric Le Goater
  2025-04-13 11:30   ` Tomita Moeko
  0 siblings, 2 replies; 13+ messages in thread
From: Alex Williamson @ 2025-04-09 17:18 UTC (permalink / raw)
  To: Tomita Moeko; +Cc: Cédric Le Goater, qemu-devel

On Wed, 26 Mar 2025 01:22:39 +0800
Tomita Moeko <tomitamoeko@gmail.com> wrote:

> So far, all Intel VGA adapters, including discrete GPUs like A770 and
> B580, were treated as IGD devices. While this had no functional impact,
> a error about "unsupported IGD device" will be printed when passthrough
> Intel discrete GPUs.
> 
> Since IGD devices must be at "0000:00:02.0", let's check the host PCI
> address when probing.
> 
> Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com>
> ---
>  hw/vfio/igd.c | 23 +++++++++--------------
>  1 file changed, 9 insertions(+), 14 deletions(-)
> 
> diff --git a/hw/vfio/igd.c b/hw/vfio/igd.c
> index 265fffc2aa..ff250017b0 100644
> --- a/hw/vfio/igd.c
> +++ b/hw/vfio/igd.c
> @@ -53,6 +53,13 @@
>   * headless setup is desired, the OpRegion gets in the way of that.
>   */
>  
> +static bool vfio_is_igd(VFIOPCIDevice *vdev)
> +{
> +    return vfio_pci_is(vdev, PCI_VENDOR_ID_INTEL, PCI_ANY_ID) &&
> +           vfio_is_vga(vdev) &&
> +           vfio_pci_host_match(&vdev->host, "0000:00:02.0");
> +}

vfio-pci devices can also be specified via sysfsdev= rather than host=,
so at a minimum I think we'd need to test against vdev->vbasedev.name,
as other callers of vfio_pci_host_match do.  For example building a
local PCIHostDeviceAddress and comparing it to name.  This is also not
foolproof though if we start taking advantage of devices passed by fd.

Could we instead rely PCIe capabilities?  A discrete GPU should
identify as either an endpoint or legacy endpoint and IGD should
identify as a root complex integrated endpoint, or maybe older versions
would lack the PCIe capability altogether.

Also I think the comments that were dropped below are still valid and
useful to transfer to this new helper.  I think those are actually
referring to the guest address of 00:02.0 though, which should maybe be
a test as well.  Thanks,

Alex

> +
>  /*
>   * This presumes the device is already known to be an Intel VGA device, so we
>   * take liberties in which device ID bits match which generation.  This should
> @@ -427,13 +434,7 @@ void vfio_probe_igd_bar0_quirk(VFIOPCIDevice *vdev, int nr)
>      VFIOConfigMirrorQuirk *ggc_mirror, *bdsm_mirror;
>      int gen;
>  
> -    /*
> -     * This must be an Intel VGA device at address 00:02.0 for us to even
> -     * consider enabling legacy mode. Some driver have dependencies on the PCI
> -     * bus address.
> -     */
> -    if (!vfio_pci_is(vdev, PCI_VENDOR_ID_INTEL, PCI_ANY_ID) ||
> -        !vfio_is_vga(vdev) || nr != 0) {
> +    if (nr != 0 || !vfio_is_igd(vdev)) {
>          return;
>      }
>  
> @@ -490,13 +491,7 @@ static bool vfio_pci_igd_config_quirk(VFIOPCIDevice *vdev, Error **errp)
>      bool legacy_mode_enabled = false;
>      Error *err = NULL;
>  
> -    /*
> -     * This must be an Intel VGA device at address 00:02.0 for us to even
> -     * consider enabling legacy mode.  The vBIOS has dependencies on the
> -     * PCI bus address.
> -     */
> -    if (!vfio_pci_is(vdev, PCI_VENDOR_ID_INTEL, PCI_ANY_ID) ||
> -        !vfio_is_vga(vdev)) {
> +    if (!vfio_is_igd(vdev)) {
>          return true;
>      }
>  



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] vfio/igd: Check host PCI address when probing
  2025-04-09 17:18 ` Alex Williamson
@ 2025-04-10  7:34   ` Cédric Le Goater
  2025-04-13 17:23     ` Tomita Moeko
  2025-04-13 11:30   ` Tomita Moeko
  1 sibling, 1 reply; 13+ messages in thread
From: Cédric Le Goater @ 2025-04-10  7:34 UTC (permalink / raw)
  To: Alex Williamson, Tomita Moeko; +Cc: qemu-devel, Corvin Köhne

+ Corvin

On 4/9/25 19:18, Alex Williamson wrote:
> On Wed, 26 Mar 2025 01:22:39 +0800
> Tomita Moeko <tomitamoeko@gmail.com> wrote:
> 
>> So far, all Intel VGA adapters, including discrete GPUs like A770 and
>> B580, were treated as IGD devices. While this had no functional impact,
>> a error about "unsupported IGD device" will be printed when passthrough
>> Intel discrete GPUs.
>>
>> Since IGD devices must be at "0000:00:02.0", let's check the host PCI
>> address when probing.
>>
>> Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com>
>> ---
>>   hw/vfio/igd.c | 23 +++++++++--------------
>>   1 file changed, 9 insertions(+), 14 deletions(-)
>>
>> diff --git a/hw/vfio/igd.c b/hw/vfio/igd.c
>> index 265fffc2aa..ff250017b0 100644
>> --- a/hw/vfio/igd.c
>> +++ b/hw/vfio/igd.c
>> @@ -53,6 +53,13 @@
>>    * headless setup is desired, the OpRegion gets in the way of that.
>>    */
>>   
>> +static bool vfio_is_igd(VFIOPCIDevice *vdev)
>> +{
>> +    return vfio_pci_is(vdev, PCI_VENDOR_ID_INTEL, PCI_ANY_ID) &&
>> +           vfio_is_vga(vdev) &&
>> +           vfio_pci_host_match(&vdev->host, "0000:00:02.0");
>> +}
> 
> vfio-pci devices can also be specified via sysfsdev= rather than host=,
> so at a minimum I think we'd need to test against vdev->vbasedev.name,
> as other callers of vfio_pci_host_match do.  For example building a
> local PCIHostDeviceAddress and comparing it to name.  This is also not
> foolproof though if we start taking advantage of devices passed by fd.
> 
> Could we instead rely PCIe capabilities?  A discrete GPU should
> identify as either an endpoint or legacy endpoint and IGD should
> identify as a root complex integrated endpoint, or maybe older versions
> would lack the PCIe capability altogether.

Maintaining a list of PCI IDs for Intel GPU devices as Corvin was
proposing in [1] is not a viable solution ?

Thanks,

C.

[1] https://lore.kernel.org/qemu-devel/20250206121341.118337-1-corvin.koehne@gmail.com/
  
> Also I think the comments that were dropped below are still valid and
> useful to transfer to this new helper.  I think those are actually
> referring to the guest address of 00:02.0 though, which should maybe be
> a test as well.  Thanks,
> 
> Alex
> 
>> +
>>   /*
>>    * This presumes the device is already known to be an Intel VGA device, so we
>>    * take liberties in which device ID bits match which generation.  This should
>> @@ -427,13 +434,7 @@ void vfio_probe_igd_bar0_quirk(VFIOPCIDevice *vdev, int nr)
>>       VFIOConfigMirrorQuirk *ggc_mirror, *bdsm_mirror;
>>       int gen;
>>   
>> -    /*
>> -     * This must be an Intel VGA device at address 00:02.0 for us to even
>> -     * consider enabling legacy mode. Some driver have dependencies on the PCI
>> -     * bus address.
>> -     */
>> -    if (!vfio_pci_is(vdev, PCI_VENDOR_ID_INTEL, PCI_ANY_ID) ||
>> -        !vfio_is_vga(vdev) || nr != 0) {
>> +    if (nr != 0 || !vfio_is_igd(vdev)) {
>>           return;
>>       }
>>   
>> @@ -490,13 +491,7 @@ static bool vfio_pci_igd_config_quirk(VFIOPCIDevice *vdev, Error **errp)
>>       bool legacy_mode_enabled = false;
>>       Error *err = NULL;
>>   
>> -    /*
>> -     * This must be an Intel VGA device at address 00:02.0 for us to even
>> -     * consider enabling legacy mode.  The vBIOS has dependencies on the
>> -     * PCI bus address.
>> -     */
>> -    if (!vfio_pci_is(vdev, PCI_VENDOR_ID_INTEL, PCI_ANY_ID) ||
>> -        !vfio_is_vga(vdev)) {
>> +    if (!vfio_is_igd(vdev)) {
>>           return true;
>>       }
>>   
> 



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] vfio/igd: Check host PCI address when probing
  2025-04-09 17:18 ` Alex Williamson
  2025-04-10  7:34   ` Cédric Le Goater
@ 2025-04-13 11:30   ` Tomita Moeko
  2025-04-14 21:51     ` Alex Williamson
  1 sibling, 1 reply; 13+ messages in thread
From: Tomita Moeko @ 2025-04-13 11:30 UTC (permalink / raw)
  To: Alex Williamson; +Cc: Cédric Le Goater, qemu-devel

On 4/10/25 01:18, Alex Williamson wrote:
> On Wed, 26 Mar 2025 01:22:39 +0800
> Tomita Moeko <tomitamoeko@gmail.com> wrote:
> 
>> So far, all Intel VGA adapters, including discrete GPUs like A770 and
>> B580, were treated as IGD devices. While this had no functional impact,
>> a error about "unsupported IGD device" will be printed when passthrough
>> Intel discrete GPUs.
>>
>> Since IGD devices must be at "0000:00:02.0", let's check the host PCI
>> address when probing.
>>
>> Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com>
>> ---
>>  hw/vfio/igd.c | 23 +++++++++--------------
>>  1 file changed, 9 insertions(+), 14 deletions(-)
>>
>> diff --git a/hw/vfio/igd.c b/hw/vfio/igd.c
>> index 265fffc2aa..ff250017b0 100644
>> --- a/hw/vfio/igd.c
>> +++ b/hw/vfio/igd.c
>> @@ -53,6 +53,13 @@
>>   * headless setup is desired, the OpRegion gets in the way of that.
>>   */
>>  
>> +static bool vfio_is_igd(VFIOPCIDevice *vdev)
>> +{
>> +    return vfio_pci_is(vdev, PCI_VENDOR_ID_INTEL, PCI_ANY_ID) &&
>> +           vfio_is_vga(vdev) &&
>> +           vfio_pci_host_match(&vdev->host, "0000:00:02.0");
>> +}
> 
> vfio-pci devices can also be specified via sysfsdev= rather than host=,
> so at a minimum I think we'd need to test against vdev->vbasedev.name,
> as other callers of vfio_pci_host_match do.  For example building a
> local PCIHostDeviceAddress and comparing it to name.  This is also not
> foolproof though if we start taking advantage of devices passed by fd.

Sorry for my late reply. Yes `vfio_pci_host_match` does not work for
sysfsdev of fd, I will drop this change.

> Could we instead rely PCIe capabilities?  A discrete GPU should
> identify as either an endpoint or legacy endpoint and IGD should
> identify as a root complex integrated endpoint, or maybe older versions
> would lack the PCIe capability altogether.

Older generations seems do not have PCIe capabilities in their config
space, like the sandy bridge spec [1] does not mention it. I don't have
a sandy bridge box for now to verify it :(

> Also I think the comments that were dropped below are still valid and
> useful to transfer to this new helper.  I think those are actually
> referring to the guest address of 00:02.0 though, which should maybe be
> a test as well.  Thanks,
> 
> Alex

Guest address of 00:02.0 is not mandatory, newer drivers does not depend
on the address (probably thanks to the discrete GPU, they removed these
hardcode). Though putting it at guest 00:02.0 is still recommended.
I would prefer not making this a limit.

[1] https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/3rd-gen-core-desktop-vol-2-datasheet.pdf

Thanks,
Moeko

>> +
>>  /*
>>   * This presumes the device is already known to be an Intel VGA device, so we
>>   * take liberties in which device ID bits match which generation.  This should
>> @@ -427,13 +434,7 @@ void vfio_probe_igd_bar0_quirk(VFIOPCIDevice *vdev, int nr)
>>      VFIOConfigMirrorQuirk *ggc_mirror, *bdsm_mirror;
>>      int gen;
>>  
>> -    /*
>> -     * This must be an Intel VGA device at address 00:02.0 for us to even
>> -     * consider enabling legacy mode. Some driver have dependencies on the PCI
>> -     * bus address.
>> -     */
>> -    if (!vfio_pci_is(vdev, PCI_VENDOR_ID_INTEL, PCI_ANY_ID) ||
>> -        !vfio_is_vga(vdev) || nr != 0) {
>> +    if (nr != 0 || !vfio_is_igd(vdev)) {
>>          return;
>>      }
>>  
>> @@ -490,13 +491,7 @@ static bool vfio_pci_igd_config_quirk(VFIOPCIDevice *vdev, Error **errp)
>>      bool legacy_mode_enabled = false;
>>      Error *err = NULL;
>>  
>> -    /*
>> -     * This must be an Intel VGA device at address 00:02.0 for us to even
>> -     * consider enabling legacy mode.  The vBIOS has dependencies on the
>> -     * PCI bus address.
>> -     */
>> -    if (!vfio_pci_is(vdev, PCI_VENDOR_ID_INTEL, PCI_ANY_ID) ||
>> -        !vfio_is_vga(vdev)) {
>> +    if (!vfio_is_igd(vdev)) {
>>          return true;
>>      }
>>  
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] vfio/igd: Check host PCI address when probing
  2025-04-10  7:34   ` Cédric Le Goater
@ 2025-04-13 17:23     ` Tomita Moeko
  2025-04-14 22:05       ` Alex Williamson
  0 siblings, 1 reply; 13+ messages in thread
From: Tomita Moeko @ 2025-04-13 17:23 UTC (permalink / raw)
  To: Cédric Le Goater, Alex Williamson; +Cc: qemu-devel, Corvin Köhne


On 4/10/25 15:34, Cédric Le Goater wrote:
> + Corvin
> 
> On 4/9/25 19:18, Alex Williamson wrote:
>> On Wed, 26 Mar 2025 01:22:39 +0800
>> Tomita Moeko <tomitamoeko@gmail.com> wrote:
>>
>>> So far, all Intel VGA adapters, including discrete GPUs like A770 and
>>> B580, were treated as IGD devices. While this had no functional impact,
>>> a error about "unsupported IGD device" will be printed when passthrough
>>> Intel discrete GPUs.
>>>
>>> Since IGD devices must be at "0000:00:02.0", let's check the host PCI
>>> address when probing.
>>>
>>> Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com>
>>> ---
>>>   hw/vfio/igd.c | 23 +++++++++--------------
>>>   1 file changed, 9 insertions(+), 14 deletions(-)
>>>
>>> diff --git a/hw/vfio/igd.c b/hw/vfio/igd.c
>>> index 265fffc2aa..ff250017b0 100644
>>> --- a/hw/vfio/igd.c
>>> +++ b/hw/vfio/igd.c
>>> @@ -53,6 +53,13 @@
>>>    * headless setup is desired, the OpRegion gets in the way of that.
>>>    */
>>>   +static bool vfio_is_igd(VFIOPCIDevice *vdev)
>>> +{
>>> +    return vfio_pci_is(vdev, PCI_VENDOR_ID_INTEL, PCI_ANY_ID) &&
>>> +           vfio_is_vga(vdev) &&
>>> +           vfio_pci_host_match(&vdev->host, "0000:00:02.0");
>>> +}
>>
>> vfio-pci devices can also be specified via sysfsdev= rather than host=,
>> so at a minimum I think we'd need to test against vdev->vbasedev.name,
>> as other callers of vfio_pci_host_match do.  For example building a
>> local PCIHostDeviceAddress and comparing it to name.  This is also not
>> foolproof though if we start taking advantage of devices passed by fd.
>>
>> Could we instead rely PCIe capabilities?  A discrete GPU should
>> identify as either an endpoint or legacy endpoint and IGD should
>> identify as a root complex integrated endpoint, or maybe older versions
>> would lack the PCIe capability altogether.
> 
> Maintaining a list of PCI IDs for Intel GPU devices as Corvin was
> proposing in [1] is not a viable solution ?
> 
> Thanks,
> 
> C.
> 
> [1] https://lore.kernel.org/qemu-devel/20250206121341.118337-1-corvin.koehne@gmail.com/

I checked Intel doc, probably maintaining an device ID list is the only
possible way. But given that intel is moving to xe driver, generation
becomes unclear, I'd like to propose a list with quirk flags for igd.

static const struct igd_device igd_devices[] = {
    INTEL_SNB_IDS(IGD_DEVICE, OPREGION_QUIRK | BDSM_QUIRK),
    INTEL_TGL_IDS(IGD_DEVICE, OPREGION_QUIRK | BDSM64_QUIRK),
}

Matching in the list is more time consuming than current switch-case,
it's better to have a new field to cache it.

I will go with Corvin's first 2 patches with reordering suggested by
Cornelia.

Thanks,
Moeko
  
>> Also I think the comments that were dropped below are still valid and
>> useful to transfer to this new helper.  I think those are actually
>> referring to the guest address of 00:02.0 though, which should maybe be
>> a test as well.  Thanks,
>>
>> Alex
>>
>>> +
>>>   /*
>>>    * This presumes the device is already known to be an Intel VGA device, so we
>>>    * take liberties in which device ID bits match which generation.  This should
>>> @@ -427,13 +434,7 @@ void vfio_probe_igd_bar0_quirk(VFIOPCIDevice *vdev, int nr)
>>>       VFIOConfigMirrorQuirk *ggc_mirror, *bdsm_mirror;
>>>       int gen;
>>>   -    /*
>>> -     * This must be an Intel VGA device at address 00:02.0 for us to even
>>> -     * consider enabling legacy mode. Some driver have dependencies on the PCI
>>> -     * bus address.
>>> -     */
>>> -    if (!vfio_pci_is(vdev, PCI_VENDOR_ID_INTEL, PCI_ANY_ID) ||
>>> -        !vfio_is_vga(vdev) || nr != 0) {
>>> +    if (nr != 0 || !vfio_is_igd(vdev)) {
>>>           return;
>>>       }
>>>   @@ -490,13 +491,7 @@ static bool vfio_pci_igd_config_quirk(VFIOPCIDevice *vdev, Error **errp)
>>>       bool legacy_mode_enabled = false;
>>>       Error *err = NULL;
>>>   -    /*
>>> -     * This must be an Intel VGA device at address 00:02.0 for us to even
>>> -     * consider enabling legacy mode.  The vBIOS has dependencies on the
>>> -     * PCI bus address.
>>> -     */
>>> -    if (!vfio_pci_is(vdev, PCI_VENDOR_ID_INTEL, PCI_ANY_ID) ||
>>> -        !vfio_is_vga(vdev)) {
>>> +    if (!vfio_is_igd(vdev)) {
>>>           return true;
>>>       }
>>>   
>>
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] vfio/igd: Check host PCI address when probing
  2025-04-13 11:30   ` Tomita Moeko
@ 2025-04-14 21:51     ` Alex Williamson
  0 siblings, 0 replies; 13+ messages in thread
From: Alex Williamson @ 2025-04-14 21:51 UTC (permalink / raw)
  To: Tomita Moeko; +Cc: Cédric Le Goater, qemu-devel

On Sun, 13 Apr 2025 19:30:10 +0800
Tomita Moeko <tomitamoeko@gmail.com> wrote:

> On 4/10/25 01:18, Alex Williamson wrote:
> > On Wed, 26 Mar 2025 01:22:39 +0800
> > Tomita Moeko <tomitamoeko@gmail.com> wrote:
> >   
> >> So far, all Intel VGA adapters, including discrete GPUs like A770 and
> >> B580, were treated as IGD devices. While this had no functional impact,
> >> a error about "unsupported IGD device" will be printed when passthrough
> >> Intel discrete GPUs.
> >>
> >> Since IGD devices must be at "0000:00:02.0", let's check the host PCI
> >> address when probing.
> >>
> >> Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com>
> >> ---
> >>  hw/vfio/igd.c | 23 +++++++++--------------
> >>  1 file changed, 9 insertions(+), 14 deletions(-)
> >>
> >> diff --git a/hw/vfio/igd.c b/hw/vfio/igd.c
> >> index 265fffc2aa..ff250017b0 100644
> >> --- a/hw/vfio/igd.c
> >> +++ b/hw/vfio/igd.c
> >> @@ -53,6 +53,13 @@
> >>   * headless setup is desired, the OpRegion gets in the way of that.
> >>   */
> >>  
> >> +static bool vfio_is_igd(VFIOPCIDevice *vdev)
> >> +{
> >> +    return vfio_pci_is(vdev, PCI_VENDOR_ID_INTEL, PCI_ANY_ID) &&
> >> +           vfio_is_vga(vdev) &&
> >> +           vfio_pci_host_match(&vdev->host, "0000:00:02.0");
> >> +}  
> > 
> > vfio-pci devices can also be specified via sysfsdev= rather than host=,
> > so at a minimum I think we'd need to test against vdev->vbasedev.name,
> > as other callers of vfio_pci_host_match do.  For example building a
> > local PCIHostDeviceAddress and comparing it to name.  This is also not
> > foolproof though if we start taking advantage of devices passed by fd.  
> 
> Sorry for my late reply. Yes `vfio_pci_host_match` does not work for
> sysfsdev of fd, I will drop this change.
> 
> > Could we instead rely PCIe capabilities?  A discrete GPU should
> > identify as either an endpoint or legacy endpoint and IGD should
> > identify as a root complex integrated endpoint, or maybe older versions
> > would lack the PCIe capability altogether.  
> 
> Older generations seems do not have PCIe capabilities in their config
> space, like the sandy bridge spec [1] does not mention it. I don't have
> a sandy bridge box for now to verify it :(

I have a Sandy Bridge i7-2760QM and can confirm there is no PCIe
capability:

# lspci -vvvs 02.0
00:02.0 VGA compatible controller: Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller (rev 09) (prog-if 00 [VGA controller])
	Subsystem: Lenovo Device 21d1
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx+
	Status: Cap+ 66MHz- UDF- FastB2B+ ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0
	Interrupt: pin A routed to IRQ 35
	Region 0: Memory at f0000000 (64-bit, non-prefetchable) [size=4M]
	Region 2: Memory at e0000000 (64-bit, prefetchable) [size=256M]
	Region 4: I/O ports at 5000 [size=64]
	Expansion ROM at 000c0000 [virtual] [disabled] [size=128K]
	Capabilities: [90] MSI: Enable+ Count=1/1 Maskable- 64bit-
		Address: fee00018  Data: 0000
	Capabilities: [d0] Power Management version 2
		Flags: PMEClk- DSI+ D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 NoSoftRst- PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [a4] PCI Advanced Features
		AFCap: TP+ FLR+
		AFCtrl: FLR-
		AFStatus: TP-
	Kernel driver in use: i915
	Kernel modules: i915

# lspci -xxxs 02.0
00:02.0 VGA compatible controller: Intel Corporation 2nd Generation Core Processor Family Integrated Graphics Controller (rev 09)
00: 86 80 26 01 07 04 90 00 09 00 00 03 00 00 00 00
10: 04 00 00 f0 00 00 00 00 0c 00 00 e0 00 00 00 00
20: 01 50 00 00 00 00 00 00 00 00 00 00 aa 17 d1 21
30: 00 00 00 00 90 00 00 00 00 00 00 00 0b 01 00 00
40: 09 00 0c 01 9d 21 00 e2 90 00 00 14 00 00 00 00
50: 11 02 00 00 11 00 00 00 00 00 00 00 01 00 a0 db
60: 00 00 02 00 00 00 00 00 00 00 00 00 00 00 00 00
70: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
90: 05 d0 01 00 18 00 e0 fe 00 00 00 00 00 00 00 00
a0: 00 00 00 00 13 00 06 03 00 00 00 00 00 00 00 00
b0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
d0: 01 a4 22 00 00 00 00 00 00 00 00 00 00 00 00 00
e0: 00 00 00 00 01 00 00 00 00 80 00 00 00 00 00 00
f0: 00 00 00 00 00 00 00 00 00 00 06 00 18 60 ef da

> > Also I think the comments that were dropped below are still valid and
> > useful to transfer to this new helper.  I think those are actually
> > referring to the guest address of 00:02.0 though, which should maybe be
> > a test as well.  Thanks,
> > 
> > Alex  
> 
> Guest address of 00:02.0 is not mandatory, newer drivers does not depend
> on the address (probably thanks to the discrete GPU, they removed these
> hardcode). Though putting it at guest 00:02.0 is still recommended.
> I would prefer not making this a limit.

I suppose it was never enforced beyond the comment, so if you vouch
that it's not mandatory, let's not make it a requirement now.  Thanks,

Alex



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] vfio/igd: Check host PCI address when probing
  2025-04-13 17:23     ` Tomita Moeko
@ 2025-04-14 22:05       ` Alex Williamson
  2025-04-15 17:36         ` Tomita Moeko
  0 siblings, 1 reply; 13+ messages in thread
From: Alex Williamson @ 2025-04-14 22:05 UTC (permalink / raw)
  To: Tomita Moeko; +Cc: Cédric Le Goater, qemu-devel, Corvin Köhne

On Mon, 14 Apr 2025 01:23:56 +0800
Tomita Moeko <tomitamoeko@gmail.com> wrote:

> On 4/10/25 15:34, Cédric Le Goater wrote:
> > + Corvin
> > 
> > On 4/9/25 19:18, Alex Williamson wrote:  
> >> On Wed, 26 Mar 2025 01:22:39 +0800
> >> Tomita Moeko <tomitamoeko@gmail.com> wrote:
> >>  
> >>> So far, all Intel VGA adapters, including discrete GPUs like A770 and
> >>> B580, were treated as IGD devices. While this had no functional impact,
> >>> a error about "unsupported IGD device" will be printed when passthrough
> >>> Intel discrete GPUs.
> >>>
> >>> Since IGD devices must be at "0000:00:02.0", let's check the host PCI
> >>> address when probing.
> >>>
> >>> Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com>
> >>> ---
> >>>   hw/vfio/igd.c | 23 +++++++++--------------
> >>>   1 file changed, 9 insertions(+), 14 deletions(-)
> >>>
> >>> diff --git a/hw/vfio/igd.c b/hw/vfio/igd.c
> >>> index 265fffc2aa..ff250017b0 100644
> >>> --- a/hw/vfio/igd.c
> >>> +++ b/hw/vfio/igd.c
> >>> @@ -53,6 +53,13 @@
> >>>    * headless setup is desired, the OpRegion gets in the way of that.
> >>>    */
> >>>   +static bool vfio_is_igd(VFIOPCIDevice *vdev)
> >>> +{
> >>> +    return vfio_pci_is(vdev, PCI_VENDOR_ID_INTEL, PCI_ANY_ID) &&
> >>> +           vfio_is_vga(vdev) &&
> >>> +           vfio_pci_host_match(&vdev->host, "0000:00:02.0");
> >>> +}  
> >>
> >> vfio-pci devices can also be specified via sysfsdev= rather than host=,
> >> so at a minimum I think we'd need to test against vdev->vbasedev.name,
> >> as other callers of vfio_pci_host_match do.  For example building a
> >> local PCIHostDeviceAddress and comparing it to name.  This is also not
> >> foolproof though if we start taking advantage of devices passed by fd.
> >>
> >> Could we instead rely PCIe capabilities?  A discrete GPU should
> >> identify as either an endpoint or legacy endpoint and IGD should
> >> identify as a root complex integrated endpoint, or maybe older versions
> >> would lack the PCIe capability altogether.  
> > 
> > Maintaining a list of PCI IDs for Intel GPU devices as Corvin was
> > proposing in [1] is not a viable solution ?
> > 
> > Thanks,
> > 
> > C.
> > 
> > [1] https://lore.kernel.org/qemu-devel/20250206121341.118337-1-corvin.koehne@gmail.com/  
> 
> I checked Intel doc, probably maintaining an device ID list is the only
> possible way. But given that intel is moving to xe driver, generation
> becomes unclear, I'd like to propose a list with quirk flags for igd.
> 
> static const struct igd_device igd_devices[] = {
>     INTEL_SNB_IDS(IGD_DEVICE, OPREGION_QUIRK | BDSM_QUIRK),
>     INTEL_TGL_IDS(IGD_DEVICE, OPREGION_QUIRK | BDSM64_QUIRK),
> }
> 
> Matching in the list is more time consuming than current switch-case,
> it's better to have a new field to cache it.
> 
> I will go with Corvin's first 2 patches with reordering suggested by
> Cornelia.

If I recall the discussion correctly, Corvin's series was mapping device
IDs to generation, where I had the concern that it creates ongoing
overhead to sync with the i915 driver to create new mappings.  There
was a suggestion that newer hardware has a register that reports the
generation, so maybe we only need to manage creating the mapping table
up to the point we can rely on getting the generation information from
hardware (with the massive caveat that Intel could drop that generation
register in the future, or maybe already has).

The above table however suggests yet another use case of the table, a
mapping of quirks to specific devices.  It seems this once again
introduces the maintenance issue.  Why would it not be sufficient to
determine the quirks based on the generation alone?  Thanks,

Alex



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] vfio/igd: Check host PCI address when probing
  2025-04-14 22:05       ` Alex Williamson
@ 2025-04-15 17:36         ` Tomita Moeko
  2025-04-15 19:04           ` Alex Williamson
  0 siblings, 1 reply; 13+ messages in thread
From: Tomita Moeko @ 2025-04-15 17:36 UTC (permalink / raw)
  To: Alex Williamson; +Cc: Cédric Le Goater, qemu-devel, Corvin Köhne

On 4/15/25 06:05, Alex Williamson wrote:
> On Mon, 14 Apr 2025 01:23:56 +0800
> Tomita Moeko <tomitamoeko@gmail.com> wrote:
> 
>> On 4/10/25 15:34, Cédric Le Goater wrote:
>>> + Corvin
>>>
>>> On 4/9/25 19:18, Alex Williamson wrote:  
>>>> On Wed, 26 Mar 2025 01:22:39 +0800
>>>> Tomita Moeko <tomitamoeko@gmail.com> wrote:
>>>>  
>>>>> So far, all Intel VGA adapters, including discrete GPUs like A770 and
>>>>> B580, were treated as IGD devices. While this had no functional impact,
>>>>> a error about "unsupported IGD device" will be printed when passthrough
>>>>> Intel discrete GPUs.
>>>>>
>>>>> Since IGD devices must be at "0000:00:02.0", let's check the host PCI
>>>>> address when probing.
>>>>>
>>>>> Signed-off-by: Tomita Moeko <tomitamoeko@gmail.com>
>>>>> ---
>>>>>   hw/vfio/igd.c | 23 +++++++++--------------
>>>>>   1 file changed, 9 insertions(+), 14 deletions(-)
>>>>>
>>>>> diff --git a/hw/vfio/igd.c b/hw/vfio/igd.c
>>>>> index 265fffc2aa..ff250017b0 100644
>>>>> --- a/hw/vfio/igd.c
>>>>> +++ b/hw/vfio/igd.c
>>>>> @@ -53,6 +53,13 @@
>>>>>    * headless setup is desired, the OpRegion gets in the way of that.
>>>>>    */
>>>>>   +static bool vfio_is_igd(VFIOPCIDevice *vdev)
>>>>> +{
>>>>> +    return vfio_pci_is(vdev, PCI_VENDOR_ID_INTEL, PCI_ANY_ID) &&
>>>>> +           vfio_is_vga(vdev) &&
>>>>> +           vfio_pci_host_match(&vdev->host, "0000:00:02.0");
>>>>> +}  
>>>>
>>>> vfio-pci devices can also be specified via sysfsdev= rather than host=,
>>>> so at a minimum I think we'd need to test against vdev->vbasedev.name,
>>>> as other callers of vfio_pci_host_match do.  For example building a
>>>> local PCIHostDeviceAddress and comparing it to name.  This is also not
>>>> foolproof though if we start taking advantage of devices passed by fd.
>>>>
>>>> Could we instead rely PCIe capabilities?  A discrete GPU should
>>>> identify as either an endpoint or legacy endpoint and IGD should
>>>> identify as a root complex integrated endpoint, or maybe older versions
>>>> would lack the PCIe capability altogether.  
>>>
>>> Maintaining a list of PCI IDs for Intel GPU devices as Corvin was
>>> proposing in [1] is not a viable solution ?
>>>
>>> Thanks,
>>>
>>> C.
>>>
>>> [1] https://lore.kernel.org/qemu-devel/20250206121341.118337-1-corvin.koehne@gmail.com/  
>>
>> I checked Intel doc, probably maintaining an device ID list is the only
>> possible way. But given that intel is moving to xe driver, generation
>> becomes unclear, I'd like to propose a list with quirk flags for igd.
>>
>> static const struct igd_device igd_devices[] = {
>>     INTEL_SNB_IDS(IGD_DEVICE, OPREGION_QUIRK | BDSM_QUIRK),
>>     INTEL_TGL_IDS(IGD_DEVICE, OPREGION_QUIRK | BDSM64_QUIRK),
>> }
>>
>> Matching in the list is more time consuming than current switch-case,
>> it's better to have a new field to cache it.
>>
>> I will go with Corvin's first 2 patches with reordering suggested by
>> Cornelia.
> 
> If I recall the discussion correctly, Corvin's series was mapping device
> IDs to generation, where I had the concern that it creates ongoing
> overhead to sync with the i915 driver to create new mappings.  There
> was a suggestion that newer hardware has a register that reports the
> generation, so maybe we only need to manage creating the mapping table
> up to the point we can rely on getting the generation information from
> hardware (with the massive caveat that Intel could drop that generation
> register in the future, or maybe already has).
> 
> The above table however suggests yet another use case of the table, a
> mapping of quirks to specific devices.  It seems this once again
> introduces the maintenance issue.  Why would it not be sufficient to
> determine the quirks based on the generation alone?  Thanks,
> 
> Alex

The generation register also exists on discrete GPUs. In the new xe
driver [1], the Battlemage discrete GPU shares the same logic reading
GMD_ID_DISPLAY register. The driver itself uses is_dgfx bit mapped to
device id. In QEMU, we need to know whether the device is a supported
IGD device first before applying the IGD-specific quirk, especially
for legacy mode.

The most feasible way is to check if kernel exposes VFIO_REGION_SUBTYPE_
INTEL_IGD_OPREGION on that device I think, as only IGD has OpRegion.

i915 driver [2] and Arrow Lake datasheet [3] shows that Intel has
removed the BDSM register by making the DSM range part of BAR2 since
Meteor Lake and onwards. QEMU only need to quirk on the register for
IGD devices until Raptor Lake, meaning that the device list is fixed
for now.

By the way, for legacy mode, I think we should only support it until
Gen 9, as Intel only provide VBIOS or CSM support until that generation,
and seabios cannot handle 64 bit BDSM register. I'm also wondering if
VGA really works on newer generations.

Maybe we can continue with current igd_gen, but implement a logic like:
    if (!intel graphics)
        return;
    if (!has VFIO_REGION_SUBTYPE_INTEL_IGD_OPREGION)
        return;
    setup_opregion();  // make x-igd-opregion automatically enabled?
    if (gen <= 9)
        setup_legacy_mode();
    if (gen >= 6 && gen <=9)
        setup_32bit_bdsm():
    else if (gen >= 9 && gen <= 12)
        setup_64bit_bdsm();
    // ...
    // optional quirks like lpc bridge id

A table can also be used to precisely track all the gen 6-12 devices.

[1] https://github.com/torvalds/linux/blob/master/drivers/gpu/drm/xe/xe_pci.c#L630
[2] https://github.com/torvalds/linux/blob/69b8923f5003664e3ffef102e73333edfa2abdcf/drivers/gpu/drm/i915/gem/i915_gem_stolen.c#L918
[3] https://edc.intel.com/content/www/us/en/design/publications/core-ultra-p200s-series-processors-cfg-mem-registers/d2-f0-processor-graphics-registers/

Thanks,
Moeko

Attached a config space dump of Intel A770 discrete GPU for reference

03:00.0 VGA compatible controller: Intel Corporation DG2 [Arc A770] (rev 08) (prog-if 00 [VGA controller])
	Subsystem: Intel Corporation Device 1020
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Interrupt: pin ? routed to IRQ 181
	IOMMU group: 19
	Region 0: Memory at 81000000 (64-bit, non-prefetchable) [size=16M]
	Region 2: Memory at 6000000000 (64-bit, prefetchable) [size=16G]
	Expansion ROM at 82000000 [disabled] [size=2M]
	Capabilities: [40] Vendor Specific Information: Len=0c <?>
	Capabilities: [70] Express (v2) Endpoint, IntMsgNum 0
		DevCap:	MaxPayload 128 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0W TEE-IO-
		DevCtl:	CorrErr- NonFatalErr- FatalErr- UnsupReq-
			RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ FLReset-
			MaxPayload 128 bytes, MaxReadReq 128 bytes
		DevSta:	CorrErr- NonFatalErr- FatalErr- UnsupReq+ AuxPwr- TransPend-
		LnkCap:	Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
			ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
		LnkCtl:	ASPM Disabled; RCB 64 bytes, LnkDisable- CommClk-
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 2.5GT/s, Width x1
			TrErr- Train- SlotClk- DLActive- BWMgmt- ABWMgmt-
		DevCap2: Completion Timeout: Range B, TimeoutDis+ NROPrPrP- LTR+
			 10BitTagComp+ 10BitTagReq+ OBFF Not Supported, ExtFmt+ EETLPPrefix-
			 EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
			 FRS- TPHComp- ExtTPHComp-
			 AtomicOpsCap: 32bit- 64bit- 128bitCAS-
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-
			 AtomicOpsCtl: ReqEn-
			 IDOReq- IDOCompl- LTR+ EmergencyPowerReductionReq-
			 10BitTagReq- OBFF Disabled, EETLPPrefixBlk-
		LnkCap2: Supported Link Speeds: 2.5GT/s, Crosslink- Retimer- 2Retimers- DRS-
		LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance Preset/De-emphasis: -6dB de-emphasis, 0dB preshoot
		LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete- EqualizationPhase1-
			 EqualizationPhase2- EqualizationPhase3- LinkEqualizationRequest-
			 Retimer- 2Retimers- CrosslinkRes: unsupported
	Capabilities: [ac] MSI: Enable+ Count=1/1 Maskable+ 64bit+
		Address: 00000000fee008b8  Data: 0000
		Masking: 00000000  Pending: 00000000
	Capabilities: [d0] Power Management version 3
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold-)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [100 v1] Alternative Routing-ID Interpretation (ARI)
		ARICap:	MFVC- ACS-, Next Function: 0
		ARICtl:	MFVC- ACS-, Function Group: 0
	Capabilities: [420 v1] Physical Resizable BAR
		BAR 2: current size: 16GB, supported: 256MB 512MB 1GB 2GB 4GB 8GB 16GB
	Capabilities: [400 v1] Latency Tolerance Reporting
		Max snoop latency: 15728640ns
		Max no snoop latency: 15728640ns
	Kernel driver in use: i915
	Kernel modules: i915, xe
00: 86 80 a0 56 07 00 10 00 08 00 00 03 10 00 00 00
10: 04 00 00 81 00 00 00 00 0c 00 00 00 60 00 00 00
20: 00 00 00 00 00 00 00 00 00 00 00 00 86 80 20 10
30: 00 00 00 82 40 00 00 00 00 00 00 00 ff 00 00 00
40: 09 70 0c 01 03 00 00 00 00 00 00 00 00 00 00 00
50: c0 f5 00 00 00 00 00 00 00 00 00 00 00 00 00 00
60: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
70: 10 ac 02 00 20 80 00 10 10 09 08 00 11 0c 40 00
80: 00 00 11 00 00 00 00 00 00 00 00 00 00 00 00 00
90: 00 00 00 00 12 08 13 00 00 04 00 00 02 00 00 00
a0: 00 00 00 00 00 00 00 00 00 00 00 00 05 d0 81 01
b0: b8 08 e0 fe 00 00 00 00 00 00 00 00 00 00 00 00
c0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
d0: 01 00 03 48 08 00 00 00 00 00 00 00 00 00 00 00
e0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
f0: 47 01 c0 ff 03 00 00 00 00 00 00 00 00 00 00 00



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] vfio/igd: Check host PCI address when probing
  2025-04-15 17:36         ` Tomita Moeko
@ 2025-04-15 19:04           ` Alex Williamson
  2025-04-16 15:45             ` Tomita Moeko
  0 siblings, 1 reply; 13+ messages in thread
From: Alex Williamson @ 2025-04-15 19:04 UTC (permalink / raw)
  To: Tomita Moeko; +Cc: Cédric Le Goater, qemu-devel, Corvin Köhne

On Wed, 16 Apr 2025 01:36:15 +0800
Tomita Moeko <tomitamoeko@gmail.com> wrote:
> 
> The generation register also exists on discrete GPUs. In the new xe
> driver [1], the Battlemage discrete GPU shares the same logic reading
> GMD_ID_DISPLAY register. The driver itself uses is_dgfx bit mapped to
> device id. In QEMU, we need to know whether the device is a supported
> IGD device first before applying the IGD-specific quirk, especially
> for legacy mode.
> 
> The most feasible way is to check if kernel exposes VFIO_REGION_SUBTYPE_
> INTEL_IGD_OPREGION on that device I think, as only IGD has OpRegion.
> 
> i915 driver [2] and Arrow Lake datasheet [3] shows that Intel has
> removed the BDSM register by making the DSM range part of BAR2 since
> Meteor Lake and onwards. QEMU only need to quirk on the register for
> IGD devices until Raptor Lake, meaning that the device list is fixed
> for now.
> 
> By the way, for legacy mode, I think we should only support it until
> Gen 9, as Intel only provide VBIOS or CSM support until that generation,
> and seabios cannot handle 64 bit BDSM register. I'm also wondering if
> VGA really works on newer generations.

If it's a VGA class device, it really should, but without CSM I could
see why you have doubts.
 
> Maybe we can continue with current igd_gen, but implement a logic like:
>     if (!intel graphics)
>         return;
>     if (!has VFIO_REGION_SUBTYPE_INTEL_IGD_OPREGION)
>         return;
>     setup_opregion();  // make x-igd-opregion automatically enabled?
>     if (gen <= 9)
>         setup_legacy_mode();
>     if (gen >= 6 && gen <=9)
>         setup_32bit_bdsm():
>     else if (gen >= 9 && gen <= 12)
>         setup_64bit_bdsm();
>     // ...
>     // optional quirks like lpc bridge id
> 
> A table can also be used to precisely track all the gen 6-12 devices.

This seems reasonable to me.

> Attached a config space dump of Intel A770 discrete GPU for reference
> 
> 03:00.0 VGA compatible controller: Intel Corporation DG2 [Arc A770] (rev 08) (prog-if 00 [VGA controller])
> 	Subsystem: Intel Corporation Device 1020
> 	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
> 	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
> 	Latency: 0, Cache Line Size: 64 bytes
> 	Interrupt: pin ? routed to IRQ 181
> 	IOMMU group: 19
> 	Region 0: Memory at 81000000 (64-bit, non-prefetchable) [size=16M]
> 	Region 2: Memory at 6000000000 (64-bit, prefetchable) [size=16G]
> 	Expansion ROM at 82000000 [disabled] [size=2M]
> 	Capabilities: [40] Vendor Specific Information: Len=0c <?>
> 	Capabilities: [70] Express (v2) Endpoint, IntMsgNum 0
> 		DevCap:	MaxPayload 128 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
> 			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0W TEE-IO-
> 		DevCtl:	CorrErr- NonFatalErr- FatalErr- UnsupReq-
> 			RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ FLReset-
> 			MaxPayload 128 bytes, MaxReadReq 128 bytes
> 		DevSta:	CorrErr- NonFatalErr- FatalErr- UnsupReq+ AuxPwr- TransPend-
> 		LnkCap:	Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us

Hmm, hardware bug?  Surely the A770 is not a Gen1, x1 device.
Something is going on with the interrupt pin above too.  At least it
claims FLReset+ above, does it work reliably though?  Seems like there
are various reports of Arc GPUs not working well with assignment due to
reset issues.  Thanks,

Alex

> 			ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
> 		LnkCtl:	ASPM Disabled; RCB 64 bytes, LnkDisable- CommClk-
> 			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
> 		LnkSta:	Speed 2.5GT/s, Width x1
> 			TrErr- Train- SlotClk- DLActive- BWMgmt- ABWMgmt-
> 		DevCap2: Completion Timeout: Range B, TimeoutDis+ NROPrPrP- LTR+
> 			 10BitTagComp+ 10BitTagReq+ OBFF Not Supported, ExtFmt+ EETLPPrefix-
> 			 EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
> 			 FRS- TPHComp- ExtTPHComp-
> 			 AtomicOpsCap: 32bit- 64bit- 128bitCAS-
> 		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-
> 			 AtomicOpsCtl: ReqEn-
> 			 IDOReq- IDOCompl- LTR+ EmergencyPowerReductionReq-
> 			 10BitTagReq- OBFF Disabled, EETLPPrefixBlk-
> 		LnkCap2: Supported Link Speeds: 2.5GT/s, Crosslink- Retimer- 2Retimers- DRS-
> 		LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
> 			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
> 			 Compliance Preset/De-emphasis: -6dB de-emphasis, 0dB preshoot
> 		LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete- EqualizationPhase1-
> 			 EqualizationPhase2- EqualizationPhase3- LinkEqualizationRequest-
> 			 Retimer- 2Retimers- CrosslinkRes: unsupported
> 	Capabilities: [ac] MSI: Enable+ Count=1/1 Maskable+ 64bit+
> 		Address: 00000000fee008b8  Data: 0000
> 		Masking: 00000000  Pending: 00000000
> 	Capabilities: [d0] Power Management version 3
> 		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold-)
> 		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
> 	Capabilities: [100 v1] Alternative Routing-ID Interpretation (ARI)
> 		ARICap:	MFVC- ACS-, Next Function: 0
> 		ARICtl:	MFVC- ACS-, Function Group: 0
> 	Capabilities: [420 v1] Physical Resizable BAR
> 		BAR 2: current size: 16GB, supported: 256MB 512MB 1GB 2GB 4GB 8GB 16GB
> 	Capabilities: [400 v1] Latency Tolerance Reporting
> 		Max snoop latency: 15728640ns
> 		Max no snoop latency: 15728640ns
> 	Kernel driver in use: i915
> 	Kernel modules: i915, xe



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] vfio/igd: Check host PCI address when probing
  2025-04-15 19:04           ` Alex Williamson
@ 2025-04-16 15:45             ` Tomita Moeko
  2025-04-16 16:10               ` Alex Williamson
  0 siblings, 1 reply; 13+ messages in thread
From: Tomita Moeko @ 2025-04-16 15:45 UTC (permalink / raw)
  To: Alex Williamson; +Cc: Cédric Le Goater, qemu-devel, Corvin Köhne

On 4/16/25 03:04, Alex Williamson wrote:
> On Wed, 16 Apr 2025 01:36:15 +0800
> Tomita Moeko <tomitamoeko@gmail.com> wrote:
>>
>> The generation register also exists on discrete GPUs. In the new xe
>> driver [1], the Battlemage discrete GPU shares the same logic reading
>> GMD_ID_DISPLAY register. The driver itself uses is_dgfx bit mapped to
>> device id. In QEMU, we need to know whether the device is a supported
>> IGD device first before applying the IGD-specific quirk, especially
>> for legacy mode.
>>
>> The most feasible way is to check if kernel exposes VFIO_REGION_SUBTYPE_
>> INTEL_IGD_OPREGION on that device I think, as only IGD has OpRegion.
>>
>> i915 driver [2] and Arrow Lake datasheet [3] shows that Intel has
>> removed the BDSM register by making the DSM range part of BAR2 since
>> Meteor Lake and onwards. QEMU only need to quirk on the register for
>> IGD devices until Raptor Lake, meaning that the device list is fixed
>> for now.
>>
>> By the way, for legacy mode, I think we should only support it until
>> Gen 9, as Intel only provide VBIOS or CSM support until that generation,
>> and seabios cannot handle 64 bit BDSM register. I'm also wondering if
>> VGA really works on newer generations.
> 
> If it's a VGA class device, it really should, but without CSM I could
> see why you have doubts.

Without CSM/VBIOS there is no pre-boot video, but when it booted to OS,
driver is used for video rather than VGA. Though it claims itself as
VGA class, it does not have VGA compatiblity. A770 even does not have
IO BAR, definitely it cannot handle VGA decoding.

Similar behavior is also found on AMD GPUs. They still claim as VGA
controller, but without CSM support.
https://www.amd.com/en/resources/support-articles/faqs/GPU-N4XCSM.html

>> Maybe we can continue with current igd_gen, but implement a logic like:
>>     if (!intel graphics)
>>         return;
>>     if (!has VFIO_REGION_SUBTYPE_INTEL_IGD_OPREGION)
>>         return;
>>     setup_opregion();  // make x-igd-opregion automatically enabled?
>>     if (gen <= 9)
>>         setup_legacy_mode();
>>     if (gen >= 6 && gen <=9)
>>         setup_32bit_bdsm():
>>     else if (gen >= 9 && gen <= 12)
>>         setup_64bit_bdsm();
>>     // ...
>>     // optional quirks like lpc bridge id
>>
>> A table can also be used to precisely track all the gen 6-12 devices.
> 
> This seems reasonable to me.
> 
>> Attached a config space dump of Intel A770 discrete GPU for reference
>>
>> 03:00.0 VGA compatible controller: Intel Corporation DG2 [Arc A770] (rev 08) (prog-if 00 [VGA controller])
>> 	Subsystem: Intel Corporation Device 1020
>> 	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
>> 	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
>> 	Latency: 0, Cache Line Size: 64 bytes
>> 	Interrupt: pin ? routed to IRQ 181
>> 	IOMMU group: 19
>> 	Region 0: Memory at 81000000 (64-bit, non-prefetchable) [size=16M]
>> 	Region 2: Memory at 6000000000 (64-bit, prefetchable) [size=16G]
>> 	Expansion ROM at 82000000 [disabled] [size=2M]
>> 	Capabilities: [40] Vendor Specific Information: Len=0c <?>
>> 	Capabilities: [70] Express (v2) Endpoint, IntMsgNum 0
>> 		DevCap:	MaxPayload 128 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
>> 			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0W TEE-IO-
>> 		DevCtl:	CorrErr- NonFatalErr- FatalErr- UnsupReq-
>> 			RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ FLReset-
>> 			MaxPayload 128 bytes, MaxReadReq 128 bytes
>> 		DevSta:	CorrErr- NonFatalErr- FatalErr- UnsupReq+ AuxPwr- TransPend-
>> 		LnkCap:	Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us
> 
> Hmm, hardware bug?  Surely the A770 is not a Gen1, x1 device.
> Something is going on with the interrupt pin above too.  At least it
> claims FLReset+ above, does it work reliably though?  Seems like there
> are various reports of Arc GPUs not working well with assignment due to
> reset issues.  Thanks,
> 
> Alex

Just did a quick search, link speed reporting is a known issue on Arc
A-series GPU.
https://www.intel.com/content/www/us/en/support/articles/000094587/graphics.html

The root port reports correct `LnkSta:	Speed 16GT/s, Width x16`

I also tried the passthough it to VM and not meeting any reset issue.
Probably it was fixed, intel often publishes VBIOS updates (they call
it IFWI) with windows driver releases. 

Thanks,
Moeko

> 
>> 			ClockPM- Surprise- LLActRep- BwNot- ASPMOptComp+
>> 		LnkCtl:	ASPM Disabled; RCB 64 bytes, LnkDisable- CommClk-
>> 			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
>> 		LnkSta:	Speed 2.5GT/s, Width x1
>> 			TrErr- Train- SlotClk- DLActive- BWMgmt- ABWMgmt-
>> 		DevCap2: Completion Timeout: Range B, TimeoutDis+ NROPrPrP- LTR+
>> 			 10BitTagComp+ 10BitTagReq+ OBFF Not Supported, ExtFmt+ EETLPPrefix-
>> 			 EmergencyPowerReduction Not Supported, EmergencyPowerReductionInit-
>> 			 FRS- TPHComp- ExtTPHComp-
>> 			 AtomicOpsCap: 32bit- 64bit- 128bitCAS-
>> 		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-
>> 			 AtomicOpsCtl: ReqEn-
>> 			 IDOReq- IDOCompl- LTR+ EmergencyPowerReductionReq-
>> 			 10BitTagReq- OBFF Disabled, EETLPPrefixBlk-
>> 		LnkCap2: Supported Link Speeds: 2.5GT/s, Crosslink- Retimer- 2Retimers- DRS-
>> 		LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
>> 			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
>> 			 Compliance Preset/De-emphasis: -6dB de-emphasis, 0dB preshoot
>> 		LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete- EqualizationPhase1-
>> 			 EqualizationPhase2- EqualizationPhase3- LinkEqualizationRequest-
>> 			 Retimer- 2Retimers- CrosslinkRes: unsupported
>> 	Capabilities: [ac] MSI: Enable+ Count=1/1 Maskable+ 64bit+
>> 		Address: 00000000fee008b8  Data: 0000
>> 		Masking: 00000000  Pending: 00000000
>> 	Capabilities: [d0] Power Management version 3
>> 		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0+,D1-,D2-,D3hot+,D3cold-)
>> 		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
>> 	Capabilities: [100 v1] Alternative Routing-ID Interpretation (ARI)
>> 		ARICap:	MFVC- ACS-, Next Function: 0
>> 		ARICtl:	MFVC- ACS-, Function Group: 0
>> 	Capabilities: [420 v1] Physical Resizable BAR
>> 		BAR 2: current size: 16GB, supported: 256MB 512MB 1GB 2GB 4GB 8GB 16GB
>> 	Capabilities: [400 v1] Latency Tolerance Reporting
>> 		Max snoop latency: 15728640ns
>> 		Max no snoop latency: 15728640ns
>> 	Kernel driver in use: i915
>> 	Kernel modules: i915, xe
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] vfio/igd: Check host PCI address when probing
  2025-04-16 15:45             ` Tomita Moeko
@ 2025-04-16 16:10               ` Alex Williamson
  2025-04-16 17:41                 ` Tomita Moeko
  0 siblings, 1 reply; 13+ messages in thread
From: Alex Williamson @ 2025-04-16 16:10 UTC (permalink / raw)
  To: Tomita Moeko; +Cc: Cédric Le Goater, qemu-devel, Corvin Köhne

On Wed, 16 Apr 2025 23:45:08 +0800
Tomita Moeko <tomitamoeko@gmail.com> wrote:

> On 4/16/25 03:04, Alex Williamson wrote:
> > On Wed, 16 Apr 2025 01:36:15 +0800
> > Tomita Moeko <tomitamoeko@gmail.com> wrote:  
> >>
> >> The generation register also exists on discrete GPUs. In the new xe
> >> driver [1], the Battlemage discrete GPU shares the same logic reading
> >> GMD_ID_DISPLAY register. The driver itself uses is_dgfx bit mapped to
> >> device id. In QEMU, we need to know whether the device is a supported
> >> IGD device first before applying the IGD-specific quirk, especially
> >> for legacy mode.
> >>
> >> The most feasible way is to check if kernel exposes VFIO_REGION_SUBTYPE_
> >> INTEL_IGD_OPREGION on that device I think, as only IGD has OpRegion.
> >>
> >> i915 driver [2] and Arrow Lake datasheet [3] shows that Intel has
> >> removed the BDSM register by making the DSM range part of BAR2 since
> >> Meteor Lake and onwards. QEMU only need to quirk on the register for
> >> IGD devices until Raptor Lake, meaning that the device list is fixed
> >> for now.
> >>
> >> By the way, for legacy mode, I think we should only support it until
> >> Gen 9, as Intel only provide VBIOS or CSM support until that generation,
> >> and seabios cannot handle 64 bit BDSM register. I'm also wondering if
> >> VGA really works on newer generations.  
> > 
> > If it's a VGA class device, it really should, but without CSM I could
> > see why you have doubts.  
> 
> Without CSM/VBIOS there is no pre-boot video, but when it booted to OS,
> driver is used for video rather than VGA. Though it claims itself as
> VGA class, it does not have VGA compatiblity. A770 even does not have
> IO BAR, definitely it cannot handle VGA decoding.

VGA ranges are implicit in a VGA class device, they're not backed by
BARs.  Lack of CSM support makes it more difficult to prove whether VGA
support is present since we can't easily initialize the device for a
legacy OS, but I don't think lack of CSM necessary proves the hardware
doesn't support VGA.  If we really cared, we could probably do some low
level experiments writing into the VGA frame buffer range to test if
it's present and behaves as expected relative to memory and IO enable
bits.

> 
> Similar behavior is also found on AMD GPUs. They still claim as VGA
> controller, but without CSM support.
> https://www.amd.com/en/resources/support-articles/faqs/GPU-N4XCSM.html
> 
> >> Maybe we can continue with current igd_gen, but implement a logic like:
> >>     if (!intel graphics)
> >>         return;
> >>     if (!has VFIO_REGION_SUBTYPE_INTEL_IGD_OPREGION)
> >>         return;
> >>     setup_opregion();  // make x-igd-opregion automatically enabled?
> >>     if (gen <= 9)
> >>         setup_legacy_mode();
> >>     if (gen >= 6 && gen <=9)
> >>         setup_32bit_bdsm():
> >>     else if (gen >= 9 && gen <= 12)
> >>         setup_64bit_bdsm();
> >>     // ...
> >>     // optional quirks like lpc bridge id
> >>
> >> A table can also be used to precisely track all the gen 6-12 devices.  
> > 
> > This seems reasonable to me.
> >   
> >> Attached a config space dump of Intel A770 discrete GPU for reference
> >>
> >> 03:00.0 VGA compatible controller: Intel Corporation DG2 [Arc A770] (rev 08) (prog-if 00 [VGA controller])
> >> 	Subsystem: Intel Corporation Device 1020
> >> 	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
> >> 	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
> >> 	Latency: 0, Cache Line Size: 64 bytes
> >> 	Interrupt: pin ? routed to IRQ 181
> >> 	IOMMU group: 19
> >> 	Region 0: Memory at 81000000 (64-bit, non-prefetchable) [size=16M]
> >> 	Region 2: Memory at 6000000000 (64-bit, prefetchable) [size=16G]
> >> 	Expansion ROM at 82000000 [disabled] [size=2M]
> >> 	Capabilities: [40] Vendor Specific Information: Len=0c <?>
> >> 	Capabilities: [70] Express (v2) Endpoint, IntMsgNum 0
> >> 		DevCap:	MaxPayload 128 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
> >> 			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0W TEE-IO-
> >> 		DevCtl:	CorrErr- NonFatalErr- FatalErr- UnsupReq-
> >> 			RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ FLReset-
> >> 			MaxPayload 128 bytes, MaxReadReq 128 bytes
> >> 		DevSta:	CorrErr- NonFatalErr- FatalErr- UnsupReq+ AuxPwr- TransPend-
> >> 		LnkCap:	Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us  
> > 
> > Hmm, hardware bug?  Surely the A770 is not a Gen1, x1 device.
> > Something is going on with the interrupt pin above too.  At least it
> > claims FLReset+ above, does it work reliably though?  Seems like there
> > are various reports of Arc GPUs not working well with assignment due to
> > reset issues.  Thanks,
> > 
> > Alex  
> 
> Just did a quick search, link speed reporting is a known issue on Arc
> A-series GPU.
> https://www.intel.com/content/www/us/en/support/articles/000094587/graphics.html
> 
> The root port reports correct `LnkSta:	Speed 16GT/s, Width x16`

Interesting, suppose I just hadn't noticed before.
 
> I also tried the passthough it to VM and not meeting any reset issue.
> Probably it was fixed, intel often publishes VBIOS updates (they call
> it IFWI) with windows driver releases. 

Ok, I have an A380 that doesn't behave well with resets, I suppose
I should boot it on a bare metal Windows system to try to get any
available firmware updates.  Thanks,

Alex



^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] vfio/igd: Check host PCI address when probing
  2025-04-16 16:10               ` Alex Williamson
@ 2025-04-16 17:41                 ` Tomita Moeko
  2025-04-16 17:57                   ` Alex Williamson
  0 siblings, 1 reply; 13+ messages in thread
From: Tomita Moeko @ 2025-04-16 17:41 UTC (permalink / raw)
  To: Alex Williamson; +Cc: Cédric Le Goater, qemu-devel, Corvin Köhne



On 4/17/25 00:10, Alex Williamson wrote:
> On Wed, 16 Apr 2025 23:45:08 +0800
> Tomita Moeko <tomitamoeko@gmail.com> wrote:
> 
>> On 4/16/25 03:04, Alex Williamson wrote:
>>> On Wed, 16 Apr 2025 01:36:15 +0800
>>> Tomita Moeko <tomitamoeko@gmail.com> wrote:  
>>>>
>>>> The generation register also exists on discrete GPUs. In the new xe
>>>> driver [1], the Battlemage discrete GPU shares the same logic reading
>>>> GMD_ID_DISPLAY register. The driver itself uses is_dgfx bit mapped to
>>>> device id. In QEMU, we need to know whether the device is a supported
>>>> IGD device first before applying the IGD-specific quirk, especially
>>>> for legacy mode.
>>>>
>>>> The most feasible way is to check if kernel exposes VFIO_REGION_SUBTYPE_
>>>> INTEL_IGD_OPREGION on that device I think, as only IGD has OpRegion.
>>>>
>>>> i915 driver [2] and Arrow Lake datasheet [3] shows that Intel has
>>>> removed the BDSM register by making the DSM range part of BAR2 since
>>>> Meteor Lake and onwards. QEMU only need to quirk on the register for
>>>> IGD devices until Raptor Lake, meaning that the device list is fixed
>>>> for now.
>>>>
>>>> By the way, for legacy mode, I think we should only support it until
>>>> Gen 9, as Intel only provide VBIOS or CSM support until that generation,
>>>> and seabios cannot handle 64 bit BDSM register. I'm also wondering if
>>>> VGA really works on newer generations.  
>>>
>>> If it's a VGA class device, it really should, but without CSM I could
>>> see why you have doubts.  
>>
>> Without CSM/VBIOS there is no pre-boot video, but when it booted to OS,
>> driver is used for video rather than VGA. Though it claims itself as
>> VGA class, it does not have VGA compatiblity. A770 even does not have
>> IO BAR, definitely it cannot handle VGA decoding.
> 
> VGA ranges are implicit in a VGA class device, they're not backed by
> BARs.  Lack of CSM support makes it more difficult to prove whether VGA
> support is present since we can't easily initialize the device for a
> legacy OS, but I don't think lack of CSM necessary proves the hardware
> doesn't support VGA.  If we really cared, we could probably do some low
> level experiments writing into the VGA frame buffer range to test if
> it's present and behaves as expected relative to memory and IO enable
> bits.

Sorry for my misunderstanding. The bridge control register in PCI bridge
config space determines forwarding VGA IO/MMIO accesses to which device,
BAR is not related in this process.

As device initialization (by legacy VBIOS) is required for booting
legacy OS with seabios, limiting legacy mode to gen9 and older sounds
resonable.

Trying VGA framebuffer would be difficult we have to disable EFI GOP and
manually instruct the device to enter VGA mode without VBIOS routines.
It's not a major problem here, let's skip it for now?

Thanks,
Moeko

>>
>> Similar behavior is also found on AMD GPUs. They still claim as VGA
>> controller, but without CSM support.
>> https://www.amd.com/en/resources/support-articles/faqs/GPU-N4XCSM.html
>>
>>>> Maybe we can continue with current igd_gen, but implement a logic like:
>>>>     if (!intel graphics)
>>>>         return;
>>>>     if (!has VFIO_REGION_SUBTYPE_INTEL_IGD_OPREGION)
>>>>         return;
>>>>     setup_opregion();  // make x-igd-opregion automatically enabled?
>>>>     if (gen <= 9)
>>>>         setup_legacy_mode();
>>>>     if (gen >= 6 && gen <=9)
>>>>         setup_32bit_bdsm():
>>>>     else if (gen >= 9 && gen <= 12)
>>>>         setup_64bit_bdsm();
>>>>     // ...
>>>>     // optional quirks like lpc bridge id
>>>>
>>>> A table can also be used to precisely track all the gen 6-12 devices.  
>>>
>>> This seems reasonable to me.
>>>   
>>>> Attached a config space dump of Intel A770 discrete GPU for reference
>>>>
>>>> 03:00.0 VGA compatible controller: Intel Corporation DG2 [Arc A770] (rev 08) (prog-if 00 [VGA controller])
>>>> 	Subsystem: Intel Corporation Device 1020
>>>> 	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
>>>> 	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
>>>> 	Latency: 0, Cache Line Size: 64 bytes
>>>> 	Interrupt: pin ? routed to IRQ 181
>>>> 	IOMMU group: 19
>>>> 	Region 0: Memory at 81000000 (64-bit, non-prefetchable) [size=16M]
>>>> 	Region 2: Memory at 6000000000 (64-bit, prefetchable) [size=16G]
>>>> 	Expansion ROM at 82000000 [disabled] [size=2M]
>>>> 	Capabilities: [40] Vendor Specific Information: Len=0c <?>
>>>> 	Capabilities: [70] Express (v2) Endpoint, IntMsgNum 0
>>>> 		DevCap:	MaxPayload 128 bytes, PhantFunc 0, Latency L0s <64ns, L1 <1us
>>>> 			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset+ SlotPowerLimit 0W TEE-IO-
>>>> 		DevCtl:	CorrErr- NonFatalErr- FatalErr- UnsupReq-
>>>> 			RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop+ FLReset-
>>>> 			MaxPayload 128 bytes, MaxReadReq 128 bytes
>>>> 		DevSta:	CorrErr- NonFatalErr- FatalErr- UnsupReq+ AuxPwr- TransPend-
>>>> 		LnkCap:	Port #0, Speed 2.5GT/s, Width x1, ASPM L0s L1, Exit Latency L0s <64ns, L1 <1us  
>>>
>>> Hmm, hardware bug?  Surely the A770 is not a Gen1, x1 device.
>>> Something is going on with the interrupt pin above too.  At least it
>>> claims FLReset+ above, does it work reliably though?  Seems like there
>>> are various reports of Arc GPUs not working well with assignment due to
>>> reset issues.  Thanks,
>>>
>>> Alex  
>>
>> Just did a quick search, link speed reporting is a known issue on Arc
>> A-series GPU.
>> https://www.intel.com/content/www/us/en/support/articles/000094587/graphics.html
>>
>> The root port reports correct `LnkSta:	Speed 16GT/s, Width x16`
> 
> Interesting, suppose I just hadn't noticed before.
>  
>> I also tried the passthough it to VM and not meeting any reset issue.
>> Probably it was fixed, intel often publishes VBIOS updates (they call
>> it IFWI) with windows driver releases. 
> 
> Ok, I have an A380 that doesn't behave well with resets, I suppose
> I should boot it on a bare metal Windows system to try to get any
> available firmware updates.  Thanks,
> 
> Alex
> 


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] vfio/igd: Check host PCI address when probing
  2025-04-16 17:41                 ` Tomita Moeko
@ 2025-04-16 17:57                   ` Alex Williamson
  0 siblings, 0 replies; 13+ messages in thread
From: Alex Williamson @ 2025-04-16 17:57 UTC (permalink / raw)
  To: Tomita Moeko; +Cc: Cédric Le Goater, qemu-devel, Corvin Köhne

On Thu, 17 Apr 2025 01:41:22 +0800
Tomita Moeko <tomitamoeko@gmail.com> wrote:

> On 4/17/25 00:10, Alex Williamson wrote:
> > On Wed, 16 Apr 2025 23:45:08 +0800
> > Tomita Moeko <tomitamoeko@gmail.com> wrote:
> >   
> >> On 4/16/25 03:04, Alex Williamson wrote:  
> >>> On Wed, 16 Apr 2025 01:36:15 +0800
> >>> Tomita Moeko <tomitamoeko@gmail.com> wrote:    
> >>>>
> >>>> The generation register also exists on discrete GPUs. In the new xe
> >>>> driver [1], the Battlemage discrete GPU shares the same logic reading
> >>>> GMD_ID_DISPLAY register. The driver itself uses is_dgfx bit mapped to
> >>>> device id. In QEMU, we need to know whether the device is a supported
> >>>> IGD device first before applying the IGD-specific quirk, especially
> >>>> for legacy mode.
> >>>>
> >>>> The most feasible way is to check if kernel exposes VFIO_REGION_SUBTYPE_
> >>>> INTEL_IGD_OPREGION on that device I think, as only IGD has OpRegion.
> >>>>
> >>>> i915 driver [2] and Arrow Lake datasheet [3] shows that Intel has
> >>>> removed the BDSM register by making the DSM range part of BAR2 since
> >>>> Meteor Lake and onwards. QEMU only need to quirk on the register for
> >>>> IGD devices until Raptor Lake, meaning that the device list is fixed
> >>>> for now.
> >>>>
> >>>> By the way, for legacy mode, I think we should only support it until
> >>>> Gen 9, as Intel only provide VBIOS or CSM support until that generation,
> >>>> and seabios cannot handle 64 bit BDSM register. I'm also wondering if
> >>>> VGA really works on newer generations.    
> >>>
> >>> If it's a VGA class device, it really should, but without CSM I could
> >>> see why you have doubts.    
> >>
> >> Without CSM/VBIOS there is no pre-boot video, but when it booted to OS,
> >> driver is used for video rather than VGA. Though it claims itself as
> >> VGA class, it does not have VGA compatiblity. A770 even does not have
> >> IO BAR, definitely it cannot handle VGA decoding.  
> > 
> > VGA ranges are implicit in a VGA class device, they're not backed by
> > BARs.  Lack of CSM support makes it more difficult to prove whether VGA
> > support is present since we can't easily initialize the device for a
> > legacy OS, but I don't think lack of CSM necessary proves the hardware
> > doesn't support VGA.  If we really cared, we could probably do some low
> > level experiments writing into the VGA frame buffer range to test if
> > it's present and behaves as expected relative to memory and IO enable
> > bits.  
> 
> Sorry for my misunderstanding. The bridge control register in PCI bridge
> config space determines forwarding VGA IO/MMIO accesses to which device,
> BAR is not related in this process.
> 
> As device initialization (by legacy VBIOS) is required for booting
> legacy OS with seabios, limiting legacy mode to gen9 and older sounds
> resonable.
> 
> Trying VGA framebuffer would be difficult we have to disable EFI GOP and
> manually instruct the device to enter VGA mode without VBIOS routines.
> It's not a major problem here, let's skip it for now?

This is only a curiosity as far as I'm concerned, I agree with your
proposal that we can drop legacy mode where the bare metal system
doesn't even support CSM.  Thanks,

Alex



^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2025-04-16 17:57 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-03-25 17:22 [PATCH] vfio/igd: Check host PCI address when probing Tomita Moeko
2025-04-09 17:18 ` Alex Williamson
2025-04-10  7:34   ` Cédric Le Goater
2025-04-13 17:23     ` Tomita Moeko
2025-04-14 22:05       ` Alex Williamson
2025-04-15 17:36         ` Tomita Moeko
2025-04-15 19:04           ` Alex Williamson
2025-04-16 15:45             ` Tomita Moeko
2025-04-16 16:10               ` Alex Williamson
2025-04-16 17:41                 ` Tomita Moeko
2025-04-16 17:57                   ` Alex Williamson
2025-04-13 11:30   ` Tomita Moeko
2025-04-14 21:51     ` Alex Williamson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).