All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bjorn Helgaas <helgaas@kernel.org>
To: Mario Limonciello <superm1@kernel.org>
Cc: mario.limonciello@amd.com, bhelgaas@google.com,
	Thomas Zimmermann <tzimmermann@suse.de>,
	linux-pci@vger.kernel.org
Subject: Re: [PATCH] PCI/VGA: Look at all PCI display devices in VGA arbiter
Date: Fri, 13 Jun 2025 14:07:18 -0500	[thread overview]
Message-ID: <20250613190718.GA968774@bhelgaas> (raw)
In-Reply-To: <20250613031215.615885-1-superm1@kernel.org>

On Thu, Jun 12, 2025 at 10:12:14PM -0500, Mario Limonciello wrote:
> From: Mario Limonciello <mario.limonciello@amd.com>
> 
> On an A+N mobile system the APU is not being selected by some desktop
> environments for any rendering tasks. This is because the neither GPU
> is being treated as "boot_vga" but that is what some environments use
> to select a GPU [1].

What is "A+N" and "APU"?

I didn't quite follow the second sentence.  I guess you're saying some
userspace environments use the "boot_vga" sysfs file to select a GPU?
And on this A+N system, neither device has the file?

> The VGA arbiter driver only looks at devices that report as PCI display
> VGA class devices. Neither GPU on the system is a display VGA class
> device:
> 
> c5:00.0 3D controller: NVIDIA Corporation Device 2db9 (rev a1)
> c6:00.0 Display controller: Advanced Micro Devices, Inc. [AMD/ATI] Device 150e (rev d1)
> 
> So neither device gets the boot_vga sysfs file. The vga_is_boot_device()
> function already has some handling for integrated GPUs by looking at the
> ACPI HID entries, so if all PCI display class devices are looked at it
> actually can detect properly with this device ordering.  However if there
> is a different ordering it could flag the wrong device.
> 
> Modify the VGA arbiter code and matching sysfs file entries to examine all
> PCI display class devices. After every device is added to the arbiter list
> make a pass on all devices and explicitly check whether one is integrated.
> 
> This will cause all GPUs to gain a `boot_vga` file, but the correct device
> (APU in this case) will now show `1` and the incorrect device shows `0`.
> Userspace then picks the right device as well.

What determines whether the APU is the "correct" device?

> Link: https://github.com/robherring/libpciaccess/commit/b2838fb61c3542f107014b285cbda097acae1e12 [1]
> Acked-by: Thomas Zimmermann <tzimmermann@suse.de>
> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> ---
> RFC->v1:
>  * Add tag
>  * Drop unintended whitespace change
>  * Use vgaarb_dbg instead of vgaarb_info
> ---
>  drivers/pci/pci-sysfs.c |  2 +-
>  drivers/pci/vgaarb.c    | 48 +++++++++++++++++++++++++++--------------
>  include/linux/pci.h     | 15 +++++++++++++
>  3 files changed, 48 insertions(+), 17 deletions(-)
> 
> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> index 268c69daa4d57..c314ee1b3f9ac 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -1707,7 +1707,7 @@ static umode_t pci_dev_attrs_are_visible(struct kobject *kobj,
>  	struct device *dev = kobj_to_dev(kobj);
>  	struct pci_dev *pdev = to_pci_dev(dev);
>  
> -	if (a == &dev_attr_boot_vga.attr && pci_is_vga(pdev))
> +	if (a == &dev_attr_boot_vga.attr && pci_is_display(pdev))
>  		return a->mode;
>  
>  	return 0;
> diff --git a/drivers/pci/vgaarb.c b/drivers/pci/vgaarb.c
> index 78748e8d2dbae..0eb1274d52169 100644
> --- a/drivers/pci/vgaarb.c
> +++ b/drivers/pci/vgaarb.c
> @@ -140,6 +140,7 @@ void vga_set_default_device(struct pci_dev *pdev)
>  	if (vga_default == pdev)
>  		return;
>  
> +	vgaarb_dbg(&pdev->dev, "selecting as VGA boot device\n");

I guess this is essentially a move of the vgaarb_info() message from
vga_arbiter_add_pci_device() to here?  If so, I think I would preserve
the _info() level.  Including non-VGA devices is fairly subtle and I
don't think we need to discard potentially useful information about
what we're doing.

>  	pci_dev_put(vga_default);
>  	vga_default = pci_dev_get(pdev);
>  }
> @@ -751,6 +752,31 @@ static void vga_arbiter_check_bridge_sharing(struct vga_device *vgadev)
>  		vgaarb_info(&vgadev->pdev->dev, "no bridge control possible\n");
>  }
>  
> +static
> +void vga_arbiter_select_default_device(void)

Signature on one line.

> +{
> +	struct pci_dev *candidate = vga_default_device();
> +	struct vga_device *vgadev;
> +
> +	list_for_each_entry(vgadev, &vga_list, list) {
> +		if (vga_is_boot_device(vgadev)) {
> +			/* check if one is an integrated GPU, use that if so */
> +			if (candidate) {
> +				if (vga_arb_integrated_gpu(&candidate->dev))
> +					break;
> +				if (vga_arb_integrated_gpu(&vgadev->pdev->dev)) {
> +					candidate = vgadev->pdev;
> +					break;
> +				}
> +			} else
> +				candidate = vgadev->pdev;
> +		}
> +	}
> +
> +	if (candidate)
> +		vga_set_default_device(candidate);
> +}

It looks like this is related to the integrated GPU code in
vga_is_boot_device().  Can this be done by updating the logic there,
so it's more clear what change this patch makes?

It seems like this patch would change the selection in a way that
makes some of the vga_is_boot_device() comments obsolete.  E.g., "We
select the default VGA device in this order"?  Or "we use the *last*
[integrated GPU] because that was the previous behavior"?

The end of vga_is_boot_device() mentions non-legacy (non-VGA) devices,
but I don't remember now how we ever got there because
vga_arb_device_init() and pci_notify() only call
vga_arbiter_add_pci_device() for VGA devices.

>  /*
>   * Currently, we assume that the "initial" setup of the system is not sane,
>   * that is, we come up with conflicting devices and let the arbiter's
> @@ -816,23 +842,17 @@ static bool vga_arbiter_add_pci_device(struct pci_dev *pdev)
>  		bus = bus->parent;
>  	}
>  
> -	if (vga_is_boot_device(vgadev)) {
> -		vgaarb_info(&pdev->dev, "setting as boot VGA device%s\n",
> -			    vga_default_device() ?
> -			    " (overriding previous)" : "");
> -		vga_set_default_device(pdev);
> -	}
> -
>  	vga_arbiter_check_bridge_sharing(vgadev);
>  
>  	/* Add to the list */
>  	list_add_tail(&vgadev->list, &vga_list);
>  	vga_count++;
> -	vgaarb_info(&pdev->dev, "VGA device added: decodes=%s,owns=%s,locks=%s\n",
> +	vgaarb_dbg(&pdev->dev, "VGA device added: decodes=%s,owns=%s,locks=%s\n",

Looks like an unrelated change.

>  		vga_iostate_to_str(vgadev->decodes),
>  		vga_iostate_to_str(vgadev->owns),
>  		vga_iostate_to_str(vgadev->locks));
>  
> +	vga_arbiter_select_default_device();
>  	spin_unlock_irqrestore(&vga_lock, flags);
>  	return true;
>  fail:
> @@ -1499,8 +1519,8 @@ static int pci_notify(struct notifier_block *nb, unsigned long action,
>  
>  	vgaarb_dbg(dev, "%s\n", __func__);
>  
> -	/* Only deal with VGA class devices */
> -	if (!pci_is_vga(pdev))
> +	/* Only deal with display devices */
> +	if (!pci_is_display(pdev))
>  		return 0;

Are there any other pci_is_vga() users that might want
pci_is_display() instead?  virtio_gpu_pci_quirk()?

>  	/*
> @@ -1548,12 +1568,8 @@ static int __init vga_arb_device_init(void)
>  
>  	/* Add all VGA class PCI devices by default */
>  	pdev = NULL;
> -	while ((pdev =
> -		pci_get_subsys(PCI_ANY_ID, PCI_ANY_ID, PCI_ANY_ID,
> -			       PCI_ANY_ID, pdev)) != NULL) {
> -		if (pci_is_vga(pdev))
> -			vga_arbiter_add_pci_device(pdev);
> -	}
> +	while ((pdev = pci_get_base_class(PCI_BASE_CLASS_DISPLAY, pdev)))
> +		vga_arbiter_add_pci_device(pdev);

I guess pci_get_base_class(PCI_BASE_CLASS_DISPLAY) is sort of a source
code optimization and this one-line change would be equivalent?

  -		if (pci_is_vga(pdev))
  +		if (pci_is_display(pdev))
  			vga_arbiter_add_pci_device(pdev);

If so, I think I prefer the one-liner because then everything in this
file would use pci_is_display() and we wouldn't have to figure out the
equivalent-ness of pci_get_base_class(PCI_BASE_CLASS_DISPLAY).

>  	pr_info("loaded\n");
>  	return rc;
> diff --git a/include/linux/pci.h b/include/linux/pci.h
> index 05e68f35f3923..e77754e43c629 100644
> --- a/include/linux/pci.h
> +++ b/include/linux/pci.h
> @@ -744,6 +744,21 @@ static inline bool pci_is_vga(struct pci_dev *pdev)
>  	return false;
>  }
>  
> +/**
> + * pci_is_display - Check if a PCI device is a display controller
> + * @pdev: Pointer to the PCI device structure
> + *
> + * This function determines whether the given PCI device corresponds
> + * to a display controller. Display controllers are typically used
> + * for graphical output and are identified based on their class code.
> + *
> + * Return: true if the PCI device is a display controller, false otherwise.
> + */
> +static inline bool pci_is_display(struct pci_dev *pdev)
> +{
> +	return (pdev->class >> 16) == PCI_BASE_CLASS_DISPLAY;
> +}

Could use in vga_switcheroo_client_probe_defer(), IS_GFX_DEVICE(),
vfio_pci_is_intel_display(), i915_gfx_present(), get_bound_vga().
Arguable whether it's worth changing them, but I guess it's nice to
test for the same thing the same way.

Bjorn

  reply	other threads:[~2025-06-13 19:07 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-13  3:12 [PATCH] PCI/VGA: Look at all PCI display devices in VGA arbiter Mario Limonciello
2025-06-13 19:07 ` Bjorn Helgaas [this message]
2025-06-13 19:31   ` Mario Limonciello
2025-06-13 20:31     ` Bjorn Helgaas
2025-06-13 20:56       ` Mario Limonciello
2025-06-13 21:47         ` Daniel Dadap
2025-06-14  7:04           ` Lukas Wunner
2025-06-16 15:05             ` Daniel Dadap
2025-06-16 19:14               ` Lukas Wunner
2025-06-17 15:53                 ` Daniel Dadap
2025-06-17 16:06                   ` Mario Limonciello
2025-06-17 16:36                     ` Daniel Dadap
2025-06-17 17:01                       ` Mario Limonciello
2025-06-13 21:19     ` Alex Williamson
2025-06-13 21:29       ` Mario Limonciello
2025-06-16  6:42         ` Thomas Zimmermann
2025-06-16  6:50           ` David Airlie
2025-06-16  7:21             ` Thomas Zimmermann
2025-06-16  7:24               ` David Airlie
2025-06-16  9:11       ` Gerd Hoffmann

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250613190718.GA968774@bhelgaas \
    --to=helgaas@kernel.org \
    --cc=bhelgaas@google.com \
    --cc=linux-pci@vger.kernel.org \
    --cc=mario.limonciello@amd.com \
    --cc=superm1@kernel.org \
    --cc=tzimmermann@suse.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.