Intel-XE Archive on lore.kernel.org
 help / color / mirror / Atom feed
From: Jani Nikula <jani.nikula@intel.com>
To: Vivek Kasireddy <vivek.kasireddy@intel.com>,
	intel-xe@lists.freedesktop.org
Cc: Vivek Kasireddy <vivek.kasireddy@intel.com>,
	Matt Roper <matthew.d.roper@intel.com>,
	Lucas De Marchi <lucas.demarchi@intel.com>
Subject: Re: [PATCH] Revert "drm/xe/display: use xe->display to decide whether to do anything"
Date: Thu, 05 Jun 2025 10:55:59 +0300	[thread overview]
Message-ID: <fc535785263b34ee08802dba3a451fb99469ca58@intel.com> (raw)
In-Reply-To: <20250605054247.386633-1-vivek.kasireddy@intel.com>

On Wed, 04 Jun 2025, Vivek Kasireddy <vivek.kasireddy@intel.com> wrote:
> This reverts commit 5a9f299f956ef9764f56044cfca7aafa23cea1d1.
>
> The following crash/regression was seen with the reverted commit
> on a specific BMG SKU with no display capabilities:
>
> [  115.582833] BUG: kernel NULL pointer dereference, address: 00000000000005d0
> [  115.589775] #PF: supervisor write access in kernel mode
> [  115.594976] #PF: error_code(0x0002) - not-present page
> [  115.600088] PGD 0 P4D 0
> [  115.602617] Oops: Oops: 0002 [#1] SMP
> [  115.606267] CPU: 14 UID: 0 PID: 1547 Comm: kworker/14:3 Tainted: G 	U      E       6.15.0-local+ #62 PREEMPT(voluntary)
> [  115.617332] Tainted: [U]=USER, [E]=UNSIGNED_MODULE
> [  115.622100] Hardware name: Intel Corporation Meteor Lake Client Platform/MTL-P DDR5 SODIMM SBS RVP, BIOS MTLPEMI1.R00.3471.D49.2401260852 01/26/2024
> [  115.635314] Workqueue: pm pm_runtime_work
> [  115.639309] RIP: 0010:_raw_spin_lock+0x17/0x30
> [  115.662382] RSP: 0018:ffffd13f82e7bc30 EFLAGS: 00010246
> [  115.667581] RAX: 0000000000000000 RBX: ffff8be919076000 RCX: 0000000000000002
> [  115.674675] RDX: 0000000000000001 RSI: 000000000000004b RDI: 00000000000005d0
> [  115.681775] RBP: ffffd13f82e7bc60 R08: ffffd13f82e7bb00 R09: ffff8beb0c1b06c0
> [  115.688869] R10: ffff8be7c034f4c0 R11: fefefefefefefeff R12: fffffffffffffff0
> [  115.695965] R13: ffff8be9190762e8 R14: ffff8be919077798 R15: 00000000000005d0
> [  115.703062] FS:  0000000000000000(0000) GS:ffff8beb552b6000(0000) knlGS:0000000000000000
> [  115.711106] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  115.716826] CR2: 00000000000005d0 CR3: 000000024c68d002 CR4: 0000000000f72ef0
> [  115.723921] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [  115.731015] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7: 0000000000000400
> [  115.738113] PKRU: 55555554
> [  115.740816] Call Trace:
> [  115.743258]  <TASK>
> [  115.745363]  ? xe_display_flush_cleanup_work+0x92/0x120 [xe]
> [  115.751102]  xe_display_pm_runtime_suspend+0x42/0x80 [xe]
> [  115.756542]  xe_pm_runtime_suspend+0x11b/0x1b0 [xe]
> [  115.761463]  xe_pci_runtime_suspend+0x23/0xd0 [xe]
> [  115.766291]  pci_pm_runtime_suspend+0x6b/0x1a0
> [  115.770717]  ? pci_pm_thaw_noirq+0xa0/0xa0
> [  115.774797]  __rpm_callback+0x48/0x1e0
> [  115.778531]  ? pci_pm_thaw_noirq+0xa0/0xa0
> [  115.782614]  rpm_callback+0x66/0x70
> [  115.786090]  ? pci_pm_thaw_noirq+0xa0/0xa0
> [  115.790173]  rpm_suspend+0xe1/0x5e0
> [  115.793647]  ? psi_task_switch+0xb8/0x200
> [  115.797643]  ? finish_task_switch.isra.0+0x8d/0x270
> [  115.802502]  pm_runtime_work+0xa6/0xc0
> [  115.806238]  process_one_work+0x186/0x350
> [  115.810234]  worker_thread+0x33a/0x480
> [  115.813968]  ? process_one_work+0x350/0x350
> [  115.818132]  kthread+0x10c/0x220
> [  115.821350]  ? kthreads_online_cpu+0x120/0x120
> [  115.825774]  ret_from_fork+0x3a/0x60
> [  115.829339]  ? kthreads_online_cpu+0x120/0x120
> [  115.833768]  ret_from_fork_asm+0x11/0x20
> [  115.829339]  ? kthreads_online_cpu+0x120/0x120
> [  115.833768]  ret_from_fork_asm+0x11/0x20
> [  115.837680]  </TASK>
> [  115.839907]  acpi_tad(E) drm(E)
> [  115.931629] CR2: 00000000000005d0
> [  115.934935] ---[ end trace 0000000000000000 ]---
> [  115.939531] RIP: 0010:_raw_spin_lock+0x17/0x30
>
> We cannot yet use xe->display to determine whether display hardware
> has been successfully probed/initialized or not. This is because
> xe->display would not be set to NULL even with GPUs with no display
> capabilities (e.g, GMD_ID_DISPLAY = 0). However, this might change
> in the future as Xe and i915 code is unified to deal with no display
> cases.
>
> Therefore, for now we have to continue to rely on xe->info.probe_display
> (which would be set to false with display-less GPUs) to decide
> whether to invoke any display related functions or not.
>
> Cc: Jani Nikula <jani.nikula@intel.com>
> Cc: Matt Roper <matthew.d.roper@intel.com>
> Cc: Lucas De Marchi <lucas.demarchi@intel.com>
> Signed-off-by: Vivek Kasireddy <vivek.kasireddy@intel.com>

Thanks, and sorry again for the trouble,

Reviewed-by: Jani Nikula <jani.nikula@intel.com>

> ---
>  drivers/gpu/drm/xe/display/xe_display.c | 40 ++++++++++++-------------
>  1 file changed, 20 insertions(+), 20 deletions(-)
>
> diff --git a/drivers/gpu/drm/xe/display/xe_display.c b/drivers/gpu/drm/xe/display/xe_display.c
> index eafe2f093a6c..e2e0771cf274 100644
> --- a/drivers/gpu/drm/xe/display/xe_display.c
> +++ b/drivers/gpu/drm/xe/display/xe_display.c
> @@ -93,7 +93,7 @@ static void xe_display_fini_early(void *arg)
>  	struct xe_device *xe = arg;
>  	struct intel_display *display = xe->display;
>  
> -	if (!display)
> +	if (!xe->info.probe_display)
>  		return;
>  
>  	intel_display_driver_remove_nogem(display);
> @@ -107,7 +107,7 @@ int xe_display_init_early(struct xe_device *xe)
>  	struct intel_display *display = xe->display;
>  	int err;
>  
> -	if (!display)
> +	if (!xe->info.probe_display)
>  		return 0;
>  
>  	/* Fake uncore lock */
> @@ -163,7 +163,7 @@ int xe_display_init(struct xe_device *xe)
>  	struct intel_display *display = xe->display;
>  	int err;
>  
> -	if (!display)
> +	if (!xe->info.probe_display)
>  		return 0;
>  
>  	err = intel_display_driver_probe(display);
> @@ -177,7 +177,7 @@ void xe_display_register(struct xe_device *xe)
>  {
>  	struct intel_display *display = xe->display;
>  
> -	if (!display)
> +	if (!xe->info.probe_display)
>  		return;
>  
>  	intel_display_driver_register(display);
> @@ -188,7 +188,7 @@ void xe_display_unregister(struct xe_device *xe)
>  {
>  	struct intel_display *display = xe->display;
>  
> -	if (!display)
> +	if (!xe->info.probe_display)
>  		return;
>  
>  	intel_power_domains_disable(display);
> @@ -201,7 +201,7 @@ void xe_display_irq_handler(struct xe_device *xe, u32 master_ctl)
>  {
>  	struct intel_display *display = xe->display;
>  
> -	if (!display)
> +	if (!xe->info.probe_display)
>  		return;
>  
>  	if (master_ctl & DISPLAY_IRQ)
> @@ -212,7 +212,7 @@ void xe_display_irq_enable(struct xe_device *xe, u32 gu_misc_iir)
>  {
>  	struct intel_display *display = xe->display;
>  
> -	if (!display)
> +	if (!xe->info.probe_display)
>  		return;
>  
>  	if (gu_misc_iir & GU_MISC_GSE)
> @@ -223,7 +223,7 @@ void xe_display_irq_reset(struct xe_device *xe)
>  {
>  	struct intel_display *display = xe->display;
>  
> -	if (!display)
> +	if (!xe->info.probe_display)
>  		return;
>  
>  	gen11_display_irq_reset(display);
> @@ -233,7 +233,7 @@ void xe_display_irq_postinstall(struct xe_device *xe, struct xe_gt *gt)
>  {
>  	struct intel_display *display = xe->display;
>  
> -	if (!display)
> +	if (!xe->info.probe_display)
>  		return;
>  
>  	if (gt->info.id == XE_GT0)
> @@ -274,7 +274,7 @@ static void xe_display_enable_d3cold(struct xe_device *xe)
>  {
>  	struct intel_display *display = xe->display;
>  
> -	if (!display)
> +	if (!xe->info.probe_display)
>  		return;
>  
>  	/*
> @@ -297,7 +297,7 @@ static void xe_display_disable_d3cold(struct xe_device *xe)
>  {
>  	struct intel_display *display = xe->display;
>  
> -	if (!display)
> +	if (!xe->info.probe_display)
>  		return;
>  
>  	intel_dmc_resume(display);
> @@ -322,7 +322,7 @@ void xe_display_pm_suspend(struct xe_device *xe)
>  	struct intel_display *display = xe->display;
>  	bool s2idle = suspend_to_idle();
>  
> -	if (!display)
> +	if (!xe->info.probe_display)
>  		return;
>  
>  	/*
> @@ -356,7 +356,7 @@ void xe_display_pm_shutdown(struct xe_device *xe)
>  {
>  	struct intel_display *display = xe->display;
>  
> -	if (!display)
> +	if (!xe->info.probe_display)
>  		return;
>  
>  	intel_power_domains_disable(display);
> @@ -387,7 +387,7 @@ void xe_display_pm_runtime_suspend(struct xe_device *xe)
>  {
>  	struct intel_display *display = xe->display;
>  
> -	if (!display)
> +	if (!xe->info.probe_display)
>  		return;
>  
>  	if (xe->d3cold.allowed) {
> @@ -403,7 +403,7 @@ void xe_display_pm_suspend_late(struct xe_device *xe)
>  	struct intel_display *display = xe->display;
>  	bool s2idle = suspend_to_idle();
>  
> -	if (!display)
> +	if (!xe->info.probe_display)
>  		return;
>  
>  	intel_display_power_suspend_late(display, s2idle);
> @@ -413,7 +413,7 @@ void xe_display_pm_runtime_suspend_late(struct xe_device *xe)
>  {
>  	struct intel_display *display = xe->display;
>  
> -	if (!display)
> +	if (!xe->info.probe_display)
>  		return;
>  
>  	if (xe->d3cold.allowed)
> @@ -431,7 +431,7 @@ void xe_display_pm_shutdown_late(struct xe_device *xe)
>  {
>  	struct intel_display *display = xe->display;
>  
> -	if (!display)
> +	if (!xe->info.probe_display)
>  		return;
>  
>  	/*
> @@ -446,7 +446,7 @@ void xe_display_pm_resume_early(struct xe_device *xe)
>  {
>  	struct intel_display *display = xe->display;
>  
> -	if (!display)
> +	if (!xe->info.probe_display)
>  		return;
>  
>  	intel_display_power_resume_early(display);
> @@ -456,7 +456,7 @@ void xe_display_pm_resume(struct xe_device *xe)
>  {
>  	struct intel_display *display = xe->display;
>  
> -	if (!display)
> +	if (!xe->info.probe_display)
>  		return;
>  
>  	intel_dmc_resume(display);
> @@ -491,7 +491,7 @@ void xe_display_pm_runtime_resume(struct xe_device *xe)
>  {
>  	struct intel_display *display = xe->display;
>  
> -	if (!display)
> +	if (!xe->info.probe_display)
>  		return;
>  
>  	if (xe->d3cold.allowed) {

-- 
Jani Nikula, Intel

  parent reply	other threads:[~2025-06-05  7:56 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-06-05  5:42 [PATCH] Revert "drm/xe/display: use xe->display to decide whether to do anything" Vivek Kasireddy
2025-06-05  7:27 ` ✓ CI.Patch_applied: success for " Patchwork
2025-06-05  7:27 ` ✓ CI.checkpatch: " Patchwork
2025-06-05  7:29 ` ✓ CI.KUnit: " Patchwork
2025-06-05  7:42 ` ✓ CI.Build: " Patchwork
2025-06-05  7:45 ` ✓ CI.Hooks: " Patchwork
2025-06-05  7:46 ` ✓ CI.checksparse: " Patchwork
2025-06-05  7:55 ` Jani Nikula [this message]
2025-06-10  7:56   ` [PATCH] " Jani Nikula
2025-06-11  5:03     ` Kasireddy, Vivek
2025-06-11 11:02       ` Jani Nikula
2025-06-05  9:03 ` ✓ Xe.CI.BAT: success for " Patchwork
2025-06-06 18:19 ` ✗ Xe.CI.Full: failure " Patchwork

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=fc535785263b34ee08802dba3a451fb99469ca58@intel.com \
    --to=jani.nikula@intel.com \
    --cc=intel-xe@lists.freedesktop.org \
    --cc=lucas.demarchi@intel.com \
    --cc=matthew.d.roper@intel.com \
    --cc=vivek.kasireddy@intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox