From: Matthew Auld <matthew.auld@intel.com>
To: Tejas Upadhyay <tejas.upadhyay@intel.com>,
intel-xe@lists.freedesktop.org
Cc: Jani Nikula <jani.nikula@linux.intel.com>,
Badal Nilawar <badal.nilawar@intel.com>,
Raag Jadav <raag.jadav@intel.com>
Subject: Re: [PATCH] drm/xe: Move drm_client_dev_suspend/resume outside pm block signalling
Date: Wed, 17 Jun 2026 12:25:17 +0100 [thread overview]
Message-ID: <e8e282bf-c7e5-4fe5-b228-cddcea4f8b35@intel.com> (raw)
In-Reply-To: <20260617082104.610620-2-tejas.upadhyay@intel.com>
On 17/06/2026 09:21, Tejas Upadhyay wrote:
> A circular locking dependency is reported between xe_pm_block_map and
> dev->clientlist_mutex:
>
> &dev->clientlist_mutex --> &vm->lock --> xe_pm_block_map --> &dev->clientlist_mutex
>
> The dependency chain is:
> 1. xe_pm_suspend() calls xe_pm_block_begin_signalling(), then
> xe_display_pm_suspend() which calls drm_client_dev_suspend()
> acquiring clientlist_mutex. (xe_pm_block_map -> clientlist_mutex)
>
> 2. drm_client_dev_unregister() -> drm_file_free() -> xe_file_close() ->
> prelim_xe_eudebug_file_close() acquires discovery_lock.
> (clientlist_mutex -> discovery_lock)
>
> 3. xe_vm_close_and_put() acquires vm->lock under discovery_lock.
> (discovery_lock -> vm->lock)
>
> 4. xe_exec_ioctl() holds vm->lock and calls xe_pm_block_on_suspend()
> which waits on xe_pm_block_map. (vm->lock -> xe_pm_block_map)
>
> Fix this by moving drm_client_dev_suspend() before
> xe_pm_block_begin_signalling() in xe_pm_suspend(), and
> drm_client_dev_resume() after xe_pm_block_end_signalling() in
> xe_pm_resume(). This breaks the xe_pm_block_map -> clientlist_mutex
> edge in the dependency chain.
Is this not just hiding the issue from lockdep though? The pm_begin/end
are not real locks, they are just there as annotations to teach lockdep
about the hidden pm dependencies. For example, any locks you grab in rpm
suspend()/resume() you are not allowed to hold when waking the device
up, otherwise you can potentially deadlock. So here moving stuff outside
the annotations probably doesn't really fix anything, it just hides it
from lockdep so it doesn't complain.
The above looks like it's complaining that clientlist_mutex is held and
indirectly ends up waiting on something pm related, but the pm side
itself can also grab that lock? That seems like a legit complaint and
should be fixed in the upper layers?
>
> The drm_client_dev_suspend/resume calls are safe to move because they
> only blank/unblank the fbdev console and don't depend on display power
> state or any operations within the block signalling scope.
>
> Cc: Jani Nikula <jani.nikula@linux.intel.com>
> Cc: Badal Nilawar <badal.nilawar@intel.com>
> Cc: Raag Jadav <raag.jadav@intel.com>
> Signed-off-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
> ---
> drivers/gpu/drm/xe/display/xe_display.c | 3 ---
> drivers/gpu/drm/xe/xe_pm.c | 5 +++++
> 2 files changed, 5 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/xe/display/xe_display.c b/drivers/gpu/drm/xe/display/xe_display.c
> index 810d93fefcbc..ec65dcbc3fdc 100644
> --- a/drivers/gpu/drm/xe/display/xe_display.c
> +++ b/drivers/gpu/drm/xe/display/xe_display.c
> @@ -304,7 +304,6 @@ void xe_display_pm_suspend(struct xe_device *xe)
> * properly.
> */
> intel_display_power_disable(display);
> - drm_client_dev_suspend(&xe->drm);
>
> if (intel_display_device_present(display)) {
> drm_kms_helper_poll_disable(&xe->drm);
> @@ -461,8 +460,6 @@ void xe_display_pm_resume(struct xe_device *xe)
>
> intel_opregion_resume(display);
>
> - drm_client_dev_resume(&xe->drm);
> -
> intel_display_power_enable(display);
> }
>
> diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
> index 99562f691080..d00c620d0926 100644
> --- a/drivers/gpu/drm/xe/xe_pm.c
> +++ b/drivers/gpu/drm/xe/xe_pm.c
> @@ -10,6 +10,7 @@
> #include <linux/suspend.h>
> #include <linux/dmi.h>
>
> +#include <drm/drm_client_event.h>
> #include <drm/drm_managed.h>
> #include <drm/ttm/ttm_placement.h>
>
> @@ -177,6 +178,7 @@ int xe_pm_suspend(struct xe_device *xe)
> int err;
>
> drm_dbg(&xe->drm, "Suspending device\n");
> + drm_client_dev_suspend(&xe->drm);
> xe_pm_block_begin_signalling();
> trace_xe_pm_suspend(xe, __builtin_return_address(0));
>
> @@ -219,6 +221,7 @@ int xe_pm_suspend(struct xe_device *xe)
> err:
> drm_dbg(&xe->drm, "Device suspend failed %d\n", err);
> xe_pm_block_end_signalling();
> + drm_client_dev_resume(&xe->drm);
> return err;
> }
>
> @@ -292,10 +295,12 @@ int xe_pm_resume(struct xe_device *xe)
>
> drm_dbg(&xe->drm, "Device resumed\n");
> xe_pm_block_end_signalling();
> + drm_client_dev_resume(&xe->drm);
> return 0;
> err:
> drm_dbg(&xe->drm, "Device resume failed %d\n", err);
> xe_pm_block_end_signalling();
> + drm_client_dev_resume(&xe->drm);
> return err;
> }
>
next prev parent reply other threads:[~2026-06-17 11:25 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-06-17 8:21 [PATCH] drm/xe: Move drm_client_dev_suspend/resume outside pm block signalling Tejas Upadhyay
2026-06-17 11:25 ` Matthew Auld [this message]
2026-06-17 12:11 ` Upadhyay, Tejas
2026-06-17 14:13 ` Matthew Auld
2026-06-17 11:30 ` Jani Nikula
2026-06-17 11:56 ` Raag Jadav
2026-06-17 12:12 ` Upadhyay, Tejas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=e8e282bf-c7e5-4fe5-b228-cddcea4f8b35@intel.com \
--to=matthew.auld@intel.com \
--cc=badal.nilawar@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=jani.nikula@linux.intel.com \
--cc=raag.jadav@intel.com \
--cc=tejas.upadhyay@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox