From: Matthew Auld <matthew.auld@intel.com>
To: "Thomas Hellström" <thomas.hellstrom@linux.intel.com>,
intel-xe@lists.freedesktop.org
Cc: "Vivi, Rodrigo" <rodrigo.vivi@intel.com>
Subject: Re: [PATCH v5] drm/xe: Use separate rpm lockdep map for non-d3cold-capable devices
Date: Tue, 27 Aug 2024 09:44:14 +0100 [thread overview]
Message-ID: <aff78d94-7172-455c-9940-0719a9137a47@intel.com> (raw)
In-Reply-To: <20240826143450.92511-1-thomas.hellstrom@linux.intel.com>
On 26/08/2024 15:34, Thomas Hellström wrote:
> For non-d3cold-capable devices we'd like to be able to wake up the
> device from reclaim. In particular, for Lunar Lake we'd like to be
> able to blit CCS metadata to system at shrink time; at least from
> kswapd where it's reasonable OK to wait for rpm resume and a
> preceding rpm suspend.
>
> Therefore use a separate lockdep map for such devices and prime it
> reclaim-tainted.
>
> v2:
> - Rename lockmap acquire- and release functions. (Rodrigo Vivi).
> - Reinstate the old xe_pm_runtime_lockdep_prime() function and
> rename it to xe_rpm_might_enter_cb(). (Matthew Auld).
> - Introduce a separate xe_pm_runtime_lockdep_prime function
> called from module init for known required locking orders.
> v3:
> - Actually hook up the prime function at module init.
> v4:
> - Rebase.
> v5:
> - Don't use reclaim-safe RPM with sriov.
>
> Cc: "Vivi, Rodrigo" <rodrigo.vivi@intel.com>
> Cc: "Auld, Matthew" <matthew.auld@intel.com>
> Signed-off-by: Thomas Hellström <thomas.hellstrom@linux.intel.com>
> Reviewed-by: Matthew Auld <matthew.auld@intel.com> #v2.
r-b still holds on v5.
> ---
> drivers/gpu/drm/xe/xe_module.c | 9 ++++
> drivers/gpu/drm/xe/xe_pm.c | 84 ++++++++++++++++++++++++++++------
> drivers/gpu/drm/xe/xe_pm.h | 1 +
> 3 files changed, 80 insertions(+), 14 deletions(-)
>
> diff --git a/drivers/gpu/drm/xe/xe_module.c b/drivers/gpu/drm/xe/xe_module.c
> index 923460119cec..0899bdc721c5 100644
> --- a/drivers/gpu/drm/xe/xe_module.c
> +++ b/drivers/gpu/drm/xe/xe_module.c
> @@ -11,6 +11,7 @@
> #include "xe_drv.h"
> #include "xe_hw_fence.h"
> #include "xe_pci.h"
> +#include "xe_pm.h"
> #include "xe_observation.h"
> #include "xe_sched_job.h"
>
> @@ -69,6 +70,10 @@ struct init_funcs {
> void (*exit)(void);
> };
>
> +static void xe_dummy_exit(void)
> +{
> +}
> +
> static const struct init_funcs init_funcs[] = {
> {
> .init = xe_hw_fence_module_init,
> @@ -86,6 +91,10 @@ static const struct init_funcs init_funcs[] = {
> .init = xe_observation_sysctl_register,
> .exit = xe_observation_sysctl_unregister,
> },
> + {
> + .init = xe_pm_module_init,
> + .exit = xe_dummy_exit,
> + },
> };
>
> static int __init xe_init(void)
> diff --git a/drivers/gpu/drm/xe/xe_pm.c b/drivers/gpu/drm/xe/xe_pm.c
> index 49cfa2a4a07e..2e2accd76fb2 100644
> --- a/drivers/gpu/drm/xe/xe_pm.c
> +++ b/drivers/gpu/drm/xe/xe_pm.c
> @@ -70,11 +70,34 @@
> */
>
> #ifdef CONFIG_LOCKDEP
> -static struct lockdep_map xe_pm_runtime_lockdep_map = {
> - .name = "xe_pm_runtime_lockdep_map"
> +static struct lockdep_map xe_pm_runtime_d3cold_map = {
> + .name = "xe_rpm_d3cold_map"
> +};
> +
> +static struct lockdep_map xe_pm_runtime_nod3cold_map = {
> + .name = "xe_rpm_nod3cold_map"
> };
> #endif
>
> +static bool __maybe_unused xe_rpm_reclaim_safe(const struct xe_device *xe)
> +{
> + return !xe->d3cold.capable && !xe->info.has_sriov;
> +}
> +
> +static void xe_rpm_lockmap_acquire(const struct xe_device *xe)
> +{
> + lock_map_acquire(xe_rpm_reclaim_safe(xe) ?
> + &xe_pm_runtime_nod3cold_map :
> + &xe_pm_runtime_d3cold_map);
> +}
> +
> +static void xe_rpm_lockmap_release(const struct xe_device *xe)
> +{
> + lock_map_release(xe_rpm_reclaim_safe(xe) ?
> + &xe_pm_runtime_nod3cold_map :
> + &xe_pm_runtime_d3cold_map);
> +}
> +
> /**
> * xe_pm_suspend - Helper for System suspend, i.e. S0->S3 / S0->S2idle
> * @xe: xe device instance
> @@ -354,7 +377,7 @@ int xe_pm_runtime_suspend(struct xe_device *xe)
> * annotation here and in xe_pm_runtime_get() lockdep will see
> * the potential lock inversion and give us a nice splat.
> */
> - lock_map_acquire(&xe_pm_runtime_lockdep_map);
> + xe_rpm_lockmap_acquire(xe);
>
> /*
> * Applying lock for entire list op as xe_ttm_bo_destroy and xe_bo_move_notify
> @@ -389,7 +412,7 @@ int xe_pm_runtime_suspend(struct xe_device *xe)
> out:
> if (err)
> xe_display_pm_resume(xe, true);
> - lock_map_release(&xe_pm_runtime_lockdep_map);
> + xe_rpm_lockmap_release(xe);
> xe_pm_write_callback_task(xe, NULL);
> return err;
> }
> @@ -410,7 +433,7 @@ int xe_pm_runtime_resume(struct xe_device *xe)
> /* Disable access_ongoing asserts and prevent recursive pm calls */
> xe_pm_write_callback_task(xe, current);
>
> - lock_map_acquire(&xe_pm_runtime_lockdep_map);
> + xe_rpm_lockmap_acquire(xe);
>
> if (xe->d3cold.allowed) {
> err = xe_pcode_ready(xe, true);
> @@ -442,7 +465,7 @@ int xe_pm_runtime_resume(struct xe_device *xe)
> }
>
> out:
> - lock_map_release(&xe_pm_runtime_lockdep_map);
> + xe_rpm_lockmap_release(xe);
> xe_pm_write_callback_task(xe, NULL);
> return err;
> }
> @@ -456,15 +479,37 @@ int xe_pm_runtime_resume(struct xe_device *xe)
> * stuff that can happen inside the runtime_resume callback by acquiring
> * a dummy lock (it doesn't protect anything and gets compiled out on
> * non-debug builds). Lockdep then only needs to see the
> - * xe_pm_runtime_lockdep_map -> runtime_resume callback once, and then can
> - * hopefully validate all the (callers_locks) -> xe_pm_runtime_lockdep_map.
> + * xe_pm_runtime_xxx_map -> runtime_resume callback once, and then can
> + * hopefully validate all the (callers_locks) -> xe_pm_runtime_xxx_map.
> * For example if the (callers_locks) are ever grabbed in the
> * runtime_resume callback, lockdep should give us a nice splat.
> */
> -static void pm_runtime_lockdep_prime(void)
> +static void xe_rpm_might_enter_cb(const struct xe_device *xe)
> {
> - lock_map_acquire(&xe_pm_runtime_lockdep_map);
> - lock_map_release(&xe_pm_runtime_lockdep_map);
> + xe_rpm_lockmap_acquire(xe);
> + xe_rpm_lockmap_release(xe);
> +}
> +
> +/*
> + * Prime the lockdep maps for known locking orders that need to
> + * be supported but that may not always occur on all systems.
> + */
> +static void xe_pm_runtime_lockdep_prime(void)
> +{
> + struct dma_resv lockdep_resv;
> +
> + dma_resv_init(&lockdep_resv);
> + lock_map_acquire(&xe_pm_runtime_d3cold_map);
> + /* D3Cold takes the dma_resv locks to evict bos */
> + dma_resv_lock(&lockdep_resv, NULL);
> + dma_resv_unlock(&lockdep_resv);
> + lock_map_release(&xe_pm_runtime_d3cold_map);
> +
> + /* Shrinkers might like to wake up the device under reclaim. */
> + fs_reclaim_acquire(GFP_KERNEL);
> + lock_map_acquire(&xe_pm_runtime_nod3cold_map);
> + lock_map_release(&xe_pm_runtime_nod3cold_map);
> + fs_reclaim_release(GFP_KERNEL);
> }
>
> /**
> @@ -479,7 +524,7 @@ void xe_pm_runtime_get(struct xe_device *xe)
> if (xe_pm_read_callback_task(xe) == current)
> return;
>
> - pm_runtime_lockdep_prime();
> + xe_rpm_might_enter_cb(xe);
> pm_runtime_resume(xe->drm.dev);
> }
>
> @@ -511,7 +556,7 @@ int xe_pm_runtime_get_ioctl(struct xe_device *xe)
> if (WARN_ON(xe_pm_read_callback_task(xe) == current))
> return -ELOOP;
>
> - pm_runtime_lockdep_prime();
> + xe_rpm_might_enter_cb(xe);
> return pm_runtime_get_sync(xe->drm.dev);
> }
>
> @@ -579,7 +624,7 @@ bool xe_pm_runtime_resume_and_get(struct xe_device *xe)
> return true;
> }
>
> - pm_runtime_lockdep_prime();
> + xe_rpm_might_enter_cb(xe);
> return pm_runtime_resume_and_get(xe->drm.dev) >= 0;
> }
>
> @@ -671,3 +716,14 @@ void xe_pm_d3cold_allowed_toggle(struct xe_device *xe)
> drm_dbg(&xe->drm,
> "d3cold: allowed=%s\n", str_yes_no(xe->d3cold.allowed));
> }
> +
> +/**
> + * xe_pm_module_init() - Perform xe_pm specific module initialization.
> + *
> + * Return: 0 on success. Currently doesn't fail.
> + */
> +int __init xe_pm_module_init(void)
> +{
> + xe_pm_runtime_lockdep_prime();
> + return 0;
> +}
> diff --git a/drivers/gpu/drm/xe/xe_pm.h b/drivers/gpu/drm/xe/xe_pm.h
> index 104a21ae6dfd..9aef673b1c8a 100644
> --- a/drivers/gpu/drm/xe/xe_pm.h
> +++ b/drivers/gpu/drm/xe/xe_pm.h
> @@ -32,5 +32,6 @@ void xe_pm_assert_unbounded_bridge(struct xe_device *xe);
> int xe_pm_set_vram_threshold(struct xe_device *xe, u32 threshold);
> void xe_pm_d3cold_allowed_toggle(struct xe_device *xe);
> struct task_struct *xe_pm_read_callback_task(struct xe_device *xe);
> +int xe_pm_module_init(void);
>
> #endif
prev parent reply other threads:[~2024-08-27 8:44 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-26 14:34 [PATCH v5] drm/xe: Use separate rpm lockdep map for non-d3cold-capable devices Thomas Hellström
2024-08-26 14:40 ` ✓ CI.Patch_applied: success for drm/xe: Use separate rpm lockdep map for non-d3cold-capable devices (rev4) Patchwork
2024-08-26 14:40 ` ✓ CI.checkpatch: " Patchwork
2024-08-26 14:42 ` ✓ CI.KUnit: " Patchwork
2024-08-26 14:54 ` ✓ CI.Build: " Patchwork
2024-08-26 14:56 ` ✓ CI.Hooks: " Patchwork
2024-08-26 14:57 ` ✓ CI.checksparse: " Patchwork
2024-08-26 15:16 ` ✓ CI.BAT: " Patchwork
2024-08-26 19:42 ` ✗ CI.FULL: failure " Patchwork
2024-08-27 8:44 ` Matthew Auld [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aff78d94-7172-455c-9940-0719a9137a47@intel.com \
--to=matthew.auld@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=rodrigo.vivi@intel.com \
--cc=thomas.hellstrom@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox