From: Aradhya Bhatia <aradhya.bhatia@intel.com>
To: Matt Roper <matthew.d.roper@intel.com>,
Intel XE List <intel-xe@lists.freedesktop.org>
Cc: Matthew Auld <matthew.auld@intel.com>,
Lucas De Marchi <lucas.demarchi@intel.com>,
Thomas Hellstrom <thomas.hellstrom@intel.com>,
Ayaz A Siddiqui <ayaz.siddiqui@intel.com>,
Tejas Upadhyay <tejas.upadhyay@intel.com>,
Himal Prasad Ghimiray <himal.prasad.ghimiray@intel.com>,
Aradhya Bhatia <aradhya.bhatia@intel.com>
Subject: [RESEND PATCH v2] drm/xe/migrate: Switch from drm to dev managed actions
Date: Wed, 26 Mar 2025 20:49:29 +0530 [thread overview]
Message-ID: <20250326151929.1495972-1-aradhya.bhatia@intel.com> (raw)
Change the scope of the migrate subsystem to be dev managed instead of
drm managed.
The parent pci struct &device, that the xe struct &drm_device is a part
of, gets removed when a hot unplug is triggered, which causes the
underlying iommu group to get destroyed as well.
The migrate subsystem, which handles the lifetime of the page-table tree
(pt) BO, doesn't get a chance to keep the BO back during the hot unplug,
as all the references to DRM haven't been put back.
When all the references to DRM are indeed put back later, the migrate
subsystem tries to put back the pt BO. Since the underlying iommu group
has been already destroyed, a kernel NULL ptr dereference takes place
while attempting to keep back the pt BO.
Closes: https://gitlab.freedesktop.org/drm/xe/kernel/-/issues/3914
Suggested-by: Thomas Hellstrom <thomas.hellstrom@intel.com>
Reviewed-by: Tejas Upadhyay <tejas.upadhyay@intel.com>
Signed-off-by: Aradhya Bhatia <aradhya.bhatia@intel.com>
---
Note:
This is a resend of the original v2 that was sent previously. That patch, for an
unknown reason, did not get registered with intel-xe freedesktop patchwork
setup, despite reaching the intel-xe mailing list server.
original v2:
https://lore.kernel.org/intel-xe/20250325114225.1231973-1-aradhya.bhatia@intel.com/T/#u
Changes in v2:
- Rebase to latest drm-tip.
- Add tags: Closes, S-b (Thomas Hellstrom), R-b (Tejas Upadhyay).
- Drop patch 2/2 from the series, as memory eviction is now being
comprehensively handled in https://patchwork.freedesktop.org/series/146383/#rev5.
- v1: https://lore.kernel.org/all/20250228065224.320811-1-aradhya.bhatia@intel.com/
---
drivers/gpu/drm/xe/xe_migrate.c | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/drivers/gpu/drm/xe/xe_migrate.c b/drivers/gpu/drm/xe/xe_migrate.c
index df4282c71bf0..6c26892b05d5 100644
--- a/drivers/gpu/drm/xe/xe_migrate.c
+++ b/drivers/gpu/drm/xe/xe_migrate.c
@@ -97,7 +97,7 @@ struct xe_exec_queue *xe_tile_migrate_exec_queue(struct xe_tile *tile)
return tile->migrate->q;
}
-static void xe_migrate_fini(struct drm_device *dev, void *arg)
+static void xe_migrate_fini(void *arg)
{
struct xe_migrate *m = arg;
@@ -401,7 +401,7 @@ struct xe_migrate *xe_migrate_init(struct xe_tile *tile)
struct xe_vm *vm;
int err;
- m = drmm_kzalloc(&xe->drm, sizeof(*m), GFP_KERNEL);
+ m = devm_kzalloc(xe->drm.dev, sizeof(*m), GFP_KERNEL);
if (!m)
return ERR_PTR(-ENOMEM);
@@ -455,7 +455,7 @@ struct xe_migrate *xe_migrate_init(struct xe_tile *tile)
might_lock(&m->job_mutex);
fs_reclaim_release(GFP_KERNEL);
- err = drmm_add_action_or_reset(&xe->drm, xe_migrate_fini, m);
+ err = devm_add_action_or_reset(xe->drm.dev, xe_migrate_fini, m);
if (err)
return ERR_PTR(err);
base-commit: 9a42bdcde0f77b2c1e947e283cc3b267b1ce2056
--
2.34.1
next reply other threads:[~2025-03-26 15:22 UTC|newest]
Thread overview: 12+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-26 15:19 Aradhya Bhatia [this message]
2025-03-26 16:11 ` ✓ CI.Patch_applied: success for drm/xe/migrate: Switch from drm to dev managed actions Patchwork
2025-03-26 16:11 ` ✓ CI.checkpatch: " Patchwork
2025-03-26 16:12 ` ✓ CI.KUnit: " Patchwork
2025-03-26 16:29 ` ✓ CI.Build: " Patchwork
2025-03-26 16:31 ` ✓ CI.Hooks: " Patchwork
2025-03-26 16:33 ` ✓ CI.checksparse: " Patchwork
2025-03-26 16:53 ` ✓ Xe.CI.BAT: " Patchwork
2025-03-27 5:52 ` ✗ Xe.CI.Full: failure " Patchwork
2025-03-27 8:47 ` Patchwork
2025-03-27 11:16 ` Patchwork
2025-03-27 11:34 ` Aradhya Bhatia
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250326151929.1495972-1-aradhya.bhatia@intel.com \
--to=aradhya.bhatia@intel.com \
--cc=ayaz.siddiqui@intel.com \
--cc=himal.prasad.ghimiray@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=lucas.demarchi@intel.com \
--cc=matthew.auld@intel.com \
--cc=matthew.d.roper@intel.com \
--cc=tejas.upadhyay@intel.com \
--cc=thomas.hellstrom@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox