From: "Vivekanandan, Balasubramani" <balasubramani.vivekanandan@intel.com>
To: "Summers, Stuart" <stuart.summers@intel.com>,
"intel-xe@lists.freedesktop.org" <intel-xe@lists.freedesktop.org>
Cc: "De Marchi, Lucas" <lucas.demarchi@intel.com>
Subject: Re: [PATCH v2 1/2] drm/xe/gt: Synchronize GT reset with device unbind
Date: Fri, 31 Oct 2025 20:14:55 +0530 [thread overview]
Message-ID: <aQTLZ83wKHHNriZz@bvivekan-mobl1> (raw)
In-Reply-To: <b3d8c801d84a807fab0e53d1e6134e3234024b44.camel@intel.com>
On 30.10.2025 23:07, Summers, Stuart wrote:
> On Thu, 2025-10-30 at 20:41 +0530, Balasubramani Vivekanandan wrote:
> > When unbinding wait for any GT reset in progress to complete.
> > Unbinding
> > will release the mmio mapping but mmio operations are performed
> > during
> > GT reset causing Kernel panic.
>
> Do you have an example kernel panic you can provide in the commit here?
> I've seen similar.
This was the callstack reported. But I have seen some variations as well
but always arising from the do_gt_restart function.
Do you really want me to add this to commit? I think it adds more noise
than any help.
[ 2935.688873] BUG: unable to handle page fault for address: ffffc9000500c000
[ 2935.689773] #PF: supervisor read access in kernel mode
[ 2935.690464] #PF: error_code(0x0000) - not-present page
[ 2935.691154] PGD 100000067 P4D 100000067 PUD 1009b8067 PMD 0
[ 2935.691955] Oops: Oops: 0000 [#1] SMP PTI
[ 2935.692506] CPU: 0 UID: 0 PID: 91 Comm: kworker/u4:7 Kdump: loaded
Tainted: G U 6.17.0-rc3-lgci-xe-kernel-xe-internal+ #1 PREEMPT(voluntary)
[ 2935.694307] Tainted: [U]=USER
[ 2935.695508] Workqueue: gt-ordered-wq gt_reset_worker [xe]
[ 2935.696253] RIP: 0010:xe_mmio_read32+0x78/0x2e0 [xe]
[ 2935.696938] Code: 00 9f 0f 00 00 0f 97 c0 81 e3 ff ff ff fe 44 09 f0 44 0f b6 d0 c1 e0 18 09 c3 85 c0 0f 84 bb 00 00 00 44 89 e8 49 03 44 24 08 <8b> 08 89 de 41 c1 e2 18 44 89 ea 4c 89 e7 81 e6 ff ff ff fe 44 09
[ 2935.699299] RSP: 0000:ffffc90000cffa80 EFLAGS: 00010286
[ 2935.700010] RAX: ffffc9000500c000 RBX: 000000000000c000 RCX: 0000000000000000
[ 2935.700950] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
[ 2935.701891] RBP: ffffc90000cffaf8 R08: 0000000000000000 R09: 0000000000000000
[ 2935.702832] R10: 0000000000000000 R11: 0000000000000000 R12: ffff888134968078
[ 2935.703773] R13: 000000000000c000 R14: 0000000000000000 R15: ffff88813a340000
[ 2935.704713] FS: 0000000000000000(0000) GS:ffff8882b3ef7000(0000) knlGS:0000000000000000
[ 2935.705764] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 2935.706535] CR2: ffffc9000500c000 CR3: 000000016365e006 CR4: 0000000000572ef0
[ 2935.707475] PKRU: 00000000
[ 2935.707866] Call Trace:
[ 2935.708226] <TASK>
[ 2935.708547] ? usleep_range_state+0x6e/0xe0
[ 2935.709117] ? usleep_range_state+0x93/0xe0
[ 2935.709688] __xe_mmio_wait32+0x7d/0x190 [xe]
[ 2935.710303] xe_mmio_wait32_not+0x18/0x30 [xe]
[ 2935.710928] __xe_guc_upload+0x1e8/0x7d0 [xe]
[ 2935.711543] xe_guc_upload+0x5b/0x70 [xe]
[ 2935.712107] xe_uc_load_hw+0xa4/0x3f0 [xe]
[ 2935.712682] ? cancel_work_sync+0x50/0x90
[ 2935.713233] do_gt_restart+0xea/0x670 [xe]
[ 2935.713808] ? do_gt_reset+0xc9/0x2c0 [xe]
[ 2935.714382] ? mutex_unlock+0x12/0x20
[ 2935.714893] ? xe_gt_tlb_invalidation_reset+0x108/0x130 [xe]
[ 2935.715668] gt_reset_worker+0x2a6/0x460 [xe]
[ 2935.716283] ? lock_acquire+0xc4/0x2e0
[ 2935.716803] ? process_one_work+0x1ee/0x6f0
[ 2935.717374] process_one_work+0x22e/0x6f0
[ 2935.717925] worker_thread+0x1e8/0x3d0
[ 2935.718445] ? __pfx_worker_thread+0x10/0x10
[ 2935.719036] kthread+0x11f/0x250
[ 2935.719497] ? __pfx_kthread+0x10/0x10
[ 2935.720017] ret_from_fork+0x26f/0x2e0
[ 2935.720538] ? __pfx_kthread+0x10/0x10
[ 2935.721058] ret_from_fork_asm+0x1a/0x30
[ 2935.721599] </TASK>
Regards,
Bala
>
> Thanks,
> Stuart
>
> >
> > Cc: Lucas De Marchi <lucas.demarchi@intel.com>
> > Signed-off-by: Balasubramani Vivekanandan
> > <balasubramani.vivekanandan@intel.com>
> > ---
> > v2:
> > - Use the managed resource release function to wait for GT reset
> > during
> > unbind (Lucas)
> > ---
> > drivers/gpu/drm/xe/xe_gt.c | 2 ++
> > 1 file changed, 2 insertions(+)
> >
> > diff --git a/drivers/gpu/drm/xe/xe_gt.c b/drivers/gpu/drm/xe/xe_gt.c
> > index 89808b33d0a8..d0f8c40bc51e 100644
> > --- a/drivers/gpu/drm/xe/xe_gt.c
> > +++ b/drivers/gpu/drm/xe/xe_gt.c
> > @@ -607,6 +607,8 @@ static void xe_gt_fini(void *arg)
> > struct xe_gt *gt = arg;
> > int i;
> >
> > + disable_work_sync(>->reset.worker);
> > +
> > for (i = 0; i < XE_ENGINE_CLASS_MAX; ++i)
> > xe_hw_fence_irq_finish(>->fence_irq[i]);
> >
>
next prev parent reply other threads:[~2025-10-31 14:45 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-30 15:11 [PATCH v2 0/2] Fix user-after-free during driver unbind Balasubramani Vivekanandan
2025-10-30 15:11 ` [PATCH v2 1/2] drm/xe/gt: Synchronize GT reset with device unbind Balasubramani Vivekanandan
2025-10-30 17:37 ` Summers, Stuart
2025-10-31 14:44 ` Vivekanandan, Balasubramani [this message]
2025-10-31 15:39 ` Matthew Brost
2025-10-31 16:17 ` Lucas De Marchi
2025-10-31 16:32 ` Matthew Brost
2025-10-30 15:11 ` [PATCH v2 2/2] drm/xe/guc: Synchronize Dead CT worker with unbind Balasubramani Vivekanandan
2025-10-30 17:38 ` Summers, Stuart
2025-10-31 14:53 ` Vivekanandan, Balasubramani
2025-10-31 16:10 ` Summers, Stuart
2025-10-30 16:25 ` ✓ CI.KUnit: success for Fix user-after-free during driver unbind (rev2) Patchwork
2025-10-30 17:03 ` ✓ Xe.CI.BAT: " Patchwork
2025-10-30 19:44 ` ✓ Xe.CI.Full: " Patchwork
-- strict thread matches above, loose matches on Subject: below --
2025-10-30 8:26 [PATCH v2 0/2] Fix user-after-free during driver unbind Balasubramani Vivekanandan
2025-10-30 8:26 ` [PATCH v2 1/2] drm/xe/gt: Synchronize GT reset with device unbind Balasubramani Vivekanandan
2025-10-30 21:21 ` Lucas De Marchi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=aQTLZ83wKHHNriZz@bvivekan-mobl1 \
--to=balasubramani.vivekanandan@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=lucas.demarchi@intel.com \
--cc=stuart.summers@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox