From: "Dixit, Ashutosh" <ashutosh.dixit@intel.com>
To: "Souza, Jose" <jose.souza@intel.com>
Cc: "intel-xe@lists.freedesktop.org" <intel-xe@lists.freedesktop.org>,
"Nerlige Ramappa, Umesh" <umesh.nerlige.ramappa@intel.com>
Subject: Re: [PATCH v4 0/2] Fixes for MI_REPORT_PERF_COUNT
Date: Fri, 20 Dec 2024 09:19:24 -0800 [thread overview]
Message-ID: <85ikre1dcz.wl-ashutosh.dixit@intel.com> (raw)
In-Reply-To: <f76db0029f55ff14f96c5508e295be20acf04356.camel@intel.com>
On Fri, 20 Dec 2024 08:16:45 -0800, Souza, Jose wrote:
>
Hi Jose,
> On Thu, 2024-12-19 at 16:22 -0800, Umesh Nerlige Ramappa wrote:
> > OA programming sequence for query mode or MI_REPORT_PERF_COUNT requires
> > modifying some HW registers in the same hw context as the user exec
> > queue. User passes the exec_queue to the OA interface and OA
> > implementation submits an MI_LOAD_REGISTER_IMM to this queue to modify
> > the registers.
> >
> > The OA implementation submits a batch mapped in GGTT to the user exec
> > queue and hence, some plumbing is added into relevant code to enable
> > that (as per suggestions from Matthew Brost).
> >
> > v2: review rework
> > v3:
> > - review rework
> > - original patches squashed for porting to stable
> > - code cleanup
> >
> > v4:
> > - review rework/fixes
>
> Got this oops with this version:
>
> [ 176.066578] xe 0000:00:02.0: [drm:xe_oa_config_locked [xe]] changed to oa config uuid=4ccd6535-fb9a-440f-b0f5-882879dc4cb0
> [ 176.068577] xe 0000:00:02.0: [drm:xe_oa_config_locked [xe]] changed to oa config uuid=4ccd6535-fb9a-440f-b0f5-882879dc4cb0
> [ 176.072629] xe 0000:00:02.0: [drm:xe_oa_config_locked [xe]] changed to oa config uuid=4ccd6535-fb9a-440f-b0f5-882879dc4cb0
> [ 176.078117] xe 0000:00:02.0: [drm:xe_oa_config_locked [xe]] changed to oa config uuid=4ccd6535-fb9a-440f-b0f5-882879dc4cb0
> [ 176.081285] xe 0000:00:02.0: [drm:xe_oa_config_locked [xe]] changed to oa config uuid=4ccd6535-fb9a-440f-b0f5-882879dc4cb0
> [ 176.093564] xe 0000:00:02.0: [drm:xe_oa_config_locked [xe]] changed to oa config uuid=4ccd6535-fb9a-440f-b0f5-882879dc4cb0
> [ 176.102886] xe 0000:00:02.0: [drm:xe_oa_config_locked [xe]] changed to oa config uuid=4ccd6535-fb9a-440f-b0f5-882879dc4cb0
> [ 194.119229] Oops: general protection fault, probably for non-canonical address 0x6b6b6b6b6b6b6ba3: 0000 [#1] PREEMPT SMP
> [ 194.130187] CPU: 3 UID: 1000 PID: 2240 Comm: ReplayManager Not tainted 6.13.0-rc3-zeh-xe+ #1454
> [ 194.138931] Hardware name: Intel Corporation Lunar Lake Client Platform/LNL-M LP5 RVP1, BIOS LNLMFWI1.R00.3152.D83.2404190622 04/19/2024
> [ 194.151258] RIP: 0010:xe_sync_entry_add_deps+0x1c/0x60 [xe]
> [ 194.157013] Code: c7 43 18 f4 ff ff ff e9 9b fe ff ff 66 90 55 53 48 8b 5f 08 48 85 db 75 05 31 c0 5b 5d c3 48 89 f5 48 8d 7b 38 b8 01 00 00 00
> <f0> 0f c1 43 38 85 c0 74 20 8d 50 01 09 c2 78 0d 48 89 de 48 89 ef
> [ 194.175863] RSP: 0018:ffffc90001f93de8 EFLAGS: 00010202
> [ 194.181136] RAX: 0000000000000001 RBX: 6b6b6b6b6b6b6b6b RCX: 0000000000000000
> [ 194.188331] RDX: ffff88815ee8edc0 RSI: ffff88814ebb0840 RDI: 6b6b6b6b6b6b6ba3
> [ 194.195520] RBP: ffff88814ebb0840 R08: 0000000000000001 R09: 0000000000000000
> [ 194.202707] R10: 0000000000000001 R11: 0000000000000003 R12: ffff88814ebb0840
> [ 194.209889] R13: ffff8881457f9900 R14: ffff888173075800 R15: 0000000000000000
> [ 194.217071] FS: 00007f6c80db9640(0000) GS:ffff88885e580000(0000) knlGS:0000000000000000
> [ 194.225216] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 194.231014] CR2: 00007f6bdb33a000 CR3: 0000000144f44001 CR4: 0000000000772ef0
> [ 194.238201] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 194.245386] DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7: 0000000000000400
> [ 194.252575] PKRU: 55555554
> [ 194.255315] Call Trace:
> [ 194.257794] <TASK>
> [ 194.259932] ? __die_body.cold+0x19/0x21
> [ 194.263899] ? die_addr+0x33/0x50
> [ 194.267256] ? exc_general_protection+0x19e/0x450
> [ 194.272002] ? asm_exc_general_protection+0x22/0x30
> [ 194.276930] ? xe_sync_entry_add_deps+0x1c/0x60 [xe]
Looks related to this inadvertent change I noticed yesterday and pointed
out in the thread:
>> static int xe_oa_load_with_lri(struct xe_oa_stream *stream, struct xe_oa_reg *reg_lri)
>> {
>> ...
>> - fence = xe_oa_submit_bb(stream, XE_OA_SUBMIT_NO_DEPS, bb);
>> + fence = xe_oa_submit_bb(stream, XE_OA_SUBMIT_ADD_DEPS, bb);
>
> This looks like a copy-paste error, could you please change this back to
> XE_OA_SUBMIT_NO_DEPS as it used to be.
Sorry you ran into this. We'll fix this and ask your help to test again.
> [ 194.282052] xe_oa_submit_bb.constprop.0+0x9d/0x1c0 [xe]
> [ 194.287517] xe_oa_load_with_lri.constprop.0+0xc4/0x130 [xe]
> [ 194.293313] xe_oa_configure_oa_context+0x1fd/0x210 [xe]
> [ 194.298770] xe_oa_disable_metric_set+0x4b/0xc0 [xe]
> [ 194.303857] xe_oa_stream_destroy+0x3a/0x140 [xe]
> [ 194.308698] xe_oa_release+0x3a/0xe0 [xe]
> [ 194.312833] __fput+0xee/0x2a0
> [ 194.315934] __x64_sys_close+0x49/0xb0
> [ 194.319722] do_syscall_64+0x64/0x130
> [ 194.323417] entry_SYSCALL_64_after_hwframe+0x4b/0x53
> [ 194.328511] RIP: 0033:0x7f6ca8b14f8b
Thanks.
--
Ashutosh
next prev parent reply other threads:[~2024-12-20 17:19 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-20 0:22 [PATCH v4 0/2] Fixes for MI_REPORT_PERF_COUNT Umesh Nerlige Ramappa
2024-12-20 0:22 ` [PATCH v4 1/2] xe/oa: Fix query mode of operation for OAR/OAC Umesh Nerlige Ramappa
2024-12-20 3:47 ` Dixit, Ashutosh
2024-12-20 0:22 ` [PATCH v4 2/2] xe/oa: Drop the unused logic to parse context image Umesh Nerlige Ramappa
2024-12-20 3:47 ` Dixit, Ashutosh
2024-12-20 2:59 ` ✓ CI.Patch_applied: success for Fixes for MI_REPORT_PERF_COUNT (rev4) Patchwork
2024-12-20 2:59 ` ✓ CI.checkpatch: " Patchwork
2024-12-20 3:01 ` ✓ CI.KUnit: " Patchwork
2024-12-20 3:19 ` ✓ CI.Build: " Patchwork
2024-12-20 3:21 ` ✓ CI.Hooks: " Patchwork
2024-12-20 3:23 ` ✓ CI.checksparse: " Patchwork
2024-12-20 4:09 ` ✓ Xe.CI.BAT: " Patchwork
2024-12-20 16:16 ` [PATCH v4 0/2] Fixes for MI_REPORT_PERF_COUNT Souza, Jose
2024-12-20 17:19 ` Dixit, Ashutosh [this message]
2024-12-21 4:42 ` ✗ Xe.CI.Full: failure for Fixes for MI_REPORT_PERF_COUNT (rev4) Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=85ikre1dcz.wl-ashutosh.dixit@intel.com \
--to=ashutosh.dixit@intel.com \
--cc=intel-xe@lists.freedesktop.org \
--cc=jose.souza@intel.com \
--cc=umesh.nerlige.ramappa@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.