From: "Moger, Babu" <bmoger@amd.com>
To: Reinette Chatre <reinette.chatre@intel.com>,
babu.moger@amd.com, tony.luck@intel.com, Dave.Martin@arm.com,
james.morse@arm.com, dave.hansen@linux.intel.com, bp@alien8.de
Cc: kas@kernel.org, rick.p.edgecombe@intel.com,
linux-kernel@vger.kernel.org, x86@kernel.org,
linux-coco@lists.linux.dev, kvm@vger.kernel.org
Subject: Re: [PATCH] fs/resctrl: Fix MBM events being unconditionally enabled in mbm_event mode
Date: Wed, 15 Oct 2025 09:55:27 -0500 [thread overview]
Message-ID: <a2961f11-705a-4d75-85ee-bf96c8091647@amd.com> (raw)
In-Reply-To: <5163ce35-f843-41a3-abfc-5af91b7c68bc@intel.com>
Hi Reinette,
On 10/14/2025 6:09 PM, Reinette Chatre wrote:
> Hi Babu,
>
> On 10/14/25 3:45 PM, Moger, Babu wrote:
>> On 10/14/2025 3:57 PM, Reinette Chatre wrote:
>>> On 10/14/25 10:43 AM, Babu Moger wrote:
>
>
>>>>> Yes. I saw the issues. It fails to mount in my case with panic trace.
>>>
>>> (Just to ensure that there is not anything else going on) Could you please confirm if the panic is from
>>> mon_add_all_files()->mon_event_read()->mon_event_count()->__mon_event_count()->resctrl_arch_reset_rmid()
>>> that creates the MBM event files during mount and then does the initial read of RMID to determine the
>>> starting count?
>>
>> It happens just before that (at mbm_cntr_get). We have not allocated d->cntr_cfg for the counters.
>> ===================Panic trace =================================
>>
>> 349.330416] BUG: kernel NULL pointer dereference, address: 0000000000000008
>> [ 349.338187] #PF: supervisor read access in kernel mode
>> [ 349.343914] #PF: error_code(0x0000) - not-present page
>> [ 349.349644] PGD 10419f067 P4D 0
>> [ 349.353241] Oops: Oops: 0000 [#1] SMP NOPTI
>> [ 349.357905] CPU: 45 UID: 0 PID: 3449 Comm: mount Not tainted 6.18.0-rc1+ #120 PREEMPT(voluntary)
>> [ 349.367803] Hardware name: AMD Corporation PURICO/PURICO, BIOS RPUT1003E 12/11/2024
>> [ 349.376334] RIP: 0010:mbm_cntr_get+0x56/0x90
>> [ 349.381096] Code: 45 8d 41 fe 83 f8 01 77 3d 8b 7b 50 85 ff 7e 36 49 8b 84 24 f0 04 00 00 45 31 c0 eb 0d 41 83 c0 01 48 83 c0 10 44 39 c7 74 1c <48> 3b 50 08 75 ed 3b 08 75 e9 48 83 c4 10 44 89 c0 5b 41 5c 41 5d
>> [ 349.402037] RSP: 0018:ff56bba58655f958 EFLAGS: 00010246
>> [ 349.407861] RAX: 0000000000000000 RBX: ffffffff9525b900 RCX: 0000000000000002
>> [ 349.415818] RDX: ffffffff95d526a0 RSI: ff1f5d52517c1800 RDI: 0000000000000020
>> [ 349.423774] RBP: ff56bba58655f980 R08: 0000000000000000 R09: 0000000000000001
>> [ 349.431730] R10: ff1f5d52c616a6f0 R11: fffc6a2f046c3980 R12: ff1f5d52517c1800
>> [ 349.439687] R13: 0000000000000001 R14: ffffffff95d526a0 R15: ffffffff9525b968
>> [ 349.447635] FS: 00007f17926b7800(0000) GS:ff1f5d59d45ff000(0000) knlGS:0000000000000000
>> [ 349.456659] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [ 349.463064] CR2: 0000000000000008 CR3: 0000000147afe002 CR4: 0000000000771ef0
>> [ 349.471022] PKRU: 55555554
>> [ 349.474033] Call Trace:
>> [ 349.476755] <TASK>
>> [ 349.479091] ? kernfs_add_one+0x114/0x170
>> [ 349.483560] rdtgroup_assign_cntr_event+0x9b/0xd0
>> [ 349.488795] rdtgroup_assign_cntrs+0xab/0xb0
>> [ 349.493553] rdt_get_tree+0x4be/0x770
>> [ 349.497623] vfs_get_tree+0x2e/0xf0
>> [ 349.501508] fc_mount+0x18/0x90
>> [ 349.505007] path_mount+0x360/0xc50
>> [ 349.508884] ? putname+0x68/0x80
>> [ 349.512479] __x64_sys_mount+0x124/0x150
>> [ 349.516848] x64_sys_call+0x2133/0x2190
>> [ 349.521123] do_syscall_64+0x74/0x970
>>
>> ==================================================================
>
> Thank you for capturing this. This is a different trace but it confirms that it is the
> same root cause. Specifically, event is enabled after the state it depends on is (not) allocated
> during domain online.
>
Yes. Thanks
Here is the changelog.
x86,fs/resctrl: Fix BUG with mbm_event mode when MBM events are disabled
The following BUG is encountered when mounting the resctrl filesystem
after booting a system with X86_FEATURE_ABMC support and the kernel
parameter 'rdt=!mbmtotal,!mbmlocal'.
===========================================================================
[ 349.330416] BUG: kernel NULL pointer dereference, address:
0000000000000008
[ 349.338187] #PF: supervisor read access in kernel mode
[ 349.343914] #PF: error_code(0x0000) - not-present page
[ 349.349644] PGD 10419f067 P4D 0
[ 349.353241] Oops: Oops: 0000 [#1] SMP NOPTI
[ 349.357905] CPU: 45 UID: 0 PID: 3449 Comm: mount Not tainted
6.18.0-rc1+ #120 PREEMPT(voluntary)
[ 349.367803] Hardware name: AMD Corporation
[ 349.376334] RIP: 0010:mbm_cntr_get+0x56/0x90
[ 349.381096] Code: 45 8d 41 fe 83 f8 01 77 3d 8b 7b 50 85 ff 7e 36 49
8b 84 24 f0 04 00 00 45 31 c0 eb 0d 41 83 c0 01 48 83 c0 10 44 39 c7 74
1c <48> 3b 50 08 75 ed 3b 08 75 e9 48 83 c4 10 44 89 c0 5b 41 5c 41 5d
[ 349.402037] RSP: 0018:ff56bba58655f958 EFLAGS: 00010246
[ 349.407861] RAX: 0000000000000000 RBX: ffffffff9525b900 RCX:
0000000000000002
[ 349.415818] RDX: ffffffff95d526a0 RSI: ff1f5d52517c1800 RDI:
0000000000000020
[ 349.423774] RBP: ff56bba58655f980 R08: 0000000000000000 R09:
0000000000000001
[ 349.431730] R10: ff1f5d52c616a6f0 R11: fffc6a2f046c3980 R12:
ff1f5d52517c1800
[ 349.439687] R13: 0000000000000001 R14: ffffffff95d526a0 R15:
ffffffff9525b968
[ 349.447635] FS: 00007f17926b7800(0000) GS:ff1f5d59d45ff000(0000)
knlGS:0000000000000000
[ 349.456659] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 349.463064] CR2: 0000000000000008 CR3: 0000000147afe002 CR4:
0000000000771ef0
[ 349.471022] PKRU: 55555554
[ 349.474033] Call Trace:
[ 349.476755] <TASK>
[ 349.479091] ? kernfs_add_one+0x114/0x170
[ 349.483560] rdtgroup_assign_cntr_event+0x9b/0xd0
[ 349.488795] rdtgroup_assign_cntrs+0xab/0xb0
[ 349.493553] rdt_get_tree+0x4be/0x770
[ 349.497623] vfs_get_tree+0x2e/0xf0
[ 349.501508] fc_mount+0x18/0x90
[ 349.505007] path_mount+0x360/0xc50
[ 349.508884] ? putname+0x68/0x80
[ 349.512479] __x64_sys_mount+0x124/0x150
When mbm_event mode is enabled, it implicitly enables both MBM total and
local events. However, specifying the kernel parameter
"rdt=!mbmtotal,!mbmlocal" disables these events during resctrl
initialization. As a result, related data structures, such as
rdt_mon_domain::mbm_states, cntr_cfg, and
rdt_hw_mon_domain::arch_mbm_states are not allocated. This
leads to a BUG when the user attempts to mount the resctrl filesystem,
which tries to access these un-allocated structures.
Fix the issue by adding a dependency on X86_FEATURE_CQM_MBM_TOTAL and
X86_FEATURE_CQM_MBM_LOCAL for X86_FEATURE_ABMC to be enabled. This is
acceptable for now, as X86_FEATURE_ABMC currently implies support for
MBM total and local events. However, this dependency should be revisited
and removed in the future to decouple feature handling more cleanly.
Fixes: 13390861b426e ("x86,fs/resctrl: Detect Assignable Bandwidth
Monitoring feature details")
Co-developed-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
Signed-off-by: Babu Moger <babu.moger@amd.com>
====================================================
thanks
Babu
next prev parent reply other threads:[~2025-10-15 14:55 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-09-30 20:26 [PATCH] fs/resctrl: Fix MBM events being unconditionally enabled in mbm_event mode Babu Moger
2025-10-06 17:56 ` Reinette Chatre
2025-10-06 20:38 ` Moger, Babu
2025-10-07 1:23 ` Reinette Chatre
2025-10-07 17:36 ` Babu Moger
2025-10-08 2:38 ` Reinette Chatre
2025-10-14 16:24 ` Reinette Chatre
2025-10-14 17:38 ` Babu Moger
2025-10-14 17:43 ` Babu Moger
2025-10-14 20:57 ` Reinette Chatre
2025-10-14 22:45 ` Moger, Babu
2025-10-14 23:09 ` Reinette Chatre
2025-10-15 14:55 ` Moger, Babu [this message]
2025-10-15 19:56 ` Reinette Chatre
2025-10-15 20:37 ` Moger, Babu
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a2961f11-705a-4d75-85ee-bf96c8091647@amd.com \
--to=bmoger@amd.com \
--cc=Dave.Martin@arm.com \
--cc=babu.moger@amd.com \
--cc=bp@alien8.de \
--cc=dave.hansen@linux.intel.com \
--cc=james.morse@arm.com \
--cc=kas@kernel.org \
--cc=kvm@vger.kernel.org \
--cc=linux-coco@lists.linux.dev \
--cc=linux-kernel@vger.kernel.org \
--cc=reinette.chatre@intel.com \
--cc=rick.p.edgecombe@intel.com \
--cc=tony.luck@intel.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox