linux-coco.lists.linux.dev archive mirror
 help / color / mirror / Atom feed
From: "Moger, Babu" <bmoger@amd.com>
To: Reinette Chatre <reinette.chatre@intel.com>,
	babu.moger@amd.com, tony.luck@intel.com, Dave.Martin@arm.com,
	james.morse@arm.com, dave.hansen@linux.intel.com, bp@alien8.de
Cc: kas@kernel.org, rick.p.edgecombe@intel.com,
	linux-kernel@vger.kernel.org, x86@kernel.org,
	linux-coco@lists.linux.dev, kvm@vger.kernel.org
Subject: Re: [PATCH] fs/resctrl: Fix MBM events being unconditionally enabled in mbm_event mode
Date: Wed, 15 Oct 2025 15:37:19 -0500	[thread overview]
Message-ID: <dcc64b09-117c-4d25-957d-e97ef49a8100@amd.com> (raw)
In-Reply-To: <5645dec8-e344-44d3-82f7-327259a53906@intel.com>

Hi Reinette,

On 10/15/2025 2:56 PM, Reinette Chatre wrote:
> Hi Babu,
> 
> On 10/15/25 7:55 AM, Moger, Babu wrote:
>> Hi Reinette,
>>
>> On 10/14/2025 6:09 PM, Reinette Chatre wrote:
>>> Hi Babu,
>>>
>>> On 10/14/25 3:45 PM, Moger, Babu wrote:
>>>> On 10/14/2025 3:57 PM, Reinette Chatre wrote:
>>>>> On 10/14/25 10:43 AM, Babu Moger wrote:
>>>
>>>
>>>>>>> Yes. I saw the issues. It fails to mount in my case with panic trace.
>>>>>
>>>>> (Just to ensure that there is not anything else going on) Could you please confirm if the panic is from
>>>>> mon_add_all_files()->mon_event_read()->mon_event_count()->__mon_event_count()->resctrl_arch_reset_rmid()
>>>>> that creates the MBM event files during mount and then does the initial read of RMID to determine the
>>>>> starting count?
>>>>
>>>> It happens just before that (at mbm_cntr_get). We have not allocated d->cntr_cfg for the counters.
>>>> ===================Panic trace =================================
>>>>
>>>> 349.330416] BUG: kernel NULL pointer dereference, address: 0000000000000008
>>>> [  349.338187] #PF: supervisor read access in kernel mode
>>>> [  349.343914] #PF: error_code(0x0000) - not-present page
>>>> [  349.349644] PGD 10419f067 P4D 0
>>>> [  349.353241] Oops: Oops: 0000 [#1] SMP NOPTI
>>>> [  349.357905] CPU: 45 UID: 0 PID: 3449 Comm: mount Not tainted 6.18.0-rc1+ #120 PREEMPT(voluntary)
>>>> [  349.367803] Hardware name: AMD Corporation PURICO/PURICO, BIOS RPUT1003E 12/11/2024
>>>> [  349.376334] RIP: 0010:mbm_cntr_get+0x56/0x90
>>>> [  349.381096] Code: 45 8d 41 fe 83 f8 01 77 3d 8b 7b 50 85 ff 7e 36 49 8b 84 24 f0 04 00 00 45 31 c0 eb 0d 41 83 c0 01 48 83 c0 10 44 39 c7 74 1c <48> 3b 50 08 75 ed 3b 08 75 e9 48 83 c4 10 44 89 c0 5b 41 5c 41 5d
>>>> [  349.402037] RSP: 0018:ff56bba58655f958 EFLAGS: 00010246
>>>> [  349.407861] RAX: 0000000000000000 RBX: ffffffff9525b900 RCX: 0000000000000002
>>>> [  349.415818] RDX: ffffffff95d526a0 RSI: ff1f5d52517c1800 RDI: 0000000000000020
>>>> [  349.423774] RBP: ff56bba58655f980 R08: 0000000000000000 R09: 0000000000000001
>>>> [  349.431730] R10: ff1f5d52c616a6f0 R11: fffc6a2f046c3980 R12: ff1f5d52517c1800
>>>> [  349.439687] R13: 0000000000000001 R14: ffffffff95d526a0 R15: ffffffff9525b968
>>>> [  349.447635] FS:  00007f17926b7800(0000) GS:ff1f5d59d45ff000(0000) knlGS:0000000000000000
>>>> [  349.456659] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>>> [  349.463064] CR2: 0000000000000008 CR3: 0000000147afe002 CR4: 0000000000771ef0
>>>> [  349.471022] PKRU: 55555554
>>>> [  349.474033] Call Trace:
>>>> [  349.476755]  <TASK>
>>>> [  349.479091]  ? kernfs_add_one+0x114/0x170
>>>> [  349.483560]  rdtgroup_assign_cntr_event+0x9b/0xd0
>>>> [  349.488795]  rdtgroup_assign_cntrs+0xab/0xb0
>>>> [  349.493553]  rdt_get_tree+0x4be/0x770
>>>> [  349.497623]  vfs_get_tree+0x2e/0xf0
>>>> [  349.501508]  fc_mount+0x18/0x90
>>>> [  349.505007]  path_mount+0x360/0xc50
>>>> [  349.508884]  ? putname+0x68/0x80
>>>> [  349.512479]  __x64_sys_mount+0x124/0x150
>>>> [  349.516848]  x64_sys_call+0x2133/0x2190
>>>> [  349.521123]  do_syscall_64+0x74/0x970
>>>>
>>>> ==================================================================
>>>
>>> Thank you for capturing this. This is a different trace but it confirms that it is the
>>> same root cause. Specifically, event is enabled after the state it depends on is (not) allocated
>>> during domain online.
>>>
>>
>> Yes. Thanks
>>
>> Here is the changelog.
>>
>> x86,fs/resctrl: Fix BUG with mbm_event mode when MBM events are disabled
>>
>> The following BUG is encountered when mounting the resctrl filesystem after booting a system with X86_FEATURE_ABMC support and the kernel parameter 'rdt=!mbmtotal,!mbmlocal'.
> 
> "booting a system with X86_FEATURE_ABMC" sounds like this is a feature enabled
> during boot?

Yea.

> 
>>   
>> ===========================================================================
>> [  349.330416] BUG: kernel NULL pointer dereference, address: 0000000000000008
>> [  349.338187] #PF: supervisor read access in kernel mode
>> [  349.343914] #PF: error_code(0x0000) - not-present page
>> [  349.349644] PGD 10419f067 P4D 0
>> [  349.353241] Oops: Oops: 0000 [#1] SMP NOPTI
>> [  349.357905] CPU: 45 UID: 0 PID: 3449 Comm: mount Not tainted
>>                     6.18.0-rc1+ #120 PREEMPT(voluntary)
>> [  349.367803] Hardware name: AMD Corporation
> 
> This backtrace needs to be trimmed. See "Backtraces in commit messages" in
> Documentation/process/submitting-patches.rst

Yes. Sure.

> 
>> [  349.376334] RIP: 0010:mbm_cntr_get+0x56/0x90
>> [  349.381096] Code: 45 8d 41 fe 83 f8 01 77 3d 8b 7b 50 85 ff 7e 36 49 8b 84 24 f0 04 00 00 45 31 c0 eb 0d 41 83 c0 01 48 83 c0 10 44 39 c7 74 1c <48> 3b 50 08 75 ed 3b 08 75 e9 48 83 c4 10 44 89 c0 5b 41 5c 41 5d
>> [  349.402037] RSP: 0018:ff56bba58655f958 EFLAGS: 00010246
>> [  349.407861] RAX: 0000000000000000 RBX: ffffffff9525b900 RCX: 0000000000000002
>> [  349.415818] RDX: ffffffff95d526a0 RSI: ff1f5d52517c1800 RDI: 0000000000000020
>> [  349.423774] RBP: ff56bba58655f980 R08: 0000000000000000 R09: 0000000000000001
>> [  349.431730] R10: ff1f5d52c616a6f0 R11: fffc6a2f046c3980 R12: ff1f5d52517c1800
>> [  349.439687] R13: 0000000000000001 R14: ffffffff95d526a0 R15: ffffffff9525b968
>> [  349.447635] FS:  00007f17926b7800(0000) GS:ff1f5d59d45ff000(0000)
>>                      knlGS:0000000000000000
>> [  349.456659] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>> [  349.463064] CR2: 0000000000000008 CR3: 0000000147afe002 CR4: 0000000000771ef0
>> [  349.471022] PKRU: 55555554
>> [  349.474033] Call Trace:
>> [  349.476755]  <TASK>
>> [  349.479091]  ? kernfs_add_one+0x114/0x170
>> [  349.483560]  rdtgroup_assign_cntr_event+0x9b/0xd0
>> [  349.488795]  rdtgroup_assign_cntrs+0xab/0xb0
>> [  349.493553]  rdt_get_tree+0x4be/0x770
>> [  349.497623]  vfs_get_tree+0x2e/0xf0
>> [  349.501508]  fc_mount+0x18/0x90
>> [  349.505007]  path_mount+0x360/0xc50
>> [  349.508884]  ? putname+0x68/0x80
>> [  349.512479]  __x64_sys_mount+0x124/0x150
>>
>> When mbm_event mode is enabled, it implicitly enables both MBM total and
>> local events. However, specifying the kernel parameter
>> "rdt=!mbmtotal,!mbmlocal" disables these events during resctrl initialization. As a result, related data structures, such as rdt_mon_domain::mbm_states, cntr_cfg, and rdt_hw_mon_domain::arch_mbm_states are not allocated. This
> 
> This may be a bit confusing with the jumps from "enabled" to "disabled" without noting the
> contexts (arch vs fs, early init vs late init).
> 
>> leads to a BUG when the user attempts to mount the resctrl filesystem,
>> which tries to access these un-allocated structures.
>>
>>
>> Fix the issue by adding a dependency on X86_FEATURE_CQM_MBM_TOTAL and
>> X86_FEATURE_CQM_MBM_LOCAL for X86_FEATURE_ABMC to be enabled. This is
>> acceptable for now, as X86_FEATURE_ABMC currently implies support for MBM total and local events. However, this dependency should be revisited and removed in the future to decouple feature handling more cleanly.
> 
> If I understand correctly the fix for the NULL pointer access is to remove
> the late event enabling from resctrl fs. The new dependency fixes a related but different
> issue that limits the scenarios in which mbm_event mode is enabled and when it may be possible
> to switch between modes.
> 
> I think the changelog can be made more specific with some adjustments. Here is an attempt
> at doing so but I think it can still be improved for flow.
> 
> 	x86,fs/resctrl: Fix NULL pointer dereference when events force disabled while in mbm_event mode
> 
> 	The following NULL pointer dereference is encountered on mount of resctrl fs after booting
> 	a system that support assignable counters with the "rdt=!mbmtotal,!mbmlocal" kernel parameters:
> 
> 	BUG: kernel NULL pointer dereference, address: 0000000000000008
> 	#PF: supervisor read access in kernel mode
> 	#PF: error_code(0x0000) - not-present page
> 	RIP: 0010:mbm_cntr_get
> 	Call Trace:
> 	rdtgroup_assign_cntr_event
> 	rdtgroup_assign_cntrs
> 	rdt_get_tree
> 
> 	Specifying the kernel parameter "rdt=!mbmtotal,!mbmlocal" effectively disables the legacy
> 	X86_FEATURE_CQM_MBM_TOTAL and X86_FEATURE_CQM_MBM_LOCAL features and thus the MBM events
> 	they represent. This results in the per-domain MBM event related data structures to not
> 	be allocated during resctrl early initialization.
> 
> 	resctrl fs initialization follows by implicitly enabling both MBM total and local
> 	events on a system that	supports assignable counters (mbm_event mode), but this enabling
> 	occurs after the per-domain data structures have been created.
> 
> 	During runtime resctrl fs assumes that an enabled event can access all its state.
> 	This results in NULL pointer dereference when resctrl attempts to access the
> 	un-allocated structures of an enabled event.
> 
> 	Remove the late MBM event enabling from resctrl fs.
> 
> 	This leaves a problem where the X86_FEATURE_CQM_MBM_TOTAL and X86_FEATURE_CQM_MBM_LOCAL
> 	features may be	disabled while assignable counter (mbm_event) mode is enabled without
> 	any events to support. Switching between the "default" and "mbm_event" mode without
> 	any events is not practical.
> 
> 	Create a dependency between the X86_FEATURE_CQM_MBM_TOTAL/X86_FEATURE_CQM_MBM_LOCAL
> 	and X86_FEATURE_ABMC (assignable counter) hardware features. An x86 system that supports
> 	assignable counters now requires support of X86_FEATURE_CQM_MBM_TOTAL or X86_FEATURE_CQM_MBM_LOCAL.
> 	This ensures all needed MBM related data structures are created before use and that it is
> 	only possible to switch	between "default" and "mbm_event" mode when the same events are
> 	available in both modes. This dependency does not exist in the hardware but this usage of
> 	these feature settings work for known systems.
> 	

Looks good to me.

thanks
Babu

      reply	other threads:[~2025-10-15 20:37 UTC|newest]

Thread overview: 15+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-30 20:26 [PATCH] fs/resctrl: Fix MBM events being unconditionally enabled in mbm_event mode Babu Moger
2025-10-06 17:56 ` Reinette Chatre
2025-10-06 20:38   ` Moger, Babu
2025-10-07  1:23     ` Reinette Chatre
2025-10-07 17:36       ` Babu Moger
2025-10-08  2:38         ` Reinette Chatre
2025-10-14 16:24           ` Reinette Chatre
2025-10-14 17:38             ` Babu Moger
2025-10-14 17:43               ` Babu Moger
2025-10-14 20:57                 ` Reinette Chatre
2025-10-14 22:45                   ` Moger, Babu
2025-10-14 23:09                     ` Reinette Chatre
2025-10-15 14:55                       ` Moger, Babu
2025-10-15 19:56                         ` Reinette Chatre
2025-10-15 20:37                           ` Moger, Babu [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=dcc64b09-117c-4d25-957d-e97ef49a8100@amd.com \
    --to=bmoger@amd.com \
    --cc=Dave.Martin@arm.com \
    --cc=babu.moger@amd.com \
    --cc=bp@alien8.de \
    --cc=dave.hansen@linux.intel.com \
    --cc=james.morse@arm.com \
    --cc=kas@kernel.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-coco@lists.linux.dev \
    --cc=linux-kernel@vger.kernel.org \
    --cc=reinette.chatre@intel.com \
    --cc=rick.p.edgecombe@intel.com \
    --cc=tony.luck@intel.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).