From: Babu Moger <bmoger@amd.com>
To: Greg Kroah-Hartman <gregkh@linuxfoundation.org>, stable@vger.kernel.org
Cc: patches@lists.linux.dev, Babu Moger <babu.moger@amd.com>,
"Borislav Petkov (AMD)" <bp@alien8.de>,
Reinette Chatre <reinette.chatre@intel.com>
Subject: Re: [PATCH 6.6 80/84] x86/resctrl: Fix miscount of bandwidth event when reactivating previously unavailable RMID
Date: Tue, 28 Oct 2025 11:00:21 -0500 [thread overview]
Message-ID: <6da4a9cb-38df-41de-b714-44b64c9b3b60@amd.com> (raw)
In-Reply-To: <20251027183440.942556856@linuxfoundation.org>
On 10/27/25 13:37, Greg Kroah-Hartman wrote:
> 6.6-stable review patch. If anyone has any objections, please let me know.
>
> ------------------
>
> From: Babu Moger <babu.moger@amd.com>
>
> commit 15292f1b4c55a3a7c940dbcb6cb8793871ed3d92 upstream.
>
> Users can create as many monitoring groups as the number of RMIDs supported
> by the hardware. However, on AMD systems, only a limited number of RMIDs
> are guaranteed to be actively tracked by the hardware. RMIDs that exceed
> this limit are placed in an "Unavailable" state.
>
> When a bandwidth counter is read for such an RMID, the hardware sets
> MSR_IA32_QM_CTR.Unavailable (bit 62). When such an RMID starts being tracked
> again the hardware counter is reset to zero. MSR_IA32_QM_CTR.Unavailable
> remains set on first read after tracking re-starts and is clear on all
> subsequent reads as long as the RMID is tracked.
>
> resctrl miscounts the bandwidth events after an RMID transitions from the
> "Unavailable" state back to being tracked. This happens because when the
> hardware starts counting again after resetting the counter to zero, resctrl
> in turn compares the new count against the counter value stored from the
> previous time the RMID was tracked.
>
> This results in resctrl computing an event value that is either undercounting
> (when new counter is more than stored counter) or a mistaken overflow (when
> new counter is less than stored counter).
>
> Reset the stored value (arch_mbm_state::prev_msr) of MSR_IA32_QM_CTR to
> zero whenever the RMID is in the "Unavailable" state to ensure accurate
> counting after the RMID resets to zero when it starts to be tracked again.
>
> Example scenario that results in mistaken overflow
> ==================================================
> 1. The resctrl filesystem is mounted, and a task is assigned to a
> monitoring group.
>
> $mount -t resctrl resctrl /sys/fs/resctrl
> $mkdir /sys/fs/resctrl/mon_groups/test1/
> $echo 1234 > /sys/fs/resctrl/mon_groups/test1/tasks
>
> $cat /sys/fs/resctrl/mon_groups/test1/mon_data/mon_L3_*/mbm_total_bytes
> 21323 <- Total bytes on domain 0
> "Unavailable" <- Total bytes on domain 1
>
> Task is running on domain 0. Counter on domain 1 is "Unavailable".
>
> 2. The task runs on domain 0 for a while and then moves to domain 1. The
> counter starts incrementing on domain 1.
>
> $cat /sys/fs/resctrl/mon_groups/test1/mon_data/mon_L3_*/mbm_total_bytes
> 7345357 <- Total bytes on domain 0
> 4545 <- Total bytes on domain 1
>
> 3. At some point, the RMID in domain 0 transitions to the "Unavailable"
> state because the task is no longer executing in that domain.
>
> $cat /sys/fs/resctrl/mon_groups/test1/mon_data/mon_L3_*/mbm_total_bytes
> "Unavailable" <- Total bytes on domain 0
> 434341 <- Total bytes on domain 1
>
> 4. Since the task continues to migrate between domains, it may eventually
> return to domain 0.
>
> $cat /sys/fs/resctrl/mon_groups/test1/mon_data/mon_L3_*/mbm_total_bytes
> 17592178699059 <- Overflow on domain 0
> 3232332 <- Total bytes on domain 1
>
> In this case, the RMID on domain 0 transitions from "Unavailable" state to
> active state. The hardware sets MSR_IA32_QM_CTR.Unavailable (bit 62) when
> the counter is read and begins tracking the RMID counting from 0.
>
> Subsequent reads succeed but return a value smaller than the previously
> saved MSR value (7345357). Consequently, the resctrl's overflow logic is
> triggered, it compares the previous value (7345357) with the new, smaller
> value and incorrectly interprets this as a counter overflow, adding a large
> delta.
>
> In reality, this is a false positive: the counter did not overflow but was
> simply reset when the RMID transitioned from "Unavailable" back to active
> state.
>
> Here is the text from APM [1] available from [2].
>
> "In PQOS Version 2.0 or higher, the MBM hardware will set the U bit on the
> first QM_CTR read when it begins tracking an RMID that it was not
> previously tracking. The U bit will be zero for all subsequent reads from
> that RMID while it is still tracked by the hardware. Therefore, a QM_CTR
> read with the U bit set when that RMID is in use by a processor can be
> considered 0 when calculating the difference with a subsequent read."
>
> [1] AMD64 Architecture Programmer's Manual Volume 2: System Programming
> Publication # 24593 Revision 3.41 section 19.3.3 Monitoring L3 Memory
> Bandwidth (MBM).
>
> [ bp: Split commit message into smaller paragraph chunks for better
> consumption. ]
>
> Fixes: 4d05bf71f157d ("x86/resctrl: Introduce AMD QOS feature")
> Signed-off-by: Babu Moger <babu.moger@amd.com>
> Signed-off-by: Borislav Petkov (AMD) <bp@alien8.de>
> Reviewed-by: Reinette Chatre <reinette.chatre@intel.com>
> Tested-by: Reinette Chatre <reinette.chatre@intel.com>
> Cc: stable@vger.kernel.org # needs adjustments for <= v6.17
> Link: https://bugzilla.kernel.org/show_bug.cgi?id=206537 # [2]
> [babu.moger@amd.com: Fix conflict for v6.6 stable]
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Tested-by: Babu Moger <babu.moger@amd.com>
Thanks
Babu Moger
next prev parent reply other threads:[~2025-10-28 16:00 UTC|newest]
Thread overview: 95+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-10-27 18:35 [PATCH 6.6 00/84] 6.6.115-rc1 review Greg Kroah-Hartman
2025-10-27 18:35 ` [PATCH 6.6 01/84] exec: Fix incorrect type for ret Greg Kroah-Hartman
2025-10-27 18:35 ` [PATCH 6.6 02/84] nios2: ensure that memblock.current_limit is set when setting pfn limits Greg Kroah-Hartman
2025-10-27 18:35 ` [PATCH 6.6 03/84] hfs: clear offset and space out of valid records in b-tree node Greg Kroah-Hartman
2025-10-27 18:35 ` [PATCH 6.6 04/84] hfs: make proper initalization of struct hfs_find_data Greg Kroah-Hartman
2025-10-27 18:35 ` [PATCH 6.6 05/84] hfsplus: fix KMSAN uninit-value issue in __hfsplus_ext_cache_extent() Greg Kroah-Hartman
2025-10-27 18:35 ` [PATCH 6.6 06/84] hfs: validate record offset in hfsplus_bmap_alloc Greg Kroah-Hartman
2025-10-27 18:35 ` [PATCH 6.6 07/84] hfsplus: fix KMSAN uninit-value issue in hfsplus_delete_cat() Greg Kroah-Hartman
2025-10-27 18:35 ` [PATCH 6.6 08/84] dlm: check for defined force value in dlm_lockspace_release Greg Kroah-Hartman
2025-10-27 18:35 ` [PATCH 6.6 09/84] hfs: fix KMSAN uninit-value issue in hfs_find_set_zero_bits() Greg Kroah-Hartman
2025-10-27 18:35 ` [PATCH 6.6 10/84] hfsplus: return EIO when type of hidden directory mismatch in hfsplus_fill_super() Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 11/84] lkdtm: fortify: Fix potential NULL dereference on kmalloc failure Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 12/84] m68k: bitops: Fix find_*_bit() signatures Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 13/84] powerpc/32: Remove PAGE_KERNEL_TEXT to fix startup failure Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 14/84] drivers/perf: hisi: Relax the event ID check in the framework Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 15/84] smb: server: let smb_direct_flush_send_list() invalidate a remote key first Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 16/84] Unbreak make tools/* for user-space targets Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 17/84] net/mlx5e: Return 1 instead of 0 in invalid case in mlx5e_mpwrq_umr_entry_size() Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 18/84] rtnetlink: Allow deleting FDB entries in user namespace Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 19/84] net: Tree wide: Replace xdp_do_flush_map() with xdp_do_flush() Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 20/84] net: enetc: fix the deadlock of enetc_mdio_lock Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 21/84] net: enetc: correct the value of ENETC_RXB_TRUESIZE Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 22/84] dpaa2-eth: fix the pointer passed to PTR_ALIGN on Tx path Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 23/84] can: bxcan: bxcan_start_xmit(): use can_dev_dropped_skb() instead of can_dropped_invalid_skb() Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 24/84] selftests/net: convert sctp_vrf.sh to run it in unique namespace Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 25/84] selftests: net: fix server bind failure in sctp_vrf.sh Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 26/84] net/mlx5e: Reuse per-RQ XDP buffer to avoid stack zeroing overhead Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 27/84] net/mlx5e: RX, Fix generating skb from non-linear xdp_buff for legacy RQ Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 28/84] net/mlx5e: RX, Fix generating skb from non-linear xdp_buff for striding RQ Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 29/84] arm64, mm: avoid always making PTE dirty in pte_mkwrite() Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 30/84] sctp: avoid NULL dereference when chunk data buffer is missing Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 31/84] net: bonding: fix possible peer notify event loss or dup issue Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 32/84] dma-debug: dont report false positives with DMA_BOUNCE_UNALIGNED_KMALLOC Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 33/84] arch_topology: Fix incorrect error check in topology_parse_cpu_capacity() Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 34/84] gpio: pci-idio-16: Define maximum valid register address offset Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 35/84] gpio: 104-idio-16: " Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 36/84] Revert "cpuidle: menu: Avoid discarding useful information" Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 37/84] ACPICA: Work around bogus -Wstringop-overread warning since GCC 11 Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 38/84] can: netlink: can_changelink(): allow disabling of automatic restart Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 39/84] cifs: Fix TCP_Server_Info::credits to be signed Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 40/84] MIPS: Malta: Fix keyboard resource preventing i8042 driver from registering Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 41/84] ocfs2: clear extent cache after moving/defragmenting extents Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 42/84] vsock: fix lock inversion in vsock_assign_transport() Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 43/84] net: stmmac: dwmac-rk: Fix disabling set_clock_selection Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 44/84] net: usb: rtl8150: Fix frame padding Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 45/84] net: ravb: Enforce descriptor type ordering Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 46/84] net: ravb: Ensure memory write completes before ringing TX doorbell Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 47/84] selftests: mptcp: join: mark flush re-add as skipped if not supported Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 48/84] selftests: mptcp: join: mark implicit tests " Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 49/84] spi: spi-nxp-fspi: add extra delay after dll locked Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 50/84] firmware: arm_scmi: Account for failed debug initialization Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 51/84] firmware: arm_scmi: Fix premature SCMI_XFER_FLAG_IS_RAW clearing in raw mode Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 52/84] RISC-V: Define pgprot_dmacoherent() for non-coherent devices Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 53/84] RISC-V: Dont print details of CPUs disabled in DT Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 54/84] hwmon: (sht3x) Fix error handling Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 55/84] gpio: update Intel LJCA USB GPIO driver Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 56/84] gpio: ljca: Fix duplicated IRQ mapping Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 57/84] io_uring: correct __must_hold annotation in io_install_fixed_file Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 58/84] sched: Remove never used code in mm_cid_get() Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 59/84] USB: serial: option: add UNISOC UIS7720 Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 60/84] USB: serial: option: add Quectel RG255C Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 61/84] USB: serial: option: add Telit FN920C04 ECM compositions Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 62/84] usb/core/quirks: Add Huawei ME906S to wakeup quirk Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 63/84] usb: raw-gadget: do not limit transfer length Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 64/84] xhci: dbc: enable back DbC in resume if it was enabled before suspend Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 65/84] x86/microcode: Fix Entrysign revision check for Zen1/Naples Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 66/84] binder: remove "invalid inc weak" check Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 67/84] comedi: fix divide-by-zero in comedi_buf_munge() Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 68/84] mei: me: add wildcat lake P DID Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 69/84] misc: fastrpc: Fix dma_buf object leak in fastrpc_map_lookup Greg Kroah-Hartman
2025-10-27 18:36 ` [PATCH 6.6 70/84] most: usb: Fix use-after-free in hdm_disconnect Greg Kroah-Hartman
2025-10-27 18:37 ` [PATCH 6.6 71/84] most: usb: hdm_probe: Fix calling put_device() before device initialization Greg Kroah-Hartman
2025-10-27 18:37 ` [PATCH 6.6 72/84] tcpm: switch check for role_sw device with fw_node Greg Kroah-Hartman
2025-10-27 18:37 ` [PATCH 6.6 73/84] dt-bindings: usb: dwc3-imx8mp: dma-range is required only for imx8mp Greg Kroah-Hartman
2025-10-27 18:37 ` [PATCH 6.6 74/84] serial: 8250_dw: handle reset control deassert error Greg Kroah-Hartman
2025-10-27 18:37 ` [PATCH 6.6 75/84] serial: 8250_exar: add support for Advantech 2 port card with Device ID 0x0018 Greg Kroah-Hartman
2025-10-27 18:37 ` [PATCH 6.6 76/84] serial: 8250_mtk: Enable baud clock and manage in runtime PM Greg Kroah-Hartman
2025-10-27 18:37 ` [PATCH 6.6 77/84] devcoredump: Fix circular locking dependency with devcd->mutex Greg Kroah-Hartman
2025-10-27 18:37 ` [PATCH 6.6 78/84] xfs: always warn about deprecated mount options Greg Kroah-Hartman
2025-10-27 18:37 ` [PATCH 6.6 79/84] fs/notify: call exportfs_encode_fid with s_umount Greg Kroah-Hartman
2025-10-27 18:37 ` [PATCH 6.6 80/84] x86/resctrl: Fix miscount of bandwidth event when reactivating previously unavailable RMID Greg Kroah-Hartman
2025-10-28 16:00 ` Babu Moger [this message]
2025-10-27 18:37 ` [PATCH 6.6 81/84] s390/cio: Update purge function to unregister the unused subchannels Greg Kroah-Hartman
2025-10-27 18:37 ` [PATCH 6.6 82/84] fuse: allocate ff->release_args only if release is needed Greg Kroah-Hartman
2025-10-27 18:37 ` [PATCH 6.6 83/84] fuse: fix livelock in synchronous file put from fuseblk workers Greg Kroah-Hartman
2025-10-27 18:37 ` [PATCH 6.6 84/84] gpio: ljca: Initialize num before accessing item in ljca_gpio_config Greg Kroah-Hartman
2025-10-27 21:50 ` [PATCH 6.6 00/84] 6.6.115-rc1 review Florian Fainelli
2025-10-28 3:45 ` Peter Schneider
2025-10-28 11:28 ` Jon Hunter
2025-10-28 11:51 ` Ron Economos
2025-10-28 13:42 ` Naresh Kamboju
2025-10-28 13:53 ` Brett A C Sheffield
2025-10-28 19:25 ` Shuah Khan
2025-10-28 22:11 ` Slade Watkins
2025-10-29 11:21 ` Miguel Ojeda
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=6da4a9cb-38df-41de-b714-44b64c9b3b60@amd.com \
--to=bmoger@amd.com \
--cc=babu.moger@amd.com \
--cc=bp@alien8.de \
--cc=gregkh@linuxfoundation.org \
--cc=patches@lists.linux.dev \
--cc=reinette.chatre@intel.com \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).