From: Xiaochen Shen <xiaochen.shen@intel.com>
To: stable@vger.kernel.org, gregkh@linuxfoundation.org,
sashal@kernel.org, bp@suse.de, tony.luck@intel.com
Cc: fenghua.yu@intel.com, reinette.chatre@intel.com,
pei.p.jia@intel.com, xiaochen.shen@intel.com
Subject: [PATCH 5.9] x86/resctrl: Fix incorrect local bandwidth when mba_sc is enabled
Date: Tue, 15 Dec 2020 02:05:56 +0800 [thread overview]
Message-ID: <1607969156-3365-1-git-send-email-xiaochen.shen@intel.com> (raw)
In-Reply-To: <160795595120091@kroah.com>
commit 06c5fe9b12dde1b62821f302f177c972bb1c81f9 upstream.
The MBA software controller (mba_sc) is a feedback loop which
periodically reads MBM counters and tries to restrict the bandwidth
below a user-specified value. It tags along the MBM counter overflow
handler to do the updates with 1s interval in mbm_update() and
update_mba_bw().
The purpose of mbm_update() is to periodically read the MBM counters to
make sure that the hardware counter doesn't wrap around more than once
between user samplings. mbm_update() calls __mon_event_count() for local
bandwidth updating when mba_sc is not enabled, but calls mbm_bw_count()
instead when mba_sc is enabled. __mon_event_count() will not be called
for local bandwidth updating in MBM counter overflow handler, but it is
still called when reading MBM local bandwidth counter file
'mbm_local_bytes', the call path is as below:
rdtgroup_mondata_show()
mon_event_read()
mon_event_count()
__mon_event_count()
In __mon_event_count(), m->chunks is updated by delta chunks which is
calculated from previous MSR value (m->prev_msr) and current MSR value.
When mba_sc is enabled, m->chunks is also updated in mbm_update() by
mistake by the delta chunks which is calculated from m->prev_bw_msr
instead of m->prev_msr. But m->chunks is not used in update_mba_bw() in
the mba_sc feedback loop.
When reading MBM local bandwidth counter file, m->chunks was changed
unexpectedly by mbm_bw_count(). As a result, the incorrect local
bandwidth counter which calculated from incorrect m->chunks is shown to
the user.
Fix this by removing incorrect m->chunks updating in mbm_bw_count() in
MBM counter overflow handler, and always calling __mon_event_count() in
mbm_update() to make sure that the hardware local bandwidth counter
doesn't wrap around.
Test steps:
# Run workload with aggressive memory bandwidth (e.g., 10 GB/s)
git clone https://github.com/intel/intel-cmt-cat && cd intel-cmt-cat
&& make
./tools/membw/membw -c 0 -b 10000 --read
# Enable MBA software controller
mount -t resctrl resctrl -o mba_MBps /sys/fs/resctrl
# Create control group c1
mkdir /sys/fs/resctrl/c1
# Set MB throttle to 6 GB/s
echo "MB:0=6000;1=6000" > /sys/fs/resctrl/c1/schemata
# Write PID of the workload to tasks file
echo `pidof membw` > /sys/fs/resctrl/c1/tasks
# Read local bytes counters twice with 1s interval, the calculated
# local bandwidth is not as expected (approaching to 6 GB/s):
local_1=`cat /sys/fs/resctrl/c1/mon_data/mon_L3_00/mbm_local_bytes`
sleep 1
local_2=`cat /sys/fs/resctrl/c1/mon_data/mon_L3_00/mbm_local_bytes`
echo "local b/w (bytes/s):" `expr $local_2 - $local_1`
Before fix:
local b/w (bytes/s): 11076796416
After fix:
local b/w (bytes/s): 5465014272
Backporting notes:
Upstream commit abe8f12b4425 ("x86/resctrl: Remove unused struct
mbm_state::chunks_bw") removed unused struct mbm_state::chunks_bw. It
changes the code base in mbm_bw_count().
Apply the change against the code base in older stable trees.
Fixes: ba0f26d8529c (x86/intel_rdt/mba_sc: Prepare for feedback loop)
Signed-off-by: Xiaochen Shen <xiaochen.shen@intel.com>
Signed-off-by: Borislav Petkov <bp@suse.de>
Reviewed-by: Tony Luck <tony.luck@intel.com>
Cc: <stable@vger.kernel.org>
Link: https://lkml.kernel.org/r/1607063279-19437-1-git-send-email-xiaochen.shen@intel.com
---
arch/x86/kernel/cpu/resctrl/monitor.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/arch/x86/kernel/cpu/resctrl/monitor.c b/arch/x86/kernel/cpu/resctrl/monitor.c
index 837d7d0..fe19e67 100644
--- a/arch/x86/kernel/cpu/resctrl/monitor.c
+++ b/arch/x86/kernel/cpu/resctrl/monitor.c
@@ -280,7 +280,6 @@ static void mbm_bw_count(u32 rmid, struct rmid_read *rr)
chunks = mbm_overflow_count(m->prev_bw_msr, tval, rr->r->mbm_width);
m->chunks_bw += chunks;
- m->chunks = m->chunks_bw;
cur_bw = (chunks * r->mon_scale) >> 20;
if (m->delta_comp)
@@ -451,15 +450,14 @@ static void mbm_update(struct rdt_resource *r, struct rdt_domain *d, int rmid)
}
if (is_mbm_local_enabled()) {
rr.evtid = QOS_L3_MBM_LOCAL_EVENT_ID;
+ __mon_event_count(rmid, &rr);
/*
* Call the MBA software controller only for the
* control groups and when user has enabled
* the software controller explicitly.
*/
- if (!is_mba_sc(NULL))
- __mon_event_count(rmid, &rr);
- else
+ if (is_mba_sc(NULL))
mbm_bw_count(rmid, &rr);
}
}
--
1.8.3.1
next prev parent reply other threads:[~2020-12-14 17:44 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2020-12-14 14:25 FAILED: patch "[PATCH] x86/resctrl: Fix incorrect local bandwidth when mba_sc is" failed to apply to 5.9-stable tree gregkh
2020-12-14 18:05 ` Xiaochen Shen [this message]
2020-12-14 19:19 ` Sudip Mukherjee
2020-12-19 12:48 ` Greg KH
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1607969156-3365-1-git-send-email-xiaochen.shen@intel.com \
--to=xiaochen.shen@intel.com \
--cc=bp@suse.de \
--cc=fenghua.yu@intel.com \
--cc=gregkh@linuxfoundation.org \
--cc=pei.p.jia@intel.com \
--cc=reinette.chatre@intel.com \
--cc=sashal@kernel.org \
--cc=stable@vger.kernel.org \
--cc=tony.luck@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).