From: "Ilpo Järvinen" <ilpo.jarvinen@linux.intel.com>
To: Reinette Chatre <reinette.chatre@intel.com>
Cc: "Shuah Khan" <shuah@kernel.org>,
"Shuah Khan" <skhan@linuxfoundation.org>,
linux-kselftest@vger.kernel.org,
"Maciej Wieczór-Retman" <maciej.wieczor-retman@intel.com>,
LKML <linux-kernel@vger.kernel.org>,
"Shaopeng Tan" <tan.shaopeng@jp.fujitsu.com>,
stable@vger.kernel.org
Subject: Re: [PATCH 5/5] selftests/resctrl: Reduce failures due to outliers in MBA/MBM tests
Date: Wed, 13 Sep 2023 14:43:40 +0300 (EEST) [thread overview]
Message-ID: <c1518af-cc3c-3aa7-a3c-4bbfe8cc6cd@linux.intel.com> (raw)
In-Reply-To: <cf7439c4-f72c-a145-5a65-84ae15c5d96f@intel.com>
[-- Attachment #1: Type: text/plain, Size: 2744 bytes --]
On Tue, 12 Sep 2023, Reinette Chatre wrote:
> On 9/11/2023 4:19 AM, Ilpo Järvinen wrote:
> > 5% difference upper bound for success is a bit on the low side for the
>
> "a bit on the low side" is very vague.
The commit that introduced that 5% bound plainly admitted it's "randomly
chosen value". At least that wasn't vague, I guess. :-)
So what I'm trying to do here is to have "randomly chosen value" replaced
with a value that seems to work well enough based on measurements on
a large set of platforms.
Personally, I don't care much about this, I can just ignore the failures
due to outliers (and also reports about failing MBA/MBM test if somebody
ever sends one to me), but if I'd be one running automated tests it would
be annoying to have a problem like this unaddressed.
> > MBA and MBM tests. Some platforms produce outliers that are slightly
> > above that, typically 6-7%.
> >
> > Relaxing the MBA/MBM success bound to 8% removes most of the failures
> > due those frequent outliers.
>
> This description needs more context on what issue is being solved here.
> What does the % difference represent? How was new percentage determined?
>
> Did you investigate why there are differences between platforms? From
> what I understand these tests measure memory bandwidth using perf and
> resctrl and then compare the difference. Are there interesting things
> about the platforms on which the difference is higher than 5%?
Not really I think. The number just isn't that stable to always remain
below 5% (even if it usually does).
Only systematic thing I've come across is that if I play with the read
pattern for defeating the hw prefetcher (you've seen a patch earlier and
it will be among the series I'll send after this one), it has an impact
which looks more systematic across all MBM/MBA tests. But it's not what
I'm trying now address with this patch.
> Could
> those be systems with multiple sockets (and thus multiple PMUs that need
> to be setup, reset, and read)? Can the reading of the counters be improved
> instead of relaxing the success criteria? A quick comparison between
> get_mem_bw_imc() and get_mem_bw_resctrl() makes me think that a difference
> is not surprising ... note how the PMU counters are started and reset
> (potentially on multiple sockets) at every iteration while the resctrl
> counters keep rolling and new values are just subtracted from previous.
Perhaps, I can try to look into it (add to my todo list so I won't
forget). But in the meantime, this new value is picked using a criteria
that looks better than "randomly chosen value". If I ever manage to
address the outliers, the bound could be lowered again.
I'll update the changelog to explain things better.
--
i.
next prev parent reply other threads:[~2023-09-13 11:43 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-11 11:19 [PATCH 0/5] selftests/resctrl: Fixes to failing tests Ilpo Järvinen
2023-09-11 11:19 ` [PATCH 1/5] selftests/resctrl: Extend signal handler coverage to unmount on receiving signal Ilpo Järvinen
2023-09-12 22:06 ` Reinette Chatre
2023-09-13 10:01 ` Ilpo Järvinen
2023-09-13 20:58 ` Reinette Chatre
2023-09-14 10:16 ` Ilpo Järvinen
2023-09-14 15:04 ` Reinette Chatre
2023-09-14 17:05 ` Ilpo Järvinen
2023-09-14 17:29 ` Reinette Chatre
2023-09-11 11:19 ` [PATCH 2/5] selftests/resctrl: Remove duplicate feature check from CMT test Ilpo Järvinen
2023-09-12 22:06 ` Reinette Chatre
2023-09-13 11:11 ` Ilpo Järvinen
2023-09-13 20:58 ` Reinette Chatre
2023-09-14 9:58 ` Ilpo Järvinen
2023-09-14 15:04 ` Reinette Chatre
2023-09-11 11:19 ` [PATCH 3/5] selftests/resctrl: Refactor feature check to use resource and feature name Ilpo Järvinen
2023-09-12 22:09 ` Reinette Chatre
2023-09-13 11:02 ` Ilpo Järvinen
2023-09-13 20:59 ` Reinette Chatre
2023-09-14 11:06 ` Ilpo Järvinen
2023-09-11 11:19 ` [PATCH 4/5] selftests/resctrl: Fix feature checks Ilpo Järvinen
2023-09-11 11:19 ` [PATCH 5/5] selftests/resctrl: Reduce failures due to outliers in MBA/MBM tests Ilpo Järvinen
2023-09-12 22:10 ` Reinette Chatre
2023-09-13 11:43 ` Ilpo Järvinen [this message]
2023-09-13 21:00 ` Reinette Chatre
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=c1518af-cc3c-3aa7-a3c-4bbfe8cc6cd@linux.intel.com \
--to=ilpo.jarvinen@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=maciej.wieczor-retman@intel.com \
--cc=reinette.chatre@intel.com \
--cc=shuah@kernel.org \
--cc=skhan@linuxfoundation.org \
--cc=stable@vger.kernel.org \
--cc=tan.shaopeng@jp.fujitsu.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.