From: Reinette Chatre <reinette.chatre@intel.com>
To: "Ilpo Järvinen" <ilpo.jarvinen@linux.intel.com>
Cc: fenghua.yu@intel.com, shuah@kernel.org, tony.luck@intel.com,
peternewman@google.com, babu.moger@amd.com,
"Maciej Wieczór-Retman" <maciej.wieczor-retman@intel.com>,
linux-kselftest@vger.kernel.org,
LKML <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH 5/6] selftests/resctrl: Do not compare performance counters and resctrl at low bandwidth
Date: Fri, 6 Sep 2024 17:05:29 -0700 [thread overview]
Message-ID: <a2c7baf1-1c69-45a5-8755-38d152d33fae@intel.com> (raw)
In-Reply-To: <c3aa4289-40aa-c158-8fde-8a35a6002783@linux.intel.com>
Hi Ilpo,
On 9/6/24 1:44 AM, Ilpo Järvinen wrote:
> On Thu, 5 Sep 2024, Reinette Chatre wrote:
>> On 9/5/24 4:45 AM, Ilpo Järvinen wrote:
>>> On Wed, 4 Sep 2024, Reinette Chatre wrote:
>>>> On 9/4/24 4:43 AM, Ilpo Järvinen wrote:
>>>>> On Fri, 30 Aug 2024, Reinette Chatre wrote:
>>>>>> On 8/30/24 4:42 AM, Ilpo Järvinen wrote:
>>>>>>> On Thu, 29 Aug 2024, Reinette Chatre wrote:
>>>>>>>
>>>>>>>> The MBA test incrementally throttles memory bandwidth, each time
>>>>>>>> followed by a comparison between the memory bandwidth observed
>>>>>>>> by the performance counters and resctrl respectively.
>>>>>>>>
>>>>>>>> While a comparison between performance counters and resctrl is
>>>>>>>> generally appropriate, they do not have an identical view of
>>>>>>>> memory bandwidth. For example RAS features or memory performance
>>>>>>>> features that generate memory traffic may drive accesses that are
>>>>>>>> counted differently by performance counters and MBM respectively,
>>>>>>>> for instance generating "overhead" traffic which is not counted
>>>>>>>> against any specific RMID. As a ratio, this different view of
>>>>>>>> memory
>>>>>>>> bandwidth becomes more apparent at low memory bandwidths.
>>>>>>>
>>>>>>> Interesting.
>>>>>>>
>>>>>>> I did some time back prototype with a change to MBM test such that
>>>>>>> instead
>>>>>>> of using once=false I changed fill_buf to be able to run N passes
>>>>>>> through
>>>>>>> the buffer which allowed me to know how many reads were performed by
>>>>>>> the
>>>>>>> benchmark. This yielded numerical difference between all those 3
>>>>>>> values
>>>>>>> (# of reads, MBM, perf) which also varied from arch to another so it
>>>>>>> didn't end up making an usable test.
>>>>>>>
>>>>>>> I guess I now have an explanation for at least a part of the
>>>>>>> differences.
>>>>>>>
>>>>>>>> It is not practical to enable/disable the various features that
>>>>>>>> may generate memory bandwidth to give performance counters and
>>>>>>>> resctrl an identical view. Instead, do not compare performance
>>>>>>>> counters and resctrl view of memory bandwidth when the memory
>>>>>>>> bandwidth is low.
>>>>>>>>
>>>>>>>> Bandwidth throttling behaves differently across platforms
>>>>>>>> so it is not appropriate to drop measurement data simply based
>>>>>>>> on the throttling level. Instead, use a threshold of 750MiB
>>>>>>>> that has been observed to support adequate comparison between
>>>>>>>> performance counters and resctrl.
>>>>>>>>
>>>>>>>> Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
>>>>>>>> ---
>>>>>>>> tools/testing/selftests/resctrl/mba_test.c | 7 +++++++
>>>>>>>> tools/testing/selftests/resctrl/resctrl.h | 6 ++++++
>>>>>>>> 2 files changed, 13 insertions(+)
>>>>>>>>
>>>>>>>> diff --git a/tools/testing/selftests/resctrl/mba_test.c
>>>>>>>> b/tools/testing/selftests/resctrl/mba_test.c
>>>>>>>> index cad473b81a64..204b9ac4b108 100644
>>>>>>>> --- a/tools/testing/selftests/resctrl/mba_test.c
>>>>>>>> +++ b/tools/testing/selftests/resctrl/mba_test.c
>>>>>>>> @@ -96,6 +96,13 @@ static bool show_mba_info(unsigned long
>>>>>>>> *bw_imc,
>>>>>>>> unsigned long *bw_resc)
>>>>>>>> avg_bw_imc = sum_bw_imc / (NUM_OF_RUNS - 1);
>>>>>>>> avg_bw_resc = sum_bw_resc / (NUM_OF_RUNS - 1);
>>>>>>>> + if (avg_bw_imc < THROTTLE_THRESHOLD || avg_bw_resc <
>>>>>>>> THROTTLE_THRESHOLD) {
>>>>>>>> + ksft_print_msg("Bandwidth below threshold (%d
>>>>>>>> MiB).
>>>>>>>> Dropping results from MBA schemata %u.\n",
>>>>>>>> + THROTTLE_THRESHOLD,
>>>>>>>> + ALLOCATION_MAX -
>>>>>>>> ALLOCATION_STEP *
>>>>>>>> allocation);
>>>>>>>
>>>>>>> The second one too should be %d.
>>>>>>>
>>>>>>
>>>>>> hmmm ... I intended to have it be consistent with the ksft_print_msg()
>>>>>> that
>>>>>> follows. Perhaps allocation can be made unsigned instead?
>>>>>
>>>>> If you go that way, then it would be good to make the related defines
>>>>> and
>>>>> allocation in mba_setup() unsigned too, although the latter is a bit
>>>>> scary
>>>>
>>>> Sure, will look into that.
>>>>
>>>>> because it does allocation -= ALLOCATION_STEP which could underflow if
>>>>> the
>>>>> defines are ever changed.
>>>>>
>>>>
>>>> Is this not already covered in the following check:
>>>> if (allocation < ALLOCATION_MIN || allocation > ALLOCATION_MAX)
>>>> return END_OF_TESTS;
>>>>
>>>> We are talking about hardcoded constants though.
>>>
>>> Borderline yes. It is "covered" by the allocation > ALLOCATION_MAX but
>>> it's also very non-intuitive to let the value underflow and then check for
>>> that with the > operator.
>>
>> My understanding is that this is the traditional way of checking overflow
>> (or more accurately wraparound). There are several examples of this pattern
>> in the kernel and it is also the pattern that I understand Linus referred
>> to as "traditional" in [1]. Even the compiler's intrinsic overflow checkers
>> do checking in this way (perform the subtraction and then check if it
>> overflowed) [2].
>
> Fair enough. I've never come across that kind of claim before.
>
>>> You're correct that they're constants so one would need to tweak the
>>> source to end up into this condition in the first place.
>>>
>>> Perhaps I'm being overly pendantic here but I in general don't like
>>> leaving trappy and non-obvious logic like that lying around because one
>>> day one of such will come back biting.
>>
>> It is not clear to me what is "trappy" about this. The current checks will
>> catch the wraparound if somebody changes ALLOCATION_STEP in your scenario, no?
>>
>>> So, if a variable is unsigned and we ever do subtraction (or adding
>>> negative numbers to it), I'd prefer additional check to prevent ever
>>> underflowing it unexpectedly. Or leave them signed.
>>
>> To support checking at the time of subtraction we either need to change the
>> flow of that function since it cannot exit with failure if that subtraction
>> fails because of overflow/wraparound, or we need to introduce more state that
>> will be an additional check that the existing
>> "if (allocation < ALLOCATION_MIN || allocation > ALLOCATION_MAX)"
>> would have caught anyway.
>>
>> In either case, to do this checking at the time of subtraction the ideal way
>> would be to use check_sub_overflow() ... but it again does exactly what
>> you find to be non-intuitive since the implementation in
>> tools/include/linux/overflow.h uses the gcc intrinsics that does subtraction
>> first and then checks if overflow occurred.
>
> It's trappy because by glance, that check looks unnecessary since
> allocation starts from max and goes downwards. Also worth to note,
> the check is not immediately after the decrement but done on the next
> iteration.
Right. This is probably what causes most confusion.
Considering that, what do you think of below that avoids wraparound entirely:
---8<---
From: Reinette Chatre <reinette.chatre@intel.com>
Subject: [PATCH] selftests/resctrl: Make wraparound handling obvious
Within mba_setup() the programmed bandwidth delay value starts
at the maximum (100, or rather ALLOCATION_MAX) and progresses
towards ALLOCATION_MIN by decrementing with ALLOCATION_STEP.
The programmed bandwidth delay should never be negative, so
representing it with an unsigned int is most appropriate. This
may introduce confusion because of the "allocation > ALLOCATION_MAX"
check used to check wraparound of the subtraction.
Modify the mba_setup() flow to start at the minimum, ALLOCATION_MIN,
and incrementally, with ALLOCATION_STEP steps, adjust the
bandwidth delay value. This avoids wraparound while making the purpose
of "allocation > ALLOCATION_MAX" clear and eliminates the
need for the "allocation < ALLOCATION_MIN" check.
Reported-by: Ilpo Järvinen <ilpo.jarvinen@linux.intel.com>
Signed-off-by: Reinette Chatre <reinette.chatre@intel.com>
---
Changes since V1:
- New patch
---
tools/testing/selftests/resctrl/mba_test.c | 12 +++++++-----
1 file changed, 7 insertions(+), 5 deletions(-)
diff --git a/tools/testing/selftests/resctrl/mba_test.c b/tools/testing/selftests/resctrl/mba_test.c
index ab8496a4925b..947d5699f0c8 100644
--- a/tools/testing/selftests/resctrl/mba_test.c
+++ b/tools/testing/selftests/resctrl/mba_test.c
@@ -39,7 +39,8 @@ static int mba_setup(const struct resctrl_test *test,
const struct user_params *uparams,
struct resctrl_val_param *p)
{
- static int runs_per_allocation, allocation = 100;
+ static unsigned int allocation = ALLOCATION_MIN;
+ static int runs_per_allocation = 0;
char allocation_str[64];
int ret;
@@ -50,7 +51,7 @@ static int mba_setup(const struct resctrl_test *test,
if (runs_per_allocation++ != 0)
return 0;
- if (allocation < ALLOCATION_MIN || allocation > ALLOCATION_MAX)
+ if (allocation > ALLOCATION_MAX)
return END_OF_TESTS;
sprintf(allocation_str, "%d", allocation);
@@ -59,7 +60,7 @@ static int mba_setup(const struct resctrl_test *test,
if (ret < 0)
return ret;
- allocation -= ALLOCATION_STEP;
+ allocation += ALLOCATION_STEP;
return 0;
}
@@ -72,8 +73,9 @@ static int mba_measure(const struct user_params *uparams,
static bool show_mba_info(unsigned long *bw_imc, unsigned long *bw_resc)
{
- int allocation, runs;
+ unsigned int allocation;
bool ret = false;
+ int runs;
ksft_print_msg("Results are displayed in (MB)\n");
/* Memory bandwidth from 100% down to 10% */
@@ -103,7 +105,7 @@ static bool show_mba_info(unsigned long *bw_imc, unsigned long *bw_resc)
avg_diff_per > MAX_DIFF_PERCENT ?
"Fail:" : "Pass:",
MAX_DIFF_PERCENT,
- ALLOCATION_MAX - ALLOCATION_STEP * allocation);
+ ALLOCATION_MIN + ALLOCATION_STEP * allocation);
ksft_print_msg("avg_diff_per: %d%%\n", avg_diff_per);
ksft_print_msg("avg_bw_imc: %lu\n", avg_bw_imc);
>
> The risk here is that somebody removes allocation > ALLOCATION_MAX check.
>
> Something called check_sub_overflow() is not subject to a similar risk
> regardless of what operations occur inside it.
Reinette
next prev parent reply other threads:[~2024-09-07 0:05 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-29 22:52 [PATCH 0/6] selftests/resctrl: Support diverse platforms with MBM and MBA tests Reinette Chatre
2024-08-29 22:52 ` [PATCH 1/6] selftests/resctrl: Fix sparse warnings Reinette Chatre
2024-08-30 10:29 ` Ilpo Järvinen
2024-08-29 22:52 ` [PATCH 2/6] selftests/resctrl: Ensure measurements skip initialization of default benchmark Reinette Chatre
2024-08-30 10:56 ` Ilpo Järvinen
2024-08-30 16:00 ` Reinette Chatre
2024-09-04 11:57 ` Ilpo Järvinen
2024-09-04 21:15 ` Reinette Chatre
2024-09-05 12:10 ` Ilpo Järvinen
2024-09-05 18:08 ` Reinette Chatre
2024-09-06 10:00 ` Ilpo Järvinen
2024-09-07 0:05 ` Reinette Chatre
2024-09-09 12:52 ` Ilpo Järvinen
2024-08-29 22:52 ` [PATCH 3/6] selftests/resctrl: Simplify benchmark parameter passing Reinette Chatre
2024-08-30 11:13 ` Ilpo Järvinen
2024-08-30 16:01 ` Reinette Chatre
2024-08-29 22:52 ` [PATCH 4/6] selftests/resctrl: Use cache size to determine "fill_buf" buffer size Reinette Chatre
2024-08-30 11:25 ` Ilpo Järvinen
2024-08-30 16:00 ` Reinette Chatre
2024-08-29 22:52 ` [PATCH 5/6] selftests/resctrl: Do not compare performance counters and resctrl at low bandwidth Reinette Chatre
2024-08-30 11:42 ` Ilpo Järvinen
2024-08-30 16:00 ` Reinette Chatre
2024-09-04 11:43 ` Ilpo Järvinen
2024-09-04 21:15 ` Reinette Chatre
2024-09-05 11:45 ` Ilpo Järvinen
2024-09-05 18:08 ` Reinette Chatre
2024-09-06 8:44 ` Ilpo Järvinen
2024-09-07 0:05 ` Reinette Chatre [this message]
2024-09-09 8:13 ` Ilpo Järvinen
2024-08-29 22:52 ` [PATCH 6/6] selftests/resctrl: Keep results from first test run Reinette Chatre
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a2c7baf1-1c69-45a5-8755-38d152d33fae@intel.com \
--to=reinette.chatre@intel.com \
--cc=babu.moger@amd.com \
--cc=fenghua.yu@intel.com \
--cc=ilpo.jarvinen@linux.intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=maciej.wieczor-retman@intel.com \
--cc=peternewman@google.com \
--cc=shuah@kernel.org \
--cc=tony.luck@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox