public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Reinette Chatre <reinette.chatre@intel.com>
To: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
Cc: <fenghua.yu@intel.com>, <shuah@kernel.org>,
	<linux-kernel@vger.kernel.org>, <linux-kselftest@vger.kernel.org>,
	<ilpo.jarvinen@linux.intel.com>
Subject: Re: [PATCH v3 5/5] selftests/resctrl: Add non-contiguous CBMs CAT test
Date: Thu, 1 Feb 2024 11:47:44 -0800	[thread overview]
Message-ID: <6a7b14a7-966d-4953-9b9b-0c847c83ddbb@intel.com> (raw)
In-Reply-To: <jbgkiwfkotntcdzhwf27dceit4w3j37sumkayixmyuee6zyaqg@un22a2iljgmr>

Hi Maciej,

On 1/31/2024 4:55 AM, Maciej Wieczor-Retman wrote:
> On 2024-01-26 at 13:10:18 -0800, Reinette Chatre wrote:
>> On 1/25/2024 3:13 AM, Maciej Wieczor-Retman wrote:
>>> Add tests for both L2 and L3 CAT to verify the return values
>>> generated by writing non-contiguous CBMs don't contradict the
>>> reported non-contiguous support information.
>>>
>>> Use a logical XOR to confirm return value of write_schemata() and
>>> non-contiguous CBMs support information match.
>>>
>>> Signed-off-by: Maciej Wieczor-Retman <maciej.wieczor-retman@intel.com>
>>> ---
>>> Changelog v3:
>>> - Roll back __cpuid_count part. (Reinette)
>>> - Update function name to read sparse_masks file.
>>> - Roll back get_cache_level() changes.
>>> - Add ksft_print_msg() to contiguous schemata write error handling
>>>   (Reinette).
>>>
>>> Changelog v2:
>>> - Redo the patch message. (Ilpo)
>>> - Tidy up __cpuid_count calls. (Ilpo)
>>> - Remove redundant AND in noncont_mask calculations (Ilpo)
>>> - Fix bit_center offset.
>>> - Add newline before function return. (Ilpo)
>>> - Group non-contiguous tests with CAT tests. (Ilpo)
>>> - Use a helper for reading sparse_masks file. (Ilpo)
>>> - Make get_cache_level() available in other source files. (Ilpo)
>>>
>>>  tools/testing/selftests/resctrl/cat_test.c    | 81 +++++++++++++++++++
>>>  tools/testing/selftests/resctrl/resctrl.h     |  2 +
>>>  .../testing/selftests/resctrl/resctrl_tests.c |  2 +
>>>  3 files changed, 85 insertions(+)
>>>
>>> diff --git a/tools/testing/selftests/resctrl/cat_test.c b/tools/testing/selftests/resctrl/cat_test.c
>>> index 39fc9303b8e8..9086bf359072 100644
>>> --- a/tools/testing/selftests/resctrl/cat_test.c
>>> +++ b/tools/testing/selftests/resctrl/cat_test.c
>>> @@ -294,6 +294,71 @@ static int cat_run_test(const struct resctrl_test *test, const struct user_param
>>>  	return ret;
>>>  }
>>>  
>>> +static int noncont_cat_run_test(const struct resctrl_test *test,
>>> +				const struct user_params *uparams)
>>> +{
>>> +	unsigned long full_cache_mask, cont_mask, noncont_mask;
>>> +	unsigned int eax, ebx, ecx, edx, ret, sparse_masks;
>>> +	char schemata[64];
>>> +	int bit_center;
>>> +
>>> +	/* Check to compare sparse_masks content to CPUID output. */
>>> +	ret = resource_info_unsigned_get(test->resource, "sparse_masks", &sparse_masks);
>>> +	if (ret)
>>> +		return ret;
>>> +
>>> +	if (!strcmp(test->resource, "L3"))
>>> +		__cpuid_count(0x10, 1, eax, ebx, ecx, edx);
>>> +	else if (!strcmp(test->resource, "L2"))
>>> +		__cpuid_count(0x10, 2, eax, ebx, ecx, edx);
>>> +	else
>>> +		return -EINVAL;
>>> +
>>> +	if (sparse_masks != ((ecx >> 3) & 1)) {
>>> +		ksft_print_msg("CPUID output doesn't match 'sparse_masks' file content!\n");
>>> +		return -1;
>>
>> If I understand correctly this falls into the "test failure" [1] category
>> and should return 1? ...
>>
>>> +	}
>>> +
>>> +	/* Write checks initialization. */
>>> +	ret = get_full_cbm(test->resource, &full_cache_mask);
>>> +	if (ret < 0)
>>> +		return ret;
>>> +	bit_center = count_bits(full_cache_mask) / 2;
>>> +	cont_mask = full_cache_mask >> bit_center;
>>> +
>>> +	/* Contiguous mask write check. */
>>> +	snprintf(schemata, sizeof(schemata), "%lx", cont_mask);
>>> +	ret = write_schemata("", schemata, uparams->cpu, test->resource);
>>> +	if (ret) {
>>> +		ksft_print_msg("Write of contiguous CBM failed\n");
>>> +		return ret;
>>
>> ... although here I think the goal to distinguish between test error and test failure
>> falls apart since it is not possible to tell within the test if the failure is
>> because of error in the test or if test failed.
> 
> Is there even a distinction between test error and failure in resctrl selftest?

There is such a distinction in the current tests (and from what I understand the reason
behind the logical XOR used in this test) . In existing tests the running of
the test precedes and is clearly separate from determining of the test pass/fail.
All the current tests have a clear "run the test" phase where data is collected to
a file, followed by an analysis (aka "check results") phase that looks at collected
data to determine if the test passes or fails.
Note how all the "check results" return either 0 or 1 to indicate test pass
or fail respectively. Specifically, you can refer to:
mbm_test.c->check_results()
mba_test.c->check_results()
cmt_test.c->check_results()
cat_test.c->check_results()

> I've been looking at it for a while and can't find any instances where
> ksft_test_result_error() would be used. Everywhere I look it's either pass or
> fail. By grep-ing over all selftests I found only five tests that use
> ksft_test_result_error().

Yes, from the user perspective there is no such distinction. This seems to
be entirely internal to the resctrl selftests (but I do not think that this
should or can be a hard requirement).

> 
> Furthermore there is this one "TODO" in kselftests.h:
> 
> 	/* TODO: how does "error" differ from "fail" or "skip"? */
> 
> If you meant the distintion less literally then I'd say the sparse_masks
> comparison to CPUID would be a failure. What I had in mind is that it tries to
> validate a resctrl interface relevant to non-contiguous CBMs. If it fails
> there is probably something wrong with the code concerning non-contiguous CBMs.

Wrong with which code? As I understand this particular check compares the
resctrl view of the world to the hardware realities. If this check fails
then I do not think this is an issue with the test code (which would make it a test
error) but instead a resctrl bug and thus a test failure.

> On the other hand writing contiguous CBMs shouldn't fail as far as the
> non-contiguous CBMs in CAT test is concerned. So if that fails there might be
> something wrong on a higher level and I'd say that can be more of an error than
> a failure.

I think that the write_schemata() can fail for a variety of reasons, some may
indicate an issue with the test while some may indicate an issue with resctrl.
It is not possible for the caller of write_schemata() to distinguish.

> But I'm just saying how I undestood it so far. If there is some clear
> distinction between error and failure definitions I could try to separate it
> more explicitly.

I do not think it is possible to clearly distinguish between error and failure.
These are already lumped together as a ksft_test_result_fail() anyway so no
risk of confusion to folks just running the tests.
I think the final test result may be confusing to folks parsing the
resctrl selftest internals:

	run_single_test()
	{
		...
		ret = test->run_test(test, uparams);
		ksft_test_result(!ret, "%s: test\n", test->name);
		...
	}

above means that a test returning negative or greater than zero value is
considered a test failure and resctrl tests may return either in the case of
an actual test failure ... but from user perspective there is no difference
so I do not think it is an issue, just lack of consistency in the resctrl
test internals in cases like write_schemata() failure where a possible
test fail is captured as a test error. 

I do not think it is required to be strict here. Keeping "test returns
negative or greater than zero on test failure" seems reasonable to me.

Reinette

  reply	other threads:[~2024-02-01 19:47 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-01-25 11:09 [PATCH v3 0/5] selftests/resctrl: Add non-contiguous CBMs in Intel CAT selftest Maciej Wieczor-Retman
2024-01-25 11:10 ` [PATCH v3 1/5] selftests/resctrl: Add test groups and name L3 CAT test L3_CAT Maciej Wieczor-Retman
2024-01-25 11:10 ` [PATCH v3 2/5] selftests/resctrl: Add helpers for the non-contiguous test Maciej Wieczor-Retman
2024-01-25 12:14   ` Ilpo Järvinen
2024-01-26 18:58     ` Reinette Chatre
2024-01-31 11:57       ` Maciej Wieczor-Retman
2024-01-31 12:04         ` Ilpo Järvinen
2024-01-26 21:08   ` Reinette Chatre
2024-01-31 11:48     ` Maciej Wieczor-Retman
2024-01-25 11:12 ` [PATCH v3 3/5] selftests/resctrl: Split validate_resctrl_feature_request() Maciej Wieczor-Retman
2024-01-25 11:46   ` Ilpo Järvinen
2024-01-31 10:05     ` Maciej Wieczor-Retman
2024-01-25 11:12 ` [PATCH v3 4/5] selftests/resctrl: Add resource_info_file_exists() Maciej Wieczor-Retman
2024-01-25 12:16   ` Ilpo Järvinen
2024-01-26 21:08   ` Reinette Chatre
2024-01-31 12:07     ` Maciej Wieczor-Retman
2024-01-25 11:13 ` [PATCH v3 5/5] selftests/resctrl: Add non-contiguous CBMs CAT test Maciej Wieczor-Retman
2024-01-26 21:10   ` Reinette Chatre
2024-01-31 12:55     ` Maciej Wieczor-Retman
2024-02-01 19:47       ` Reinette Chatre [this message]
2024-02-02 10:17         ` Maciej Wieczor-Retman
2024-02-02 17:10           ` Reinette Chatre

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6a7b14a7-966d-4953-9b9b-0c847c83ddbb@intel.com \
    --to=reinette.chatre@intel.com \
    --cc=fenghua.yu@intel.com \
    --cc=ilpo.jarvinen@linux.intel.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=maciej.wieczor-retman@intel.com \
    --cc=shuah@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox