Re: KCIDB: Support one more test status

public inbox for kernelci@lists.linux.dev
 help / color / mirror / Atom feed

From: Nikolai Kondrashov <Nikolai.Kondrashov@redhat.com>
To: "Bird, Tim" <Tim.Bird@sony.com>,
	"kernelci@lists.linux.dev" <kernelci@lists.linux.dev>,
	Dmitry Vyukov <dvyukov@google.com>,
	Cristian Marussi <cristian.marussi@arm.com>,
	Alice Ferrazzi <alicef@gentoo.org>,
	Philip Li <philip.li@intel.com>,
	Vishal Bhoj <vishal.bhoj@linaro.org>,
	"automated-testing@lists.yoctoproject.org"
	<automated-testing@lists.yoctoproject.org>,
	CKI <cki-project@redhat.com>, Mark Brown <broonie@kernel.org>,
	Johnson George <Johnson.George@microsoft.com>,
	Sachin Sant <sachinp@linux.ibm.com>
Subject: Re: KCIDB: Support one more test status
Date: Thu, 20 Apr 2023 15:08:42 +0300	[thread overview]
Message-ID: <a5747b70-2dbe-769d-82be-9f9e96ee283f@redhat.com> (raw)
In-Reply-To: <BYAPR13MB2503FE8D3D92A95B8B331B8DFD629@BYAPR13MB2503.namprd13.prod.outlook.com>

Hi Tim,

Thanks a lot for your response!
I will do some snipping and answering below.

On 4/19/23 22:38, Bird, Tim wrote:
>> After the testing is done, or we gave up trying, we send the same KCIDB test
>> objects to the database, but this time only containing whatever results we
>> got, including the "status" fields. However, with the current set of status
>> strings [1], the only way we can try to express "wanted to run, but couldn't"
>> is with "SKIP", which is not supposed to alert anyone, yet this situation
>> should be treated as a problem.
>
> Why can't "wanted to run, but couldn't" be expressed with "ERROR"?

This is a matter of responsibility area and distinguishing who should be 
fixing the problem.

The three layers I listed below correspond to the three distinct parties 
involved in testing, at Red Hat, and other large CI systems: "CODE" is for 
kernel developers/maintainers, "TEST" is for test maintainers, and "HARNESS+" 
is for CI system maintainers.

At Red Hat we have the CKI project, which is responsible for maintaining the 
pipeline, builds, provisioning, reporting, etc - "HARNESS+". Then we have *a 
lot* of test maintainers, both internal (for the tests Red Hat needs), and 
external (for test suites like LTP) - "TEST". Finally, we have the kernel 
developers/maintainers, of course - "CODE".

Naturally, if there was an issue with the test (normally reported as ERROR), 
we don't want to bother kernel developers. A test maintainer would need to 
deal with that (and they would get their notification), although they would be 
interested in just regular PASS/FAIL results too.

Similarly, if the CI system couldn't manage to run the test, we won't want to 
report it as ERROR, because that would alert the test maintainer, while it 
wouldn't be their fault at all, and they shouldn't go and waste their time 
investigating it.

Now, of course, this particular split is not always there, or it's not so 
clear-cut. E.g. kunit tests are normally maintained by kernel developers 
themselves. So they would be interested in both "CODE" and "TEST" layers for 
those, and so wouldn't really need the "ERROR" status - "FAIL" would be enough.

CI system maintainers often take the role of test maintainers as well, and 
they wouldn't really need to distinguish "TEST" and "HARNESS+", for them it 
would be "TEST+", and they wouldn't need the "MISS" status - "ERROR" would be 
enough.

However, this split (and the various statuses) is a good tool for handling all 
the various responsibility combinations the CI systems submitting to KCIDB 
have. It allows precise targeting of notifications and dashboard data, saving 
time and effort in many cases.

>> We propose to call this new status "MISS" (as in "the test result should be
>> there, but isn't"), and think it would be useful to others as well.
>>
>> We can break down the testing stack into three layers: the tested code, the
>> test, and the harness (and everything above it) that runs the test. If we then
>> express each existing test status as one trinary outcome per each of those
>> layers, we would get this table (in order of descending status priority):
>>
>>       STATUS      CODE TEST HARNESS+           LEGEND
>>
>>       FAIL        ❌   ✅   ✅                 ❌ - failure
>>       ERROR       ➖   ❌   ✅                 ✅ - success
>>       PASS        ✅   ✅   ✅                 ➖ - no data
>>       DONE        ➖   ✅   ✅
>>       SKIP        ➖   ➖   ✅
>>                   ➖   ➖   ➖
>>
>> If you look at the above closely, you will notice one possible state missing
>> (because we didn't need to express failing harnesses), and that is the status
>> we want to introduce:
>>
>>       STATUS      CODE TEST HARNESS+           LEGEND
>>
>>       FAIL        ❌   ✅   ✅                 ❌ - failure
>>       ERROR       ➖   ❌   ✅                 ✅ - success
>>   =>  MISS        ➖   ➖   ❌ <=              ➖ - no data
>>       PASS        ✅   ✅   ✅
>>       DONE        ➖   ✅   ✅
>>       SKIP        ➖   ➖   ✅
>>                   ➖   ➖   ➖
>>
>> Please respond with comments, objections, and (counter-)proposals,
>> if you have them.
> 
> I don't understand the rationale for distinguishing a test error from a harness
> error.  In either case the test was not executed properly, and so there is no
> useful test result data available.  Diagnostic information should enable
> the user to determine whether the problem was due to the test code failing
> or the test harness failing.

This works when the test maintainer and the harness/framework/CI system 
maintainer are the same person or team. This doesn't work when everything from 
CI system down to the test (suite) harness is maintained by one team, and the 
test by a completely different team (e.g. CKI and LTP).

> I think I'm missing something.  Are you trying to distinguish these so you
> can determine whether there is a problem with the test itself, vs. the harness?

Yes.

> Are you automatically re-running a test if the harness is the problem?

We do try rerunning tests in case we hit a faulty host in our inventory (this 
happens), or e.g. a network problem occurred. However, at some point we gotta 
give up, and then we need a way to say: "this test result is not just missing 
or in progress (as specified by a missing "status" property), but we're done 
testing, and we couldn't run that test".

Because we usually run multiple suites on each machine, one after another, if 
one of them crashes/locks up the machine (we're testing the kernel after all), 
then the following suites won't be able to run. In this case we also need a 
way to say "we finished testing, but those tests didn't even get to run".

> Why do you want to distinguish these error cases?

As described above, to alert the right people and avoid wasting time of others.

Nick

next prev parent reply	other threads:[~2023-04-20 12:08 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-04-19 17:08 KCIDB: Support one more test status Nikolai Kondrashov
2023-04-19 17:49 ` Mark Brown
2023-04-20 16:50   ` Nikolai Kondrashov
2023-04-19 19:38 ` Bird, Tim
2023-04-20  5:53   ` Guillaume Tucker
2023-04-20 16:37     ` [Automated-testing] " Nikolai Kondrashov
2023-05-05 11:39       ` Guillaume Tucker
2023-05-08 10:16         ` Nikolai Kondrashov
2023-04-20 12:08   ` Nikolai Kondrashov [this message]
2023-04-20 21:48     ` Bird, Tim

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a5747b70-2dbe-769d-82be-9f9e96ee283f@redhat.com \
    --to=nikolai.kondrashov@redhat.com \
    --cc=Johnson.George@microsoft.com \
    --cc=Tim.Bird@sony.com \
    --cc=alicef@gentoo.org \
    --cc=automated-testing@lists.yoctoproject.org \
    --cc=broonie@kernel.org \
    --cc=cki-project@redhat.com \
    --cc=cristian.marussi@arm.com \
    --cc=dvyukov@google.com \
    --cc=kernelci@lists.linux.dev \
    --cc=philip.li@intel.com \
    --cc=sachinp@linux.ibm.com \
    --cc=vishal.bhoj@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox