From: Peter Senna Tschudin <peter.senna@linux.intel.com>
To: "Zbigniew Kempczyński" <zbigniew.kempczynski@intel.com>
Cc: "igt-dev@lists.freedesktop.org" <igt-dev@lists.freedesktop.org>,
Ryszard Knop <ryszard.knop@intel.com>,
Janusz Krzysztofik <janusz.krzysztofik@linux.intel.com>,
Lucas De Marchi <lucas.demarchi@intel.com>,
luciano.coelho@intel.com, nirmoy.das@intel.com,
stuart.summers@intel.com, himal.prasad.ghimiray@intel.com,
katarzyna.piecielska@intel.com
Subject: Re: [PATCH i-g-t v8] igt-runner fact checking
Date: Mon, 25 Nov 2024 11:21:08 +0100 [thread overview]
Message-ID: <923b0d07-a4bc-4c40-a4bc-65829360fb94@linux.intel.com> (raw)
In-Reply-To: <20241125094949.gxqftc3fccggwfo5@zkempczy-mobl2>
On 25.11.2024 10:49, Zbigniew Kempczyński wrote:
> On Thu, Nov 21, 2024 at 03:22:30PM +0100, Peter Senna Tschudin wrote:
>> When using igt-runner, collect facts before each test and after the
>> last test, and report when facts change. The facts are:
>> - GPUs on PCI bus: hardware.pci.gpu_at_addr.0000:03:00.0: 8086:e20b Intel Battlemage (Gen20)
>> - Associations between PCI GPU and DRM card: hardware.pci.drm_card_at_addr.0000:03:00.0: card1
>> - Kernel taints: kernel.is_tainted.taint_warn: true
>> - GPU kernel modules loaded: kernel.kmod_is_loaded.i915: true
>>
>> This change imposes little execution overhead and adds just a few
>> lines of logging. The facts will be printed on normal igt-runner
>> output. Here is a real example from our CI shwoing
>> hotreplug-lateclose changing the DRM card number and tainting the
>> kernel on the abort path:
>>
>> [245.316207] [056/121] (816s left) core_hotunplug (hotreplug-lateclose)
>> [245.383596] Starting subtest: hotreplug-lateclose
>> [249.843361] Aborting: Lockdep not active
>> [249.858249] [FACT core_hotunplug (hotreplug-lateclose)] changed: hardware.pci.drm_card_at_addr.0000:00:02.0: card0 -> card1
>> [249.858392] [FACT core_hotunplug (hotreplug-lateclose)] new: kernel.is_tainted.taint_die: true
>> [249.859075] Closing watchdogs
>
> <cut>
>
> Regardless implementation - I wondered a bit about igt_runner and using
> it by the others - instead of turning on gathering the facts from the
> default I would add separate option to enable it. I see -f/--facts would
> appropriate. This would allow to enable/disable it from CI perspective
> if we would notice some problems - instead reverting the code we can
> just disable it from CI perspective.
Thank you for the input. Can you expand on the cost for others? It is just a few extra lines of log.
igt-facts are primarily looking for sub-tests that make changes to the environment, that cause issues downstream. Here is an example. Imagine test B runs after test A, and that there are 100 tests in between. If test A has tainted the kernel or changed modules loaded, it can cause B to fail. The value is identifying test A as the offender. Can you expand on how this is a problem for others?
The secondary goal is to report when weird stuff happens such as disappearing PCI GPU. As the facts goal is to detect events that are not expected to happen, having options to choose facts or to disable it makes no sense.
You mention folks who have non-PCI gpu. Covering their use case can be added later if they want. Lets wait for them to manifest interest instead of trying to come up with something perfect. Please :-)
Just to give you an idea of how pervasive the problem of tests changing the environment is: 49% of all test-lists from IGTPW_12121 had at least one kernel taint. With a little bit of an approximation we can estimate that each sub-test had a 25% chance of running in a tainted kernel.
For me having 25% chance of any sub-tests running in a tainted kernel is a problem, for everyone. Can you expand on why this would be different for others?
In short, my take is:
- overhead is low: just a few lines of code
- the value is there for everyone
- if we need to adapt facts for others, I will be happy to do it when others manifest
Can I get your reviewed-by? Please :-)
Thanks
next prev parent reply other threads:[~2024-11-25 10:22 UTC|newest]
Thread overview: 113+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-11-02 11:37 [PATCH i-g-t] igt-runner fact checking Peter Senna Tschudin
2024-11-04 19:27 ` ✓ CI.xeBAT: success for " Patchwork
2024-11-05 5:09 ` [PATCH i-g-t] " Peter Senna Tschudin
2024-11-05 8:18 ` ✗ CI.xeFULL: failure for " Patchwork
2024-11-06 9:28 ` [PATCH i-g-t v2] " Peter Senna Tschudin
2024-11-06 12:44 ` Kamil Konieczny
2024-11-07 7:03 ` Peter Senna Tschudin
2024-11-07 17:44 ` Kamil Konieczny
2024-11-06 11:15 ` ✗ Fi.CI.BAT: failure for igt-runner fact checking (rev2) Patchwork
2024-11-06 11:23 ` ✓ CI.xeBAT: success " Patchwork
2024-11-06 14:13 ` [PATCH i-g-t] igt-runner fact checking Lucas De Marchi
2024-11-07 6:57 ` [PATCH i-g-t v3] " Peter Senna Tschudin
2024-11-07 7:13 ` ✗ GitLab.Pipeline: warning for igt-runner fact checking (rev3) Patchwork
2024-11-07 7:18 ` [PATCH i-g-t] igt-runner fact checking Peter Senna Tschudin
2024-11-07 15:55 ` Lucas De Marchi
2024-11-07 17:48 ` Peter Senna Tschudin
2024-11-07 19:29 ` Lucas De Marchi
2024-11-08 5:30 ` Peter Senna Tschudin
2024-11-08 16:24 ` Lucas De Marchi
2024-11-08 1:15 ` Knop, Ryszard
2024-11-08 6:51 ` Peter Senna Tschudin
2024-11-08 13:41 ` Knop, Ryszard
2024-11-07 7:26 ` ✓ CI.xeBAT: success for igt-runner fact checking (rev3) Patchwork
2024-11-07 7:30 ` [PATCH i-g-t v3] igt-runner fact checking Peter Senna Tschudin
2024-11-07 17:39 ` Kamil Konieczny
2024-11-07 7:51 ` ✓ Fi.CI.BAT: success for igt-runner fact checking (rev3) Patchwork
2024-11-07 8:40 ` ✗ Fi.CI.IGT: failure " Patchwork
2024-11-07 13:21 ` ✗ CI.xeFULL: failure for igt-runner fact checking (rev2) Patchwork
2024-11-08 12:54 ` ✗ CI.xeFULL: failure for igt-runner fact checking (rev3) Patchwork
2024-11-09 7:15 ` [PATCH i-g-t v4] igt-runner fact checking Peter Senna Tschudin
2024-11-09 7:46 ` [PATCH i-g-t v5] " Peter Senna Tschudin
2024-11-09 8:33 ` ✓ Fi.CI.BAT: success for igt-runner fact checking (rev5) Patchwork
2024-11-09 8:36 ` ✗ CI.xeBAT: failure " Patchwork
2024-11-11 5:55 ` Peter Senna Tschudin
2024-11-09 9:33 ` ✗ Fi.CI.IGT: " Patchwork
2024-11-11 5:54 ` Peter Senna Tschudin
2024-11-10 5:09 ` ✗ CI.xeFULL: " Patchwork
2024-11-11 5:53 ` Peter Senna Tschudin
2024-11-18 8:24 ` [PATCH i-g-t v6] igt-runner fact checking Peter Senna Tschudin
2024-11-18 13:07 ` Luca Coelho
2024-11-18 16:03 ` Peter Senna Tschudin
2024-11-19 8:19 ` Luca Coelho
2024-11-19 10:07 ` Peter Senna Tschudin
2024-11-19 10:24 ` Luca Coelho
2024-11-20 6:09 ` Peter Senna Tschudin
2024-11-18 20:45 ` ✓ CI.xeBAT: success for igt-runner fact checking (rev6) Patchwork
2024-11-18 21:00 ` ✗ Fi.CI.BAT: failure " Patchwork
2024-11-19 5:31 ` Peter Senna Tschudin
2024-11-19 5:08 ` ✗ CI.xeFULL: " Patchwork
2024-11-19 6:15 ` Peter Senna Tschudin
2024-11-21 12:35 ` [PATCH i-g-t v7] igt-runner fact checking Peter Senna Tschudin
2024-11-21 14:22 ` [PATCH i-g-t v8] " Peter Senna Tschudin
2024-11-25 9:49 ` Zbigniew Kempczyński
2024-11-25 10:21 ` Peter Senna Tschudin [this message]
2024-11-25 11:32 ` Zbigniew Kempczyński
2024-11-21 20:48 ` ✓ Xe.CI.BAT: success for igt-runner fact checking (rev8) Patchwork
2024-11-21 20:57 ` ✗ i915.CI.BAT: failure " Patchwork
2024-11-22 6:36 ` Peter Senna Tschudin
2024-11-22 8:08 ` Illipilli, TejasreeX
2024-11-22 8:06 ` ✓ i915.CI.BAT: success " Patchwork
2024-11-22 8:11 ` ✗ Xe.CI.Full: failure " Patchwork
2024-11-22 8:27 ` Peter Senna Tschudin
2024-11-25 7:15 ` Peter Senna Tschudin
2024-11-25 7:20 ` Musial, Ewelina
2024-11-25 7:27 ` Musial, Ewelina
2024-11-25 10:54 ` Illipilli, TejasreeX
2024-11-24 15:01 ` ✗ i915.CI.Full: " Patchwork
2024-11-25 6:57 ` Peter Senna Tschudin
2024-11-25 10:54 ` Illipilli, TejasreeX
2024-11-25 10:39 ` ✓ i915.CI.Full: success " Patchwork
2024-11-29 7:08 ` [PATCH i-g-t v9] igt-runner fact checking Peter Senna Tschudin
2024-12-04 13:17 ` Piatkowski, Dominik Karol
2024-11-29 7:45 ` ✓ Xe.CI.BAT: success for igt-runner fact checking (rev9) Patchwork
2024-11-29 8:07 ` ✗ i915.CI.BAT: failure " Patchwork
2024-11-29 8:16 ` Peter Senna Tschudin
2024-11-29 13:48 ` Illipilli, TejasreeX
2024-11-29 13:42 ` ✓ i915.CI.BAT: success " Patchwork
2024-11-29 17:26 ` ✗ Xe.CI.Full: failure " Patchwork
2024-12-02 5:01 ` Peter Senna Tschudin
2024-11-29 17:28 ` ✗ i915.CI.Full: " Patchwork
2024-12-02 5:07 ` Peter Senna Tschudin
2024-12-03 6:43 ` Illipilli, TejasreeX
2024-12-02 14:29 ` ✓ i915.CI.Full: success " Patchwork
2024-12-05 4:54 ` [PATCH i-g-t v10] igt-runner fact checking Peter Senna Tschudin
2024-12-05 9:08 ` Piatkowski, Dominik Karol
2024-12-06 11:42 ` Kamil Konieczny
2024-12-06 13:16 ` Peter Senna Tschudin
2024-12-06 16:46 ` Kamil Konieczny
2024-12-05 5:41 ` ✓ i915.CI.BAT: success for igt-runner fact checking (rev10) Patchwork
2024-12-05 6:07 ` ✓ Xe.CI.BAT: " Patchwork
2024-12-05 6:58 ` ✗ i915.CI.Full: failure " Patchwork
2024-12-05 8:15 ` Peter Senna Tschudin
2024-12-05 13:01 ` Illipilli, TejasreeX
2024-12-05 8:02 ` ✗ Xe.CI.Full: " Patchwork
2024-12-05 8:35 ` Peter Senna Tschudin
2024-12-05 10:51 ` [PATCH i-g-t v11] igt-runner fact checking Peter Senna Tschudin
2024-12-09 10:53 ` Zbigniew Kempczyński
2024-12-09 17:16 ` Kamil Konieczny
2024-12-10 12:00 ` Knop, Ryszard
2024-12-10 14:14 ` Knop, Ryszard
2024-12-05 12:59 ` ✓ i915.CI.Full: success for igt-runner fact checking (rev10) Patchwork
2024-12-12 7:15 ` [PATCH i-g-t v12 0/3] igt_facts for fact tracking Peter Senna Tschudin
2024-12-12 7:15 ` [PATCH i-g-t v12 1/3] lib/igt_facts: Library and unit testing " Peter Senna Tschudin
2024-12-12 16:19 ` Zbigniew Kempczyński
2024-12-12 7:15 ` [PATCH i-g-t v12 2/3] tools/lsfacts: Add tool for listing facts Peter Senna Tschudin
2024-12-12 16:20 ` Zbigniew Kempczyński
2024-12-12 7:15 ` [PATCH i-g-t v12 3/3] runner/executor: Integrate igt_facts functionality Peter Senna Tschudin
2024-12-12 16:23 ` Zbigniew Kempczyński
2024-12-12 17:33 ` Kamil Konieczny
2024-12-12 8:16 ` ✓ Xe.CI.BAT: success for igt-runner fact checking (rev12) Patchwork
2024-12-12 8:43 ` ✓ i915.CI.BAT: " Patchwork
2024-12-12 12:57 ` ✗ i915.CI.Full: failure " Patchwork
2024-12-12 14:30 ` ✗ Xe.CI.Full: " Patchwork
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=923b0d07-a4bc-4c40-a4bc-65829360fb94@linux.intel.com \
--to=peter.senna@linux.intel.com \
--cc=himal.prasad.ghimiray@intel.com \
--cc=igt-dev@lists.freedesktop.org \
--cc=janusz.krzysztofik@linux.intel.com \
--cc=katarzyna.piecielska@intel.com \
--cc=lucas.demarchi@intel.com \
--cc=luciano.coelho@intel.com \
--cc=nirmoy.das@intel.com \
--cc=ryszard.knop@intel.com \
--cc=stuart.summers@intel.com \
--cc=zbigniew.kempczynski@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox