* [PATCH v2 0/3] kselftest: Add test to report device log errors
@ 2024-07-05 23:29 Nícolas F. R. A. Prado
2024-07-05 23:29 ` [PATCH v2 1/3] kselftest: devices: Move discoverable devices test to subdirectory Nícolas F. R. A. Prado
` (3 more replies)
0 siblings, 4 replies; 12+ messages in thread
From: Nícolas F. R. A. Prado @ 2024-07-05 23:29 UTC (permalink / raw)
To: Shuah Khan, Greg Kroah-Hartman
Cc: kernel, linux-kselftest, linux-kernel, kernelci,
Nícolas F. R. A. Prado
Log errors are the most widely used mechanism for reporting issues in
the kernel. When an error is logged using the device helpers, eg
dev_err(), it gets metadata attached that identifies the subsystem and
device where the message is coming from. This series makes use of that
metadata in a new test to report which devices logged errors.
The first two patches move a test and a helper script to keep things
organized before this new test is added in the third patch.
It is expected that there might be many false-positive error messages
throughout the drivers code which will be reported by this test. By
having this test in the first place and working through the results we
can address those occurrences by adjusting the loglevel of the messages
that turn out to not be real errors that require the user's attention.
It will also motivate additional error messages to be introduced in the
code to detect real errors where they turn out to be missing, since
it will be possible to detect said issues automatically.
As an example, below you can see the test result for
mt8192-asurada-spherion. The single standing issue has been investigated
and will be addressed in an EC firmware update [1]:
TAP version 13
1..1
power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
power_supply sbs-8-000b: driver failed to report `model_name' property: -6
power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
power_supply sbs-8-000b: driver failed to report `energy_full_design' property: -6
power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
not ok 1 +power_supply:sbs-8-000b
Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
[1] https://lore.kernel.org/all/cf4d8131-4b63-4c7a-9f27-5a0847c656c4@notapiano
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
---
Changes in v2:
- Rebased onto next-20240703
- Link to v1: https://lore.kernel.org/r/20240423-dev-err-log-selftest-v1-0-690c1741d68b@collabora.com
---
Nícolas F. R. A. Prado (3):
kselftest: devices: Move discoverable devices test to subdirectory
kselftest: Move ksft helper module to common directory
kselftest: devices: Add test to detect device error logs
tools/testing/selftests/Makefile | 4 +-
tools/testing/selftests/devices/Makefile | 4 -
.../testing/selftests/devices/error_logs/Makefile | 3 +
.../devices/error_logs/test_device_error_logs.py | 85 ++++++++++++++++++++++
tools/testing/selftests/devices/probe/Makefile | 4 +
.../{ => probe}/boards/Dell Inc.,XPS 13 9300.yaml | 0
.../{ => probe}/boards/google,spherion.yaml | 0
.../{ => probe}/test_discoverable_devices.py | 7 +-
.../selftests/{devices => kselftest}/ksft.py | 0
9 files changed, 101 insertions(+), 6 deletions(-)
---
base-commit: 0b58e108042b0ed28a71cd7edf5175999955b233
change-id: 20240421-dev-err-log-selftest-28f5b8fc7cd0
Best regards,
--
Nícolas F. R. A. Prado <nfraprado@collabora.com>
^ permalink raw reply [flat|nested] 12+ messages in thread
* [PATCH v2 1/3] kselftest: devices: Move discoverable devices test to subdirectory
2024-07-05 23:29 [PATCH v2 0/3] kselftest: Add test to report device log errors Nícolas F. R. A. Prado
@ 2024-07-05 23:29 ` Nícolas F. R. A. Prado
2024-07-05 23:29 ` [PATCH v2 2/3] kselftest: Move ksft helper module to common directory Nícolas F. R. A. Prado
` (2 subsequent siblings)
3 siblings, 0 replies; 12+ messages in thread
From: Nícolas F. R. A. Prado @ 2024-07-05 23:29 UTC (permalink / raw)
To: Shuah Khan, Greg Kroah-Hartman
Cc: kernel, linux-kselftest, linux-kernel, kernelci,
Nícolas F. R. A. Prado
Move the discoverable devices test to a subdirectory to allow other
related tests to be added to the devices directory.
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
---
tools/testing/selftests/Makefile | 2 +-
tools/testing/selftests/devices/{ => probe}/Makefile | 2 +-
.../selftests/devices/{ => probe}/boards/Dell Inc.,XPS 13 9300.yaml | 0
tools/testing/selftests/devices/{ => probe}/boards/google,spherion.yaml | 0
tools/testing/selftests/devices/{ => probe}/ksft.py | 0
.../testing/selftests/devices/{ => probe}/test_discoverable_devices.py | 0
6 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile
index 9039f3709aff..37214201d974 100644
--- a/tools/testing/selftests/Makefile
+++ b/tools/testing/selftests/Makefile
@@ -13,7 +13,7 @@ TARGETS += core
TARGETS += cpufreq
TARGETS += cpu-hotplug
TARGETS += damon
-TARGETS += devices
+TARGETS += devices/probe
TARGETS += dmabuf-heaps
TARGETS += drivers/dma-buf
TARGETS += drivers/s390x/uvdevice
diff --git a/tools/testing/selftests/devices/Makefile b/tools/testing/selftests/devices/probe/Makefile
similarity index 77%
rename from tools/testing/selftests/devices/Makefile
rename to tools/testing/selftests/devices/probe/Makefile
index ca29249b30c3..7a6eaa031cfe 100644
--- a/tools/testing/selftests/devices/Makefile
+++ b/tools/testing/selftests/devices/probe/Makefile
@@ -1,4 +1,4 @@
TEST_PROGS := test_discoverable_devices.py
TEST_FILES := boards ksft.py
-include ../lib.mk
+include ../../lib.mk
diff --git a/tools/testing/selftests/devices/boards/Dell Inc.,XPS 13 9300.yaml b/tools/testing/selftests/devices/probe/boards/Dell Inc.,XPS 13 9300.yaml
similarity index 100%
rename from tools/testing/selftests/devices/boards/Dell Inc.,XPS 13 9300.yaml
rename to tools/testing/selftests/devices/probe/boards/Dell Inc.,XPS 13 9300.yaml
diff --git a/tools/testing/selftests/devices/boards/google,spherion.yaml b/tools/testing/selftests/devices/probe/boards/google,spherion.yaml
similarity index 100%
rename from tools/testing/selftests/devices/boards/google,spherion.yaml
rename to tools/testing/selftests/devices/probe/boards/google,spherion.yaml
diff --git a/tools/testing/selftests/devices/ksft.py b/tools/testing/selftests/devices/probe/ksft.py
similarity index 100%
rename from tools/testing/selftests/devices/ksft.py
rename to tools/testing/selftests/devices/probe/ksft.py
diff --git a/tools/testing/selftests/devices/test_discoverable_devices.py b/tools/testing/selftests/devices/probe/test_discoverable_devices.py
similarity index 100%
rename from tools/testing/selftests/devices/test_discoverable_devices.py
rename to tools/testing/selftests/devices/probe/test_discoverable_devices.py
--
2.45.2
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v2 2/3] kselftest: Move ksft helper module to common directory
2024-07-05 23:29 [PATCH v2 0/3] kselftest: Add test to report device log errors Nícolas F. R. A. Prado
2024-07-05 23:29 ` [PATCH v2 1/3] kselftest: devices: Move discoverable devices test to subdirectory Nícolas F. R. A. Prado
@ 2024-07-05 23:29 ` Nícolas F. R. A. Prado
2024-07-05 23:29 ` [PATCH v2 3/3] kselftest: devices: Add test to detect device error logs Nícolas F. R. A. Prado
2024-07-10 13:11 ` [PATCH v2 0/3] kselftest: Add test to report device log errors Greg Kroah-Hartman
3 siblings, 0 replies; 12+ messages in thread
From: Nícolas F. R. A. Prado @ 2024-07-05 23:29 UTC (permalink / raw)
To: Shuah Khan, Greg Kroah-Hartman
Cc: kernel, linux-kselftest, linux-kernel, kernelci,
Nícolas F. R. A. Prado
Move the ksft python module, which provides generic helpers for
kselftests, to a common directory so it can be more easily shared
between different tests.
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
---
tools/testing/selftests/Makefile | 1 +
tools/testing/selftests/devices/probe/Makefile | 2 +-
tools/testing/selftests/devices/probe/test_discoverable_devices.py | 7 ++++++-
tools/testing/selftests/{devices/probe => kselftest}/ksft.py | 0
4 files changed, 8 insertions(+), 2 deletions(-)
diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile
index 37214201d974..7bd78b9f5cdd 100644
--- a/tools/testing/selftests/Makefile
+++ b/tools/testing/selftests/Makefile
@@ -251,6 +251,7 @@ ifdef INSTALL_PATH
install -m 744 kselftest/runner.sh $(INSTALL_PATH)/kselftest/
install -m 744 kselftest/prefix.pl $(INSTALL_PATH)/kselftest/
install -m 744 kselftest/ktap_helpers.sh $(INSTALL_PATH)/kselftest/
+ install -m 744 kselftest/ksft.py $(INSTALL_PATH)/kselftest/
install -m 744 run_kselftest.sh $(INSTALL_PATH)/
rm -f $(TEST_LIST)
@ret=1; \
diff --git a/tools/testing/selftests/devices/probe/Makefile b/tools/testing/selftests/devices/probe/Makefile
index 7a6eaa031cfe..f630108c3fdf 100644
--- a/tools/testing/selftests/devices/probe/Makefile
+++ b/tools/testing/selftests/devices/probe/Makefile
@@ -1,4 +1,4 @@
TEST_PROGS := test_discoverable_devices.py
-TEST_FILES := boards ksft.py
+TEST_FILES := boards
include ../../lib.mk
diff --git a/tools/testing/selftests/devices/probe/test_discoverable_devices.py b/tools/testing/selftests/devices/probe/test_discoverable_devices.py
index 8f2200540a1f..d94a74b8a054 100755
--- a/tools/testing/selftests/devices/probe/test_discoverable_devices.py
+++ b/tools/testing/selftests/devices/probe/test_discoverable_devices.py
@@ -16,12 +16,17 @@
import argparse
import glob
-import ksft
import os
import re
import sys
import yaml
+# Allow ksft module to be imported from different directory
+this_dir = os.path.dirname(os.path.realpath(__file__))
+sys.path.append(os.path.join(this_dir, "../../kselftest/"))
+
+import ksft
+
pci_controllers = []
usb_controllers = []
diff --git a/tools/testing/selftests/devices/probe/ksft.py b/tools/testing/selftests/kselftest/ksft.py
similarity index 100%
rename from tools/testing/selftests/devices/probe/ksft.py
rename to tools/testing/selftests/kselftest/ksft.py
--
2.45.2
^ permalink raw reply related [flat|nested] 12+ messages in thread
* [PATCH v2 3/3] kselftest: devices: Add test to detect device error logs
2024-07-05 23:29 [PATCH v2 0/3] kselftest: Add test to report device log errors Nícolas F. R. A. Prado
2024-07-05 23:29 ` [PATCH v2 1/3] kselftest: devices: Move discoverable devices test to subdirectory Nícolas F. R. A. Prado
2024-07-05 23:29 ` [PATCH v2 2/3] kselftest: Move ksft helper module to common directory Nícolas F. R. A. Prado
@ 2024-07-05 23:29 ` Nícolas F. R. A. Prado
2024-07-10 13:11 ` [PATCH v2 0/3] kselftest: Add test to report device log errors Greg Kroah-Hartman
3 siblings, 0 replies; 12+ messages in thread
From: Nícolas F. R. A. Prado @ 2024-07-05 23:29 UTC (permalink / raw)
To: Shuah Khan, Greg Kroah-Hartman
Cc: kernel, linux-kselftest, linux-kernel, kernelci,
Nícolas F. R. A. Prado
Log errors are the most widely used mechanism for reporting issues in
the kernel. When an error is logged using the device helpers, eg
dev_err(), it gets metadata attached that identifies the subsystem and
device where the message is coming from. Introduce a new test that makes
use of that metadata to report which devices logged errors (or more
critical messages).
Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
---
tools/testing/selftests/Makefile | 1 +
.../testing/selftests/devices/error_logs/Makefile | 3 +
.../devices/error_logs/test_device_error_logs.py | 85 ++++++++++++++++++++++
3 files changed, 89 insertions(+)
diff --git a/tools/testing/selftests/Makefile b/tools/testing/selftests/Makefile
index 7bd78b9f5cdd..c4937c87df22 100644
--- a/tools/testing/selftests/Makefile
+++ b/tools/testing/selftests/Makefile
@@ -13,6 +13,7 @@ TARGETS += core
TARGETS += cpufreq
TARGETS += cpu-hotplug
TARGETS += damon
+TARGETS += devices/error_logs
TARGETS += devices/probe
TARGETS += dmabuf-heaps
TARGETS += drivers/dma-buf
diff --git a/tools/testing/selftests/devices/error_logs/Makefile b/tools/testing/selftests/devices/error_logs/Makefile
new file mode 100644
index 000000000000..d546c3fb0a7f
--- /dev/null
+++ b/tools/testing/selftests/devices/error_logs/Makefile
@@ -0,0 +1,3 @@
+TEST_PROGS := test_device_error_logs.py
+
+include ../../lib.mk
diff --git a/tools/testing/selftests/devices/error_logs/test_device_error_logs.py b/tools/testing/selftests/devices/error_logs/test_device_error_logs.py
new file mode 100755
index 000000000000..3dd56c8ec92c
--- /dev/null
+++ b/tools/testing/selftests/devices/error_logs/test_device_error_logs.py
@@ -0,0 +1,85 @@
+#!/usr/bin/env python3
+# SPDX-License-Identifier: GPL-2.0
+#
+# Copyright (c) 2024 Collabora Ltd
+#
+# This test checks for the presence of error (or more critical) log messages
+# coming from devices in the kernel log.
+#
+# One failed test case is reported for each device that has outputted error
+# logs. Devices with no errors do not produce a passing test case to avoid
+# polluting the results, therefore a successful run will list 0 tests run.
+#
+
+import glob
+import os
+import re
+import sys
+
+# Allow ksft module to be imported from different directory
+this_dir = os.path.dirname(os.path.realpath(__file__))
+sys.path.append(os.path.join(this_dir, "../../kselftest/"))
+
+import ksft
+
+kmsg = "/dev/kmsg"
+
+RE_log = re.compile(
+ r"(?P<prefix>[0-9]+),(?P<sequence>[0-9]+),(?P<timestamp>[0-9]+),(?P<flag>[^;]*)(,[^;]*)*;(?P<message>.*)"
+)
+RE_tag = re.compile(r" (?P<key>[^=]+)=(?P<value>.*)")
+
+PREFIX_ERROR = 3
+
+logs = []
+error_log_per_device = {}
+
+
+def parse_kmsg():
+ current_log = {}
+
+ with open(kmsg) as f:
+ os.set_blocking(f.fileno(), False)
+
+ for line in f:
+ tag_line = RE_tag.match(line)
+ log_line = RE_log.match(line)
+
+ if log_line:
+ if current_log:
+ logs.append(current_log) # Save last log
+
+ current_log = {
+ "prefix": int(log_line.group("prefix")),
+ "sequence": int(log_line.group("sequence")),
+ "timestamp": int(log_line.group("timestamp")),
+ "flag": log_line.group("flag"),
+ "message": log_line.group("message"),
+ }
+ elif tag_line:
+ current_log[tag_line.group("key")] = tag_line.group("value")
+
+
+def generate_per_device_error_log():
+ for log in logs:
+ if log.get("DEVICE") and log["prefix"] <= PREFIX_ERROR:
+ if not error_log_per_device.get(log["DEVICE"]):
+ error_log_per_device[log["DEVICE"]] = []
+ error_log_per_device[log["DEVICE"]].append(log)
+
+
+parse_kmsg()
+
+generate_per_device_error_log()
+num_tests = len(error_log_per_device)
+
+ksft.print_header()
+ksft.set_plan(num_tests)
+
+for device in error_log_per_device:
+ for log in error_log_per_device[device]:
+ ksft.print_msg(log["message"])
+ ksft.test_result_fail(device)
+if num_tests == 0:
+ ksft.print_msg("No device error logs found")
+ksft.finished()
--
2.45.2
^ permalink raw reply related [flat|nested] 12+ messages in thread
* Re: [PATCH v2 0/3] kselftest: Add test to report device log errors
2024-07-05 23:29 [PATCH v2 0/3] kselftest: Add test to report device log errors Nícolas F. R. A. Prado
` (2 preceding siblings ...)
2024-07-05 23:29 ` [PATCH v2 3/3] kselftest: devices: Add test to detect device error logs Nícolas F. R. A. Prado
@ 2024-07-10 13:11 ` Greg Kroah-Hartman
2024-07-10 21:49 ` Shuah Khan
3 siblings, 1 reply; 12+ messages in thread
From: Greg Kroah-Hartman @ 2024-07-10 13:11 UTC (permalink / raw)
To: Nícolas F. R. A. Prado
Cc: Shuah Khan, kernel, linux-kselftest, linux-kernel, kernelci
On Fri, Jul 05, 2024 at 07:29:53PM -0400, Nícolas F. R. A. Prado wrote:
> Log errors are the most widely used mechanism for reporting issues in
> the kernel. When an error is logged using the device helpers, eg
> dev_err(), it gets metadata attached that identifies the subsystem and
> device where the message is coming from. This series makes use of that
> metadata in a new test to report which devices logged errors.
>
> The first two patches move a test and a helper script to keep things
> organized before this new test is added in the third patch.
>
> It is expected that there might be many false-positive error messages
> throughout the drivers code which will be reported by this test. By
> having this test in the first place and working through the results we
> can address those occurrences by adjusting the loglevel of the messages
> that turn out to not be real errors that require the user's attention.
> It will also motivate additional error messages to be introduced in the
> code to detect real errors where they turn out to be missing, since
> it will be possible to detect said issues automatically.
>
> As an example, below you can see the test result for
> mt8192-asurada-spherion. The single standing issue has been investigated
> and will be addressed in an EC firmware update [1]:
>
> TAP version 13
> 1..1
> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> power_supply sbs-8-000b: driver failed to report `model_name' property: -6
> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> power_supply sbs-8-000b: driver failed to report `energy_full_design' property: -6
> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> not ok 1 +power_supply:sbs-8-000b
> Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
>
> [1] https://lore.kernel.org/all/cf4d8131-4b63-4c7a-9f27-5a0847c656c4@notapiano
>
> Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 0/3] kselftest: Add test to report device log errors
2024-07-10 13:11 ` [PATCH v2 0/3] kselftest: Add test to report device log errors Greg Kroah-Hartman
@ 2024-07-10 21:49 ` Shuah Khan
2024-07-11 19:53 ` Shuah Khan
0 siblings, 1 reply; 12+ messages in thread
From: Shuah Khan @ 2024-07-10 21:49 UTC (permalink / raw)
To: Greg Kroah-Hartman, Nícolas F. R. A. Prado
Cc: Shuah Khan, kernel, linux-kselftest, linux-kernel, kernelci,
Shuah Khan
On 7/10/24 07:11, Greg Kroah-Hartman wrote:
> On Fri, Jul 05, 2024 at 07:29:53PM -0400, Nícolas F. R. A. Prado wrote:
>> Log errors are the most widely used mechanism for reporting issues in
>> the kernel. When an error is logged using the device helpers, eg
>> dev_err(), it gets metadata attached that identifies the subsystem and
>> device where the message is coming from. This series makes use of that
>> metadata in a new test to report which devices logged errors.
>>
>> The first two patches move a test and a helper script to keep things
>> organized before this new test is added in the third patch.
>>
>> It is expected that there might be many false-positive error messages
>> throughout the drivers code which will be reported by this test. By
>> having this test in the first place and working through the results we
>> can address those occurrences by adjusting the loglevel of the messages
>> that turn out to not be real errors that require the user's attention.
>> It will also motivate additional error messages to be introduced in the
>> code to detect real errors where they turn out to be missing, since
>> it will be possible to detect said issues automatically.
>>
>> As an example, below you can see the test result for
>> mt8192-asurada-spherion. The single standing issue has been investigated
>> and will be addressed in an EC firmware update [1]:
>>
>> TAP version 13
>> 1..1
>> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
>> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
>> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
>> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
>> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
>> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
>> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
>> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
>> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
>> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
>> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
>> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
>> power_supply sbs-8-000b: driver failed to report `model_name' property: -6
>> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
>> power_supply sbs-8-000b: driver failed to report `energy_full_design' property: -6
>> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
>> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
>> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
>> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
>> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
>> not ok 1 +power_supply:sbs-8-000b
>> Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
>>
>> [1] https://lore.kernel.org/all/cf4d8131-4b63-4c7a-9f27-5a0847c656c4@notapiano
>>
>> Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
>
> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Is this dependent on a linux-next?
Didn't apply to linux-kselftest next.
thanks,
-- Shuah
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 0/3] kselftest: Add test to report device log errors
2024-07-10 21:49 ` Shuah Khan
@ 2024-07-11 19:53 ` Shuah Khan
2024-07-11 21:44 ` Nícolas F. R. A. Prado
0 siblings, 1 reply; 12+ messages in thread
From: Shuah Khan @ 2024-07-11 19:53 UTC (permalink / raw)
To: Greg Kroah-Hartman, Nícolas F. R. A. Prado
Cc: Shuah Khan, kernel, linux-kselftest, linux-kernel, kernelci,
Shuah Khan
On 7/10/24 15:49, Shuah Khan wrote:
> On 7/10/24 07:11, Greg Kroah-Hartman wrote:
>> On Fri, Jul 05, 2024 at 07:29:53PM -0400, Nícolas F. R. A. Prado wrote:
>>> Log errors are the most widely used mechanism for reporting issues in
>>> the kernel. When an error is logged using the device helpers, eg
>>> dev_err(), it gets metadata attached that identifies the subsystem and
>>> device where the message is coming from. This series makes use of that
>>> metadata in a new test to report which devices logged errors.
>>>
>>> The first two patches move a test and a helper script to keep things
>>> organized before this new test is added in the third patch.
>>>
>>> It is expected that there might be many false-positive error messages
>>> throughout the drivers code which will be reported by this test. By
>>> having this test in the first place and working through the results we
>>> can address those occurrences by adjusting the loglevel of the messages
>>> that turn out to not be real errors that require the user's attention.
>>> It will also motivate additional error messages to be introduced in the
>>> code to detect real errors where they turn out to be missing, since
>>> it will be possible to detect said issues automatically.
>>>
>>> As an example, below you can see the test result for
>>> mt8192-asurada-spherion. The single standing issue has been investigated
>>> and will be addressed in an EC firmware update [1]:
>>>
>>> TAP version 13
>>> 1..1
>>> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
>>> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
>>> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
>>> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
>>> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
>>> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
>>> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
>>> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
>>> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
>>> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
>>> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
>>> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
>>> power_supply sbs-8-000b: driver failed to report `model_name' property: -6
>>> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
>>> power_supply sbs-8-000b: driver failed to report `energy_full_design' property: -6
>>> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
>>> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
>>> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
>>> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
>>> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
>>> not ok 1 +power_supply:sbs-8-000b
>>> Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
>>>
>>> [1] https://lore.kernel.org/all/cf4d8131-4b63-4c7a-9f27-5a0847c656c4@notapiano
>>>
>>> Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
>>
>> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>
> Is this dependent on a linux-next?
>
> Didn't apply to linux-kselftest next.
>
I tried applying these on top of linux-kselftest next which is at
Linux 6.10-rc7 + other patches.
I am not sure what is wrong - first patch applies and the second
and third don't.
git am fails and manual patch application worked for 2/3, same thing
with 3.3 - these should apply cleanly since they don't have obvious
conflicts.
Please clean this up and send me updated series adding Greg's ack.
thanks,
-- Shuah
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 0/3] kselftest: Add test to report device log errors
2024-07-11 19:53 ` Shuah Khan
@ 2024-07-11 21:44 ` Nícolas F. R. A. Prado
2024-07-11 21:56 ` Shuah Khan
2024-07-12 17:48 ` Bird, Tim
0 siblings, 2 replies; 12+ messages in thread
From: Nícolas F. R. A. Prado @ 2024-07-11 21:44 UTC (permalink / raw)
To: Shuah Khan
Cc: Greg Kroah-Hartman, Shuah Khan, kernel, linux-kselftest,
linux-kernel, kernelci
On Thu, Jul 11, 2024 at 01:53:37PM -0600, Shuah Khan wrote:
> On 7/10/24 15:49, Shuah Khan wrote:
> > On 7/10/24 07:11, Greg Kroah-Hartman wrote:
> > > On Fri, Jul 05, 2024 at 07:29:53PM -0400, Nícolas F. R. A. Prado wrote:
> > > > Log errors are the most widely used mechanism for reporting issues in
> > > > the kernel. When an error is logged using the device helpers, eg
> > > > dev_err(), it gets metadata attached that identifies the subsystem and
> > > > device where the message is coming from. This series makes use of that
> > > > metadata in a new test to report which devices logged errors.
> > > >
> > > > The first two patches move a test and a helper script to keep things
> > > > organized before this new test is added in the third patch.
> > > >
> > > > It is expected that there might be many false-positive error messages
> > > > throughout the drivers code which will be reported by this test. By
> > > > having this test in the first place and working through the results we
> > > > can address those occurrences by adjusting the loglevel of the messages
> > > > that turn out to not be real errors that require the user's attention.
> > > > It will also motivate additional error messages to be introduced in the
> > > > code to detect real errors where they turn out to be missing, since
> > > > it will be possible to detect said issues automatically.
> > > >
> > > > As an example, below you can see the test result for
> > > > mt8192-asurada-spherion. The single standing issue has been investigated
> > > > and will be addressed in an EC firmware update [1]:
> > > >
> > > > TAP version 13
> > > > 1..1
> > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > power_supply sbs-8-000b: driver failed to report `model_name' property: -6
> > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > power_supply sbs-8-000b: driver failed to report `energy_full_design' property: -6
> > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > not ok 1 +power_supply:sbs-8-000b
> > > > Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
> > > >
> > > > [1] https://lore.kernel.org/all/cf4d8131-4b63-4c7a-9f27-5a0847c656c4@notapiano
> > > >
> > > > Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
> > >
> > > Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> >
> > Is this dependent on a linux-next?
> >
> > Didn't apply to linux-kselftest next.
> >
>
> I tried applying these on top of linux-kselftest next which is at
> Linux 6.10-rc7 + other patches.
>
> I am not sure what is wrong - first patch applies and the second
> and third don't.
>
> git am fails and manual patch application worked for 2/3, same thing
> with 3.3 - these should apply cleanly since they don't have obvious
> conflicts.
>
> Please clean this up and send me updated series adding Greg's ack.
Oh, now I see what happened. I recently sent another series that touches the
same file (tools/testing/selftests/devices/test_discoverable_devices.py):
"kselftest: devices: Allow running test on more platforms"
https://lore.kernel.org/all/20240613-kselftest-discoverable-probe-mt8195-kci-v1-1-7b396a9b032d@collabora.com/
That was already merged through the usb tree, and is present on next (on which I
based this series).
In this case I imagine it's best if this series gets picked through the usb
tree, right? Even if I rebase on kselftest's next, there will be conflicts.
Thanks,
Nícolas
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 0/3] kselftest: Add test to report device log errors
2024-07-11 21:44 ` Nícolas F. R. A. Prado
@ 2024-07-11 21:56 ` Shuah Khan
2024-07-12 8:59 ` Greg Kroah-Hartman
2024-07-12 17:48 ` Bird, Tim
1 sibling, 1 reply; 12+ messages in thread
From: Shuah Khan @ 2024-07-11 21:56 UTC (permalink / raw)
To: Nícolas F. R. A. Prado
Cc: Greg Kroah-Hartman, Shuah Khan, kernel, linux-kselftest,
linux-kernel, kernelci, Shuah Khan
On 7/11/24 15:44, Nícolas F. R. A. Prado wrote:
> On Thu, Jul 11, 2024 at 01:53:37PM -0600, Shuah Khan wrote:
>> On 7/10/24 15:49, Shuah Khan wrote:
>>> On 7/10/24 07:11, Greg Kroah-Hartman wrote:
>>>> On Fri, Jul 05, 2024 at 07:29:53PM -0400, Nícolas F. R. A. Prado wrote:
>>>>> Log errors are the most widely used mechanism for reporting issues in
>>>>> the kernel. When an error is logged using the device helpers, eg
>>>>> dev_err(), it gets metadata attached that identifies the subsystem and
>>>>> device where the message is coming from. This series makes use of that
>>>>> metadata in a new test to report which devices logged errors.
>>>>>
>>>>> The first two patches move a test and a helper script to keep things
>>>>> organized before this new test is added in the third patch.
>>>>>
>>>>> It is expected that there might be many false-positive error messages
>>>>> throughout the drivers code which will be reported by this test. By
>>>>> having this test in the first place and working through the results we
>>>>> can address those occurrences by adjusting the loglevel of the messages
>>>>> that turn out to not be real errors that require the user's attention.
>>>>> It will also motivate additional error messages to be introduced in the
>>>>> code to detect real errors where they turn out to be missing, since
>>>>> it will be possible to detect said issues automatically.
>>>>>
>>>>> As an example, below you can see the test result for
>>>>> mt8192-asurada-spherion. The single standing issue has been investigated
>>>>> and will be addressed in an EC firmware update [1]:
>>>>>
>>>>> TAP version 13
>>>>> 1..1
>>>>> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
>>>>> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
>>>>> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
>>>>> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
>>>>> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
>>>>> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
>>>>> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
>>>>> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
>>>>> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
>>>>> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
>>>>> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
>>>>> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
>>>>> power_supply sbs-8-000b: driver failed to report `model_name' property: -6
>>>>> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
>>>>> power_supply sbs-8-000b: driver failed to report `energy_full_design' property: -6
>>>>> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
>>>>> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
>>>>> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
>>>>> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
>>>>> power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
>>>>> not ok 1 +power_supply:sbs-8-000b
>>>>> Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
>>>>>
>>>>> [1] https://lore.kernel.org/all/cf4d8131-4b63-4c7a-9f27-5a0847c656c4@notapiano
>>>>>
>>>>> Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
>>>>
>>>> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
>>>
>>> Is this dependent on a linux-next?
>>>
>>> Didn't apply to linux-kselftest next.
>>>
>>
>> I tried applying these on top of linux-kselftest next which is at
>> Linux 6.10-rc7 + other patches.
>>
>> I am not sure what is wrong - first patch applies and the second
>> and third don't.
>>
>> git am fails and manual patch application worked for 2/3, same thing
>> with 3.3 - these should apply cleanly since they don't have obvious
>> conflicts.
>>
>> Please clean this up and send me updated series adding Greg's ack.
>
> Oh, now I see what happened. I recently sent another series that touches the
> same file (tools/testing/selftests/devices/test_discoverable_devices.py):
> "kselftest: devices: Allow running test on more platforms"
> https://lore.kernel.org/all/20240613-kselftest-discoverable-probe-mt8195-kci-v1-1-7b396a9b032d@collabora.com/
>
> That was already merged through the usb tree, and is present on next (on which I
> based this series).
>
> In this case I imagine it's best if this series gets picked through the usb
> tree, right? Even if I rebase on kselftest's next, there will be conflicts.
>
I see. No problem. It can go through usb tree
Acked-by: Shuah Khan <skhan@linuxfoundation.org>
thanks,
-- Shuah
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 0/3] kselftest: Add test to report device log errors
2024-07-11 21:56 ` Shuah Khan
@ 2024-07-12 8:59 ` Greg Kroah-Hartman
0 siblings, 0 replies; 12+ messages in thread
From: Greg Kroah-Hartman @ 2024-07-12 8:59 UTC (permalink / raw)
To: Shuah Khan
Cc: Nícolas F. R. A. Prado, Shuah Khan, kernel, linux-kselftest,
linux-kernel, kernelci
On Thu, Jul 11, 2024 at 03:56:25PM -0600, Shuah Khan wrote:
> On 7/11/24 15:44, Nícolas F. R. A. Prado wrote:
> > On Thu, Jul 11, 2024 at 01:53:37PM -0600, Shuah Khan wrote:
> > > On 7/10/24 15:49, Shuah Khan wrote:
> > > > On 7/10/24 07:11, Greg Kroah-Hartman wrote:
> > > > > On Fri, Jul 05, 2024 at 07:29:53PM -0400, Nícolas F. R. A. Prado wrote:
> > > > > > Log errors are the most widely used mechanism for reporting issues in
> > > > > > the kernel. When an error is logged using the device helpers, eg
> > > > > > dev_err(), it gets metadata attached that identifies the subsystem and
> > > > > > device where the message is coming from. This series makes use of that
> > > > > > metadata in a new test to report which devices logged errors.
> > > > > >
> > > > > > The first two patches move a test and a helper script to keep things
> > > > > > organized before this new test is added in the third patch.
> > > > > >
> > > > > > It is expected that there might be many false-positive error messages
> > > > > > throughout the drivers code which will be reported by this test. By
> > > > > > having this test in the first place and working through the results we
> > > > > > can address those occurrences by adjusting the loglevel of the messages
> > > > > > that turn out to not be real errors that require the user's attention.
> > > > > > It will also motivate additional error messages to be introduced in the
> > > > > > code to detect real errors where they turn out to be missing, since
> > > > > > it will be possible to detect said issues automatically.
> > > > > >
> > > > > > As an example, below you can see the test result for
> > > > > > mt8192-asurada-spherion. The single standing issue has been investigated
> > > > > > and will be addressed in an EC firmware update [1]:
> > > > > >
> > > > > > TAP version 13
> > > > > > 1..1
> > > > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > > > power_supply sbs-8-000b: driver failed to report `model_name' property: -6
> > > > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > > > power_supply sbs-8-000b: driver failed to report `energy_full_design' property: -6
> > > > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > > > not ok 1 +power_supply:sbs-8-000b
> > > > > > Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
> > > > > >
> > > > > > [1] https://lore.kernel.org/all/cf4d8131-4b63-4c7a-9f27-5a0847c656c4@notapiano
> > > > > >
> > > > > > Signed-off-by: Nícolas F. R. A. Prado <nfraprado@collabora.com>
> > > > >
> > > > > Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > > >
> > > > Is this dependent on a linux-next?
> > > >
> > > > Didn't apply to linux-kselftest next.
> > > >
> > >
> > > I tried applying these on top of linux-kselftest next which is at
> > > Linux 6.10-rc7 + other patches.
> > >
> > > I am not sure what is wrong - first patch applies and the second
> > > and third don't.
> > >
> > > git am fails and manual patch application worked for 2/3, same thing
> > > with 3.3 - these should apply cleanly since they don't have obvious
> > > conflicts.
> > >
> > > Please clean this up and send me updated series adding Greg's ack.
> >
> > Oh, now I see what happened. I recently sent another series that touches the
> > same file (tools/testing/selftests/devices/test_discoverable_devices.py):
> > "kselftest: devices: Allow running test on more platforms"
> > https://lore.kernel.org/all/20240613-kselftest-discoverable-probe-mt8195-kci-v1-1-7b396a9b032d@collabora.com/
> >
> > That was already merged through the usb tree, and is present on next (on which I
> > based this series).
> >
> > In this case I imagine it's best if this series gets picked through the usb
> > tree, right? Even if I rebase on kselftest's next, there will be conflicts.
> >
>
> I see. No problem. It can go through usb tree
>
> Acked-by: Shuah Khan <skhan@linuxfoundation.org>
Ok, taken through the usb tree now, thanks.
greg k-h
^ permalink raw reply [flat|nested] 12+ messages in thread
* RE: [PATCH v2 0/3] kselftest: Add test to report device log errors
2024-07-11 21:44 ` Nícolas F. R. A. Prado
2024-07-11 21:56 ` Shuah Khan
@ 2024-07-12 17:48 ` Bird, Tim
2024-07-12 20:51 ` Nícolas F. R. A. Prado
1 sibling, 1 reply; 12+ messages in thread
From: Bird, Tim @ 2024-07-12 17:48 UTC (permalink / raw)
To: Nícolas F. R. A. Prado, Shuah Khan
Cc: Greg Kroah-Hartman, Shuah Khan, kernel@collabora.com,
linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org,
kernelci@lists.linux.dev
> -----Original Message-----
> From: Nícolas F. R. A. Prado <nfraprado@collabora.com>
> On Thu, Jul 11, 2024 at 01:53:37PM -0600, Shuah Khan wrote:
> > On 7/10/24 15:49, Shuah Khan wrote:
> > > On 7/10/24 07:11, Greg Kroah-Hartman wrote:
> > > > On Fri, Jul 05, 2024 at 07:29:53PM -0400, Nícolas F. R. A. Prado wrote:
> > > > > Log errors are the most widely used mechanism for reporting issues in
> > > > > the kernel. When an error is logged using the device helpers, eg
> > > > > dev_err(), it gets metadata attached that identifies the subsystem and
> > > > > device where the message is coming from. This series makes use of that
> > > > > metadata in a new test to report which devices logged errors.
> > > > >
> > > > > The first two patches move a test and a helper script to keep things
> > > > > organized before this new test is added in the third patch.
> > > > >
> > > > > It is expected that there might be many false-positive error messages
> > > > > throughout the drivers code which will be reported by this test. By
> > > > > having this test in the first place and working through the results we
> > > > > can address those occurrences by adjusting the loglevel of the messages
> > > > > that turn out to not be real errors that require the user's attention.
> > > > > It will also motivate additional error messages to be introduced in the
> > > > > code to detect real errors where they turn out to be missing, since
> > > > > it will be possible to detect said issues automatically.
> > > > >
> > > > > As an example, below you can see the test result for
> > > > > mt8192-asurada-spherion. The single standing issue has been investigated
> > > > > and will be addressed in an EC firmware update [1]:
> > > > >
> > > > > TAP version 13
> > > > > 1..1
> > > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > > power_supply sbs-8-000b: driver failed to report `model_name' property: -6
> > > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > > power_supply sbs-8-000b: driver failed to report `energy_full_design' property: -6
> > > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > > not ok 1 +power_supply:sbs-8-000b
Sorry this feedback is coming late.
The test description only includes the device identifier, and not a description of the testcase.
(See below for why this might be the case).
Maybe a better testcase description might be: "check that power_supply:sbs-8-000b device has no error messages"
I note that there are 3 different error messages, but only one testcase report generated.
It would be a lot of work (probably too much), for this test to interpret the error messages
and convert them into testcase language. But it might be worthwhile to report a test result
for each type of error message generated.
> > > > > Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
> > > > >
There are a few interesting issues that this 'test' raises.
It is not so much a test itself, as a way to convert an error log report into KTAP output.
The 'test' has no knowledge of what the testcases are, that are being reported.
That is, the actual test has occurred at runtime, inside the driver or device manager
code, which produced the error string. This 'test' just reports that the device reported
an error.
As currently structured, there is no way for this to report an 'ok' testcase.
I talked with Nicolas on a KernelCI conference call, it is apparently possible to try to find
all possible devices on a system, and report them as 'ok'. But this would produce
a LOT of noise, as all the devices that did not have errors would be reported as 'ok'.
This level of verbosity is probably not desirable. (It's similar to how a compiler does
not report the success of every test it conducts on the source code, as it is being compiled.
Rather a compiler only outputs errors and warnings for test failures.)
In terms of tracking regressions, it would be useful to identify the different things that could
cause an error message to appear or disappear. For example, some errors related to hardware
might change due to flakiness of the hardware. Some errors might be introduced or go away
based on a change in the kernel configuration. This is one case, where have a long list of "ok"
lines would be useful - to disambiguate drivers that were present, but didn't have an error, from
drivers that were not present at all.
I think this test is useful, and am not NAK-ing it. I'm just giving some feedback on the general
approach, and hopefully raising some issues to consider, to make the test more useful.
-- Tim
^ permalink raw reply [flat|nested] 12+ messages in thread
* Re: [PATCH v2 0/3] kselftest: Add test to report device log errors
2024-07-12 17:48 ` Bird, Tim
@ 2024-07-12 20:51 ` Nícolas F. R. A. Prado
0 siblings, 0 replies; 12+ messages in thread
From: Nícolas F. R. A. Prado @ 2024-07-12 20:51 UTC (permalink / raw)
To: Bird, Tim
Cc: Shuah Khan, Greg Kroah-Hartman, Shuah Khan, kernel@collabora.com,
linux-kselftest@vger.kernel.org, linux-kernel@vger.kernel.org,
kernelci@lists.linux.dev
On Fri, Jul 12, 2024 at 05:48:14PM +0000, Bird, Tim wrote:
>
>
> > -----Original Message-----
> > From: Nícolas F. R. A. Prado <nfraprado@collabora.com>
> > On Thu, Jul 11, 2024 at 01:53:37PM -0600, Shuah Khan wrote:
> > > On 7/10/24 15:49, Shuah Khan wrote:
> > > > On 7/10/24 07:11, Greg Kroah-Hartman wrote:
> > > > > On Fri, Jul 05, 2024 at 07:29:53PM -0400, Nícolas F. R. A. Prado wrote:
> > > > > > Log errors are the most widely used mechanism for reporting issues in
> > > > > > the kernel. When an error is logged using the device helpers, eg
> > > > > > dev_err(), it gets metadata attached that identifies the subsystem and
> > > > > > device where the message is coming from. This series makes use of that
> > > > > > metadata in a new test to report which devices logged errors.
> > > > > >
> > > > > > The first two patches move a test and a helper script to keep things
> > > > > > organized before this new test is added in the third patch.
> > > > > >
> > > > > > It is expected that there might be many false-positive error messages
> > > > > > throughout the drivers code which will be reported by this test. By
> > > > > > having this test in the first place and working through the results we
> > > > > > can address those occurrences by adjusting the loglevel of the messages
> > > > > > that turn out to not be real errors that require the user's attention.
> > > > > > It will also motivate additional error messages to be introduced in the
> > > > > > code to detect real errors where they turn out to be missing, since
> > > > > > it will be possible to detect said issues automatically.
> > > > > >
> > > > > > As an example, below you can see the test result for
> > > > > > mt8192-asurada-spherion. The single standing issue has been investigated
> > > > > > and will be addressed in an EC firmware update [1]:
> > > > > >
> > > > > > TAP version 13
> > > > > > 1..1
> > > > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > > > power_supply sbs-8-000b: driver failed to report `model_name' property: -6
> > > > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > > > power_supply sbs-8-000b: driver failed to report `energy_full_design' property: -6
> > > > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > > > power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
> > > > > > not ok 1 +power_supply:sbs-8-000b
>
> Sorry this feedback is coming late.
>
> The test description only includes the device identifier, and not a description of the testcase.
> (See below for why this might be the case).
>
> Maybe a better testcase description might be: "check that power_supply:sbs-8-000b device has no error messages"
Indeed that would make it more descriptive, but it would also add a lot of
redundant information. Since all the test does is check whether each device has
error messages, the only unique information for each test case is the device
identifier. In other words I considered the "check that ... device has no error
messages" implied by the test suite.
>
> I note that there are 3 different error messages, but only one testcase report generated.
> It would be a lot of work (probably too much), for this test to interpret the error messages
> and convert them into testcase language. But it might be worthwhile to report a test result
> for each type of error message generated.
I don't know how much value that would bring, but it might be an interesting
idea to try.
Interpreting the messages seems too much, but if we just compare the messages
for uniqueness and use them as is in the test identifiers it would be easy to
do, eg:
not ok 1 +power_supply:sbs-8-000b.power_supply sbs-8-000b: driver failed to report `time_to_empty_now' property: -5
not ok 2 +power_supply:sbs-8-000b.power_supply sbs-8-000b: driver failed to report `energy_full_design' property: -6
not ok 3 +power_supply:sbs-8-000b.power_supply sbs-8-000b: driver failed to report `model_name' property: -6
My concern is that I don't know if this is actually doing a good job of
portraying the issues: in this case it's actually really just one issue that
triggers those different messages, but this makes it look like there are 3
independent issues and the order of the errors is completely lost.
>
> > > > > > Totals: pass:0 fail:1 xfail:0 xpass:0 skip:0 error:0
> > > > > >
>
> There are a few interesting issues that this 'test' raises.
>
> It is not so much a test itself, as a way to convert an error log report into KTAP output.
>
> The 'test' has no knowledge of what the testcases are, that are being reported.
> That is, the actual test has occurred at runtime, inside the driver or device manager
> code, which produced the error string. This 'test' just reports that the device reported
> an error.
Right. I would argue that the test suite as a whole does test something: "Did
any devices on this platform report errors?". And if the answer is yes, then the
test cases will detail this information by simply converting the error log into
KTAP.
But indeed the bulk of the testing is done by the drivers and frameworks and
this 'test' is mostly about structuring and reporting that data in a way that
is easier to identify the problematic devices on the system.
>
> As currently structured, there is no way for this to report an 'ok' testcase.
> I talked with Nicolas on a KernelCI conference call, it is apparently possible to try to find
> all possible devices on a system, and report them as 'ok'. But this would produce
> a LOT of noise, as all the devices that did not have errors would be reported as 'ok'.
> This level of verbosity is probably not desirable. (It's similar to how a compiler does
> not report the success of every test it conducts on the source code, as it is being compiled.
> Rather a compiler only outputs errors and warnings for test failures.)
>
> In terms of tracking regressions, it would be useful to identify the different things that could
> cause an error message to appear or disappear. For example, some errors related to hardware
> might change due to flakiness of the hardware. Some errors might be introduced or go away
> based on a change in the kernel configuration. This is one case, where have a long list of "ok"
> lines would be useful - to disambiguate drivers that were present, but didn't have an error, from
> drivers that were not present at all.
Yes, to this I currently don't see another option... However, if we keep the
"ok" cases hidden/implicit, I think the concern about drivers not being present
at all can be mostly addressed by also running a probe test, like the DT
kselftest [1] or the ACPI kselftest [2]. That way, you would rely on the probe
test to verify whether all the devices expected on the system are there, and on
the error log test to verify whether any of the devices on the system have
issues.
>
> I think this test is useful, and am not NAK-ing it. I'm just giving some feedback on the general
> approach, and hopefully raising some issues to consider, to make the test more useful.
I think more important than merging or not merging the test is getting feedback
from the community on how to improve testing on the kernel, so I definitely
welcome your feedback. Thank you for taking the time to write this.
Thanks,
Nícolas
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/testing/selftests/dt/test_unprobed_devices.sh
[2] https://lore.kernel.org/all/20240308144933.337107-1-laura.nao@collabora.com
^ permalink raw reply [flat|nested] 12+ messages in thread
end of thread, other threads:[~2024-07-12 20:51 UTC | newest]
Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-05 23:29 [PATCH v2 0/3] kselftest: Add test to report device log errors Nícolas F. R. A. Prado
2024-07-05 23:29 ` [PATCH v2 1/3] kselftest: devices: Move discoverable devices test to subdirectory Nícolas F. R. A. Prado
2024-07-05 23:29 ` [PATCH v2 2/3] kselftest: Move ksft helper module to common directory Nícolas F. R. A. Prado
2024-07-05 23:29 ` [PATCH v2 3/3] kselftest: devices: Add test to detect device error logs Nícolas F. R. A. Prado
2024-07-10 13:11 ` [PATCH v2 0/3] kselftest: Add test to report device log errors Greg Kroah-Hartman
2024-07-10 21:49 ` Shuah Khan
2024-07-11 19:53 ` Shuah Khan
2024-07-11 21:44 ` Nícolas F. R. A. Prado
2024-07-11 21:56 ` Shuah Khan
2024-07-12 8:59 ` Greg Kroah-Hartman
2024-07-12 17:48 ` Bird, Tim
2024-07-12 20:51 ` Nícolas F. R. A. Prado
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox