* Re: [GIT PULL] s390 updates for 5.15 merge window
[not found] <YSzZFgBt6nMvpVgc@osiris>
@ 2021-08-31 2:19 ` Nathan Chancellor
2021-08-31 7:09 ` Christian Borntraeger
0 siblings, 1 reply; 13+ messages in thread
From: Nathan Chancellor @ 2021-08-31 2:19 UTC (permalink / raw)
To: Heiko Carstens
Cc: Linus Torvalds, Vasily Gorbik, Christian Borntraeger, linux-s390,
linux-kernel, llvm
Hi Heiko,
On Mon, Aug 30, 2021 at 03:11:50PM +0200, Heiko Carstens wrote:
> - Enable KCSAN for s390. This comes with a small common code change to fix a
> compile warning. Acked by Marco Elver:
> https://lore.kernel.org/r/20210729142811.1309391-1-hca@linux.ibm.com/
This caught my eye, as we are boot testing KCSAN + KCSAN_KUNIT_TEST in
our CI [1] for x86_64 so it would be nice to enable this for s390 as
well. However, it does not seem like the unit tests pass when booting up
in QEMU, is this expected or am I doing something wrong? The results and
compiler versions are below (the results are the same), they should both
have the commits that are mentioned in the KCSAN message.
GCC 12.0.0 @ d904008df267cbcc01bd6edf98fa0789fb6e94da
[ 131.813482] not ok 1 - test_basic
[ 135.321137] not ok 2 - test_concurrent_races
[ 138.830648] ok 3 - test_novalue_change
[ 142.342562] not ok 4 - test_novalue_change_exception
[ 145.851008] not ok 5 - test_unknown_origin
[ 149.361416] ok 6 - test_write_write_assume_atomic
[ 152.872013] not ok 7 - test_write_write_struct
[ 156.382960] not ok 8 - test_write_write_struct_part
[ 159.890222] ok 9 - test_read_atomic_write_atomic
[ 163.402919] not ok 10 - test_read_plain_atomic_write
[ 166.912931] not ok 11 - test_read_plain_atomic_rmw
[ 170.431915] not ok 12 - test_zero_size_access
[ 173.940959] ok 13 - test_data_race
[ 177.452028] not ok 14 - test_assert_exclusive_writer
[ 180.962840] not ok 15 - test_assert_exclusive_access
[ 184.474686] not ok 16 - test_assert_exclusive_access_writer
[ 187.992282] not ok 17 - test_assert_exclusive_bits_change
[ 191.501869] ok 18 - test_assert_exclusive_bits_nochange
[ 195.013138] not ok 19 - test_assert_exclusive_writer_scoped
[ 199.534212] not ok 20 - test_assert_exclusive_access_scoped
[ 203.053361] ok 21 - test_jiffies_noreport
[ 206.573803] ok 22 - test_seqlock_noreport
[ 210.093508] ok 23 - test_atomic_builtins
[ 210.094014] not ok 1 - kcsan
clang 14.0.0 @ 657bb7262d4a53e903e702d46fdcab57b7085128:
[ 10.341427] not ok 1 - test_basic
[ 13.848960] not ok 2 - test_concurrent_races
[ 17.359671] ok 3 - test_novalue_change
[ 20.869202] not ok 4 - test_novalue_change_exception
[ 24.379067] not ok 5 - test_unknown_origin
[ 27.889492] ok 6 - test_write_write_assume_atomic
[ 31.399572] not ok 7 - test_write_write_struct
[ 34.910833] not ok 8 - test_write_write_struct_part
[ 38.419473] ok 9 - test_read_atomic_write_atomic
[ 41.929642] not ok 10 - test_read_plain_atomic_write
[ 45.439644] not ok 11 - test_read_plain_atomic_rmw
[ 48.950048] not ok 12 - test_zero_size_access
[ 52.459026] ok 13 - test_data_race
[ 55.969806] not ok 14 - test_assert_exclusive_writer
[ 59.480436] not ok 15 - test_assert_exclusive_access
[ 62.990164] not ok 16 - test_assert_exclusive_access_writer
[ 66.499199] not ok 17 - test_assert_exclusive_bits_change
[ 70.009481] ok 18 - test_assert_exclusive_bits_nochange
[ 73.522184] not ok 19 - test_assert_exclusive_writer_scoped
[ 78.030448] not ok 20 - test_assert_exclusive_access_scoped
[ 81.539059] ok 21 - test_jiffies_noreport
[ 85.051769] ok 22 - test_seqlock_noreport
[ 88.572048] ok 23 - test_atomic_builtins
[ 88.572279] not ok 1 - kcsan
[1]: https://github.com/ClangBuiltLinux/continuous-integration2
Cheers,
Nathan
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [GIT PULL] s390 updates for 5.15 merge window
2021-08-31 2:19 ` [GIT PULL] s390 updates for 5.15 merge window Nathan Chancellor
@ 2021-08-31 7:09 ` Christian Borntraeger
2021-08-31 10:13 ` Heiko Carstens
0 siblings, 1 reply; 13+ messages in thread
From: Christian Borntraeger @ 2021-08-31 7:09 UTC (permalink / raw)
To: Nathan Chancellor, Heiko Carstens
Cc: Linus Torvalds, Vasily Gorbik, linux-s390, linux-kernel, llvm,
qemu-s390x
On 31.08.21 04:19, Nathan Chancellor wrote:
> Hi Heiko,
>
> On Mon, Aug 30, 2021 at 03:11:50PM +0200, Heiko Carstens wrote:
>> - Enable KCSAN for s390. This comes with a small common code change to fix a
>> compile warning. Acked by Marco Elver:
>> https://lore.kernel.org/r/20210729142811.1309391-1-hca@linux.ibm.com/
>
> This caught my eye, as we are boot testing KCSAN + KCSAN_KUNIT_TEST in
> our CI [1] for x86_64 so it would be nice to enable this for s390 as
> well. However, it does not seem like the unit tests pass when booting up
> in QEMU, is this expected or am I doing something wrong? The results and
> compiler versions are below (the results are the same), they should both
> have the commits that are mentioned in the KCSAN message.
Do you have a branch somewhere where you have the s390 build rules as well as
the qemu command line? Maybe its a QEMU TCG issue, dont know. CC qemu-s390x
just in case.
FWIW, if you want access to a virtual s390x let me know.
>
> GCC 12.0.0 @ d904008df267cbcc01bd6edf98fa0789fb6e94da
>
> [ 131.813482] not ok 1 - test_basic
> [ 135.321137] not ok 2 - test_concurrent_races
> [ 138.830648] ok 3 - test_novalue_change
> [ 142.342562] not ok 4 - test_novalue_change_exception
> [ 145.851008] not ok 5 - test_unknown_origin
> [ 149.361416] ok 6 - test_write_write_assume_atomic
> [ 152.872013] not ok 7 - test_write_write_struct
> [ 156.382960] not ok 8 - test_write_write_struct_part
> [ 159.890222] ok 9 - test_read_atomic_write_atomic
> [ 163.402919] not ok 10 - test_read_plain_atomic_write
> [ 166.912931] not ok 11 - test_read_plain_atomic_rmw
> [ 170.431915] not ok 12 - test_zero_size_access
> [ 173.940959] ok 13 - test_data_race
> [ 177.452028] not ok 14 - test_assert_exclusive_writer
> [ 180.962840] not ok 15 - test_assert_exclusive_access
> [ 184.474686] not ok 16 - test_assert_exclusive_access_writer
> [ 187.992282] not ok 17 - test_assert_exclusive_bits_change
> [ 191.501869] ok 18 - test_assert_exclusive_bits_nochange
> [ 195.013138] not ok 19 - test_assert_exclusive_writer_scoped
> [ 199.534212] not ok 20 - test_assert_exclusive_access_scoped
> [ 203.053361] ok 21 - test_jiffies_noreport
> [ 206.573803] ok 22 - test_seqlock_noreport
> [ 210.093508] ok 23 - test_atomic_builtins
> [ 210.094014] not ok 1 - kcsan
>
> clang 14.0.0 @ 657bb7262d4a53e903e702d46fdcab57b7085128:
>
> [ 10.341427] not ok 1 - test_basic
> [ 13.848960] not ok 2 - test_concurrent_races
> [ 17.359671] ok 3 - test_novalue_change
> [ 20.869202] not ok 4 - test_novalue_change_exception
> [ 24.379067] not ok 5 - test_unknown_origin
> [ 27.889492] ok 6 - test_write_write_assume_atomic
> [ 31.399572] not ok 7 - test_write_write_struct
> [ 34.910833] not ok 8 - test_write_write_struct_part
> [ 38.419473] ok 9 - test_read_atomic_write_atomic
> [ 41.929642] not ok 10 - test_read_plain_atomic_write
> [ 45.439644] not ok 11 - test_read_plain_atomic_rmw
> [ 48.950048] not ok 12 - test_zero_size_access
> [ 52.459026] ok 13 - test_data_race
> [ 55.969806] not ok 14 - test_assert_exclusive_writer
> [ 59.480436] not ok 15 - test_assert_exclusive_access
> [ 62.990164] not ok 16 - test_assert_exclusive_access_writer
> [ 66.499199] not ok 17 - test_assert_exclusive_bits_change
> [ 70.009481] ok 18 - test_assert_exclusive_bits_nochange
> [ 73.522184] not ok 19 - test_assert_exclusive_writer_scoped
> [ 78.030448] not ok 20 - test_assert_exclusive_access_scoped
> [ 81.539059] ok 21 - test_jiffies_noreport
> [ 85.051769] ok 22 - test_seqlock_noreport
> [ 88.572048] ok 23 - test_atomic_builtins
> [ 88.572279] not ok 1 - kcsan
>
> [1]: https://github.com/ClangBuiltLinux/continuous-integration2
>
> Cheers,
> Nathan
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [GIT PULL] s390 updates for 5.15 merge window
2021-08-31 7:09 ` Christian Borntraeger
@ 2021-08-31 10:13 ` Heiko Carstens
2021-08-31 10:46 ` Marco Elver
0 siblings, 1 reply; 13+ messages in thread
From: Heiko Carstens @ 2021-08-31 10:13 UTC (permalink / raw)
To: Christian Borntraeger
Cc: Nathan Chancellor, Linus Torvalds, Vasily Gorbik, linux-s390,
linux-kernel, llvm, qemu-s390x
On Tue, Aug 31, 2021 at 09:09:10AM +0200, Christian Borntraeger wrote:
>
>
> On 31.08.21 04:19, Nathan Chancellor wrote:
> > Hi Heiko,
> >
> > On Mon, Aug 30, 2021 at 03:11:50PM +0200, Heiko Carstens wrote:
> > > - Enable KCSAN for s390. This comes with a small common code change to fix a
> > > compile warning. Acked by Marco Elver:
> > > https://lore.kernel.org/r/20210729142811.1309391-1-hca@linux.ibm.com/
> >
> > This caught my eye, as we are boot testing KCSAN + KCSAN_KUNIT_TEST in
> > our CI [1] for x86_64 so it would be nice to enable this for s390 as
> > well. However, it does not seem like the unit tests pass when booting up
> > in QEMU, is this expected or am I doing something wrong? The results and
> > compiler versions are below (the results are the same), they should both
> > have the commits that are mentioned in the KCSAN message.
>
> Do you have a branch somewhere where you have the s390 build rules as well as
> the qemu command line? Maybe its a QEMU TCG issue, dont know. CC qemu-s390x
> just in case.
I really don't think this is QEMU related. The test fails are sort of
expected: we've seen KCSAN reports when the kernel boots and wanted to
fix them later.
However I have to admit that I wasn't aware of the KCSAN KUNIT tests,
and wouldn't have sent the s390 KCSAN enablement upstream if I would
have been aware of failing self tests.
We'll fix them, and I let you know if things are supposed to work.
Thanks a lot for making aware of this!
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [GIT PULL] s390 updates for 5.15 merge window
2021-08-31 10:13 ` Heiko Carstens
@ 2021-08-31 10:46 ` Marco Elver
2021-08-31 15:02 ` Marco Elver
0 siblings, 1 reply; 13+ messages in thread
From: Marco Elver @ 2021-08-31 10:46 UTC (permalink / raw)
To: Heiko Carstens
Cc: Christian Borntraeger, Nathan Chancellor, Linus Torvalds,
Vasily Gorbik, linux-s390, linux-kernel, llvm, qemu-s390x
On Tue, 31 Aug 2021 at 12:13, Heiko Carstens <hca@linux.ibm.com> wrote:
[...]
> I really don't think this is QEMU related. The test fails are sort of
> expected: we've seen KCSAN reports when the kernel boots and wanted to
> fix them later.
> However I have to admit that I wasn't aware of the KCSAN KUNIT tests,
> and wouldn't have sent the s390 KCSAN enablement upstream if I would
> have been aware of failing self tests.
>
> We'll fix them, and I let you know if things are supposed to work.
>
> Thanks a lot for making aware of this!
Note: Set `CONFIG_KCSAN_REPORT_ONCE_IN_MS=100` (or smaller) instead of
the default to make the test complete faster.
The pattern I see from what Nathan reported is that all test cases
that expect race reports don't observe them ("not ok" cases), and all
those where no races are meant to be reported are fine ("ok" cases).
Without actually seeing the log, I'm guessing that no races are
reported at all, which is certainly not working as intended.
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [GIT PULL] s390 updates for 5.15 merge window
2021-08-31 10:46 ` Marco Elver
@ 2021-08-31 15:02 ` Marco Elver
2021-08-31 15:18 ` Heiko Carstens
` (2 more replies)
0 siblings, 3 replies; 13+ messages in thread
From: Marco Elver @ 2021-08-31 15:02 UTC (permalink / raw)
To: Heiko Carstens
Cc: Christian Borntraeger, Nathan Chancellor, Linus Torvalds,
Vasily Gorbik, linux-s390, linux-kernel, llvm, qemu-s390x
On Tue, Aug 31, 2021 at 12:46PM +0200, Marco Elver wrote:
> On Tue, 31 Aug 2021 at 12:13, Heiko Carstens <hca@linux.ibm.com> wrote:
> [...]
> > I really don't think this is QEMU related. The test fails are sort of
> > expected: we've seen KCSAN reports when the kernel boots and wanted to
> > fix them later.
> > However I have to admit that I wasn't aware of the KCSAN KUNIT tests,
> > and wouldn't have sent the s390 KCSAN enablement upstream if I would
> > have been aware of failing self tests.
> >
> > We'll fix them, and I let you know if things are supposed to work.
> >
> > Thanks a lot for making aware of this!
>
> Note: Set `CONFIG_KCSAN_REPORT_ONCE_IN_MS=100` (or smaller) instead of
> the default to make the test complete faster.
>
> The pattern I see from what Nathan reported is that all test cases
> that expect race reports don't observe them ("not ok" cases), and all
> those where no races are meant to be reported are fine ("ok" cases).
> Without actually seeing the log, I'm guessing that no races are
> reported at all, which is certainly not working as intended.
I repro'd, and the problem is part QEMU TCG and a minor problem with
stack_trace_save() on s390:
1. QEMU TCG doesn't seem to want to execute threads concurrently,
resulting in no "value changes" being observed. This is probably just
a limitation of TCG, and if run on a real CPU, shouldn't be a problem.
On QEMU, most test cases will pass with CONFIG_KCSAN_REPORT_VALUE_CHANGE_ONLY=n
(There's one left that requires value changes to be observable)
2. stack_trace_save() is subtly broken on s390: it starts the trace in
stack_trace_save() itself. This is incorrect, as the trace should
start with the caller. We reported something similar to arm64, also
because one of our sanitizer tests failed:
https://lkml.kernel.org/r/20210319184106.5688-1-mark.rutland@arm.com
I noticed because stack traces like this:
| read to 0x0000000001309128 of 8 bytes by task 49 on cpu 1:
| print_report+0x48/0x6c0
| kcsan_report_known_origin+0x112/0x200
| kcsan_setup_watchpoint+0x464/0x500
| test_kernel_read+0x2a/0x40
| access_thread+0x84/0xb0
| kthread+0x3aa/0x3d0
| __ret_from_fork+0x58/0x90
| ret_from_fork+0xa/0x30
, which should not be generated because KCSAN uses stack_trace_save(..., 1)
in print_report().
I fixed it with the below, and now most tests pass. Note that, other
debugging tools may also report misleading stack traces without the
stack_trace_save() fix (e.g. certain KFENCE reports).
If you have a better solution for how to fix stack_trace_save() on s390,
please discard my patch.
Thanks,
-- Marco
------ >8 ------
From: Marco Elver <elver@google.com>
Date: Tue, 31 Aug 2021 16:00:03 +0200
Subject: [PATCH] s390/stacktrace: do not include arch_stack_walk() in stack
trace
Callers of stack_trace_save() expect that it does not include itself,
which attempts to exclude itself by skipping + 1. This contract is
broken if arch_stack_walk() still includes itself.
Fix it by skipping the initial entry in s390's arch_stack_walk().
Signed-off-by: Marco Elver <elver@google.com>
---
arch/s390/kernel/stacktrace.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)
diff --git a/arch/s390/kernel/stacktrace.c b/arch/s390/kernel/stacktrace.c
index 101477b3e263..47d1841af03e 100644
--- a/arch/s390/kernel/stacktrace.c
+++ b/arch/s390/kernel/stacktrace.c
@@ -16,11 +16,16 @@ void arch_stack_walk(stack_trace_consume_fn consume_entry, void *cookie,
{
struct unwind_state state;
unsigned long addr;
+ bool init = true;
unwind_for_each_frame(&state, task, regs, 0) {
addr = unwind_get_return_address(&state);
- if (!addr || !consume_entry(cookie, addr))
+ if (!addr)
+ break;
+
+ if (!init && !consume_entry(cookie, addr))
break;
+ init = false;
}
}
@@ -29,6 +34,7 @@ int arch_stack_walk_reliable(stack_trace_consume_fn consume_entry,
{
struct unwind_state state;
unsigned long addr;
+ bool init = true;
unwind_for_each_frame(&state, task, NULL, 0) {
if (state.stack_info.type != STACK_TYPE_TASK)
@@ -50,8 +56,9 @@ int arch_stack_walk_reliable(stack_trace_consume_fn consume_entry,
return -EINVAL;
#endif
- if (!consume_entry(cookie, addr))
+ if (!init && !consume_entry(cookie, addr))
return -EINVAL;
+ init = false;
}
/* Check for stack corruption */
--
2.33.0.259.gc128427fd7-goog
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [GIT PULL] s390 updates for 5.15 merge window
2021-08-31 15:02 ` Marco Elver
@ 2021-08-31 15:18 ` Heiko Carstens
2021-08-31 17:48 ` Nathan Chancellor
2021-09-01 14:03 ` Vasily Gorbik
2 siblings, 0 replies; 13+ messages in thread
From: Heiko Carstens @ 2021-08-31 15:18 UTC (permalink / raw)
To: Marco Elver
Cc: Christian Borntraeger, Nathan Chancellor, Linus Torvalds,
Vasily Gorbik, linux-s390, linux-kernel, llvm, qemu-s390x
On Tue, Aug 31, 2021 at 05:02:15PM +0200, Marco Elver wrote:
> On Tue, Aug 31, 2021 at 12:46PM +0200, Marco Elver wrote:
> > On Tue, 31 Aug 2021 at 12:13, Heiko Carstens <hca@linux.ibm.com> wrote:
> I repro'd, and the problem is part QEMU TCG and a minor problem with
> stack_trace_save() on s390:
>
> 1. QEMU TCG doesn't seem to want to execute threads concurrently,
> resulting in no "value changes" being observed. This is probably just
> a limitation of TCG, and if run on a real CPU, shouldn't be a problem.
> On QEMU, most test cases will pass with CONFIG_KCSAN_REPORT_VALUE_CHANGE_ONLY=n
> (There's one left that requires value changes to be observable)
>
> 2. stack_trace_save() is subtly broken on s390: it starts the trace in
> stack_trace_save() itself. This is incorrect, as the trace should
> start with the caller. We reported something similar to arm64, also
> because one of our sanitizer tests failed:
> https://lkml.kernel.org/r/20210319184106.5688-1-mark.rutland@arm.com
...
> I fixed it with the below, and now most tests pass. Note that, other
> debugging tools may also report misleading stack traces without the
> stack_trace_save() fix (e.g. certain KFENCE reports).
>
> If you have a better solution for how to fix stack_trace_save() on s390,
> please discard my patch.
Ah, with your patch I get the below on a real machine; so it looks
like this really was the only problem for the failing self tests.
Thanks a lot for the patch! Will look into it later and apply if
everything else still works!
[ 5.138547] ok 1 - test_basic
[ 7.538799] ok 2 - test_concurrent_races
[ 9.938524] ok 3 - test_novalue_change
[ 11.620063] ok 4 - test_novalue_change_exception
[ 11.952404] ok 5 - test_unknown_origin
[ 14.349852] ok 6 - test_write_write_assume_atomic
[ 14.681937] ok 7 - test_write_write_struct
[ 15.031046] ok 8 - test_write_write_struct_part
[ 17.429869] ok 9 - test_read_atomic_write_atomic
[ 17.760862] ok 10 - test_read_plain_atomic_write
[ 18.093500] ok 11 - test_read_plain_atomic_rmw
[ 20.490570] ok 12 - test_zero_size_access
[ 22.939270] ok 13 - test_data_race
[ 23.271104] ok 14 - test_assert_exclusive_writer
[ 23.601940] ok 15 - test_assert_exclusive_access
[ 26.000281] ok 16 - test_assert_exclusive_access_writer
[ 26.330718] ok 17 - test_assert_exclusive_bits_change
[ 28.739480] ok 18 - test_assert_exclusive_bits_nochange
[ 28.962218] ok 19 - test_assert_exclusive_writer_scoped
[ 29.071385] ok 20 - test_assert_exclusive_access_scoped
[ 31.479706] ok 21 - test_jiffies_noreport
[ 33.879581] ok 22 - test_seqlock_noreport
[ 36.279853] ok 23 - test_atomic_builtins
[ 36.279893] ok 1 - kcsan
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [GIT PULL] s390 updates for 5.15 merge window
2021-08-31 15:02 ` Marco Elver
2021-08-31 15:18 ` Heiko Carstens
@ 2021-08-31 17:48 ` Nathan Chancellor
2021-08-31 17:49 ` Christian Borntraeger
2021-09-01 14:03 ` Vasily Gorbik
2 siblings, 1 reply; 13+ messages in thread
From: Nathan Chancellor @ 2021-08-31 17:48 UTC (permalink / raw)
To: Marco Elver
Cc: Heiko Carstens, Christian Borntraeger, Linus Torvalds,
Vasily Gorbik, linux-s390, linux-kernel, llvm, qemu-s390x
On Tue, Aug 31, 2021 at 05:02:15PM +0200, Marco Elver wrote:
> On Tue, Aug 31, 2021 at 12:46PM +0200, Marco Elver wrote:
> > On Tue, 31 Aug 2021 at 12:13, Heiko Carstens <hca@linux.ibm.com> wrote:
> > [...]
> > > I really don't think this is QEMU related. The test fails are sort of
> > > expected: we've seen KCSAN reports when the kernel boots and wanted to
> > > fix them later.
> > > However I have to admit that I wasn't aware of the KCSAN KUNIT tests,
> > > and wouldn't have sent the s390 KCSAN enablement upstream if I would
> > > have been aware of failing self tests.
> > >
> > > We'll fix them, and I let you know if things are supposed to work.
> > >
> > > Thanks a lot for making aware of this!
> >
> > Note: Set `CONFIG_KCSAN_REPORT_ONCE_IN_MS=100` (or smaller) instead of
> > the default to make the test complete faster.
> >
> > The pattern I see from what Nathan reported is that all test cases
> > that expect race reports don't observe them ("not ok" cases), and all
> > those where no races are meant to be reported are fine ("ok" cases).
> > Without actually seeing the log, I'm guessing that no races are
> > reported at all, which is certainly not working as intended.
>
> I repro'd, and the problem is part QEMU TCG and a minor problem with
> stack_trace_save() on s390:
>
> 1. QEMU TCG doesn't seem to want to execute threads concurrently,
> resulting in no "value changes" being observed. This is probably just
> a limitation of TCG, and if run on a real CPU, shouldn't be a problem.
> On QEMU, most test cases will pass with CONFIG_KCSAN_REPORT_VALUE_CHANGE_ONLY=n
> (There's one left that requires value changes to be observable)
Is this just a limitation of s390's TCG implementation or in general?
Our CI runs on GitHub Actions, which does not support virtualization so
I believe that all of our tests are being done with TCG and x86_64
passes just fine:
https://github.com/ClangBuiltLinux/continuous-integration2/runs/3473222334?check_suite_focus=true
Good to hear that it is working on bare metal now though, we could still
enable build testing of it at a minimum but it would be nice to see the
tests pass even in QEMU :)
Cheers,
Nathan
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [GIT PULL] s390 updates for 5.15 merge window
2021-08-31 17:48 ` Nathan Chancellor
@ 2021-08-31 17:49 ` Christian Borntraeger
0 siblings, 0 replies; 13+ messages in thread
From: Christian Borntraeger @ 2021-08-31 17:49 UTC (permalink / raw)
To: Nathan Chancellor, Marco Elver, Richard Henderson
Cc: Heiko Carstens, Linus Torvalds, Vasily Gorbik, linux-s390,
linux-kernel, llvm, qemu-s390x
On 31.08.21 19:48, Nathan Chancellor wrote:
> On Tue, Aug 31, 2021 at 05:02:15PM +0200, Marco Elver wrote:
>> On Tue, Aug 31, 2021 at 12:46PM +0200, Marco Elver wrote:
>>> On Tue, 31 Aug 2021 at 12:13, Heiko Carstens <hca@linux.ibm.com> wrote:
>>> [...]
>>>> I really don't think this is QEMU related. The test fails are sort of
>>>> expected: we've seen KCSAN reports when the kernel boots and wanted to
>>>> fix them later.
>>>> However I have to admit that I wasn't aware of the KCSAN KUNIT tests,
>>>> and wouldn't have sent the s390 KCSAN enablement upstream if I would
>>>> have been aware of failing self tests.
>>>>
>>>> We'll fix them, and I let you know if things are supposed to work.
>>>>
>>>> Thanks a lot for making aware of this!
>>>
>>> Note: Set `CONFIG_KCSAN_REPORT_ONCE_IN_MS=100` (or smaller) instead of
>>> the default to make the test complete faster.
>>>
>>> The pattern I see from what Nathan reported is that all test cases
>>> that expect race reports don't observe them ("not ok" cases), and all
>>> those where no races are meant to be reported are fine ("ok" cases).
>>> Without actually seeing the log, I'm guessing that no races are
>>> reported at all, which is certainly not working as intended.
>>
>> I repro'd, and the problem is part QEMU TCG and a minor problem with
>> stack_trace_save() on s390:
>>
>> 1. QEMU TCG doesn't seem to want to execute threads concurrently,
>> resulting in no "value changes" being observed. This is probably just
>> a limitation of TCG, and if run on a real CPU, shouldn't be a problem.
>> On QEMU, most test cases will pass with CONFIG_KCSAN_REPORT_VALUE_CHANGE_ONLY=n
>> (There's one left that requires value changes to be observable)
>
> Is this just a limitation of s390's TCG implementation or in general?
> Our CI runs on GitHub Actions, which does not support virtualization so
> I believe that all of our tests are being done with TCG and x86_64
> passes just fine:
Maybe Richard knows?
>
> https://github.com/ClangBuiltLinux/continuous-integration2/runs/3473222334?check_suite_focus=true
>
> Good to hear that it is working on bare metal now though, we could still
> enable build testing of it at a minimum but it would be nice to see the
> tests pass even in QEMU :)
>
> Cheers,
> Nathan
>
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [GIT PULL] s390 updates for 5.15 merge window
2021-08-31 15:02 ` Marco Elver
2021-08-31 15:18 ` Heiko Carstens
2021-08-31 17:48 ` Nathan Chancellor
@ 2021-09-01 14:03 ` Vasily Gorbik
2021-09-01 14:05 ` [PATCH] s390/unwind: use current_frame_address() to unwind current task Vasily Gorbik
2 siblings, 1 reply; 13+ messages in thread
From: Vasily Gorbik @ 2021-09-01 14:03 UTC (permalink / raw)
To: Marco Elver
Cc: Heiko Carstens, Christian Borntraeger, Nathan Chancellor,
Linus Torvalds, linux-s390, linux-kernel, llvm, qemu-s390x
On Tue, Aug 31, 2021 at 05:02:15PM +0200, Marco Elver wrote:
> 2. stack_trace_save() is subtly broken on s390: it starts the trace in
> stack_trace_save() itself. This is incorrect, as the trace should
> start with the caller. We reported something similar to arm64, also
> because one of our sanitizer tests failed:
> https://lkml.kernel.org/r/20210319184106.5688-1-mark.rutland@arm.com
Thanks a lot for looking into it and debugging it!
> Fix it by skipping the initial entry in s390's arch_stack_walk().
...
> diff --git a/arch/s390/kernel/stacktrace.c b/arch/s390/kernel/stacktrace.c
> index 101477b3e263..47d1841af03e 100644
> --- a/arch/s390/kernel/stacktrace.c
> +++ b/arch/s390/kernel/stacktrace.c
> @@ -16,11 +16,16 @@ void arch_stack_walk(stack_trace_consume_fn consume_entry, void *cookie,
> {
> struct unwind_state state;
> unsigned long addr;
> + bool init = true;
>
> unwind_for_each_frame(&state, task, regs, 0) {
> addr = unwind_get_return_address(&state);
> - if (!addr || !consume_entry(cookie, addr))
> + if (!addr)
> + break;
> +
> + if (!init && !consume_entry(cookie, addr))
> break;
> + init = false;
> }
I believe we don't need to skip the first unwinder result if task != current
or regs != NULL. Same for arch_stack_walk_reliable.
But after you pinpointed the problem I see that the actual difference
with x86 implementation comes from get_stack_pointer(). I'll send a patch
as reply.
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH] s390/unwind: use current_frame_address() to unwind current task
2021-09-01 14:03 ` Vasily Gorbik
@ 2021-09-01 14:05 ` Vasily Gorbik
2021-09-01 17:51 ` Marco Elver
2021-09-03 23:23 ` Nathan Chancellor
0 siblings, 2 replies; 13+ messages in thread
From: Vasily Gorbik @ 2021-09-01 14:05 UTC (permalink / raw)
To: Marco Elver, Heiko Carstens, Christian Borntraeger
Cc: Nathan Chancellor, Linus Torvalds, linux-s390, linux-kernel, llvm,
qemu-s390x
current_stack_pointer() simply returns current value of %r15. If
current_stack_pointer() caller allocates stack (which is the case in
unwind code) %r15 points to a stack frame allocated for callees, meaning
current_stack_pointer() caller (e.g. stack_trace_save) will end up in
the stacktrace. This is not expected by stack_trace_save*() callers and
causes problems.
current_frame_address() on the other hand returns function stack frame
address, which matches %r15 upon function invocation. Using it in
get_stack_pointer() makes it more aligned with x86 implementation
(according to BACKTRACE_SELF_TEST output) and meets stack_trace_save*()
caller's expectations, notably KCSAN.
Also make sure unwind_start is always inlined.
Reported-by: Nathan Chancellor <nathan@kernel.org>
Suggested-by: Marco Elver <elver@google.com>
Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
---
arch/s390/include/asm/stacktrace.h | 20 ++++++++++----------
arch/s390/include/asm/unwind.h | 8 ++++----
2 files changed, 14 insertions(+), 14 deletions(-)
diff --git a/arch/s390/include/asm/stacktrace.h b/arch/s390/include/asm/stacktrace.h
index 3d8a4b94c620..22c41d7fd95c 100644
--- a/arch/s390/include/asm/stacktrace.h
+++ b/arch/s390/include/asm/stacktrace.h
@@ -34,16 +34,6 @@ static inline bool on_stack(struct stack_info *info,
return addr >= info->begin && addr + len <= info->end;
}
-static __always_inline unsigned long get_stack_pointer(struct task_struct *task,
- struct pt_regs *regs)
-{
- if (regs)
- return (unsigned long) kernel_stack_pointer(regs);
- if (task == current)
- return current_stack_pointer();
- return (unsigned long) task->thread.ksp;
-}
-
/*
* Stack layout of a C stack frame.
*/
@@ -74,6 +64,16 @@ struct stack_frame {
((unsigned long)__builtin_frame_address(0) - \
offsetof(struct stack_frame, back_chain))
+static __always_inline unsigned long get_stack_pointer(struct task_struct *task,
+ struct pt_regs *regs)
+{
+ if (regs)
+ return (unsigned long) kernel_stack_pointer(regs);
+ if (task == current)
+ return current_frame_address();
+ return (unsigned long) task->thread.ksp;
+}
+
/*
* To keep this simple mark register 2-6 as being changed (volatile)
* by the called function, even though register 6 is saved/nonvolatile.
diff --git a/arch/s390/include/asm/unwind.h b/arch/s390/include/asm/unwind.h
index de9006b0cfeb..5ebf534ef753 100644
--- a/arch/s390/include/asm/unwind.h
+++ b/arch/s390/include/asm/unwind.h
@@ -55,10 +55,10 @@ static inline bool unwind_error(struct unwind_state *state)
return state->error;
}
-static inline void unwind_start(struct unwind_state *state,
- struct task_struct *task,
- struct pt_regs *regs,
- unsigned long first_frame)
+static __always_inline void unwind_start(struct unwind_state *state,
+ struct task_struct *task,
+ struct pt_regs *regs,
+ unsigned long first_frame)
{
task = task ?: current;
first_frame = first_frame ?: get_stack_pointer(task, regs);
--
2.25.4
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCH] s390/unwind: use current_frame_address() to unwind current task
2021-09-01 14:05 ` [PATCH] s390/unwind: use current_frame_address() to unwind current task Vasily Gorbik
@ 2021-09-01 17:51 ` Marco Elver
2021-09-01 18:07 ` Heiko Carstens
2021-09-03 23:23 ` Nathan Chancellor
1 sibling, 1 reply; 13+ messages in thread
From: Marco Elver @ 2021-09-01 17:51 UTC (permalink / raw)
To: Vasily Gorbik
Cc: Heiko Carstens, Christian Borntraeger, Nathan Chancellor,
Linus Torvalds, linux-s390, linux-kernel, llvm, qemu-s390x
On Wed, 1 Sept 2021 at 16:06, Vasily Gorbik <gor@linux.ibm.com> wrote:
> current_stack_pointer() simply returns current value of %r15. If
> current_stack_pointer() caller allocates stack (which is the case in
> unwind code) %r15 points to a stack frame allocated for callees, meaning
> current_stack_pointer() caller (e.g. stack_trace_save) will end up in
> the stacktrace. This is not expected by stack_trace_save*() callers and
> causes problems.
>
> current_frame_address() on the other hand returns function stack frame
> address, which matches %r15 upon function invocation. Using it in
> get_stack_pointer() makes it more aligned with x86 implementation
> (according to BACKTRACE_SELF_TEST output) and meets stack_trace_save*()
> caller's expectations, notably KCSAN.
>
> Also make sure unwind_start is always inlined.
>
> Reported-by: Nathan Chancellor <nathan@kernel.org>
> Suggested-by: Marco Elver <elver@google.com>
> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Tested-by: Marco Elver <elver@google.com>
Thanks!
> ---
> arch/s390/include/asm/stacktrace.h | 20 ++++++++++----------
> arch/s390/include/asm/unwind.h | 8 ++++----
> 2 files changed, 14 insertions(+), 14 deletions(-)
>
> diff --git a/arch/s390/include/asm/stacktrace.h b/arch/s390/include/asm/stacktrace.h
> index 3d8a4b94c620..22c41d7fd95c 100644
> --- a/arch/s390/include/asm/stacktrace.h
> +++ b/arch/s390/include/asm/stacktrace.h
> @@ -34,16 +34,6 @@ static inline bool on_stack(struct stack_info *info,
> return addr >= info->begin && addr + len <= info->end;
> }
>
> -static __always_inline unsigned long get_stack_pointer(struct task_struct *task,
> - struct pt_regs *regs)
> -{
> - if (regs)
> - return (unsigned long) kernel_stack_pointer(regs);
> - if (task == current)
> - return current_stack_pointer();
> - return (unsigned long) task->thread.ksp;
> -}
> -
> /*
> * Stack layout of a C stack frame.
> */
> @@ -74,6 +64,16 @@ struct stack_frame {
> ((unsigned long)__builtin_frame_address(0) - \
> offsetof(struct stack_frame, back_chain))
>
> +static __always_inline unsigned long get_stack_pointer(struct task_struct *task,
> + struct pt_regs *regs)
> +{
> + if (regs)
> + return (unsigned long) kernel_stack_pointer(regs);
> + if (task == current)
> + return current_frame_address();
> + return (unsigned long) task->thread.ksp;
> +}
> +
> /*
> * To keep this simple mark register 2-6 as being changed (volatile)
> * by the called function, even though register 6 is saved/nonvolatile.
> diff --git a/arch/s390/include/asm/unwind.h b/arch/s390/include/asm/unwind.h
> index de9006b0cfeb..5ebf534ef753 100644
> --- a/arch/s390/include/asm/unwind.h
> +++ b/arch/s390/include/asm/unwind.h
> @@ -55,10 +55,10 @@ static inline bool unwind_error(struct unwind_state *state)
> return state->error;
> }
>
> -static inline void unwind_start(struct unwind_state *state,
> - struct task_struct *task,
> - struct pt_regs *regs,
> - unsigned long first_frame)
> +static __always_inline void unwind_start(struct unwind_state *state,
> + struct task_struct *task,
> + struct pt_regs *regs,
> + unsigned long first_frame)
> {
> task = task ?: current;
> first_frame = first_frame ?: get_stack_pointer(task, regs);
> --
> 2.25.4
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] s390/unwind: use current_frame_address() to unwind current task
2021-09-01 17:51 ` Marco Elver
@ 2021-09-01 18:07 ` Heiko Carstens
0 siblings, 0 replies; 13+ messages in thread
From: Heiko Carstens @ 2021-09-01 18:07 UTC (permalink / raw)
To: Marco Elver
Cc: Vasily Gorbik, Christian Borntraeger, Nathan Chancellor,
Linus Torvalds, linux-s390, linux-kernel, llvm, qemu-s390x
On Wed, Sep 01, 2021 at 07:51:06PM +0200, Marco Elver wrote:
> On Wed, 1 Sept 2021 at 16:06, Vasily Gorbik <gor@linux.ibm.com> wrote:
> > current_stack_pointer() simply returns current value of %r15. If
> > current_stack_pointer() caller allocates stack (which is the case in
> > unwind code) %r15 points to a stack frame allocated for callees, meaning
> > current_stack_pointer() caller (e.g. stack_trace_save) will end up in
> > the stacktrace. This is not expected by stack_trace_save*() callers and
> > causes problems.
> >
> > current_frame_address() on the other hand returns function stack frame
> > address, which matches %r15 upon function invocation. Using it in
> > get_stack_pointer() makes it more aligned with x86 implementation
> > (according to BACKTRACE_SELF_TEST output) and meets stack_trace_save*()
> > caller's expectations, notably KCSAN.
> >
> > Also make sure unwind_start is always inlined.
> >
> > Reported-by: Nathan Chancellor <nathan@kernel.org>
> > Suggested-by: Marco Elver <elver@google.com>
> > Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
>
> Tested-by: Marco Elver <elver@google.com>
>
> Thanks!
>
> > ---
> > arch/s390/include/asm/stacktrace.h | 20 ++++++++++----------
> > arch/s390/include/asm/unwind.h | 8 ++++----
> > 2 files changed, 14 insertions(+), 14 deletions(-)
Applied, thanks!
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCH] s390/unwind: use current_frame_address() to unwind current task
2021-09-01 14:05 ` [PATCH] s390/unwind: use current_frame_address() to unwind current task Vasily Gorbik
2021-09-01 17:51 ` Marco Elver
@ 2021-09-03 23:23 ` Nathan Chancellor
1 sibling, 0 replies; 13+ messages in thread
From: Nathan Chancellor @ 2021-09-03 23:23 UTC (permalink / raw)
To: Vasily Gorbik
Cc: Marco Elver, Heiko Carstens, Christian Borntraeger,
Linus Torvalds, linux-s390, linux-kernel, llvm, qemu-s390x
On Wed, Sep 01, 2021 at 04:05:59PM +0200, Vasily Gorbik wrote:
> current_stack_pointer() simply returns current value of %r15. If
> current_stack_pointer() caller allocates stack (which is the case in
> unwind code) %r15 points to a stack frame allocated for callees, meaning
> current_stack_pointer() caller (e.g. stack_trace_save) will end up in
> the stacktrace. This is not expected by stack_trace_save*() callers and
> causes problems.
>
> current_frame_address() on the other hand returns function stack frame
> address, which matches %r15 upon function invocation. Using it in
> get_stack_pointer() makes it more aligned with x86 implementation
> (according to BACKTRACE_SELF_TEST output) and meets stack_trace_save*()
> caller's expectations, notably KCSAN.
>
> Also make sure unwind_start is always inlined.
>
> Reported-by: Nathan Chancellor <nathan@kernel.org>
> Suggested-by: Marco Elver <elver@google.com>
> Signed-off-by: Vasily Gorbik <gor@linux.ibm.com>
Sorry for the late response and I see that this has already been applied
but I took this for a spin and all of the tests pass with clang-14 in
QEMU. Thank you for the quick fix so that we can get this turned on in
CI :)
[ 10.362073] ok 1 - test_basic
[ 13.870386] ok 2 - test_concurrent_races
[ 17.379643] ok 3 - test_novalue_change
[ 17.393315] ok 4 - test_novalue_change_exception
[ 17.409815] ok 5 - test_unknown_origin
[ 20.914289] ok 6 - test_write_write_assume_atomic
[ 20.982545] ok 7 - test_write_write_struct
[ 21.106135] ok 8 - test_write_write_struct_part
[ 24.622205] ok 9 - test_read_atomic_write_atomic
[ 24.662048] ok 10 - test_read_plain_atomic_write
[ 24.775291] ok 11 - test_read_plain_atomic_rmw
[ 28.294457] ok 12 - test_zero_size_access
[ 31.829529] ok 13 - test_data_race
[ 31.867174] ok 14 - test_assert_exclusive_writer
[ 31.929184] ok 15 - test_assert_exclusive_access
[ 35.446281] ok 16 - test_assert_exclusive_access_writer
[ 35.540228] ok 17 - test_assert_exclusive_bits_change
[ 39.052271] ok 18 - test_assert_exclusive_bits_nochange
[ 39.097020] ok 19 - test_assert_exclusive_writer_scoped
[ 39.152914] ok 20 - test_assert_exclusive_access_scoped
[ 42.675158] ok 21 - test_jiffies_noreport
[ 46.192453] ok 22 - test_seqlock_noreport
[ 49.712712] ok 23 - test_atomic_builtins
[ 49.746428] ok 24 - test_1bit_value_change
[ 49.753316] ok 1 - kcsan
Tested-by: Nathan Chancellor <nathan@kernel.org>
> ---
> arch/s390/include/asm/stacktrace.h | 20 ++++++++++----------
> arch/s390/include/asm/unwind.h | 8 ++++----
> 2 files changed, 14 insertions(+), 14 deletions(-)
>
> diff --git a/arch/s390/include/asm/stacktrace.h b/arch/s390/include/asm/stacktrace.h
> index 3d8a4b94c620..22c41d7fd95c 100644
> --- a/arch/s390/include/asm/stacktrace.h
> +++ b/arch/s390/include/asm/stacktrace.h
> @@ -34,16 +34,6 @@ static inline bool on_stack(struct stack_info *info,
> return addr >= info->begin && addr + len <= info->end;
> }
>
> -static __always_inline unsigned long get_stack_pointer(struct task_struct *task,
> - struct pt_regs *regs)
> -{
> - if (regs)
> - return (unsigned long) kernel_stack_pointer(regs);
> - if (task == current)
> - return current_stack_pointer();
> - return (unsigned long) task->thread.ksp;
> -}
> -
> /*
> * Stack layout of a C stack frame.
> */
> @@ -74,6 +64,16 @@ struct stack_frame {
> ((unsigned long)__builtin_frame_address(0) - \
> offsetof(struct stack_frame, back_chain))
>
> +static __always_inline unsigned long get_stack_pointer(struct task_struct *task,
> + struct pt_regs *regs)
> +{
> + if (regs)
> + return (unsigned long) kernel_stack_pointer(regs);
> + if (task == current)
> + return current_frame_address();
> + return (unsigned long) task->thread.ksp;
> +}
> +
> /*
> * To keep this simple mark register 2-6 as being changed (volatile)
> * by the called function, even though register 6 is saved/nonvolatile.
> diff --git a/arch/s390/include/asm/unwind.h b/arch/s390/include/asm/unwind.h
> index de9006b0cfeb..5ebf534ef753 100644
> --- a/arch/s390/include/asm/unwind.h
> +++ b/arch/s390/include/asm/unwind.h
> @@ -55,10 +55,10 @@ static inline bool unwind_error(struct unwind_state *state)
> return state->error;
> }
>
> -static inline void unwind_start(struct unwind_state *state,
> - struct task_struct *task,
> - struct pt_regs *regs,
> - unsigned long first_frame)
> +static __always_inline void unwind_start(struct unwind_state *state,
> + struct task_struct *task,
> + struct pt_regs *regs,
> + unsigned long first_frame)
> {
> task = task ?: current;
> first_frame = first_frame ?: get_stack_pointer(task, regs);
> --
> 2.25.4
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2021-09-03 23:24 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <YSzZFgBt6nMvpVgc@osiris>
2021-08-31 2:19 ` [GIT PULL] s390 updates for 5.15 merge window Nathan Chancellor
2021-08-31 7:09 ` Christian Borntraeger
2021-08-31 10:13 ` Heiko Carstens
2021-08-31 10:46 ` Marco Elver
2021-08-31 15:02 ` Marco Elver
2021-08-31 15:18 ` Heiko Carstens
2021-08-31 17:48 ` Nathan Chancellor
2021-08-31 17:49 ` Christian Borntraeger
2021-09-01 14:03 ` Vasily Gorbik
2021-09-01 14:05 ` [PATCH] s390/unwind: use current_frame_address() to unwind current task Vasily Gorbik
2021-09-01 17:51 ` Marco Elver
2021-09-01 18:07 ` Heiko Carstens
2021-09-03 23:23 ` Nathan Chancellor
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox