* [BUG] TREE04 hang on 6.5.y stable: Writer stall state RTWS_COND_SYNC_FULL
@ 2023-09-07 13:17 Joel Fernandes
2023-09-07 14:34 ` Paul E. McKenney
0 siblings, 1 reply; 11+ messages in thread
From: Joel Fernandes @ 2023-09-07 13:17 UTC (permalink / raw)
To: rcu; +Cc: Paul E. McKenney
Hi,
Just started seeing this on 6.5 stable. It is new and first occurrence:
TREE04 no success message, 234 successful version messages
[033mWARNING: [mTREE04 GP HANG at 14 torture stat 2
[ 38.371120] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g1253
f0x0 ->state 0x2 cpu 6
[ 38.388342] Call Trace:
[ 53.741039] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g3637
f0x2 ->state 0x2 cpu 6
[ 69.093462] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g5501
f0x0 ->state 0x2 cpu 6
[ 84.450028] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g10505
f0x0 ->state 0x2 cpu 6
[ 99.815871] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g13781
f0x0 ->state 0x2 cpu 6
[ 115.166476] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g16544
f0x0 ->state 0x2 cpu 6
[ 130.550116] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g18941
f0x0 ->state 0x2 cpu 6
[..]
All logs:
http://box.joelfernandes.org:9080/job/rcutorture_stable/job/linux-6.5.y/17/artifact/tools/testing/selftests/rcutorture/res/2023.09.07-04.10.25/TREE04/
thanks,
- Joel
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [BUG] TREE04 hang on 6.5.y stable: Writer stall state RTWS_COND_SYNC_FULL
2023-09-07 13:17 [BUG] TREE04 hang on 6.5.y stable: Writer stall state RTWS_COND_SYNC_FULL Joel Fernandes
@ 2023-09-07 14:34 ` Paul E. McKenney
2023-09-07 20:03 ` Joel Fernandes
` (2 more replies)
0 siblings, 3 replies; 11+ messages in thread
From: Paul E. McKenney @ 2023-09-07 14:34 UTC (permalink / raw)
To: Joel Fernandes; +Cc: rcu
On Thu, Sep 07, 2023 at 09:17:15AM -0400, Joel Fernandes wrote:
> Hi,
> Just started seeing this on 6.5 stable. It is new and first occurrence:
>
> TREE04 no success message, 234 successful version messages
> [033mWARNING: [mTREE04 GP HANG at 14 torture stat 2
> [ 38.371120] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g1253
> f0x0 ->state 0x2 cpu 6
> [ 38.388342] Call Trace:
> [ 53.741039] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g3637
> f0x2 ->state 0x2 cpu 6
> [ 69.093462] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g5501
> f0x0 ->state 0x2 cpu 6
> [ 84.450028] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g10505
> f0x0 ->state 0x2 cpu 6
> [ 99.815871] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g13781
> f0x0 ->state 0x2 cpu 6
> [ 115.166476] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g16544
> f0x0 ->state 0x2 cpu 6
> [ 130.550116] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g18941
> f0x0 ->state 0x2 cpu 6
> [..]
>
> All logs:
> http://box.joelfernandes.org:9080/job/rcutorture_stable/job/linux-6.5.y/17/artifact/tools/testing/selftests/rcutorture/res/2023.09.07-04.10.25/TREE04/
Huh. Does this happen for you in v6.5 mainline?
Both the code under test (full-state polled grace periods) and the
rcutorture test code are fairly new, so there is some reason for general
suspicion. ;-)
Thanx, Paul
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [BUG] TREE04 hang on 6.5.y stable: Writer stall state RTWS_COND_SYNC_FULL
2023-09-07 14:34 ` Paul E. McKenney
@ 2023-09-07 20:03 ` Joel Fernandes
2023-09-08 0:51 ` Joel Fernandes
2023-09-08 10:28 ` Zhouyi Zhou
2023-09-16 1:09 ` Joel Fernandes
2 siblings, 1 reply; 11+ messages in thread
From: Joel Fernandes @ 2023-09-07 20:03 UTC (permalink / raw)
To: paulmck; +Cc: Joel Fernandes, rcu
> On Sep 7, 2023, at 12:23 PM, Paul E. McKenney <paulmck@kernel.org> wrote:
>
> On Thu, Sep 07, 2023 at 09:17:15AM -0400, Joel Fernandes wrote:
>> Hi,
>> Just started seeing this on 6.5 stable. It is new and first occurrence:
>>
>> TREE04 no success message, 234 successful version messages
>> [033mWARNING: [mTREE04 GP HANG at 14 torture stat 2
>> [ 38.371120] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g1253
>> f0x0 ->state 0x2 cpu 6
>> [ 38.388342] Call Trace:
>> [ 53.741039] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g3637
>> f0x2 ->state 0x2 cpu 6
>> [ 69.093462] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g5501
>> f0x0 ->state 0x2 cpu 6
>> [ 84.450028] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g10505
>> f0x0 ->state 0x2 cpu 6
>> [ 99.815871] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g13781
>> f0x0 ->state 0x2 cpu 6
>> [ 115.166476] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g16544
>> f0x0 ->state 0x2 cpu 6
>> [ 130.550116] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g18941
>> f0x0 ->state 0x2 cpu 6
>> [..]
>>
>> All logs:
>> http://box.joelfernandes.org:9080/job/rcutorture_stable/job/linux-6.5.y/17/artifact/tools/testing/selftests/rcutorture/res/2023.09.07-04.10.25/TREE04/
>
> Huh. Does this happen for you in v6.5 mainline?
>
> Both the code under test (full-state polled grace periods) and the
> rcutorture test code are fairly new, so there is some reason for general
> suspicion. ;-)
Ah. I never saw it on either 6.5 mainline or stable till today. Even on stable
I only ever saw it this once. On mainline I have not seen it yet but I do test
stable much more since I have been on stable maintenance duty ;-).
I will keep an eye on it.. this also happens quite early per that time stamp. thanks,
- Joel
>
> Thanx, Paul
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [BUG] TREE04 hang on 6.5.y stable: Writer stall state RTWS_COND_SYNC_FULL
2023-09-07 20:03 ` Joel Fernandes
@ 2023-09-08 0:51 ` Joel Fernandes
2023-09-08 8:27 ` Paul E. McKenney
0 siblings, 1 reply; 11+ messages in thread
From: Joel Fernandes @ 2023-09-08 0:51 UTC (permalink / raw)
To: paulmck; +Cc: Joel Fernandes, rcu
On Thu, Sep 7, 2023 at 4:03 PM Joel Fernandes <joel@joelfernandes.org> wrote:
>
>
>
> > On Sep 7, 2023, at 12:23 PM, Paul E. McKenney <paulmck@kernel.org> wrote:
> >
> > On Thu, Sep 07, 2023 at 09:17:15AM -0400, Joel Fernandes wrote:
> >> Hi,
> >> Just started seeing this on 6.5 stable. It is new and first occurrence:
> >>
> >> TREE04 no success message, 234 successful version messages
> >> [033mWARNING: [mTREE04 GP HANG at 14 torture stat 2
> >> [ 38.371120] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g1253
> >> f0x0 ->state 0x2 cpu 6
> >> [ 38.388342] Call Trace:
> >> [ 53.741039] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g3637
> >> f0x2 ->state 0x2 cpu 6
> >> [ 69.093462] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g5501
> >> f0x0 ->state 0x2 cpu 6
> >> [ 84.450028] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g10505
> >> f0x0 ->state 0x2 cpu 6
> >> [ 99.815871] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g13781
> >> f0x0 ->state 0x2 cpu 6
> >> [ 115.166476] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g16544
> >> f0x0 ->state 0x2 cpu 6
> >> [ 130.550116] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g18941
> >> f0x0 ->state 0x2 cpu 6
> >> [..]
> >>
> >> All logs:
> >> http://box.joelfernandes.org:9080/job/rcutorture_stable/job/linux-6.5.y/17/artifact/tools/testing/selftests/rcutorture/res/2023.09.07-04.10.25/TREE04/
> >
> > Huh. Does this happen for you in v6.5 mainline?
> >
> > Both the code under test (full-state polled grace periods) and the
> > rcutorture test code are fairly new, so there is some reason for general
> > suspicion. ;-)
>
> Ah. I never saw it on either 6.5 mainline or stable till today. Even on stable
> I only ever saw it this once. On mainline I have not seen it yet but I do test
> stable much more since I have been on stable maintenance duty ;-).
I did a couple of long runs and I am not able to reproduce it anymore. :-/
thanks,
- Joel
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [BUG] TREE04 hang on 6.5.y stable: Writer stall state RTWS_COND_SYNC_FULL
2023-09-08 0:51 ` Joel Fernandes
@ 2023-09-08 8:27 ` Paul E. McKenney
2023-09-08 11:41 ` Frederic Weisbecker
0 siblings, 1 reply; 11+ messages in thread
From: Paul E. McKenney @ 2023-09-08 8:27 UTC (permalink / raw)
To: Joel Fernandes; +Cc: Joel Fernandes, rcu
On Thu, Sep 07, 2023 at 08:51:43PM -0400, Joel Fernandes wrote:
> On Thu, Sep 7, 2023 at 4:03 PM Joel Fernandes <joel@joelfernandes.org> wrote:
> >
> >
> >
> > > On Sep 7, 2023, at 12:23 PM, Paul E. McKenney <paulmck@kernel.org> wrote:
> > >
> > > On Thu, Sep 07, 2023 at 09:17:15AM -0400, Joel Fernandes wrote:
> > >> Hi,
> > >> Just started seeing this on 6.5 stable. It is new and first occurrence:
> > >>
> > >> TREE04 no success message, 234 successful version messages
> > >> [033mWARNING: [mTREE04 GP HANG at 14 torture stat 2
> > >> [ 38.371120] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g1253
> > >> f0x0 ->state 0x2 cpu 6
> > >> [ 38.388342] Call Trace:
> > >> [ 53.741039] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g3637
> > >> f0x2 ->state 0x2 cpu 6
> > >> [ 69.093462] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g5501
> > >> f0x0 ->state 0x2 cpu 6
> > >> [ 84.450028] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g10505
> > >> f0x0 ->state 0x2 cpu 6
> > >> [ 99.815871] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g13781
> > >> f0x0 ->state 0x2 cpu 6
> > >> [ 115.166476] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g16544
> > >> f0x0 ->state 0x2 cpu 6
> > >> [ 130.550116] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g18941
> > >> f0x0 ->state 0x2 cpu 6
> > >> [..]
> > >>
> > >> All logs:
> > >> http://box.joelfernandes.org:9080/job/rcutorture_stable/job/linux-6.5.y/17/artifact/tools/testing/selftests/rcutorture/res/2023.09.07-04.10.25/TREE04/
> > >
> > > Huh. Does this happen for you in v6.5 mainline?
> > >
> > > Both the code under test (full-state polled grace periods) and the
> > > rcutorture test code are fairly new, so there is some reason for general
> > > suspicion. ;-)
> >
> > Ah. I never saw it on either 6.5 mainline or stable till today. Even on stable
> > I only ever saw it this once. On mainline I have not seen it yet but I do test
> > stable much more since I have been on stable maintenance duty ;-).
>
> I did a couple of long runs and I am not able to reproduce it anymore. :-/
I know that feeling!
Thanx, Paul
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [BUG] TREE04 hang on 6.5.y stable: Writer stall state RTWS_COND_SYNC_FULL
2023-09-07 14:34 ` Paul E. McKenney
2023-09-07 20:03 ` Joel Fernandes
@ 2023-09-08 10:28 ` Zhouyi Zhou
2023-09-08 23:33 ` Zhouyi Zhou
2023-09-16 1:09 ` Joel Fernandes
2 siblings, 1 reply; 11+ messages in thread
From: Zhouyi Zhou @ 2023-09-08 10:28 UTC (permalink / raw)
To: paulmck; +Cc: Joel Fernandes, rcu
On Fri, Sep 8, 2023 at 1:59 AM Paul E. McKenney <paulmck@kernel.org> wrote:
>
> On Thu, Sep 07, 2023 at 09:17:15AM -0400, Joel Fernandes wrote:
> > Hi,
> > Just started seeing this on 6.5 stable. It is new and first occurrence:
> >
> > TREE04 no success message, 234 successful version messages
> > [033mWARNING: [mTREE04 GP HANG at 14 torture stat 2
> > [ 38.371120] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g1253
> > f0x0 ->state 0x2 cpu 6
> > [ 38.388342] Call Trace:
> > [ 53.741039] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g3637
> > f0x2 ->state 0x2 cpu 6
> > [ 69.093462] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g5501
> > f0x0 ->state 0x2 cpu 6
> > [ 84.450028] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g10505
> > f0x0 ->state 0x2 cpu 6
> > [ 99.815871] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g13781
> > f0x0 ->state 0x2 cpu 6
> > [ 115.166476] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g16544
> > f0x0 ->state 0x2 cpu 6
> > [ 130.550116] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g18941
> > f0x0 ->state 0x2 cpu 6
> > [..]
> >
> > All logs:
> > http://box.joelfernandes.org:9080/job/rcutorture_stable/job/linux-6.5.y/17/artifact/tools/testing/selftests/rcutorture/res/2023.09.07-04.10.25/TREE04/
>
> Huh. Does this happen for you in v6.5 mainline?
Hi, I am started torture.sh in a kvm environment (with nested kvm
enable) in my Intel i7-1165G7 laptop, which can be examined at
runtime:
http://154.220.3.120:8080/test/linux-stable/tools/testing/selftests/rcutorture/res/2023.09.08-10.23.47-torture/
Hope I can be of some beneficial
Thanks
Zhouyi
>
> Both the code under test (full-state polled grace periods) and the
> rcutorture test code are fairly new, so there is some reason for general
> suspicion. ;-)
>
> Thanx, Paul
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [BUG] TREE04 hang on 6.5.y stable: Writer stall state RTWS_COND_SYNC_FULL
2023-09-08 8:27 ` Paul E. McKenney
@ 2023-09-08 11:41 ` Frederic Weisbecker
2023-09-08 13:32 ` Joel Fernandes
0 siblings, 1 reply; 11+ messages in thread
From: Frederic Weisbecker @ 2023-09-08 11:41 UTC (permalink / raw)
To: Paul E. McKenney; +Cc: Joel Fernandes, Joel Fernandes, rcu
On Fri, Sep 08, 2023 at 01:27:06AM -0700, Paul E. McKenney wrote:
> On Thu, Sep 07, 2023 at 08:51:43PM -0400, Joel Fernandes wrote:
> > On Thu, Sep 7, 2023 at 4:03 PM Joel Fernandes <joel@joelfernandes.org> wrote:
> > >
> > >
> > >
> > > > On Sep 7, 2023, at 12:23 PM, Paul E. McKenney <paulmck@kernel.org> wrote:
> > > >
> > > > On Thu, Sep 07, 2023 at 09:17:15AM -0400, Joel Fernandes wrote:
> > > >> Hi,
> > > >> Just started seeing this on 6.5 stable. It is new and first occurrence:
> > > >>
> > > >> TREE04 no success message, 234 successful version messages
> > > >> [033mWARNING: [mTREE04 GP HANG at 14 torture stat 2
> > > >> [ 38.371120] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g1253
> > > >> f0x0 ->state 0x2 cpu 6
> > > >> [ 38.388342] Call Trace:
> > > >> [ 53.741039] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g3637
> > > >> f0x2 ->state 0x2 cpu 6
> > > >> [ 69.093462] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g5501
> > > >> f0x0 ->state 0x2 cpu 6
> > > >> [ 84.450028] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g10505
> > > >> f0x0 ->state 0x2 cpu 6
> > > >> [ 99.815871] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g13781
> > > >> f0x0 ->state 0x2 cpu 6
> > > >> [ 115.166476] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g16544
> > > >> f0x0 ->state 0x2 cpu 6
> > > >> [ 130.550116] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g18941
> > > >> f0x0 ->state 0x2 cpu 6
> > > >> [..]
> > > >>
> > > >> All logs:
> > > >> http://box.joelfernandes.org:9080/job/rcutorture_stable/job/linux-6.5.y/17/artifact/tools/testing/selftests/rcutorture/res/2023.09.07-04.10.25/TREE04/
> > > >
> > > > Huh. Does this happen for you in v6.5 mainline?
> > > >
> > > > Both the code under test (full-state polled grace periods) and the
> > > > rcutorture test code are fairly new, so there is some reason for general
> > > > suspicion. ;-)
> > >
> > > Ah. I never saw it on either 6.5 mainline or stable till today. Even on stable
> > > I only ever saw it this once. On mainline I have not seen it yet but I do test
> > > stable much more since I have been on stable maintenance duty ;-).
> >
> > I did a couple of long runs and I am not able to reproduce it anymore. :-/
>
> I know that feeling!
Same here, this is after all the reason why we keep the tick dependency within
the hotplug process without really knowing why :o)
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [BUG] TREE04 hang on 6.5.y stable: Writer stall state RTWS_COND_SYNC_FULL
2023-09-08 11:41 ` Frederic Weisbecker
@ 2023-09-08 13:32 ` Joel Fernandes
0 siblings, 0 replies; 11+ messages in thread
From: Joel Fernandes @ 2023-09-08 13:32 UTC (permalink / raw)
To: Frederic Weisbecker; +Cc: Paul E. McKenney, Joel Fernandes, rcu
On Fri, Sep 8, 2023 at 7:41 AM Frederic Weisbecker <frederic@kernel.org> wrote:
>
> On Fri, Sep 08, 2023 at 01:27:06AM -0700, Paul E. McKenney wrote:
> > On Thu, Sep 07, 2023 at 08:51:43PM -0400, Joel Fernandes wrote:
> > > On Thu, Sep 7, 2023 at 4:03 PM Joel Fernandes <joel@joelfernandes.org> wrote:
> > > >
> > > >
> > > >
> > > > > On Sep 7, 2023, at 12:23 PM, Paul E. McKenney <paulmck@kernel.org> wrote:
> > > > >
> > > > > On Thu, Sep 07, 2023 at 09:17:15AM -0400, Joel Fernandes wrote:
> > > > >> Hi,
> > > > >> Just started seeing this on 6.5 stable. It is new and first occurrence:
> > > > >>
> > > > >> TREE04 no success message, 234 successful version messages
> > > > >> [033mWARNING: [mTREE04 GP HANG at 14 torture stat 2
> > > > >> [ 38.371120] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g1253
> > > > >> f0x0 ->state 0x2 cpu 6
> > > > >> [ 38.388342] Call Trace:
> > > > >> [ 53.741039] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g3637
> > > > >> f0x2 ->state 0x2 cpu 6
> > > > >> [ 69.093462] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g5501
> > > > >> f0x0 ->state 0x2 cpu 6
> > > > >> [ 84.450028] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g10505
> > > > >> f0x0 ->state 0x2 cpu 6
> > > > >> [ 99.815871] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g13781
> > > > >> f0x0 ->state 0x2 cpu 6
> > > > >> [ 115.166476] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g16544
> > > > >> f0x0 ->state 0x2 cpu 6
> > > > >> [ 130.550116] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g18941
> > > > >> f0x0 ->state 0x2 cpu 6
> > > > >> [..]
> > > > >>
> > > > >> All logs:
> > > > >> http://box.joelfernandes.org:9080/job/rcutorture_stable/job/linux-6.5.y/17/artifact/tools/testing/selftests/rcutorture/res/2023.09.07-04.10.25/TREE04/
> > > > >
> > > > > Huh. Does this happen for you in v6.5 mainline?
> > > > >
> > > > > Both the code under test (full-state polled grace periods) and the
> > > > > rcutorture test code are fairly new, so there is some reason for general
> > > > > suspicion. ;-)
> > > >
> > > > Ah. I never saw it on either 6.5 mainline or stable till today. Even on stable
> > > > I only ever saw it this once. On mainline I have not seen it yet but I do test
> > > > stable much more since I have been on stable maintenance duty ;-).
> > >
> > > I did a couple of long runs and I am not able to reproduce it anymore. :-/
> >
> > I know that feeling!
>
> Same here, this is after all the reason why we keep the tick dependency within
> the hotplug process without really knowing why :o)
Heh. I have been running into another intermittent one as well which
is the boost failure and that happens once in 10-15 runs or so.
I was thinking of running the following configuration on an automated
regular basis to at least provide a better clue on the lucky run that
catches an issue. But then the issue is it would change timing enough
to maybe hide bugs. I could also make it submit logs automatically to
the list on such occurrences, but one step at a time and all that. I
do need to add (hopefully less noisy) tick/timer related trace events.
# Define the bootargs array
bootargs=(
"ftrace_dump_on_oops"
"panic_on_warn=1"
"sysctl.kernel.panic_on_rcu_stall=1"
"sysctl.kernel.max_rcu_stall_to_panic=1"
"trace_buf_size=10K"
"traceoff_on_warning=1"
"panic_print=0x1f" # To dump held locks, mem and other info.
)
# Define the trace events array passed to bootargs.
trace_events=(
"sched:sched_switch"
"sched:sched_waking"
"rcu:rcu_callback"
"rcu:rcu_fqs"
"rcu:rcu_quiescent_state_report"
"rcu:rcu_grace_period"
)
Thanks.
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [BUG] TREE04 hang on 6.5.y stable: Writer stall state RTWS_COND_SYNC_FULL
2023-09-08 10:28 ` Zhouyi Zhou
@ 2023-09-08 23:33 ` Zhouyi Zhou
2023-09-09 0:10 ` Zhouyi Zhou
0 siblings, 1 reply; 11+ messages in thread
From: Zhouyi Zhou @ 2023-09-08 23:33 UTC (permalink / raw)
To: paulmck; +Cc: Joel Fernandes, rcu
On Fri, Sep 8, 2023 at 6:28 PM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote:
>
> On Fri, Sep 8, 2023 at 1:59 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> >
> > On Thu, Sep 07, 2023 at 09:17:15AM -0400, Joel Fernandes wrote:
> > > Hi,
> > > Just started seeing this on 6.5 stable. It is new and first occurrence:
> > >
> > > TREE04 no success message, 234 successful version messages
> > > [033mWARNING: [mTREE04 GP HANG at 14 torture stat 2
> > > [ 38.371120] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g1253
> > > f0x0 ->state 0x2 cpu 6
> > > [ 38.388342] Call Trace:
> > > [ 53.741039] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g3637
> > > f0x2 ->state 0x2 cpu 6
> > > [ 69.093462] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g5501
> > > f0x0 ->state 0x2 cpu 6
> > > [ 84.450028] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g10505
> > > f0x0 ->state 0x2 cpu 6
> > > [ 99.815871] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g13781
> > > f0x0 ->state 0x2 cpu 6
> > > [ 115.166476] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g16544
> > > f0x0 ->state 0x2 cpu 6
> > > [ 130.550116] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g18941
> > > f0x0 ->state 0x2 cpu 6
> > > [..]
> > >
> > > All logs:
> > > http://box.joelfernandes.org:9080/job/rcutorture_stable/job/linux-6.5.y/17/artifact/tools/testing/selftests/rcutorture/res/2023.09.07-04.10.25/TREE04/
> >
> > Huh. Does this happen for you in v6.5 mainline?
> Hi, I am started torture.sh in a kvm environment (with nested kvm
> enable) in my Intel i7-1165G7 laptop, which can be examined at
> runtime:
> http://154.220.3.120:8080/test/linux-stable/tools/testing/selftests/rcutorture/res/2023.09.08-10.23.47-torture/
again I can't reproduce the bug in my environment, I will try it more times.
the git head is 3766ec12cf89
System stability is a profound knowledge, there is too much for me to
learn from the community.
Thanks
Zhouyi
>
> Hope I can be of some beneficial
> Thanks
> Zhouyi
> >
> > Both the code under test (full-state polled grace periods) and the
> > rcutorture test code are fairly new, so there is some reason for general
> > suspicion. ;-)
> >
> > Thanx, Paul
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [BUG] TREE04 hang on 6.5.y stable: Writer stall state RTWS_COND_SYNC_FULL
2023-09-08 23:33 ` Zhouyi Zhou
@ 2023-09-09 0:10 ` Zhouyi Zhou
0 siblings, 0 replies; 11+ messages in thread
From: Zhouyi Zhou @ 2023-09-09 0:10 UTC (permalink / raw)
To: paulmck; +Cc: Joel Fernandes, rcu
On Sat, Sep 9, 2023 at 7:33 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote:
>
> On Fri, Sep 8, 2023 at 6:28 PM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote:
> >
> > On Fri, Sep 8, 2023 at 1:59 AM Paul E. McKenney <paulmck@kernel.org> wrote:
> > >
> > > On Thu, Sep 07, 2023 at 09:17:15AM -0400, Joel Fernandes wrote:
> > > > Hi,
> > > > Just started seeing this on 6.5 stable. It is new and first occurrence:
> > > >
> > > > TREE04 no success message, 234 successful version messages
> > > > [033mWARNING: [mTREE04 GP HANG at 14 torture stat 2
> > > > [ 38.371120] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g1253
> > > > f0x0 ->state 0x2 cpu 6
> > > > [ 38.388342] Call Trace:
> > > > [ 53.741039] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g3637
> > > > f0x2 ->state 0x2 cpu 6
> > > > [ 69.093462] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g5501
> > > > f0x0 ->state 0x2 cpu 6
> > > > [ 84.450028] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g10505
> > > > f0x0 ->state 0x2 cpu 6
> > > > [ 99.815871] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g13781
> > > > f0x0 ->state 0x2 cpu 6
> > > > [ 115.166476] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g16544
> > > > f0x0 ->state 0x2 cpu 6
> > > > [ 130.550116] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g18941
> > > > f0x0 ->state 0x2 cpu 6
> > > > [..]
> > > >
> > > > All logs:
> > > > http://box.joelfernandes.org:9080/job/rcutorture_stable/job/linux-6.5.y/17/artifact/tools/testing/selftests/rcutorture/res/2023.09.07-04.10.25/TREE04/
> > >
> > > Huh. Does this happen for you in v6.5 mainline?
> > Hi, I am started torture.sh in a kvm environment (with nested kvm
> > enable) in my Intel i7-1165G7 laptop, which can be examined at
> > runtime:
> > http://154.220.3.120:8080/test/linux-stable/tools/testing/selftests/rcutorture/res/2023.09.08-10.23.47-torture/
> again I can't reproduce the bug in my environment, I will try it more times.
besides my laptop, I also started the test on PPC vm of Open Source
lab of Oregon State University:
http://140.211.169.189/stable/linux/tools/testing/selftests/rcutorture/res/2023.09.09-00.07.55-torture/
> the git head is 3766ec12cf89
>
> System stability is a profound knowledge, there is too much for me to
> learn from the community.
> Thanks
> Zhouyi
> >
> > Hope I can be of some beneficial
> > Thanks
> > Zhouyi
> > >
> > > Both the code under test (full-state polled grace periods) and the
> > > rcutorture test code are fairly new, so there is some reason for general
> > > suspicion. ;-)
> > >
> > > Thanx, Paul
^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [BUG] TREE04 hang on 6.5.y stable: Writer stall state RTWS_COND_SYNC_FULL
2023-09-07 14:34 ` Paul E. McKenney
2023-09-07 20:03 ` Joel Fernandes
2023-09-08 10:28 ` Zhouyi Zhou
@ 2023-09-16 1:09 ` Joel Fernandes
2 siblings, 0 replies; 11+ messages in thread
From: Joel Fernandes @ 2023-09-16 1:09 UTC (permalink / raw)
To: Paul E. McKenney; +Cc: rcu
On Thu, Sep 07, 2023 at 07:34:44AM -0700, Paul E. McKenney wrote:
> On Thu, Sep 07, 2023 at 09:17:15AM -0400, Joel Fernandes wrote:
> > Hi,
> > Just started seeing this on 6.5 stable. It is new and first occurrence:
> >
> > TREE04 no success message, 234 successful version messages
> > [033mWARNING: [mTREE04 GP HANG at 14 torture stat 2
> > [ 38.371120] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g1253
> > f0x0 ->state 0x2 cpu 6
> > [ 38.388342] Call Trace:
> > [ 53.741039] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g3637
> > f0x2 ->state 0x2 cpu 6
> > [ 69.093462] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g5501
> > f0x0 ->state 0x2 cpu 6
> > [ 84.450028] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g10505
> > f0x0 ->state 0x2 cpu 6
> > [ 99.815871] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g13781
> > f0x0 ->state 0x2 cpu 6
> > [ 115.166476] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g16544
> > f0x0 ->state 0x2 cpu 6
> > [ 130.550116] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g18941
> > f0x0 ->state 0x2 cpu 6
> > [..]
> >
> > All logs:
> > http://box.joelfernandes.org:9080/job/rcutorture_stable/job/linux-6.5.y/17/artifact/tools/testing/selftests/rcutorture/res/2023.09.07-04.10.25/TREE04/
>
> Huh. Does this happen for you in v6.5 mainline?
>
> Both the code under test (full-state polled grace periods) and the
> rcutorture test code are fairly new, so there is some reason for general
> suspicion. ;-)
I happened to hit this again but this time on 6.1 stable and TREE05:
Here are some logs:
http://box.joelfernandes.org:9080/job/rcutorture_stable/job/linux-6.1.y/139/artifact/tools/testing/selftests/rcutorture/res/2023.09.15-04.02.48/TREE05/
I am planning to look closer soon. thanks,
- Joel
^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2023-09-16 1:10 UTC | newest]
Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-09-07 13:17 [BUG] TREE04 hang on 6.5.y stable: Writer stall state RTWS_COND_SYNC_FULL Joel Fernandes
2023-09-07 14:34 ` Paul E. McKenney
2023-09-07 20:03 ` Joel Fernandes
2023-09-08 0:51 ` Joel Fernandes
2023-09-08 8:27 ` Paul E. McKenney
2023-09-08 11:41 ` Frederic Weisbecker
2023-09-08 13:32 ` Joel Fernandes
2023-09-08 10:28 ` Zhouyi Zhou
2023-09-08 23:33 ` Zhouyi Zhou
2023-09-09 0:10 ` Zhouyi Zhou
2023-09-16 1:09 ` Joel Fernandes
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.