* [BUG] TREE04 hang on 6.5.y stable: Writer stall state RTWS_COND_SYNC_FULL @ 2023-09-07 13:17 Joel Fernandes 2023-09-07 14:34 ` Paul E. McKenney 0 siblings, 1 reply; 11+ messages in thread From: Joel Fernandes @ 2023-09-07 13:17 UTC (permalink / raw) To: rcu; +Cc: Paul E. McKenney Hi, Just started seeing this on 6.5 stable. It is new and first occurrence: TREE04 no success message, 234 successful version messages [033mWARNING: [mTREE04 GP HANG at 14 torture stat 2 [ 38.371120] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g1253 f0x0 ->state 0x2 cpu 6 [ 38.388342] Call Trace: [ 53.741039] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g3637 f0x2 ->state 0x2 cpu 6 [ 69.093462] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g5501 f0x0 ->state 0x2 cpu 6 [ 84.450028] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g10505 f0x0 ->state 0x2 cpu 6 [ 99.815871] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g13781 f0x0 ->state 0x2 cpu 6 [ 115.166476] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g16544 f0x0 ->state 0x2 cpu 6 [ 130.550116] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g18941 f0x0 ->state 0x2 cpu 6 [..] All logs: http://box.joelfernandes.org:9080/job/rcutorture_stable/job/linux-6.5.y/17/artifact/tools/testing/selftests/rcutorture/res/2023.09.07-04.10.25/TREE04/ thanks, - Joel ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [BUG] TREE04 hang on 6.5.y stable: Writer stall state RTWS_COND_SYNC_FULL 2023-09-07 13:17 [BUG] TREE04 hang on 6.5.y stable: Writer stall state RTWS_COND_SYNC_FULL Joel Fernandes @ 2023-09-07 14:34 ` Paul E. McKenney 2023-09-07 20:03 ` Joel Fernandes ` (2 more replies) 0 siblings, 3 replies; 11+ messages in thread From: Paul E. McKenney @ 2023-09-07 14:34 UTC (permalink / raw) To: Joel Fernandes; +Cc: rcu On Thu, Sep 07, 2023 at 09:17:15AM -0400, Joel Fernandes wrote: > Hi, > Just started seeing this on 6.5 stable. It is new and first occurrence: > > TREE04 no success message, 234 successful version messages > [033mWARNING: [mTREE04 GP HANG at 14 torture stat 2 > [ 38.371120] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g1253 > f0x0 ->state 0x2 cpu 6 > [ 38.388342] Call Trace: > [ 53.741039] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g3637 > f0x2 ->state 0x2 cpu 6 > [ 69.093462] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g5501 > f0x0 ->state 0x2 cpu 6 > [ 84.450028] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g10505 > f0x0 ->state 0x2 cpu 6 > [ 99.815871] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g13781 > f0x0 ->state 0x2 cpu 6 > [ 115.166476] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g16544 > f0x0 ->state 0x2 cpu 6 > [ 130.550116] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g18941 > f0x0 ->state 0x2 cpu 6 > [..] > > All logs: > http://box.joelfernandes.org:9080/job/rcutorture_stable/job/linux-6.5.y/17/artifact/tools/testing/selftests/rcutorture/res/2023.09.07-04.10.25/TREE04/ Huh. Does this happen for you in v6.5 mainline? Both the code under test (full-state polled grace periods) and the rcutorture test code are fairly new, so there is some reason for general suspicion. ;-) Thanx, Paul ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [BUG] TREE04 hang on 6.5.y stable: Writer stall state RTWS_COND_SYNC_FULL 2023-09-07 14:34 ` Paul E. McKenney @ 2023-09-07 20:03 ` Joel Fernandes 2023-09-08 0:51 ` Joel Fernandes 2023-09-08 10:28 ` Zhouyi Zhou 2023-09-16 1:09 ` Joel Fernandes 2 siblings, 1 reply; 11+ messages in thread From: Joel Fernandes @ 2023-09-07 20:03 UTC (permalink / raw) To: paulmck; +Cc: Joel Fernandes, rcu > On Sep 7, 2023, at 12:23 PM, Paul E. McKenney <paulmck@kernel.org> wrote: > > On Thu, Sep 07, 2023 at 09:17:15AM -0400, Joel Fernandes wrote: >> Hi, >> Just started seeing this on 6.5 stable. It is new and first occurrence: >> >> TREE04 no success message, 234 successful version messages >> [033mWARNING: [mTREE04 GP HANG at 14 torture stat 2 >> [ 38.371120] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g1253 >> f0x0 ->state 0x2 cpu 6 >> [ 38.388342] Call Trace: >> [ 53.741039] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g3637 >> f0x2 ->state 0x2 cpu 6 >> [ 69.093462] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g5501 >> f0x0 ->state 0x2 cpu 6 >> [ 84.450028] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g10505 >> f0x0 ->state 0x2 cpu 6 >> [ 99.815871] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g13781 >> f0x0 ->state 0x2 cpu 6 >> [ 115.166476] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g16544 >> f0x0 ->state 0x2 cpu 6 >> [ 130.550116] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g18941 >> f0x0 ->state 0x2 cpu 6 >> [..] >> >> All logs: >> http://box.joelfernandes.org:9080/job/rcutorture_stable/job/linux-6.5.y/17/artifact/tools/testing/selftests/rcutorture/res/2023.09.07-04.10.25/TREE04/ > > Huh. Does this happen for you in v6.5 mainline? > > Both the code under test (full-state polled grace periods) and the > rcutorture test code are fairly new, so there is some reason for general > suspicion. ;-) Ah. I never saw it on either 6.5 mainline or stable till today. Even on stable I only ever saw it this once. On mainline I have not seen it yet but I do test stable much more since I have been on stable maintenance duty ;-). I will keep an eye on it.. this also happens quite early per that time stamp. thanks, - Joel > > Thanx, Paul ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [BUG] TREE04 hang on 6.5.y stable: Writer stall state RTWS_COND_SYNC_FULL 2023-09-07 20:03 ` Joel Fernandes @ 2023-09-08 0:51 ` Joel Fernandes 2023-09-08 8:27 ` Paul E. McKenney 0 siblings, 1 reply; 11+ messages in thread From: Joel Fernandes @ 2023-09-08 0:51 UTC (permalink / raw) To: paulmck; +Cc: Joel Fernandes, rcu On Thu, Sep 7, 2023 at 4:03 PM Joel Fernandes <joel@joelfernandes.org> wrote: > > > > > On Sep 7, 2023, at 12:23 PM, Paul E. McKenney <paulmck@kernel.org> wrote: > > > > On Thu, Sep 07, 2023 at 09:17:15AM -0400, Joel Fernandes wrote: > >> Hi, > >> Just started seeing this on 6.5 stable. It is new and first occurrence: > >> > >> TREE04 no success message, 234 successful version messages > >> [033mWARNING: [mTREE04 GP HANG at 14 torture stat 2 > >> [ 38.371120] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g1253 > >> f0x0 ->state 0x2 cpu 6 > >> [ 38.388342] Call Trace: > >> [ 53.741039] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g3637 > >> f0x2 ->state 0x2 cpu 6 > >> [ 69.093462] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g5501 > >> f0x0 ->state 0x2 cpu 6 > >> [ 84.450028] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g10505 > >> f0x0 ->state 0x2 cpu 6 > >> [ 99.815871] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g13781 > >> f0x0 ->state 0x2 cpu 6 > >> [ 115.166476] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g16544 > >> f0x0 ->state 0x2 cpu 6 > >> [ 130.550116] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g18941 > >> f0x0 ->state 0x2 cpu 6 > >> [..] > >> > >> All logs: > >> http://box.joelfernandes.org:9080/job/rcutorture_stable/job/linux-6.5.y/17/artifact/tools/testing/selftests/rcutorture/res/2023.09.07-04.10.25/TREE04/ > > > > Huh. Does this happen for you in v6.5 mainline? > > > > Both the code under test (full-state polled grace periods) and the > > rcutorture test code are fairly new, so there is some reason for general > > suspicion. ;-) > > Ah. I never saw it on either 6.5 mainline or stable till today. Even on stable > I only ever saw it this once. On mainline I have not seen it yet but I do test > stable much more since I have been on stable maintenance duty ;-). I did a couple of long runs and I am not able to reproduce it anymore. :-/ thanks, - Joel ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [BUG] TREE04 hang on 6.5.y stable: Writer stall state RTWS_COND_SYNC_FULL 2023-09-08 0:51 ` Joel Fernandes @ 2023-09-08 8:27 ` Paul E. McKenney 2023-09-08 11:41 ` Frederic Weisbecker 0 siblings, 1 reply; 11+ messages in thread From: Paul E. McKenney @ 2023-09-08 8:27 UTC (permalink / raw) To: Joel Fernandes; +Cc: Joel Fernandes, rcu On Thu, Sep 07, 2023 at 08:51:43PM -0400, Joel Fernandes wrote: > On Thu, Sep 7, 2023 at 4:03 PM Joel Fernandes <joel@joelfernandes.org> wrote: > > > > > > > > > On Sep 7, 2023, at 12:23 PM, Paul E. McKenney <paulmck@kernel.org> wrote: > > > > > > On Thu, Sep 07, 2023 at 09:17:15AM -0400, Joel Fernandes wrote: > > >> Hi, > > >> Just started seeing this on 6.5 stable. It is new and first occurrence: > > >> > > >> TREE04 no success message, 234 successful version messages > > >> [033mWARNING: [mTREE04 GP HANG at 14 torture stat 2 > > >> [ 38.371120] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g1253 > > >> f0x0 ->state 0x2 cpu 6 > > >> [ 38.388342] Call Trace: > > >> [ 53.741039] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g3637 > > >> f0x2 ->state 0x2 cpu 6 > > >> [ 69.093462] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g5501 > > >> f0x0 ->state 0x2 cpu 6 > > >> [ 84.450028] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g10505 > > >> f0x0 ->state 0x2 cpu 6 > > >> [ 99.815871] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g13781 > > >> f0x0 ->state 0x2 cpu 6 > > >> [ 115.166476] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g16544 > > >> f0x0 ->state 0x2 cpu 6 > > >> [ 130.550116] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g18941 > > >> f0x0 ->state 0x2 cpu 6 > > >> [..] > > >> > > >> All logs: > > >> http://box.joelfernandes.org:9080/job/rcutorture_stable/job/linux-6.5.y/17/artifact/tools/testing/selftests/rcutorture/res/2023.09.07-04.10.25/TREE04/ > > > > > > Huh. Does this happen for you in v6.5 mainline? > > > > > > Both the code under test (full-state polled grace periods) and the > > > rcutorture test code are fairly new, so there is some reason for general > > > suspicion. ;-) > > > > Ah. I never saw it on either 6.5 mainline or stable till today. Even on stable > > I only ever saw it this once. On mainline I have not seen it yet but I do test > > stable much more since I have been on stable maintenance duty ;-). > > I did a couple of long runs and I am not able to reproduce it anymore. :-/ I know that feeling! Thanx, Paul ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [BUG] TREE04 hang on 6.5.y stable: Writer stall state RTWS_COND_SYNC_FULL 2023-09-08 8:27 ` Paul E. McKenney @ 2023-09-08 11:41 ` Frederic Weisbecker 2023-09-08 13:32 ` Joel Fernandes 0 siblings, 1 reply; 11+ messages in thread From: Frederic Weisbecker @ 2023-09-08 11:41 UTC (permalink / raw) To: Paul E. McKenney; +Cc: Joel Fernandes, Joel Fernandes, rcu On Fri, Sep 08, 2023 at 01:27:06AM -0700, Paul E. McKenney wrote: > On Thu, Sep 07, 2023 at 08:51:43PM -0400, Joel Fernandes wrote: > > On Thu, Sep 7, 2023 at 4:03 PM Joel Fernandes <joel@joelfernandes.org> wrote: > > > > > > > > > > > > > On Sep 7, 2023, at 12:23 PM, Paul E. McKenney <paulmck@kernel.org> wrote: > > > > > > > > On Thu, Sep 07, 2023 at 09:17:15AM -0400, Joel Fernandes wrote: > > > >> Hi, > > > >> Just started seeing this on 6.5 stable. It is new and first occurrence: > > > >> > > > >> TREE04 no success message, 234 successful version messages > > > >> [033mWARNING: [mTREE04 GP HANG at 14 torture stat 2 > > > >> [ 38.371120] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g1253 > > > >> f0x0 ->state 0x2 cpu 6 > > > >> [ 38.388342] Call Trace: > > > >> [ 53.741039] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g3637 > > > >> f0x2 ->state 0x2 cpu 6 > > > >> [ 69.093462] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g5501 > > > >> f0x0 ->state 0x2 cpu 6 > > > >> [ 84.450028] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g10505 > > > >> f0x0 ->state 0x2 cpu 6 > > > >> [ 99.815871] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g13781 > > > >> f0x0 ->state 0x2 cpu 6 > > > >> [ 115.166476] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g16544 > > > >> f0x0 ->state 0x2 cpu 6 > > > >> [ 130.550116] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g18941 > > > >> f0x0 ->state 0x2 cpu 6 > > > >> [..] > > > >> > > > >> All logs: > > > >> http://box.joelfernandes.org:9080/job/rcutorture_stable/job/linux-6.5.y/17/artifact/tools/testing/selftests/rcutorture/res/2023.09.07-04.10.25/TREE04/ > > > > > > > > Huh. Does this happen for you in v6.5 mainline? > > > > > > > > Both the code under test (full-state polled grace periods) and the > > > > rcutorture test code are fairly new, so there is some reason for general > > > > suspicion. ;-) > > > > > > Ah. I never saw it on either 6.5 mainline or stable till today. Even on stable > > > I only ever saw it this once. On mainline I have not seen it yet but I do test > > > stable much more since I have been on stable maintenance duty ;-). > > > > I did a couple of long runs and I am not able to reproduce it anymore. :-/ > > I know that feeling! Same here, this is after all the reason why we keep the tick dependency within the hotplug process without really knowing why :o) ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [BUG] TREE04 hang on 6.5.y stable: Writer stall state RTWS_COND_SYNC_FULL 2023-09-08 11:41 ` Frederic Weisbecker @ 2023-09-08 13:32 ` Joel Fernandes 0 siblings, 0 replies; 11+ messages in thread From: Joel Fernandes @ 2023-09-08 13:32 UTC (permalink / raw) To: Frederic Weisbecker; +Cc: Paul E. McKenney, Joel Fernandes, rcu On Fri, Sep 8, 2023 at 7:41 AM Frederic Weisbecker <frederic@kernel.org> wrote: > > On Fri, Sep 08, 2023 at 01:27:06AM -0700, Paul E. McKenney wrote: > > On Thu, Sep 07, 2023 at 08:51:43PM -0400, Joel Fernandes wrote: > > > On Thu, Sep 7, 2023 at 4:03 PM Joel Fernandes <joel@joelfernandes.org> wrote: > > > > > > > > > > > > > > > > > On Sep 7, 2023, at 12:23 PM, Paul E. McKenney <paulmck@kernel.org> wrote: > > > > > > > > > > On Thu, Sep 07, 2023 at 09:17:15AM -0400, Joel Fernandes wrote: > > > > >> Hi, > > > > >> Just started seeing this on 6.5 stable. It is new and first occurrence: > > > > >> > > > > >> TREE04 no success message, 234 successful version messages > > > > >> [033mWARNING: [mTREE04 GP HANG at 14 torture stat 2 > > > > >> [ 38.371120] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g1253 > > > > >> f0x0 ->state 0x2 cpu 6 > > > > >> [ 38.388342] Call Trace: > > > > >> [ 53.741039] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g3637 > > > > >> f0x2 ->state 0x2 cpu 6 > > > > >> [ 69.093462] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g5501 > > > > >> f0x0 ->state 0x2 cpu 6 > > > > >> [ 84.450028] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g10505 > > > > >> f0x0 ->state 0x2 cpu 6 > > > > >> [ 99.815871] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g13781 > > > > >> f0x0 ->state 0x2 cpu 6 > > > > >> [ 115.166476] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g16544 > > > > >> f0x0 ->state 0x2 cpu 6 > > > > >> [ 130.550116] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g18941 > > > > >> f0x0 ->state 0x2 cpu 6 > > > > >> [..] > > > > >> > > > > >> All logs: > > > > >> http://box.joelfernandes.org:9080/job/rcutorture_stable/job/linux-6.5.y/17/artifact/tools/testing/selftests/rcutorture/res/2023.09.07-04.10.25/TREE04/ > > > > > > > > > > Huh. Does this happen for you in v6.5 mainline? > > > > > > > > > > Both the code under test (full-state polled grace periods) and the > > > > > rcutorture test code are fairly new, so there is some reason for general > > > > > suspicion. ;-) > > > > > > > > Ah. I never saw it on either 6.5 mainline or stable till today. Even on stable > > > > I only ever saw it this once. On mainline I have not seen it yet but I do test > > > > stable much more since I have been on stable maintenance duty ;-). > > > > > > I did a couple of long runs and I am not able to reproduce it anymore. :-/ > > > > I know that feeling! > > Same here, this is after all the reason why we keep the tick dependency within > the hotplug process without really knowing why :o) Heh. I have been running into another intermittent one as well which is the boost failure and that happens once in 10-15 runs or so. I was thinking of running the following configuration on an automated regular basis to at least provide a better clue on the lucky run that catches an issue. But then the issue is it would change timing enough to maybe hide bugs. I could also make it submit logs automatically to the list on such occurrences, but one step at a time and all that. I do need to add (hopefully less noisy) tick/timer related trace events. # Define the bootargs array bootargs=( "ftrace_dump_on_oops" "panic_on_warn=1" "sysctl.kernel.panic_on_rcu_stall=1" "sysctl.kernel.max_rcu_stall_to_panic=1" "trace_buf_size=10K" "traceoff_on_warning=1" "panic_print=0x1f" # To dump held locks, mem and other info. ) # Define the trace events array passed to bootargs. trace_events=( "sched:sched_switch" "sched:sched_waking" "rcu:rcu_callback" "rcu:rcu_fqs" "rcu:rcu_quiescent_state_report" "rcu:rcu_grace_period" ) Thanks. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [BUG] TREE04 hang on 6.5.y stable: Writer stall state RTWS_COND_SYNC_FULL 2023-09-07 14:34 ` Paul E. McKenney 2023-09-07 20:03 ` Joel Fernandes @ 2023-09-08 10:28 ` Zhouyi Zhou 2023-09-08 23:33 ` Zhouyi Zhou 2023-09-16 1:09 ` Joel Fernandes 2 siblings, 1 reply; 11+ messages in thread From: Zhouyi Zhou @ 2023-09-08 10:28 UTC (permalink / raw) To: paulmck; +Cc: Joel Fernandes, rcu On Fri, Sep 8, 2023 at 1:59 AM Paul E. McKenney <paulmck@kernel.org> wrote: > > On Thu, Sep 07, 2023 at 09:17:15AM -0400, Joel Fernandes wrote: > > Hi, > > Just started seeing this on 6.5 stable. It is new and first occurrence: > > > > TREE04 no success message, 234 successful version messages > > [033mWARNING: [mTREE04 GP HANG at 14 torture stat 2 > > [ 38.371120] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g1253 > > f0x0 ->state 0x2 cpu 6 > > [ 38.388342] Call Trace: > > [ 53.741039] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g3637 > > f0x2 ->state 0x2 cpu 6 > > [ 69.093462] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g5501 > > f0x0 ->state 0x2 cpu 6 > > [ 84.450028] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g10505 > > f0x0 ->state 0x2 cpu 6 > > [ 99.815871] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g13781 > > f0x0 ->state 0x2 cpu 6 > > [ 115.166476] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g16544 > > f0x0 ->state 0x2 cpu 6 > > [ 130.550116] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g18941 > > f0x0 ->state 0x2 cpu 6 > > [..] > > > > All logs: > > http://box.joelfernandes.org:9080/job/rcutorture_stable/job/linux-6.5.y/17/artifact/tools/testing/selftests/rcutorture/res/2023.09.07-04.10.25/TREE04/ > > Huh. Does this happen for you in v6.5 mainline? Hi, I am started torture.sh in a kvm environment (with nested kvm enable) in my Intel i7-1165G7 laptop, which can be examined at runtime: http://154.220.3.120:8080/test/linux-stable/tools/testing/selftests/rcutorture/res/2023.09.08-10.23.47-torture/ Hope I can be of some beneficial Thanks Zhouyi > > Both the code under test (full-state polled grace periods) and the > rcutorture test code are fairly new, so there is some reason for general > suspicion. ;-) > > Thanx, Paul ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [BUG] TREE04 hang on 6.5.y stable: Writer stall state RTWS_COND_SYNC_FULL 2023-09-08 10:28 ` Zhouyi Zhou @ 2023-09-08 23:33 ` Zhouyi Zhou 2023-09-09 0:10 ` Zhouyi Zhou 0 siblings, 1 reply; 11+ messages in thread From: Zhouyi Zhou @ 2023-09-08 23:33 UTC (permalink / raw) To: paulmck; +Cc: Joel Fernandes, rcu On Fri, Sep 8, 2023 at 6:28 PM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote: > > On Fri, Sep 8, 2023 at 1:59 AM Paul E. McKenney <paulmck@kernel.org> wrote: > > > > On Thu, Sep 07, 2023 at 09:17:15AM -0400, Joel Fernandes wrote: > > > Hi, > > > Just started seeing this on 6.5 stable. It is new and first occurrence: > > > > > > TREE04 no success message, 234 successful version messages > > > [033mWARNING: [mTREE04 GP HANG at 14 torture stat 2 > > > [ 38.371120] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g1253 > > > f0x0 ->state 0x2 cpu 6 > > > [ 38.388342] Call Trace: > > > [ 53.741039] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g3637 > > > f0x2 ->state 0x2 cpu 6 > > > [ 69.093462] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g5501 > > > f0x0 ->state 0x2 cpu 6 > > > [ 84.450028] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g10505 > > > f0x0 ->state 0x2 cpu 6 > > > [ 99.815871] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g13781 > > > f0x0 ->state 0x2 cpu 6 > > > [ 115.166476] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g16544 > > > f0x0 ->state 0x2 cpu 6 > > > [ 130.550116] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g18941 > > > f0x0 ->state 0x2 cpu 6 > > > [..] > > > > > > All logs: > > > http://box.joelfernandes.org:9080/job/rcutorture_stable/job/linux-6.5.y/17/artifact/tools/testing/selftests/rcutorture/res/2023.09.07-04.10.25/TREE04/ > > > > Huh. Does this happen for you in v6.5 mainline? > Hi, I am started torture.sh in a kvm environment (with nested kvm > enable) in my Intel i7-1165G7 laptop, which can be examined at > runtime: > http://154.220.3.120:8080/test/linux-stable/tools/testing/selftests/rcutorture/res/2023.09.08-10.23.47-torture/ again I can't reproduce the bug in my environment, I will try it more times. the git head is 3766ec12cf89 System stability is a profound knowledge, there is too much for me to learn from the community. Thanks Zhouyi > > Hope I can be of some beneficial > Thanks > Zhouyi > > > > Both the code under test (full-state polled grace periods) and the > > rcutorture test code are fairly new, so there is some reason for general > > suspicion. ;-) > > > > Thanx, Paul ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [BUG] TREE04 hang on 6.5.y stable: Writer stall state RTWS_COND_SYNC_FULL 2023-09-08 23:33 ` Zhouyi Zhou @ 2023-09-09 0:10 ` Zhouyi Zhou 0 siblings, 0 replies; 11+ messages in thread From: Zhouyi Zhou @ 2023-09-09 0:10 UTC (permalink / raw) To: paulmck; +Cc: Joel Fernandes, rcu On Sat, Sep 9, 2023 at 7:33 AM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote: > > On Fri, Sep 8, 2023 at 6:28 PM Zhouyi Zhou <zhouzhouyi@gmail.com> wrote: > > > > On Fri, Sep 8, 2023 at 1:59 AM Paul E. McKenney <paulmck@kernel.org> wrote: > > > > > > On Thu, Sep 07, 2023 at 09:17:15AM -0400, Joel Fernandes wrote: > > > > Hi, > > > > Just started seeing this on 6.5 stable. It is new and first occurrence: > > > > > > > > TREE04 no success message, 234 successful version messages > > > > [033mWARNING: [mTREE04 GP HANG at 14 torture stat 2 > > > > [ 38.371120] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g1253 > > > > f0x0 ->state 0x2 cpu 6 > > > > [ 38.388342] Call Trace: > > > > [ 53.741039] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g3637 > > > > f0x2 ->state 0x2 cpu 6 > > > > [ 69.093462] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g5501 > > > > f0x0 ->state 0x2 cpu 6 > > > > [ 84.450028] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g10505 > > > > f0x0 ->state 0x2 cpu 6 > > > > [ 99.815871] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g13781 > > > > f0x0 ->state 0x2 cpu 6 > > > > [ 115.166476] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g16544 > > > > f0x0 ->state 0x2 cpu 6 > > > > [ 130.550116] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g18941 > > > > f0x0 ->state 0x2 cpu 6 > > > > [..] > > > > > > > > All logs: > > > > http://box.joelfernandes.org:9080/job/rcutorture_stable/job/linux-6.5.y/17/artifact/tools/testing/selftests/rcutorture/res/2023.09.07-04.10.25/TREE04/ > > > > > > Huh. Does this happen for you in v6.5 mainline? > > Hi, I am started torture.sh in a kvm environment (with nested kvm > > enable) in my Intel i7-1165G7 laptop, which can be examined at > > runtime: > > http://154.220.3.120:8080/test/linux-stable/tools/testing/selftests/rcutorture/res/2023.09.08-10.23.47-torture/ > again I can't reproduce the bug in my environment, I will try it more times. besides my laptop, I also started the test on PPC vm of Open Source lab of Oregon State University: http://140.211.169.189/stable/linux/tools/testing/selftests/rcutorture/res/2023.09.09-00.07.55-torture/ > the git head is 3766ec12cf89 > > System stability is a profound knowledge, there is too much for me to > learn from the community. > Thanks > Zhouyi > > > > Hope I can be of some beneficial > > Thanks > > Zhouyi > > > > > > Both the code under test (full-state polled grace periods) and the > > > rcutorture test code are fairly new, so there is some reason for general > > > suspicion. ;-) > > > > > > Thanx, Paul ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: [BUG] TREE04 hang on 6.5.y stable: Writer stall state RTWS_COND_SYNC_FULL 2023-09-07 14:34 ` Paul E. McKenney 2023-09-07 20:03 ` Joel Fernandes 2023-09-08 10:28 ` Zhouyi Zhou @ 2023-09-16 1:09 ` Joel Fernandes 2 siblings, 0 replies; 11+ messages in thread From: Joel Fernandes @ 2023-09-16 1:09 UTC (permalink / raw) To: Paul E. McKenney; +Cc: rcu On Thu, Sep 07, 2023 at 07:34:44AM -0700, Paul E. McKenney wrote: > On Thu, Sep 07, 2023 at 09:17:15AM -0400, Joel Fernandes wrote: > > Hi, > > Just started seeing this on 6.5 stable. It is new and first occurrence: > > > > TREE04 no success message, 234 successful version messages > > [033mWARNING: [mTREE04 GP HANG at 14 torture stat 2 > > [ 38.371120] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g1253 > > f0x0 ->state 0x2 cpu 6 > > [ 38.388342] Call Trace: > > [ 53.741039] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g3637 > > f0x2 ->state 0x2 cpu 6 > > [ 69.093462] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g5501 > > f0x0 ->state 0x2 cpu 6 > > [ 84.450028] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g10505 > > f0x0 ->state 0x2 cpu 6 > > [ 99.815871] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g13781 > > f0x0 ->state 0x2 cpu 6 > > [ 115.166476] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g16544 > > f0x0 ->state 0x2 cpu 6 > > [ 130.550116] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g18941 > > f0x0 ->state 0x2 cpu 6 > > [..] > > > > All logs: > > http://box.joelfernandes.org:9080/job/rcutorture_stable/job/linux-6.5.y/17/artifact/tools/testing/selftests/rcutorture/res/2023.09.07-04.10.25/TREE04/ > > Huh. Does this happen for you in v6.5 mainline? > > Both the code under test (full-state polled grace periods) and the > rcutorture test code are fairly new, so there is some reason for general > suspicion. ;-) I happened to hit this again but this time on 6.1 stable and TREE05: Here are some logs: http://box.joelfernandes.org:9080/job/rcutorture_stable/job/linux-6.1.y/139/artifact/tools/testing/selftests/rcutorture/res/2023.09.15-04.02.48/TREE05/ I am planning to look closer soon. thanks, - Joel ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2023-09-16 1:10 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2023-09-07 13:17 [BUG] TREE04 hang on 6.5.y stable: Writer stall state RTWS_COND_SYNC_FULL Joel Fernandes 2023-09-07 14:34 ` Paul E. McKenney 2023-09-07 20:03 ` Joel Fernandes 2023-09-08 0:51 ` Joel Fernandes 2023-09-08 8:27 ` Paul E. McKenney 2023-09-08 11:41 ` Frederic Weisbecker 2023-09-08 13:32 ` Joel Fernandes 2023-09-08 10:28 ` Zhouyi Zhou 2023-09-08 23:33 ` Zhouyi Zhou 2023-09-09 0:10 ` Zhouyi Zhou 2023-09-16 1:09 ` Joel Fernandes
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.