From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9180AEE49A4 for ; Sun, 10 Sep 2023 20:14:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231190AbjIJUOy (ORCPT ); Sun, 10 Sep 2023 16:14:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48456 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229688AbjIJUOx (ORCPT ); Sun, 10 Sep 2023 16:14:53 -0400 Received: from mail-io1-xd2e.google.com (mail-io1-xd2e.google.com [IPv6:2607:f8b0:4864:20::d2e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4D310135 for ; Sun, 10 Sep 2023 13:14:47 -0700 (PDT) Received: by mail-io1-xd2e.google.com with SMTP id ca18e2360f4ac-7926a450a0aso132612339f.3 for ; Sun, 10 Sep 2023 13:14:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=joelfernandes.org; s=google; t=1694376886; x=1694981686; darn=vger.kernel.org; h=content-transfer-encoding:content-disposition:mime-version :message-id:subject:cc:to:from:date:from:to:cc:subject:date :message-id:reply-to; bh=kW2AVFLGV9R7FswF+izo/4pYtp2R55/798Jh6ta9G+0=; b=QcVXe52od3aGak/NQCMr5pANakwtCEbcF8QLHnCEwronZxHOhgHh8XQdDkINv3WAo+ 81IiRZQ4HgoKTYI5Vm/CDYOGQ/ATtGE8JRJ3YlHTL0LX7KbpDohyemPN+t+15AMS7F1I 0cpwLvD3soWw0j9Y9fY7vClCxOybF421UoUQY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694376886; x=1694981686; h=content-transfer-encoding:content-disposition:mime-version :message-id:subject:cc:to:from:date:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=kW2AVFLGV9R7FswF+izo/4pYtp2R55/798Jh6ta9G+0=; b=T2P/Q1VAmiG/c1X2Cm4cATJ9GZ4N7OW9B6zkTKuR1KxleZEwlNp+PB+hb8eFCmVs18 RHBCGkX2SANjBxg/MbaHTVOj82SORmvsDhsl9k5ZYQ9Fm1bGD/OBOU3ru5BKv8lPqVkk MhS9A8FfR783yfFRwN8dud1uLi3eDEIr4hVQYA82aJ/d3UPAf9HOX7yVTsyMBbgyrzLF 6icLWE2VTkDsVk9H/gt1SNs8hMZOqNO+cfg4AcH2LyBluq8yc/7vv4ZYtPQPPWv5wfQX 1epsja0fzamiHqb/n+2P7H1us5jhDN0HMgnSCc9IoyjJzXMnKB8mVCoq8bxGOiCMNj11 Mmig== X-Gm-Message-State: AOJu0Yw8OsyQMpInhHf1BxUkbqqQcn4FNJ9peADYGb8qM0ev/KulKutO /+MQytXVcixpxErw2KpitkaG6DUxxayTIARadLg= X-Google-Smtp-Source: AGHT+IGlyeRwtTn4R7VWe9f/Ufv+z+LL5ct46+UxnwdxLhlvgXlvs3saIKz8T+NShWjnIirG6y1bUQ== X-Received: by 2002:a6b:f316:0:b0:786:cc36:360c with SMTP id m22-20020a6bf316000000b00786cc36360cmr10020149ioh.8.1694376886447; Sun, 10 Sep 2023 13:14:46 -0700 (PDT) Received: from localhost (156.190.123.34.bc.googleusercontent.com. [34.123.190.156]) by smtp.gmail.com with ESMTPSA id h8-20020a0566380f0800b0042b76deb22fsm1736253jas.92.2023.09.10.13.14.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 10 Sep 2023 13:14:45 -0700 (PDT) Date: Sun, 10 Sep 2023 20:14:45 +0000 From: Joel Fernandes To: Frederic Weisbecker Cc: "Paul E. McKenney" , rcu@vger.kernel.org Subject: [BUG] Random intermittent boost failures (Was Re: [BUG] TREE04..) Message-ID: <20230910201445.GA1605059@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: rcu@vger.kernel.org On Fri, Sep 08, 2023 at 09:32:35AM -0400, Joel Fernandes wrote: > On Fri, Sep 8, 2023 at 7:41 AM Frederic Weisbecker wrote: > > > > On Fri, Sep 08, 2023 at 01:27:06AM -0700, Paul E. McKenney wrote: > > > On Thu, Sep 07, 2023 at 08:51:43PM -0400, Joel Fernandes wrote: > > > > On Thu, Sep 7, 2023 at 4:03 PM Joel Fernandes wrote: > > > > > > > > > > > > > > > > > > > > > On Sep 7, 2023, at 12:23 PM, Paul E. McKenney wrote: > > > > > > > > > > > > On Thu, Sep 07, 2023 at 09:17:15AM -0400, Joel Fernandes wrote: > > > > > >> Hi, > > > > > >> Just started seeing this on 6.5 stable. It is new and first occurrence: > > > > > >> > > > > > >> TREE04 no success message, 234 successful version messages > > > > > >> [033mWARNING: [mTREE04 GP HANG at 14 torture stat 2 > > > > > >> [ 38.371120] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g1253 > > > > > >> f0x0 ->state 0x2 cpu 6 > > > > > >> [ 38.388342] Call Trace: > > > > > >> [ 53.741039] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g3637 > > > > > >> f0x2 ->state 0x2 cpu 6 > > > > > >> [ 69.093462] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g5501 > > > > > >> f0x0 ->state 0x2 cpu 6 > > > > > >> [ 84.450028] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g10505 > > > > > >> f0x0 ->state 0x2 cpu 6 > > > > > >> [ 99.815871] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g13781 > > > > > >> f0x0 ->state 0x2 cpu 6 > > > > > >> [ 115.166476] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g16544 > > > > > >> f0x0 ->state 0x2 cpu 6 > > > > > >> [ 130.550116] ??? Writer stall state RTWS_COND_SYNC_FULL(10) g18941 > > > > > >> f0x0 ->state 0x2 cpu 6 > > > > > >> [..] > > > > > >> > > > > > >> All logs: > > > > > >> http://box.joelfernandes.org:9080/job/rcutorture_stable/job/linux-6.5.y/17/artifact/tools/testing/selftests/rcutorture/res/2023.09.07-04.10.25/TREE04/ > > > > > > > > > > > > Huh. Does this happen for you in v6.5 mainline? > > > > > > > > > > > > Both the code under test (full-state polled grace periods) and the > > > > > > rcutorture test code are fairly new, so there is some reason for general > > > > > > suspicion. ;-) > > > > > > > > > > Ah. I never saw it on either 6.5 mainline or stable till today. Even on stable > > > > > I only ever saw it this once. On mainline I have not seen it yet but I do test > > > > > stable much more since I have been on stable maintenance duty ;-). > > > > > > > > I did a couple of long runs and I am not able to reproduce it anymore. :-/ > > > > > > I know that feeling! > > > > Same here, this is after all the reason why we keep the tick dependency within > > the hotplug process without really knowing why :o) > > Heh. I have been running into another intermittent one as well which > is the boost failure and that happens once in 10-15 runs or so. > > I was thinking of running the following configuration on an automated > regular basis to at least provide a better clue on the lucky run that > catches an issue. But then the issue is it would change timing enough > to maybe hide bugs. I could also make it submit logs automatically to > the list on such occurrences, but one step at a time and all that. I > do need to add (hopefully less noisy) tick/timer related trace events. > > # Define the bootargs array > bootargs=( > "ftrace_dump_on_oops" > "panic_on_warn=1" > "sysctl.kernel.panic_on_rcu_stall=1" > "sysctl.kernel.max_rcu_stall_to_panic=1" > "trace_buf_size=10K" > "traceoff_on_warning=1" > "panic_print=0x1f" # To dump held locks, mem and other info. > ) > # Define the trace events array passed to bootargs. > trace_events=( > "sched:sched_switch" > "sched:sched_waking" > "rcu:rcu_callback" > "rcu:rcu_fqs" > "rcu:rcu_quiescent_state_report" > "rcu:rcu_grace_period" > ) So some insight on this boost failure. Just before the boost failures are reported, I see the migration thread interferring with the rcu_preempt thread (aka GP kthread). See trace below. Of note is that the rcu_preempt thread is runnable while context switching, which means its execution is interferred. The rcu_preempt thread is at RT prio 2 as can be seen. So some open-ended questions: what exactly does the migration thread want, this is something related to CPU hotplug? And if the migration thread had to run, why did the rcu_preempt thread not get pushed to another CPU by the scheduler? We have 16 vCPUs for this test. IMHO, this might be unrelated to boost but just that the boost test detects a stall in RCU 'sooner'. I am planning to add more tracing to the migration thread to trace down, but I'm afraid I am hitting some limits of what the console is capable of in terms of accepting trace dumping. I am seeing a lot of printk messages dropped. Maybe if I reduce RCU stall detector threshold to 4 seconds or so, I might see this even without boost testing, I feel. migratio-66 8d..2. 1544321393us : sched_switch: prev_comm=migration/8 prev_pid=66 prev_prio=0 prev_state=S ==> next_comm=swapper/8 next_pid=0 next_prio=120 -0 8d..2. 1544321890us : sched_switch: prev_comm=swapper/8 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=rcu_preempt next_pid=13 next_prio=97 rcu_pree-13 8d..4. 1544321892us : sched_waking: comm=migration/8 pid=66 prio=0 target_cpu=008 rcu_pree-13 8d..2. 1544321895us : sched_switch: prev_comm=rcu_preempt prev_pid=13 prev_prio=97 prev_state=R ==> next_comm=migration/8 next_pid=66 next_prio=0 migratio-66 8d..2. 1544321903us : sched_switch: prev_comm=migration/8 prev_pid=66 prev_prio=0 prev_state=S ==> next_comm=swapper/8 next_pid=0 next_prio=120 -0 8d..2. 1544322121us : sched_switch: prev_comm=swapper/8 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=rcu_preempt next_pid=13 next_prio=97 rcu_pree-13 8d..4. 1544322123us : sched_waking: comm=migration/8 pid=66 prio=0 target_cpu=008 rcu_pree-13 8d..2. 1544322126us : sched_switch: prev_comm=rcu_preempt prev_pid=13 prev_prio=97 prev_state=R ==> next_comm=migration/8 next_pid=66 next_prio=0 migratio-66 8d..2. 1544322134us : sched_switch: prev_comm=migration/8 prev_pid=66 prev_prio=0 prev_state=S ==> next_comm=swapper/8 next_pid=0 next_prio=120 -0 8d..2. 1544322495us : sched_switch: prev_comm=swapper/8 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=rcu_preempt next_pid=13 next_prio=97 rcu_pree-13 8d..4. 1544322497us : sched_waking: comm=migration/8 pid=66 prio=0 target_cpu=008 rcu_pree-13 8d..2. 1544322500us : sched_switch: prev_comm=rcu_preempt prev_pid=13 prev_prio=97 prev_state=R ==> next_comm=migration/8 next_pid=66 next_prio=0 migratio-66 8d..2. 1544322508us : sched_switch: prev_comm=migration/8 prev_pid=66 prev_prio=0 prev_state=S ==> next_comm=swapper/8 next_pid=0 next_prio=120 -0 8d..2. 1544322978us : sched_switch: prev_comm=swapper/8 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=rcu_preempt next_pid=13 next_prio=97 rcu_pree-13 8d..4. 1544322980us : sched_waking: comm=migration/8 pid=66 prio=0 target_cpu=008 rcu_pree-13 8d..2. 1544322984us : sched_switch: prev_comm=rcu_preempt prev_pid=13 prev_prio=97 prev_state=R ==> next_comm=migration/8 next_pid=66 next_prio=0 migratio-66 8d..2. 1544322992us : sched_switch: prev_comm=migration/8 prev_pid=66 prev_prio=0 prev_state=S ==> next_comm=swapper/8 next_pid=0 next_prio=120 -0 8d..2. 1544323301us : sched_switch: prev_comm=swapper/8 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=rcu_preempt next_pid=13 next_prio=97 rcu_pree-13 8d..4. 1544323303us : sched_waking: comm=migration/8 pid=66 prio=0 target_cpu=008 rcu_pree-13 8d..2. 1544323306us : sched_switch: prev_comm=rcu_preempt prev_pid=13 prev_prio=97 prev_state=R ==> next_comm=migration/8 next_pid=66 next_prio=0 migratio-66 8d..2. 1544323314us : sched_switch: prev_comm=migration/8 prev_pid=66 prev_prio=0 prev_state=S ==> next_comm=swapper/8 next_pid=0 next_prio=120 -0 8d..2. 1544323673us : sched_switch: prev_comm=swapper/8 prev_pid=0 prev_prio=120 prev_state=R ==> next_comm=rcu_preempt next_pid=13 next_prio=97 rcu_pree-13 8d..4. 1544323675us : sched_waking: comm=migration/8 pid=66 prio=0 target_cpu=008 rcu_pree-13 8d..2. 1544323678us : sched_switch: prev_comm=rcu_preempt prev_pid=13 prev_prio=97 prev_state=R ==> next_comm=migration/8 next_pid=66 next_prio=0 migratio-66 8d..2. 1544323686us : sched_switch: prev_comm=migration/8 prev_pid=66 prev_prio=0 prev_state=S ==> next_comm=swapper/8 next_pid=0 next_prio=120 rcu_tort-1726 12..... 1544323807us : rcu_torture_boost_failed: Boost inversion thread ->rt_priority 1 gp_state 1242792 jiffies 2503 rcu_tort-1723 1..... 1544323840us : rcu_torture_boost_failed: Boost failed rcu_tort-1743 4..... 1544323844us : rcu_torture_boost_failed: Boost failed rcu_tort-1715 3..... 1544323848us : rcu_torture_boost_failed: Boost failed rcu_tort-1713 7..... 1544323852us : rcu_torture_boost_failed: Boost failed (At this point a message says tracing was disabled due to a boost failure warning due to my use of traceoff_on_warning=1) Side note: I was talking to Steve the other day that it'd be nice to extract the trace buffer in a better way than console. Perhaps we can do a core dump from rcutorture? I think 'crash' can already extract ftrace [1] from core dumps and Qemu can generate core dumps via QMP. [1] https://access.redhat.com/solutions/239433 thanks, - Joel