From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751859AbdK1VIQ (ORCPT ); Tue, 28 Nov 2017 16:08:16 -0500 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:47124 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751402AbdK1VIP (ORCPT ); Tue, 28 Nov 2017 16:08:15 -0500 Date: Tue, 28 Nov 2017 13:08:10 -0800 From: "Paul E. McKenney" To: Thomas Gleixner Cc: kernel test robot , LKML , lkp@01.org Subject: Re: [lkp-robot] [torture] b151f93a71: INFO:rcu_preempt_detected_stalls_on_CPUs/tasks Reply-To: paulmck@linux.vnet.ibm.com References: <20171126084203.GE21779@yexl-desktop> <20171127215735.GP3624@linux.vnet.ibm.com> <20171128164144.GS3624@linux.vnet.ibm.com> <20171128170759.GU3624@linux.vnet.ibm.com> <20171128173554.GV3624@linux.vnet.ibm.com> <20171128204619.GA16740@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20171128204619.GA16740@linux.vnet.ibm.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-GCONF: 00 x-cbid: 17112821-0052-0000-0000-000002891C92 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00008121; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000242; SDB=6.00952585; UDB=6.00481237; IPR=6.00732666; BA=6.00005719; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00018233; XFM=3.00000015; UTC=2017-11-28 21:08:12 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17112821-0053-0000-0000-000052C1D267 Message-Id: <20171128210810.GA22726@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-11-28_11:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 impostorscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1709140000 definitions=main-1711280283 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Nov 28, 2017 at 12:46:19PM -0800, Paul E. McKenney wrote: > On Tue, Nov 28, 2017 at 09:35:54AM -0800, Paul E. McKenney wrote: > > On Tue, Nov 28, 2017 at 06:10:08PM +0100, Thomas Gleixner wrote: > > > On Tue, 28 Nov 2017, Paul E. McKenney wrote: > > > > On Tue, Nov 28, 2017 at 05:47:35PM +0100, Thomas Gleixner wrote: > > > > diff --git a/kernel/time/timer.c b/kernel/time/timer.c > > > > index db774b0f217e..a3321bb565db 100644 > > > > --- a/kernel/time/timer.c > > > > +++ b/kernel/time/timer.c > > > > @@ -1803,7 +1803,7 @@ signed long __sched schedule_timeout(signed long timeout) > > > > idx = timer_get_idx(&timer.timer); > > > > idx_now = calc_wheel_index(j, base->clk); > > > > raw_spin_unlock_irqrestore(&base->lock, flags); > > > > - pr_info("%s: Waylayed timer idx: %u idx_now: %u\n", __func__, idx, idx_now); > > > > + pr_info("%s: Waylayed timer base->clk: %#lx jiffies: %#lx base->next_expiry: %#lx timer->flags: %#x timer->expires %#lx idx: %u idx_now: %u\n", __func__, base->clk, j, base->next_expiry, timer.timer.flags, timer.timer.expires, idx, idx_now); > > > > > > Please print idx and idx_now as hex values. It's simpler to decode that way. > > > > Here you go! Starting tests at this end, focusing on TREE01 and TREE04. > > BTW, TREE04 doesn't do any CPU hotplug, providing a counterexample to > > my long-held assumption that this only happened in the presence of CPU > > hotplug operations. > > And here is output with changes discussed on IRC. TREE04 managed to > have not one but two overlapping RCU CPU stall warnings, one for RCU-bh > and the second for RCU-sched. TREE04 and TREE04. HZ=1000. And here is the full patch, in all its lack of aesthetic appeal. Thanx, Paul ------------------------------------------------------------------------ diff --git a/kernel/time/timer.c b/kernel/time/timer.c index ffebcf878fba..fbfdc572ceaa 100644 --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -1755,8 +1755,13 @@ static void process_timeout(struct timer_list *t) */ signed long __sched schedule_timeout(signed long timeout) { + struct timer_base *base; struct process_timer timer; unsigned long expire; + unsigned long flags; + unsigned long i; + unsigned int idx, idx_now; + unsigned long j; switch (timeout) { @@ -1793,6 +1798,17 @@ signed long __sched schedule_timeout(signed long timeout) timer_setup_on_stack(&timer.timer, process_timeout, 0); __mod_timer(&timer.timer, expire, 0); schedule(); + j = jiffies; + if (timeout < 5 && time_after(j, expire + 8 * HZ) && timer_pending(&timer.timer)) { + base = lock_timer_base(&timer.timer, &flags); + idx = timer_get_idx(&timer.timer); + idx_now = calc_wheel_index(j, base->clk); + raw_spin_unlock_irqrestore(&base->lock, flags); + pr_info("%s: Waylayed timer base->clk: %#lx jiffies: %#lx base->next_expiry: %#lx timer->flags: %#x timer->expires %#lx idx: %x idx_now: %x base->pending_map ", __func__, base->clk, j, base->next_expiry, timer.timer.flags, timer.timer.expires, idx, idx_now); + for (i = 0; i < WHEEL_SIZE / sizeof(base->pending_map[0]) / 8; i++) + pr_cont("%16lx", base->pending_map[i]); + pr_cont("\n"); + } del_singleshot_timer_sync(&timer.timer); /* Remove the timer from the object tracker */