From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752398AbdFPRhE (ORCPT ); Fri, 16 Jun 2017 13:37:04 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:43783 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751046AbdFPRhD (ORCPT ); Fri, 16 Jun 2017 13:37:03 -0400 Date: Fri, 16 Jun 2017 10:36:58 -0700 From: "Paul E. McKenney" To: Tejun Heo Cc: jiangshanlai@gmail.com, linux-kernel@vger.kernel.org Subject: Re: WARN_ON_ONCE() in process_one_work()? Reply-To: paulmck@linux.vnet.ibm.com References: <20170501165747.GA993@linux.vnet.ibm.com> <20170501183807.GA7054@linux.vnet.ibm.com> <20170501184402.GB8921@htj.duckdns.org> <20170501185819.GJ3956@linux.vnet.ibm.com> <20170505171159.GA10296@linux.vnet.ibm.com> <20170613205837.GB7359@htj.duckdns.org> <20170613223103.GX3721@linux.vnet.ibm.com> <20170614151548.GA14462@linux.vnet.ibm.com> <20170615153857.GA27788@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170615153857.GA27788@linux.vnet.ibm.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-GCONF: 00 x-cbid: 17061617-0044-0000-0000-000003533D89 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00007244; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000214; SDB=6.00875705; UDB=6.00436031; IPR=6.00655793; BA=6.00005425; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00015854; XFM=3.00000015; UTC=2017-06-16 17:36:59 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17061617-0045-0000-0000-0000078143B1 Message-Id: <20170616173658.GA451@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-06-16_10:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1703280000 definitions=main-1706160292 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jun 15, 2017 at 08:38:57AM -0700, Paul E. McKenney wrote: > On Wed, Jun 14, 2017 at 08:15:48AM -0700, Paul E. McKenney wrote: > > On Tue, Jun 13, 2017 at 03:31:03PM -0700, Paul E. McKenney wrote: > > > On Tue, Jun 13, 2017 at 04:58:37PM -0400, Tejun Heo wrote: > > > > Hello, Paul. > > > > > > > > On Fri, May 05, 2017 at 10:11:59AM -0700, Paul E. McKenney wrote: > > > > > Just following up... I have hit this bug a couple of times over the > > > > > past few days. Anything I can do to help? > > > > > > > > My apologies for dropping the ball on this. I've gone over the hot > > > > plug code in workqueue several times but can't really find how this > > > > would happen. Can you please apply the following patch and see what > > > > it says when the problem happens? > > > > > > I have fired it up, thank you! > > > > > > Last time I saw one failure in 21 hours of test runs, so I have kicked > > > of 42 one-hour test runs. Will see what happens tomorrow morning, > > > Pacific Time. > > > > And none of the 42 runs resulted in a workqueue splat. I will try again > > this evening, Pacific Time. > > > > Who knows, maybe your diagnostic patch is the fix. ;-) > > And this time, we did get something! Here is the printk() output: > > [ 2126.863410] XXX workfn=vmstat_update pool->cpu/flags=1/0x0 curcpu=2 online=0-2,7 active=0,2,7 > > Please see below for the full splat from dmesg. > > Please let me know if you need additional email. My test ID is KSIC > 2017.06.14-15:50:08/TREE07.14, just to help me find it in my large pile > of test results. ;-) And no test failures from yesterday evening. So it looks like we get somewhere on the order of one failure per 138 hours of TREE07 rcutorture runtime with your printk() in the mix. Was the above output from your printk() output of any help? Thanx, Paul