From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752434AbdF3Mwf (ORCPT ); Fri, 30 Jun 2017 08:52:35 -0400 Received: from mx0b-001b2d01.pphosted.com ([148.163.158.5]:51462 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752286AbdF3Mwc (ORCPT ); Fri, 30 Jun 2017 08:52:32 -0400 Date: Fri, 30 Jun 2017 05:52:17 -0700 From: "Paul E. McKenney" To: Will Deacon Cc: kernel test robot , Oleg Nesterov , Andrew Morton , Peter Zijlstra , Alan Stern , Andrea Parri , Linus Torvalds , LKML , lkp@01.org Subject: Re: [task_work] 46a4746d9a: inconsistent{IN-HARDIRQ-W}->{HARDIRQ-ON-W}usage Reply-To: paulmck@linux.vnet.ibm.com References: <20170630061920.GA61856@inn.lkp.intel.com> <20170630084558.GA9726@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170630084558.GA9726@arm.com> User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-GCONF: 00 x-cbid: 17063012-0044-0000-0000-0000036178D5 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00007297; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000214; SDB=6.00880862; UDB=6.00439164; IPR=6.00661015; BA=6.00005448; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00016023; XFM=3.00000015; UTC=2017-06-30 12:52:20 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17063012-0045-0000-0000-0000078F7926 Message-Id: <20170630125217.GV2393@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-06-30_09:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1703280000 definitions=main-1706300204 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jun 30, 2017 at 09:45:58AM +0100, Will Deacon wrote: > On Fri, Jun 30, 2017 at 02:19:20PM +0800, kernel test robot wrote: > > > > FYI, we noticed the following commit: > > > > commit: 46a4746d9a364a9b0267c19be0f8419e9b72ad37 ("task_work: Replace spin_unlock_wait() with lock/unlock pair") > > https://git.kernel.org/cgit/linux/kernel/git/paulmck/linux-rcu.git spin_unlock_wait_no.2017.06.29c > > > > in testcase: boot > > > > on test machine: qemu-system-x86_64 -enable-kvm -cpu host -smp 2 -m 1G > > > > caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace): > > > > > > +-------------------------------------------------+------------+------------+ > > | | ee4c0fbd46 | 46a4746d9a | > > +-------------------------------------------------+------------+------------+ > > | boot_successes | 6 | 0 | > > | boot_failures | 0 | 10 | > > | inconsistent{IN-HARDIRQ-W}->{HARDIRQ-ON-W}usage | 0 | 8 | > > | inconsistent{IN-SOFTIRQ-W}->{SOFTIRQ-ON-W}usage | 0 | 2 | > > +-------------------------------------------------+------------+------------+ > > > > > > > > [ 4.784726] WARNING: inconsistent lock state > > [ 4.785206] 4.12.0-rc4-00090-g46a4746 #86 Not tainted > > [ 4.785733] -------------------------------- > > [ 4.786203] inconsistent {IN-HARDIRQ-W} -> {HARDIRQ-ON-W} usage. > > [ 4.786815] modprobe/143 [HC0[0]:SC0[0]:HE1:SE1] takes: > > [ 4.787377] (&p->pi_lock){?.-.-.}, at: [] task_work_run+0x6e/0xa8 > > [ 4.788202] {IN-HARDIRQ-W} state was registered at: > > [ 4.788711] __lock_acquire+0x3a9/0xed4 > > [ 4.789151] lock_acquire+0x125/0x1be > > [ 4.789571] _raw_spin_lock_irqsave+0x49/0x84 > > [ 4.790048] try_to_wake_up+0x35/0x25b > > D'oh... so that's another difference between spin_unlock_wait and spin_lock; > spin_unlock. The former doesn't care about being interrupted, since there's > no scope for deadlock when you're not actually taking the lock. > > So the easy fix here is to use the irqsave/irqrestore variants in > task_work_run, but it does mean we need to be a little bit careful when > doing the conversion. Indeed, very stupid mistake on my part. Hurray for 0day Test Robot! ;-) I will recheck the others. Thanx, Paul