From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.2 required=3.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_PASS,URIBL_BLOCKED, USER_AGENT_MUTT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5EE4BC0044C for ; Thu, 1 Nov 2018 23:21:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id EC64320820 for ; Thu, 1 Nov 2018 23:21:41 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org EC64320820 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=linux.ibm.com Authentication-Results: mail.kernel.org; spf=none smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728201AbeKBI0m (ORCPT ); Fri, 2 Nov 2018 04:26:42 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:44386 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728053AbeKBI0m (ORCPT ); Fri, 2 Nov 2018 04:26:42 -0400 Received: from pps.filterd (m0098399.ppops.net [127.0.0.1]) by mx0a-001b2d01.pphosted.com (8.16.0.22/8.16.0.22) with SMTP id wA1NKNib121852 for ; Thu, 1 Nov 2018 19:21:38 -0400 Received: from e15.ny.us.ibm.com (e15.ny.us.ibm.com [129.33.205.205]) by mx0a-001b2d01.pphosted.com with ESMTP id 2ng7d614jf-1 (version=TLSv1.2 cipher=AES256-GCM-SHA384 bits=256 verify=NOT) for ; Thu, 01 Nov 2018 19:21:38 -0400 Received: from localhost by e15.ny.us.ibm.com with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted for from ; Thu, 1 Nov 2018 23:21:37 -0000 Received: from b01cxnp23034.gho.pok.ibm.com (9.57.198.29) by e15.ny.us.ibm.com (146.89.104.202) with IBM ESMTP SMTP Gateway: Authorized Use Only! Violators will be prosecuted; (version=TLSv1/SSLv3 cipher=AES256-GCM-SHA384 bits=256/256) Thu, 1 Nov 2018 23:21:35 -0000 Received: from b01ledav003.gho.pok.ibm.com (b01ledav003.gho.pok.ibm.com [9.57.199.108]) by b01cxnp23034.gho.pok.ibm.com (8.14.9/8.14.9/NCO v10.0) with ESMTP id wA1NLYg327066434 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Thu, 1 Nov 2018 23:21:34 GMT Received: from b01ledav003.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id 1CB6EB2066; Thu, 1 Nov 2018 23:21:34 +0000 (GMT) Received: from b01ledav003.gho.pok.ibm.com (unknown [127.0.0.1]) by IMSVA (Postfix) with ESMTP id DF390B2064; Thu, 1 Nov 2018 23:21:33 +0000 (GMT) Received: from paulmck-ThinkPad-W541 (unknown [9.70.82.141]) by b01ledav003.gho.pok.ibm.com (Postfix) with ESMTP; Thu, 1 Nov 2018 23:21:33 +0000 (GMT) Received: by paulmck-ThinkPad-W541 (Postfix, from userid 1000) id 7016916C34AA; Thu, 1 Nov 2018 16:21:34 -0700 (PDT) Date: Thu, 1 Nov 2018 16:21:34 -0700 From: "Paul E. McKenney" To: a.p.zijlstra@chello.nl, tglx@linutronix.de, bigeasy@linutronix.de Cc: linux-rt-users@vger.kernel.org, linux-kernel@vger.kernel.org Subject: rcu: Frob softirq test Reply-To: paulmck@linux.ibm.com MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-GCONF: 00 x-cbid: 18110123-0068-0000-0000-000003580E5F X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00009968; HX=3.00000242; KW=3.00000007; PH=3.00000004; SC=3.00000268; SDB=6.01111304; UDB=6.00575899; IPR=6.00891396; MB=3.00023997; MTD=3.00000008; XFM=3.00000015; UTC=2018-11-01 23:21:36 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 18110123-0069-0000-0000-00004648637D Message-Id: <20181101232134.GA11875@linux.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10434:,, definitions=2018-11-01_16:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=0 phishscore=0 bulkscore=0 spamscore=0 clxscore=1015 lowpriorityscore=0 mlxscore=0 impostorscore=0 mlxlogscore=999 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1807170000 definitions=main-1811010194 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > With RT_FULL we get the below wreckage: The code that this applies to has itself been fully frobbed as of the current merge window. I believe that it should now work in -rt as is, but who knows? ;-) Thanx, Paul > [ 126.060484] ======================================================= > [ 126.060486] [ INFO: possible circular locking dependency detected ] > [ 126.060489] 3.0.1-rt10+ #30 > [ 126.060490] ------------------------------------------------------- > [ 126.060492] irq/24-eth0/1235 is trying to acquire lock: > [ 126.060495] (&(lock)->wait_lock#2){+.+...}, at: [] rt_mutex_slowunlock+0x16/0x55 > [ 126.060503] > [ 126.060504] but task is already holding lock: > [ 126.060506] (&p->pi_lock){-...-.}, at: [] try_to_wake_up+0x35/0x429 > [ 126.060511] > [ 126.060511] which lock already depends on the new lock. > [ 126.060513] > [ 126.060514] > [ 126.060514] the existing dependency chain (in reverse order) is: > [ 126.060516] > [ 126.060516] -> #1 (&p->pi_lock){-...-.}: > [ 126.060519] [] lock_acquire+0x145/0x18a > [ 126.060524] [] _raw_spin_lock_irqsave+0x4b/0x85 > [ 126.060527] [] task_blocks_on_rt_mutex+0x36/0x20f > [ 126.060531] [] rt_mutex_slowlock+0xd1/0x15a > [ 126.060534] [] rt_mutex_lock+0x2d/0x2f > [ 126.060537] [] rcu_boost+0xad/0xde > [ 126.060541] [] rcu_boost_kthread+0x7d/0x9b > [ 126.060544] [] kthread+0x99/0xa1 > [ 126.060547] [] kernel_thread_helper+0x4/0x10 > [ 126.060551] > [ 126.060552] -> #0 (&(lock)->wait_lock#2){+.+...}: > [ 126.060555] [] __lock_acquire+0x1157/0x1816 > [ 126.060558] [] lock_acquire+0x145/0x18a > [ 126.060561] [] _raw_spin_lock+0x40/0x73 > [ 126.060564] [] rt_mutex_slowunlock+0x16/0x55 > [ 126.060566] [] rt_mutex_unlock+0x27/0x29 > [ 126.060569] [] rcu_read_unlock_special+0x17e/0x1c4 > [ 126.060573] [] __rcu_read_unlock+0x48/0x89 > [ 126.060576] [] select_task_rq_rt+0xc7/0xd5 > [ 126.060580] [] try_to_wake_up+0x175/0x429 > [ 126.060583] [] wake_up_process+0x15/0x17 > [ 126.060585] [] wakeup_softirqd+0x24/0x26 > [ 126.060590] [] irq_exit+0x49/0x55 > [ 126.060593] [] smp_apic_timer_interrupt+0x8a/0x98 > [ 126.060597] [] apic_timer_interrupt+0x13/0x20 > [ 126.060600] [] irq_forced_thread_fn+0x1b/0x44 > [ 126.060603] [] irq_thread+0xde/0x1af > [ 126.060606] [] kthread+0x99/0xa1 > [ 126.060608] [] kernel_thread_helper+0x4/0x10 > [ 126.060611] > [ 126.060612] other info that might help us debug this: > [ 126.060614] > [ 126.060615] Possible unsafe locking scenario: > [ 126.060616] > [ 126.060617] CPU0 CPU1 > [ 126.060619] ---- ---- > [ 126.060620] lock(&p->pi_lock); > [ 126.060623] lock(&(lock)->wait_lock); > [ 126.060625] lock(&p->pi_lock); > [ 126.060627] lock(&(lock)->wait_lock); > [ 126.060629] > [ 126.060629] *** DEADLOCK *** > [ 126.060630] > [ 126.060632] 1 lock held by irq/24-eth0/1235: > [ 126.060633] #0: (&p->pi_lock){-...-.}, at: [] try_to_wake_up+0x35/0x429 > [ 126.060638] > [ 126.060638] stack backtrace: > [ 126.060641] Pid: 1235, comm: irq/24-eth0 Not tainted 3.0.1-rt10+ #30 > [ 126.060643] Call Trace: > [ 126.060644] [] print_circular_bug+0x289/0x29a > [ 126.060651] [] __lock_acquire+0x1157/0x1816 > [ 126.060655] [] ? trace_hardirqs_off_caller+0x1f/0x99 > [ 126.060658] [] ? rt_mutex_slowunlock+0x16/0x55 > [ 126.060661] [] lock_acquire+0x145/0x18a > [ 126.060664] [] ? rt_mutex_slowunlock+0x16/0x55 > [ 126.060668] [] _raw_spin_lock+0x40/0x73 > [ 126.060671] [] ? rt_mutex_slowunlock+0x16/0x55 > [ 126.060674] [] ? rcu_report_qs_rsp+0x87/0x8c > [ 126.060677] [] rt_mutex_slowunlock+0x16/0x55 > [ 126.060680] [] ? rcu_read_unlock_special+0x9b/0x1c4 > [ 126.060683] [] rt_mutex_unlock+0x27/0x29 > [ 126.060687] [] rcu_read_unlock_special+0x17e/0x1c4 > [ 126.060690] [] __rcu_read_unlock+0x48/0x89 > [ 126.060693] [] select_task_rq_rt+0xc7/0xd5 > [ 126.060696] [] ? select_task_rq_rt+0x27/0xd5 > [ 126.060701] [] ? clockevents_program_event+0x8e/0x90 > [ 126.060704] [] try_to_wake_up+0x175/0x429 > [ 126.060708] [] ? tick_program_event+0x1f/0x21 > [ 126.060711] [] wake_up_process+0x15/0x17 > [ 126.060715] [] wakeup_softirqd+0x24/0x26 > [ 126.060718] [] irq_exit+0x49/0x55 > [ 126.060721] [] smp_apic_timer_interrupt+0x8a/0x98 > [ 126.060724] [] apic_timer_interrupt+0x13/0x20 > [ 126.060726] [] ? migrate_disable+0x75/0x12d > [ 126.060733] [] ? local_bh_disable+0xe/0x1f > [ 126.060736] [] ? local_bh_disable+0x1d/0x1f > [ 126.060739] [] irq_forced_thread_fn+0x1b/0x44 > [ 126.060742] [] ? _raw_spin_unlock_irq+0x3b/0x59 > [ 126.060745] [] irq_thread+0xde/0x1af > [ 126.060748] [] ? irq_thread_fn+0x3a/0x3a > [ 126.060751] [] ? irq_finalize_oneshot+0xd1/0xd1 > [ 126.060754] [] ? irq_finalize_oneshot+0xd1/0xd1 > [ 126.060757] [] kthread+0x99/0xa1 > [ 126.060761] [] kernel_thread_helper+0x4/0x10 > [ 126.060764] [] ? finish_task_switch+0x87/0x10a > [ 126.060768] [] ? retint_restore_args+0xe/0xe > [ 126.060771] [] ? __init_kthread_worker+0x8c/0x8c > [ 126.060774] [] ? gs_change+0xb/0xb > > Because irq_exit() does: > > void irq_exit(void) > { > account_system_vtime(current); > trace_hardirq_exit(); > sub_preempt_count(IRQ_EXIT_OFFSET); > if (!in_interrupt() && local_softirq_pending()) > invoke_softirq(); > > ... > } > > Which triggers a wakeup, which uses RCU, now if the interrupted task has > t->rcu_read_unlock_special set, the rcu usage from the wakeup will end > up in rcu_read_unlock_special(). rcu_read_unlock_special() will test > for in_irq(), which will fail as we just decremented preempt_count > with IRQ_EXIT_OFFSET, and in_sering_softirq(), which for > PREEMPT_RT_FULL reads: > > int in_serving_softirq(void) > { > int res; > > preempt_disable(); > res = __get_cpu_var(local_softirq_runner) == current; > preempt_enable(); > return res; > } > > Which will thus also fail, resulting in the above wreckage. > > The 'somewhat' ugly solution is to open-code the preempt_count() test > in rcu_read_unlock_special(). > > Also, we're not at all sure how ->rcu_read_unlock_special gets set > here... so this is very likely a bandaid and more thought is required. > > Cc: Paul E. McKenney > Signed-off-by: Peter Zijlstra > > diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h > index 564e3927e7b0..429a2f144e19 100644 > --- a/kernel/rcu/tree_plugin.h > +++ b/kernel/rcu/tree_plugin.h > @@ -524,7 +524,7 @@ static void rcu_read_unlock_special(struct task_struct *t) > } > > /* Hardware IRQ handlers cannot block, complain if they get here. */ > - if (in_irq() || in_serving_softirq()) { > + if (preempt_count() & (HARDIRQ_MASK | SOFTIRQ_OFFSET)) { > lockdep_rcu_suspicious(__FILE__, __LINE__, > "rcu_read_unlock() from irq or softirq with blocking in critical section!!!\n"); > pr_alert("->rcu_read_unlock_special: %#x (b: %d, enq: %d nq: %d)\n",