From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932765AbdIRQvP (ORCPT ); Mon, 18 Sep 2017 12:51:15 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:42358 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932340AbdIRQvO (ORCPT ); Mon, 18 Sep 2017 12:51:14 -0400 Date: Mon, 18 Sep 2017 09:51:10 -0700 From: "Paul E. McKenney" To: peterz@infradead.org, mingo@redhat.com Cc: linux-kernel@vger.kernel.org, bigeasy@linutronix.de, tglx@linutronix.de Subject: native_smp_send_reschedule() splat from rt_mutex_lock()? Reply-To: paulmck@linux.vnet.ibm.com MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-GCONF: 00 x-cbid: 17091816-0044-0000-0000-000003919576 X-IBM-SpamModules-Scores: X-IBM-SpamModules-Versions: BY=3.00007757; HX=3.00000241; KW=3.00000007; PH=3.00000004; SC=3.00000230; SDB=6.00918871; UDB=6.00461614; IPR=6.00699098; BA=6.00005595; NDR=6.00000001; ZLA=6.00000005; ZF=6.00000009; ZB=6.00000000; ZP=6.00000000; ZH=6.00000000; ZU=6.00000002; MB=3.00017196; XFM=3.00000015; UTC=2017-09-18 16:51:11 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 17091816-0045-0000-0000-000007C09774 Message-Id: <20170918165110.GA9975@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2017-09-18_06:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1707230000 definitions=main-1709180240 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello! Just moved ahead to v4.14-rc1, and I am seeing a native_smp_send_reschedule() splat from rt_mutex_lock(): [11072.586518] sched: Unexpected reschedule of offline CPU#6! [11072.587578] ------------[ cut here ]------------ [11072.588563] WARNING: CPU: 0 PID: 59 at /home/paulmck/public_git/linux-rcu/arch/x86/kernel/smp.c:128 native_smp_send_reschedule+0x37/0x40 [11072.591543] Modules linked in: [11072.591543] CPU: 0 PID: 59 Comm: rcub/10 Not tainted 4.14.0-rc1+ #1 [11072.592572] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Ubuntu-1.8.2-1ubuntu1 04/01/2014 [11072.594602] task: ffff9928de772640 task.stack: ffff9f580031c000 [11072.596655] RIP: 0010:native_smp_send_reschedule+0x37/0x40 [11072.597599] RSP: 0000:ffff9f580031fd10 EFLAGS: 00010082 [11072.598572] RAX: 000000000000002e RBX: ffff9928dd3fd940 RCX: 0000000000000004 [11072.599693] RDX: 0000000080000004 RSI: 0000000000000086 RDI: 00000000ffffffff [11072.601602] RBP: ffff9f580031fd10 R08: 000000000008f316 R09: 0000000000007e52 [11072.603563] R10: 0000000000000001 R11: ffffffffb957c2cd R12: 0000000000000006 [11072.604610] R13: ffff9928de772640 R14: 0000000000000061 R15: ffff9928deb991c0 [11072.606537] FS: 0000000000000000(0000) GS:ffff9928dea00000(0000) knlGS:0000000000000000 [11072.607654] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [11072.608646] CR2: 0000000009728b40 CR3: 000000001b640000 CR4: 00000000000006f0 [11072.610596] Call Trace: [11072.611531] resched_curr+0x61/0xd0 [11072.611531] switched_to_rt+0x8f/0xa0 [11072.612647] rt_mutex_setprio+0x25c/0x410 [11072.613591] task_blocks_on_rt_mutex+0x1b3/0x1f0 [11072.614601] rt_mutex_slowlock+0xa9/0x1e0 [11072.615567] rt_mutex_lock+0x29/0x30 [11072.615567] rcu_boost_kthread+0x127/0x3c0 [11072.616618] kthread+0x104/0x140 [11072.617641] ? rcu_report_unblock_qs_rnp+0x90/0x90 [11072.618565] ? kthread_create_on_node+0x40/0x40 [11072.619509] ret_from_fork+0x22/0x30 [11072.620593] Code: f0 00 0f 92 c0 84 c0 74 14 48 8b 05 84 67 c5 00 be fd 00 00 00 ff 90 a0 00 00 00 5d c3 89 fe 48 c7 c7 70 c4 fc b7 e8 05 b3 06 00 <0f> ff 5d c3 0f 1f 44 00 00 8b 05 f2 d4 13 02 85 c0 75 38 55 48 In theory, I could work around this by excluding CPU-hotplug operations while doing RCU priority boosting, but in practice I am very much hoping that there is a more reasonable solution out there... Thanx, Paul