From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932146AbcF3R6x (ORCPT ); Thu, 30 Jun 2016 13:58:53 -0400 Received: from mx0a-001b2d01.pphosted.com ([148.163.156.1]:16962 "EHLO mx0a-001b2d01.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752400AbcF3R6v (ORCPT ); Thu, 30 Jun 2016 13:58:51 -0400 X-IBM-Helo: d03dlp02.boulder.ibm.com X-IBM-MailFrom: paulmck@linux.vnet.ibm.com Date: Thu, 30 Jun 2016 10:58:45 -0700 From: "Paul E. McKenney" To: peterz@infradead.org, fweisbec@gmail.com, tglx@linutronix.de Cc: linux-kernel@vger.kernel.org, rgkernel@gmail.com Subject: [PATCH RFC] sched: Make wake_up_nohz_cpu() handle CPUs going offline Reply-To: paulmck@linux.vnet.ibm.com MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.5.21 (2010-09-15) X-TM-AS-MML: disable X-Content-Scanned: Fidelis XPS MAILER x-cbid: 16063017-0020-0000-0000-00000938D960 X-IBM-AV-DETECTION: SAVI=unused REMOTE=unused XFE=unused x-cbparentid: 16063017-0021-0000-0000-0000534BB4D3 Message-Id: <20160630175845.GA10269@linux.vnet.ibm.com> X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-06-30_08:,, signatures=0 X-Proofpoint-Spam-Details: rule=outbound_notspam policy=outbound score=0 spamscore=0 suspectscore=0 malwarescore=0 phishscore=0 adultscore=0 bulkscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1604210000 definitions=main-1606300170 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Both timers and hrtimers are maintained on the outgoing CPU until CPU_DEAD time, at which point they are migrated to a surviving CPU. If a mod_timer() executes between CPU_DYING and CPU_DEAD time, x86 systems will splat in native_smp_send_reschedule() when attempting to wake up the just-now-offlined CPU, as shown below from a NO_HZ_FULL kernel: [ 7976.741556] WARNING: CPU: 0 PID: 661 at /home/paulmck/public_git/linux-rcu/arch/x86/kernel/smp.c:125 native_smp_send_reschedule+0x39/0x40 [ 7976.741595] Modules linked in: [ 7976.741595] CPU: 0 PID: 661 Comm: rcu_torture_rea Not tainted 4.7.0-rc2+ #1 [ 7976.741595] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Bochs 01/01/2011 [ 7976.741595] 0000000000000000 ffff88000002fcc8 ffffffff8138ab2e 0000000000000000 [ 7976.741595] 0000000000000000 ffff88000002fd08 ffffffff8105cabc 0000007d1fd0ee18 [ 7976.741595] 0000000000000001 ffff88001fd16d40 ffff88001fd0ee00 ffff88001fd0ee00 [ 7976.741595] Call Trace: [ 7976.741595] [] dump_stack+0x67/0x99 [ 7976.741595] [] __warn+0xcc/0xf0 [ 7976.741595] [] warn_slowpath_null+0x18/0x20 [ 7976.741595] [] native_smp_send_reschedule+0x39/0x40 [ 7976.741595] [] wake_up_nohz_cpu+0x82/0x190 [ 7976.741595] [] internal_add_timer+0x7a/0x80 [ 7976.741595] [] mod_timer+0x187/0x2b0 [ 7976.741595] [] rcu_torture_reader+0x33d/0x380 [ 7976.741595] [] ? sched_torture_read_unlock+0x30/0x30 [ 7976.741595] [] ? rcu_bh_torture_read_lock+0x80/0x80 [ 7976.741595] [] kthread+0xdf/0x100 [ 7976.741595] [] ret_from_fork+0x1f/0x40 [ 7976.741595] [] ? kthread_create_on_node+0x200/0x200 However, in this case, the wakeup is redundant, because the timer migration will reprogram timer hardware as needed. Note that the fact that preemption is disabled does not avoid the splat, as the offline operation has already passed both the synchronize_sched() and the stop_machine() that would be blocked by disabled preemption. This commit therefore modifies wake_up_nohz_cpu() to avoid attempting to wake up offline CPUs. It also adds a comment stating that the caller must tolerate lost wakeups when the target CPU is going offline, and suggesting the CPU_DEAD notifier as a recovery mechanism. Signed-off-by: Paul E. McKenney Cc: Peter Zijlstra Cc: Frederic Weisbecker Cc: Thomas Gleixner --- core.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/kernel/sched/core.c b/kernel/sched/core.c index 7f2cae4620c7..08502966e7df 100644 --- a/kernel/sched/core.c +++ b/kernel/sched/core.c @@ -590,9 +590,14 @@ static bool wake_up_full_nohz_cpu(int cpu) return false; } +/* + * Wake up the specified CPU. If the CPU is going offline, it is the + * caller's responsibility to deal with the lost wakeup, for example, + * by hooking into the CPU_DEAD notifier like timers and hrtimers do. + */ void wake_up_nohz_cpu(int cpu) { - if (!wake_up_full_nohz_cpu(cpu)) + if (cpu_online(cpu) && !wake_up_full_nohz_cpu(cpu)) wake_up_idle_cpu(cpu); }