From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756020Ab2ARMia (ORCPT ); Wed, 18 Jan 2012 07:38:30 -0500 Received: from e23smtp07.au.ibm.com ([202.81.31.140]:53678 "EHLO e23smtp07.au.ibm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752939Ab2ARMi3 (ORCPT ); Wed, 18 Jan 2012 07:38:29 -0500 Message-ID: <4F16BD39.6070201@linux.vnet.ibm.com> Date: Wed, 18 Jan 2012 18:08:17 +0530 From: "Srivatsa S. Bhat" User-Agent: Mozilla/5.0 (X11; Linux i686; rv:7.0) Gecko/20110927 Thunderbird/7.0 MIME-Version: 1.0 To: Sergey Senozhatsky CC: Ingo Molnar , Peter Zijlstra , Paul Turner , Suresh Siddha , Mike Galbraith , linux-kernel@vger.kernel.org, Marcos Souza Subject: Re: [PATCH] Sched fair: check that ilb cpu is online during nohz_balancer_kick() References: <20120118121049.GA4258@swordfish.minsk.epam.com> In-Reply-To: <20120118121049.GA4258@swordfish.minsk.epam.com> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit x-cbid: 12011802-0260-0000-0000-000000657786 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 01/18/2012 05:40 PM, Sergey Senozhatsky wrote: > Sched fair: check that ilb cpu is online during nohz_balancer_kick() > > find_new_ilb() may return offlined cpu if trigger_load_balance() occurs while machine > suspending or resuming, hitting native_smp_send_reschedule() assertion failure: > [ 108.473465] Disabling non-boot CPUs ... > [ 108.477308] CPU 1 is now offline > [ 108.477497] ------------[ cut here ]------------ > [ 108.477523] WARNING: at arch/x86/kernel/smp.c:120 native_smp_send_reschedule+0x25/0x56() > [ 108.477724] Call Trace: > [ 108.477736] [] warn_slowpath_common+0x7e/0x96 > [ 108.477772] [] warn_slowpath_null+0x15/0x17 > [ 108.477795] [] native_smp_send_reschedule+0x25/0x56 > [ 108.477823] [] trigger_load_balance+0x6ac/0x72e > [ 108.477847] [] ? trigger_load_balance+0x2ab/0x72e > [ 108.477874] [] scheduler_tick+0xe2/0xeb > [ 108.477899] [] update_process_times+0x60/0x70 > [ 108.477926] [] tick_sched_timer+0x6d/0x96 > [ 108.477951] [] __run_hrtimer+0x1c2/0x3a1 > [ 108.477974] [] ? tick_nohz_handler+0xdf/0xdf > [ 108.477999] [] hrtimer_interrupt+0xe6/0x1b0 > [ 108.478023] [] smp_apic_timer_interrupt+0x80/0x93 > [ 108.478051] [] apic_timer_interrupt+0x73/0x80 > [ 108.478072] [] ? slab_cpuup_callback+0xa8/0xdb > [ 108.478108] [] notifier_call_chain+0x86/0xb3 > [ 108.478133] [] ? spp_getpage+0x5f/0x5f > [ 108.478157] [] __raw_notifier_call_chain+0x9/0xb > [ 108.478182] [] __cpu_notify+0x1b/0x2d > [ 108.478204] [] cpu_notify_nofail+0xe/0x16 > [ 108.478227] [] _cpu_down+0x130/0x249 > [ 108.478249] [] ? printk+0x4c/0x4e > [ 108.478271] [] disable_nonboot_cpus+0x5a/0xfc > [ 108.478297] [] suspend_devices_and_enter+0x19a/0x407 > [ 108.478323] [] enter_state+0x124/0x169 > [ 108.478346] [] state_store+0xb7/0x101 > [ 108.478373] [] kobj_attr_store+0x17/0x19 > [ 108.478399] [] sysfs_write_file+0x103/0x13f > [ 108.478425] [] vfs_write+0xad/0x13d > [ 108.478447] [] sys_write+0x45/0x6c > [ 108.478469] [] system_call_fastpath+0x16/0x1b > [ 108.478492] ---[ end trace 991823fa9b0a0b79 ]--- > > Check that returned by find_new_ilb() cpu is online. > > Signed-off-by: Sergey Senozhatsky > > --- > > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c > index 84adb2d..070b8e0 100644 > --- a/kernel/sched/fair.c > +++ b/kernel/sched/fair.c > @@ -4826,7 +4826,7 @@ unlock: > rcu_read_unlock(); > > out_done: > - if (ilb < nr_cpu_ids && idle_cpu(ilb)) > + if (likely(cpu_online(ilb)) && ilb < nr_cpu_ids && idle_cpu(ilb)) > return ilb; > > return nr_cpu_ids; > You can do even better than that by doing: if (likely(cpu_active(ilb) && ilb < nr_cpu_ids && idle_cpu(ilb)) This would prevent us from sending IPIs to CPUs that are about to go offline as well (apart from those that are already offline). However, I would rather prefer an approach where we fix the nohz.idle_cpus_mask so that it doesn't contain entries corresponding to offline CPUs. This would not only fix the root-cause of the problem but would also make find_new_ilb() return useful values more often (that is, values other than nr_cpu_ids). Suresh has posted a patch in that direction here: http://thread.gmane.org/gmane.linux.kernel/1237745/focus=1240001 (But that patch didn't help though...) It is also to be noted that this warning is a problem introduced in 3.3 merge window - we didn't hit this in 3.2. So, it would be good to fix the root-cause provided it is worth the effort (considering both additional code complexity and the coding effort needed). Regards, Srivatsa S. Bhat IBM Linux Technology Center