From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754159Ab2GRWJc (ORCPT ); Wed, 18 Jul 2012 18:09:32 -0400 Received: from mail-pb0-f46.google.com ([209.85.160.46]:52476 "EHLO mail-pb0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751434Ab2GRWJ3 (ORCPT ); Wed, 18 Jul 2012 18:09:29 -0400 Date: Wed, 18 Jul 2012 15:09:09 -0700 From: Mandeep Singh Baines To: "Srivatsa S. Bhat" Cc: mandeep.baines@gmail.com, Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" , linux-kernel@vger.kernel.org, Mandeep Singh Baines , x86@kernel.org, Tejun Heo , Andrew Morton , Stephen Rothwell , Christoph Lameter , Olof Johansson Subject: Re: [PATCH] x86, mm: only wait for flushes from online cpus Message-ID: <20120718220909.GF3465@google.com> References: <1340229784-3686-1-git-send-email-msb@chromium.org> <5007252E.4070608@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <5007252E.4070608@linux.vnet.ibm.com> X-Operating-System: Linux/2.6.38.8-gg868 (x86_64) User-Agent: Mutt/1.5.20 (2009-06-14) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Srivatsa S. Bhat (srivatsa.bhat@linux.vnet.ibm.com) wrote: > On 06/21/2012 03:33 AM, mandeep.baines@gmail.com wrote: > > From: Mandeep Singh Baines > > > > A cpu in the mm_cpumask could go offline before we send the > > invalidate IPI causing us to wait forever. > > > > Bug Trace: > > > > <4>[10222.234548] WARNING: at ../../arch/x86/kernel/apic/ipi.c:113 default_send_IPI_mask_logical+0x58/0x73() > > <5>[10222.234633] Pid: 23605, comm: lmt-udev Tainted: G C 3.2.7 #1 > > <5>[10222.234639] Call Trace: > > <5>[10222.234651] [<8102e666>] warn_slowpath_common+0x68/0x7d > > <5>[10222.234661] [<81016c36>] ? default_send_IPI_mask_logical+0x58/0x73 > > <5>[10222.234670] [<8102e68f>] warn_slowpath_null+0x14/0x18 > > <5>[10222.234678] [<81016c36>] default_send_IPI_mask_logical+0x58/0x73 > > <5>[10222.234687] [<8101eec2>] flush_tlb_others_ipi+0x86/0xba > > <5>[10222.234696] [<8101f0bb>] flush_tlb_mm+0x5e/0x62 > > <5>[10222.234703] [<8101e36c>] pud_populate+0x2c/0x31 > > <5>[10222.234711] [<8101e409>] pgd_alloc+0x98/0xc7 > > <5>[10222.234719] [<8102c881>] mm_init.isra.38+0xcc/0xf3 > > <5>[10222.234727] [<8102cbc2>] dup_mm+0x68/0x34e > > <5>[10222.234736] [<8139bbae>] ? _cond_resched+0xd/0x21 > > <5>[10222.234745] [<810a5b7c>] ? kmem_cache_alloc+0x26/0xe2 > > <5>[10222.234753] [<8102d421>] ? copy_process+0x556/0xda6 > > <5>[10222.234761] [<8102d641>] copy_process+0x776/0xda6 > > <5>[10222.234770] [<8102dd5e>] do_fork+0xcb/0x1d4 > > <5>[10222.234778] [<810a8c96>] ? do_sync_write+0xd3/0xd3 > > <5>[10222.234786] [<810a94ab>] ? vfs_read+0x95/0xa2 > > <5>[10222.234795] [<81008850>] sys_clone+0x20/0x25 > > <5>[10222.234804] [<8139d8c5>] ptregs_clone+0x15/0x30 > > <5>[10222.234812] [<8139d7f7>] ? sysenter_do_call+0x12/0x26 > > <4>[10222.234818] ---[ end trace 31e095600f50fd48 ]--- > > <3>[10234.880183] BUG: soft lockup - CPU#0 stuck for 11s! [lmt-udev:23605] > > > > Addresses http://crosbug.com/31737 > > > > Signed-off-by: Mandeep Singh Baines > > Cc: Thomas Gleixner > > Cc: Ingo Molnar > > Cc: "H. Peter Anvin" > > Cc: x86@kernel.org > > Cc: Tejun Heo > > Cc: Andrew Morton > > Cc: Stephen Rothwell > > Cc: Christoph Lameter > > Cc: Olof Johansson > > --- > > arch/x86/mm/tlb.c | 7 ++++++- > > 1 files changed, 6 insertions(+), 1 deletions(-) > > > > diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c > > index 5e57e11..010090d 100644 > > --- a/arch/x86/mm/tlb.c > > +++ b/arch/x86/mm/tlb.c > > @@ -194,8 +194,13 @@ static void flush_tlb_others_ipi(const struct cpumask *cpumask, > > apic->send_IPI_mask(to_cpumask(f->flush_cpumask), > > INVALIDATE_TLB_VECTOR_START + sender); > > > > This function is always called with preempt_disabled() right? > In that case, _while_ this function is running, a CPU cannot go offline > because of stop_machine(). (I understand that it might go offline in between > calculating that cpumask and calling preempt_disable() - which is the race > you are trying to handle). > Ah. Good point. A cpu cannot be remove from the cpu_online_mask while preemption is disabled because stop_machine() can't run until preemption is enabled. ./kernel/cpu.c: err = __stop_machine(take_cpu_down, &tcd_param, cpumask_of(cpu)); > So, why not take the offline cpus out of the way even before sending that IPI? > That way, we need not modify the while loop below. > > > - while (!cpumask_empty(to_cpumask(f->flush_cpumask))) > > + while (!cpumask_empty(to_cpumask(f->flush_cpumask))) { > > + /* Only wait for online cpus */ > > + cpumask_and(to_cpumask(f->flush_cpumask), > > + to_cpumask(f->flush_cpumask), > > + cpu_online_mask); > > cpu_relax(); > > + } > > } > > > > f->flush_mm = NULL; > > > > That is, how about something like this: > Acked-off-by: Mandeep Singh Baines Do you mind re-sending you're patch with a proper sign-off. Thanks and regards, Mandeep > diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c > index 5e57e11..9d387a9 100644 > --- a/arch/x86/mm/tlb.c > +++ b/arch/x86/mm/tlb.c > @@ -186,7 +186,11 @@ static void flush_tlb_others_ipi(const struct cpumask *cpumask, > > f->flush_mm = mm; > f->flush_va = va; > - if (cpumask_andnot(to_cpumask(f->flush_cpumask), cpumask, cpumask_of(smp_processor_id()))) { > + > + cpumask_and(to_cpumask(f->flush_cpumask), cpumask, cpu_online_mask); > + cpumask_clear_cpu(smp_processor_id(), to_cpumask(f->flush_cpumask)); > + > + if (!cpumask_empty(to_cpumask(f->flush_cpumask))) { > /* > * We have to send the IPI only to > * CPUs affected. > > > > Regards, > Srivatsa S. Bhat > IBM Linux Technology Center >