From: "H. Peter Anvin" <hpa@kernel.org>
To: Dave Jones <davej@redhat.com>,
Vegard Nossum <vegard.nossum@gmail.com>,
Andi Kleen <andi@firstfloor.org>,
the arch/x86 maintainers <x86@kernel.org>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Cc: Rusty Russell <rusty@rustcorp.com.au>
Subject: Re: latest -git: WARNING: at arch/x86/kernel/ipi.c:123 send_IPI_mask_bitmask+0xc3/0xe0()
Date: Thu, 21 Aug 2008 19:13:12 -0700 [thread overview]
Message-ID: <48AE20B8.9000204@kernel.org> (raw)
In-Reply-To: <20080822003659.GA7581@redhat.com>
[-- Attachment #1: Type: text/plain, Size: 1957 bytes --]
Dave Jones wrote:
> >
> > Hm. What you say is true, but this one in particular has nothing to do
> > with oprofile! It has something to do with reading /dev/cpu/*/msr
> > while hot-unplugging cpu1:
> >
> > [<c011733e>] msr_read+0x6e/0xa0
> > [<c01a87b4>] vfs_read+0x94/0x130
> >
> > I wasn't using oprofile when this happened. So I think it should also
> > be considered a separate issue. Though yes -- CPU hotplug in general
> > tends to break a lot of things.
>
> From my reading of the msr code, we check that the cpu is online in ->open,
> but we never check it again, and also, we make no guarantees that it
> won't go away before we ->read or even ->close it.
>
> Would adding a get_cpu/put_cpu across the open/close solve this?
> Peter?
>
A get_cpu/put_cpu across the whole open..close sequence would seem to
be, ahem, rude, since userspace could hold it for an arbitrary amount of
time (plus, there is no guarantee that they are invoked on the same CPU.)
The cpuid driver has the same problem, obviously.
get_online_cpus() and put_online_cpus() around the call to
{rd,wr}msr_safe_on_cpu() should work; and the CPU hotplug documentation
seems to claim that we can just disable preemption around those calls,
which is exactly what get_cpu()..put_cpu() does, so I guess
get_cpu()..put_cpu() here is fine. Now, the big question is: should
this really be done in the MSR/CPUID drivers, or should it be done in
smp_call_function_single(), which is the generic code invoked by this?
It seems to be that doing it in smp_call_function_single() would be more
correct as it's already protected by get_cpu()..put_cpu() and a
cpu_online() test in there should not be expensive in comparison to the
whole rest of the code.
You may want to see if this patch fixes the problem; it does *NOT* have
the correct error behaviour (some of the intervening layers don't
propagate errors), but it should make the fault go away.
-hpa
[-- Attachment #2: smp-unplug-fault.patch --]
[-- Type: text/x-patch, Size: 1157 bytes --]
diff --git a/kernel/smp.c b/kernel/smp.c
index 782e2b9..f362a85 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -210,8 +210,10 @@ int smp_call_function_single(int cpu, void (*func) (void *info), void *info,
{
struct call_single_data d;
unsigned long flags;
- /* prevent preemption and reschedule on another processor */
+ /* prevent preemption and reschedule on another processor,
+ as well as CPU removal */
int me = get_cpu();
+ int err = 0;
/* Can deadlock when called with interrupts disabled */
WARN_ON(irqs_disabled());
@@ -220,7 +222,7 @@ int smp_call_function_single(int cpu, void (*func) (void *info), void *info,
local_irq_save(flags);
func(info);
local_irq_restore(flags);
- } else {
+ } else if ((unsigned)cpu < NR_CPUS && cpu_online(cpu)) {
struct call_single_data *data = NULL;
if (!wait) {
@@ -236,10 +238,12 @@ int smp_call_function_single(int cpu, void (*func) (void *info), void *info,
data->func = func;
data->info = info;
generic_exec_single(cpu, data);
+ } else {
+ err = -ENXIO; /* CPU not online */
}
put_cpu();
- return 0;
+ return err;
}
EXPORT_SYMBOL(smp_call_function_single);
next prev parent reply other threads:[~2008-08-22 2:14 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2008-08-19 19:51 latest -git: WARNING: at arch/x86/kernel/ipi.c:123 send_IPI_mask_bitmask+0xc3/0xe0() Vegard Nossum
2008-08-20 1:39 ` Andi Kleen
2008-08-20 6:26 ` Vegard Nossum
2008-08-22 0:36 ` Dave Jones
2008-08-22 2:13 ` H. Peter Anvin [this message]
2008-08-22 2:28 ` Andi Kleen
2008-08-22 6:24 ` H. Peter Anvin
2008-08-22 9:35 ` Andi Kleen
2008-08-22 16:41 ` H. Peter Anvin
2008-08-23 6:42 ` Jeremy Fitzhardinge
2008-08-23 6:44 ` H. Peter Anvin
2008-08-22 11:13 ` adobriyan
2008-08-24 9:20 ` Vegard Nossum
2008-08-24 16:43 ` H. Peter Anvin
2008-08-24 17:17 ` H. Peter Anvin
2008-08-24 17:22 ` Vegard Nossum
2008-08-24 17:45 ` Vegard Nossum
2008-08-24 17:59 ` H. Peter Anvin
2008-08-24 18:13 ` Dave Jones
2008-08-25 18:31 ` Vegard Nossum
2008-08-25 18:38 ` Dave Jones
2008-08-25 18:36 ` Andi Kleen
2008-08-25 18:54 ` Dave Jones
2008-08-25 19:39 ` Andi Kleen
2008-08-25 19:50 ` Dave Jones
2008-08-25 20:36 ` Andi Kleen
2008-08-25 20:47 ` Dave Jones
2008-08-25 21:24 ` Arjan van de Ven
2008-08-25 19:08 ` H. Peter Anvin
2008-08-25 19:13 ` Dave Jones
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=48AE20B8.9000204@kernel.org \
--to=hpa@kernel.org \
--cc=andi@firstfloor.org \
--cc=davej@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=rusty@rustcorp.com.au \
--cc=vegard.nossum@gmail.com \
--cc=x86@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox