From: "Martin J. Bligh" <mbligh@aracnet.com>
To: Andrew Morton <akpm@osdl.org>
Cc: rddunlap@osdl.org, hari@in.ibm.com, linux-kernel@vger.kernel.org,
apw@shadowen.org, jamesclv@us.ibm.com
Subject: Re: BUG_ON(!cpus_equal(cpumask, tmp));
Date: Tue, 30 Mar 2004 17:51:13 -0800 [thread overview]
Message-ID: <276260000.1080697873@flay> (raw)
In-Reply-To: <20040330173620.6fa69482.akpm@osdl.org>
>> I think we're assuming that we don't have to because the problem is fixed
>> by the "cpus_and(tmp, cpumask, cpu_online_map)" in flush_tlb_others so we
>> don't have to. Except it's racy, and doesn't work.
>
> And it's a kludge, to work around dangling references to a CPU which has
> gone away.
Yes ;-)
>> It would seem to me that your suggestion would fix it. But isn't locking
>> cpu_online_map both simpler and (most importantly) more generic? I can't
>> imagine that we don't use this elsewhere ... suppose for instance we took
>> a timer interrupt, causing a scheduler rebalance, and moved a process to
>> an offline CPU at that point? Isn't any user of smp_call_function also racy?
>
> If we have to add any fastpath locking to cope with CPU removal or reboot
> then it's time to make CONFIG_HOTPLUG_CPU dependent upon CONFIG_BROKEN.
Yeah, but as we've proved, it's not just hotplug, it's shutdown. And I don't
think we can make that depend on CONFIG_BROKEN ;-) I don't see a *read*
side RCU lock as an impostion on the fastpath (for reading cpu_online_map),
and I don't care if writing to cpu_online_map is slower. A spinlock would
be crappy, yes ...
> yes, cpu_online_map should be viewed as a reference to the going-away CPU
> for smp_call_function purposes. However the CPU takedown code appears to
> do the right thing: it removes the cpu from cpu_online_map first, then does
> the stop_machine() thing which should ensure that all other CPUs have
> completed any cross-CPU call which they were doing, yes?
Andy almost managed to convince me that the smp_call_function stuff is safe,
based on call_lock exclusion. Except that we count that cpu stuff outside
it ... but that's trivial to fix, we just move it inside the lock (patch
below - untested, but trivial).
He also pointed out that we could fairly easily fix the tlb stuff by
taking the tlb lock before taking a cpu offline. Which still doesn't
make me desperately comfortable ... but then he's smarter than me ;-)
To me it comes down to ... do we want to lock the damned thing, or fix
all the callers to be really, really careful?
diff -purN -X /home/mbligh/.diff.exclude virgin/arch/i386/kernel/smp.c smp_call_function/arch/i386/kernel/smp.c
--- virgin/arch/i386/kernel/smp.c 2004-03-11 14:33:36.000000000 -0800
+++ smp_call_function/arch/i386/kernel/smp.c 2004-03-30 17:43:34.000000000 -0800
@@ -514,10 +514,7 @@ int smp_call_function (void (*func) (voi
*/
{
struct call_data_struct data;
- int cpus = num_online_cpus()-1;
-
- if (!cpus)
- return 0;
+ int cpus;
data.func = func;
data.info = info;
@@ -527,6 +524,10 @@ int smp_call_function (void (*func) (voi
atomic_set(&data.finished, 0);
spin_lock(&call_lock);
+ cpus = num_online_cpus()-1;
+ if (!cpus)
+ return 0;
+
call_data = &data;
mb();
next prev parent reply other threads:[~2004-03-31 1:51 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
2004-03-29 15:39 BUG_ON(!cpus_equal(cpumask, tmp)); Martin J. Bligh
2004-03-30 0:21 ` Andrew Morton
2004-03-30 0:25 ` Andrew Morton
2004-03-30 13:28 ` Hariprasad Nellitheertha
2004-03-30 23:17 ` Randy.Dunlap
2004-03-31 0:22 ` Martin J. Bligh
2004-03-31 0:39 ` Andrew Morton
2004-03-31 0:57 ` Martin J. Bligh
2004-03-31 1:11 ` Andrew Morton
2004-03-31 1:24 ` Martin J. Bligh
2004-03-31 1:36 ` Andrew Morton
2004-03-31 1:51 ` Martin J. Bligh [this message]
2004-03-31 4:43 ` Hariprasad Nellitheertha
2004-04-01 0:31 ` Andy Whitcroft
2004-04-01 5:04 ` Srivatsa Vaddagiri
2004-04-01 11:38 ` Andy Whitcroft
2004-04-02 18:33 ` Randy.Dunlap
2004-04-01 8:42 ` Paul Jackson
2004-04-01 13:57 ` Hariprasad Nellitheertha
2004-04-03 1:45 ` Andy Whitcroft
2004-03-31 1:01 ` Andy Whitcroft
-- strict thread matches above, loose matches on Subject: below --
2004-01-02 23:51 Martin J. Bligh
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=276260000.1080697873@flay \
--to=mbligh@aracnet.com \
--cc=akpm@osdl.org \
--cc=apw@shadowen.org \
--cc=hari@in.ibm.com \
--cc=jamesclv@us.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=rddunlap@osdl.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.