public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: "Martin J. Bligh" <mbligh@aracnet.com>
To: Andrew Morton <akpm@osdl.org>
Cc: rddunlap@osdl.org, hari@in.ibm.com, linux-kernel@vger.kernel.org,
	apw@shadowen.org, jamesclv@us.ibm.com
Subject: Re: BUG_ON(!cpus_equal(cpumask, tmp));
Date: Tue, 30 Mar 2004 17:51:13 -0800	[thread overview]
Message-ID: <276260000.1080697873@flay> (raw)
In-Reply-To: <20040330173620.6fa69482.akpm@osdl.org>

>> I think we're assuming that we don't have to because the problem is fixed 
>> by the "cpus_and(tmp, cpumask, cpu_online_map)" in flush_tlb_others so we 
>> don't have to. Except it's racy, and doesn't work.
> 
> And it's a kludge, to work around dangling references to a CPU which has
> gone away.

Yes ;-)
 
>> It would seem to me that your suggestion would fix it. But isn't locking
>> cpu_online_map both simpler and (most importantly) more generic? I can't 
>> imagine that we don't use this elsewhere ... suppose for instance we took 
>> a timer interrupt, causing a scheduler rebalance, and moved a process to 
>> an offline CPU at that point? Isn't any user of smp_call_function also racy?
> 
> If we have to add any fastpath locking to cope with CPU removal or reboot
> then it's time to make CONFIG_HOTPLUG_CPU dependent upon CONFIG_BROKEN.

Yeah, but as we've proved, it's not just hotplug, it's shutdown. And I don't
think we can make that depend on CONFIG_BROKEN ;-) I don't see a *read*
side RCU lock as an impostion on the fastpath (for reading cpu_online_map),
and I don't care if writing to cpu_online_map is slower. A spinlock would
be crappy, yes ...

> yes, cpu_online_map should be viewed as a reference to the going-away CPU
> for smp_call_function purposes.  However the CPU takedown code appears to
> do the right thing: it removes the cpu from cpu_online_map first, then does
> the stop_machine() thing which should ensure that all other CPUs have
> completed any cross-CPU call which they were doing, yes?

Andy almost managed to convince me that the smp_call_function stuff is safe,
based on call_lock exclusion. Except that we count that cpu stuff outside
it ... but that's trivial to fix, we just move it inside the lock (patch
below - untested, but trivial).

He also pointed out that we could fairly easily fix the tlb stuff by 
taking the tlb lock before taking a cpu offline. Which still doesn't
make me desperately comfortable ... but then he's smarter than me ;-)
To me it comes down to ... do we want to lock the damned thing, or fix
all the callers to be really, really careful? 

diff -purN -X /home/mbligh/.diff.exclude virgin/arch/i386/kernel/smp.c smp_call_function/arch/i386/kernel/smp.c
--- virgin/arch/i386/kernel/smp.c	2004-03-11 14:33:36.000000000 -0800
+++ smp_call_function/arch/i386/kernel/smp.c	2004-03-30 17:43:34.000000000 -0800
@@ -514,10 +514,7 @@ int smp_call_function (void (*func) (voi
  */
 {
 	struct call_data_struct data;
-	int cpus = num_online_cpus()-1;
-
-	if (!cpus)
-		return 0;
+	int cpus;
 
 	data.func = func;
 	data.info = info;
@@ -527,6 +524,10 @@ int smp_call_function (void (*func) (voi
 		atomic_set(&data.finished, 0);
 
 	spin_lock(&call_lock);
+	cpus = num_online_cpus()-1;
+	if (!cpus)
+		return 0;
+
 	call_data = &data;
 	mb();
 	


  reply	other threads:[~2004-03-31  1:51 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-03-29 15:39 BUG_ON(!cpus_equal(cpumask, tmp)); Martin J. Bligh
2004-03-30  0:21 ` Andrew Morton
2004-03-30  0:25   ` Andrew Morton
2004-03-30 13:28     ` Hariprasad Nellitheertha
2004-03-30 23:17       ` Randy.Dunlap
2004-03-31  0:22         ` Martin J. Bligh
2004-03-31  0:39           ` Andrew Morton
2004-03-31  0:57             ` Martin J. Bligh
2004-03-31  1:11               ` Andrew Morton
2004-03-31  1:24                 ` Martin J. Bligh
2004-03-31  1:36                   ` Andrew Morton
2004-03-31  1:51                     ` Martin J. Bligh [this message]
2004-03-31  4:43                       ` Hariprasad Nellitheertha
2004-04-01  0:31                         ` Andy Whitcroft
2004-04-01  5:04                           ` Srivatsa Vaddagiri
2004-04-01 11:38                             ` Andy Whitcroft
2004-04-02 18:33                               ` Randy.Dunlap
2004-04-01  8:42                         ` Paul Jackson
2004-04-01 13:57                           ` Hariprasad Nellitheertha
2004-04-03  1:45                             ` Andy Whitcroft
2004-03-31  1:01           ` Andy Whitcroft
  -- strict thread matches above, loose matches on Subject: below --
2004-01-02 23:51 Martin J. Bligh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=276260000.1080697873@flay \
    --to=mbligh@aracnet.com \
    --cc=akpm@osdl.org \
    --cc=apw@shadowen.org \
    --cc=hari@in.ibm.com \
    --cc=jamesclv@us.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=rddunlap@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox