public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Hariprasad Nellitheertha <hari@in.ibm.com>
To: "Martin J. Bligh" <mbligh@aracnet.com>
Cc: Andrew Morton <akpm@osdl.org>,
	rddunlap@osdl.org, linux-kernel@vger.kernel.org,
	apw@shadowen.org, jamesclv@us.ibm.com
Subject: Re: BUG_ON(!cpus_equal(cpumask, tmp));
Date: Wed, 31 Mar 2004 10:13:04 +0530	[thread overview]
Message-ID: <20040331044304.GA5167@in.ibm.com> (raw)
In-Reply-To: <276260000.1080697873@flay>

There are actually two different problems here, and the BUG_ON is
hit in both cases.

1) In INIT_MM(), we now do this

        .cpu_vm_mask    = CPU_MASK_ALL,

With this, if we enter flush_tlb_others with init_mm, all bits except the one
corresponding to the current cpu are set. For example, on my UNI machine
running an SMP kernel with NR_CPUS=8, cpumask is 0xfe. The BUG_ON is hit 
even if we are doing nothing related to shutdown or cpu-offlining (like
when we are just loading a new kernel into memory using kexec).

2) The issue of a race between taking CPUs offline and the tlbflush code. The
discussions have been centered around this issue.

The patch I sent across, though, is completely targetted towards issue 1.

Regards, Hari

On Tue, Mar 30, 2004 at 05:51:13PM -0800, Martin J. Bligh wrote:
> >> I think we're assuming that we don't have to because the problem is fixed 
> >> by the "cpus_and(tmp, cpumask, cpu_online_map)" in flush_tlb_others so we 
> >> don't have to. Except it's racy, and doesn't work.
> > 
> > And it's a kludge, to work around dangling references to a CPU which has
> > gone away.
> 
> Yes ;-)
>  
> >> It would seem to me that your suggestion would fix it. But isn't locking
> >> cpu_online_map both simpler and (most importantly) more generic? I can't 
> >> imagine that we don't use this elsewhere ... suppose for instance we took 
> >> a timer interrupt, causing a scheduler rebalance, and moved a process to 
> >> an offline CPU at that point? Isn't any user of smp_call_function also racy?
> > 
> > If we have to add any fastpath locking to cope with CPU removal or reboot
> > then it's time to make CONFIG_HOTPLUG_CPU dependent upon CONFIG_BROKEN.
> 
> Yeah, but as we've proved, it's not just hotplug, it's shutdown. And I don't
> think we can make that depend on CONFIG_BROKEN ;-) I don't see a *read*
> side RCU lock as an impostion on the fastpath (for reading cpu_online_map),
> and I don't care if writing to cpu_online_map is slower. A spinlock would
> be crappy, yes ...
> 
> > yes, cpu_online_map should be viewed as a reference to the going-away CPU
> > for smp_call_function purposes.  However the CPU takedown code appears to
> > do the right thing: it removes the cpu from cpu_online_map first, then does
> > the stop_machine() thing which should ensure that all other CPUs have
> > completed any cross-CPU call which they were doing, yes?
> 
> Andy almost managed to convince me that the smp_call_function stuff is safe,
> based on call_lock exclusion. Except that we count that cpu stuff outside
> it ... but that's trivial to fix, we just move it inside the lock (patch
> below - untested, but trivial).
> 
> He also pointed out that we could fairly easily fix the tlb stuff by 
> taking the tlb lock before taking a cpu offline. Which still doesn't
> make me desperately comfortable ... but then he's smarter than me ;-)
> To me it comes down to ... do we want to lock the damned thing, or fix
> all the callers to be really, really careful? 
> 
> diff -purN -X /home/mbligh/.diff.exclude virgin/arch/i386/kernel/smp.c smp_call_function/arch/i386/kernel/smp.c
> --- virgin/arch/i386/kernel/smp.c	2004-03-11 14:33:36.000000000 -0800
> +++ smp_call_function/arch/i386/kernel/smp.c	2004-03-30 17:43:34.000000000 -0800
> @@ -514,10 +514,7 @@ int smp_call_function (void (*func) (voi
>   */
>  {
>  	struct call_data_struct data;
> -	int cpus = num_online_cpus()-1;
> -
> -	if (!cpus)
> -		return 0;
> +	int cpus;
>  
>  	data.func = func;
>  	data.info = info;
> @@ -527,6 +524,10 @@ int smp_call_function (void (*func) (voi
>  		atomic_set(&data.finished, 0);
>  
>  	spin_lock(&call_lock);
> +	cpus = num_online_cpus()-1;
> +	if (!cpus)
> +		return 0;
> +
>  	call_data = &data;
>  	mb();
>  	
> 

-- 
Hariprasad Nellitheertha
Linux Technology Center
India Software Labs
IBM India, Bangalore

  reply	other threads:[~2004-03-31  4:43 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-03-29 15:39 BUG_ON(!cpus_equal(cpumask, tmp)); Martin J. Bligh
2004-03-30  0:21 ` Andrew Morton
2004-03-30  0:25   ` Andrew Morton
2004-03-30 13:28     ` Hariprasad Nellitheertha
2004-03-30 23:17       ` Randy.Dunlap
2004-03-31  0:22         ` Martin J. Bligh
2004-03-31  0:39           ` Andrew Morton
2004-03-31  0:57             ` Martin J. Bligh
2004-03-31  1:11               ` Andrew Morton
2004-03-31  1:24                 ` Martin J. Bligh
2004-03-31  1:36                   ` Andrew Morton
2004-03-31  1:51                     ` Martin J. Bligh
2004-03-31  4:43                       ` Hariprasad Nellitheertha [this message]
2004-04-01  0:31                         ` Andy Whitcroft
2004-04-01  5:04                           ` Srivatsa Vaddagiri
2004-04-01 11:38                             ` Andy Whitcroft
2004-04-02 18:33                               ` Randy.Dunlap
2004-04-01  8:42                         ` Paul Jackson
2004-04-01 13:57                           ` Hariprasad Nellitheertha
2004-04-03  1:45                             ` Andy Whitcroft
2004-03-31  1:01           ` Andy Whitcroft
  -- strict thread matches above, loose matches on Subject: below --
2004-01-02 23:51 Martin J. Bligh

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20040331044304.GA5167@in.ibm.com \
    --to=hari@in.ibm.com \
    --cc=akpm@osdl.org \
    --cc=apw@shadowen.org \
    --cc=jamesclv@us.ibm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mbligh@aracnet.com \
    --cc=rddunlap@osdl.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox