Re: idle issues running sembench on 128 cpus

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Thomas Gleixner <tglx@linutronix.de>
To: Andi Kleen <andi@firstfloor.org>
Cc: Dave Kleikamp <dkleikamp@gmail.com>,
	Chris Mason <chris.mason@oracle.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Tim Chen <tim.c.chen@linux.intel.com>,
	linux-kernel@vger.kernel.org, lenb@kernel.org,
	paulmck@us.ibm.com
Subject: Re: idle issues running sembench on 128 cpus
Date: Thu, 5 May 2011 01:29:49 +0200 (CEST)	[thread overview]
Message-ID: <alpine.LFD.2.02.1105050110440.3005@ionos> (raw)
In-Reply-To: <20110504230349.GC2925@one.firstfloor.org>

On Thu, 5 May 2011, Andi Kleen wrote:
> > No, it does not even need refcounting. We can access it outside of the
> 
> Ok.
> 
> > lock as this is atomic context called on the cpu which is about to go
> > idle and therefor the device cannot go away. Easy and straightforward
> > fix.
> 
> Ok. Patch appended. Looks good?

Mostly. See below.
 
> BTW why must the lock be irqsave?

Good question. Probably safety frist paranoia :)

Indeed that code should only be called from irq disabled regions, so
we could avoid the irqsave there. Otherwise that needs to be irqsave
for obvious reasons.

> > > But yes it would be still good to fix Nehalem too.
> > > 
> > > One fix would be to make all the masks hierarchical,
> > > similar to what RCU does. Perhaps even some code 
> > > could be shared with RCU on that because it's a very
> > > similar problem.
> > 
> > In theory. It's not about the mask. The mask is uninteresting. It's
> > about the expiry time, which we have to protect. There is nothing
> > hierarchical about that. It all boils down on _ONE_ single functional
> 
> The mask can be used to see if another thread on this core is still
> running. If yes you don't need that. Right now Linux doesn't 
> know that, but it could be taught. The only problem is that once
> the other guy goes idle too their timeouts have to be merged.
> 
> This would cut contention in half.

That makes sense, but merging the timeouts race free will be a real
PITA.

> Also if it's HPET you could actually use multiple independent HPET channels.
> I remember us discussing this a long time ago... Not sure if it's worth
> it, but it may be a small relief.

Multiple broadcast devices. That sounds still horrible :)
 
> > device and you don't want to miss out your deadline just because you
> > decided to be extra clever. RCU does not care much whether you run the
> > callbacks a tick later on not. Time and timekeeping does.
> 
> You can at least check lockless if someone else has a <= timeout, right?

Might be worth a try. Need some sleep to remember why I discarded that
idea long ago.

> -Andi
> 
> ---
> 
> Move C3 stop test outside lock
> 
> Avoid taking locks in the idle path for systems where the timer
> doesn't stop in C3.
> 
> Signed-off-by: Andi Kleen <ak@linux.intel.com>
> 
> diff --git a/kernel/time/tick-broadcast.c b/kernel/time/tick-broadcast.c
> index da800ff..9cf0415 100644
> --- a/kernel/time/tick-broadcast.c
> +++ b/kernel/time/tick-broadcast.c
> @@ -456,23 +456,22 @@ void tick_broadcast_oneshot_control(unsigned long reason)
>  	unsigned long flags;
>  	int cpu;
>  
> -	raw_spin_lock_irqsave(&tick_broadcast_lock, flags);
> -
>  	/*
>  	 * Periodic mode does not care about the enter/exit of power
>  	 * states
>  	 */
>  	if (tick_broadcast_device.mode == TICKDEV_MODE_PERIODIC)
> -		goto out;
> +		return;
>  
> +	cpu = raw_smp_processor_id();

Why raw_ ? As I said above this should always be called with irqs
disabled.

If that ever gets called from an irq enabled, preemptible and
migratable context then we just open up a very narrow but ugly to
debug race window as we can look at the wrong per cpu device.

>  	bc = tick_broadcast_device.evtdev;
> -	cpu = smp_processor_id();
>  	td = &per_cpu(tick_cpu_device, cpu);
>  	dev = td->evtdev;

Thanks,

	tglx

next prev parent reply	other threads:[~2011-05-04 23:29 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-05-04 21:47 idle issues running sembench on 128 cpus Dave Kleikamp
2011-05-04 22:04 ` Thomas Gleixner
2011-05-04 22:07 ` Andi Kleen
2011-05-04 22:34   ` Thomas Gleixner
2011-05-04 23:03     ` Andi Kleen
2011-05-04 23:29       ` Thomas Gleixner [this message]
2011-05-04 23:42         ` Andi Kleen
2011-05-04 23:47           ` Thomas Gleixner
2011-05-04 23:49             ` Andi Kleen
2011-05-04 23:51               ` Thomas Gleixner
2011-05-04 23:48           ` idle issues running sembench on 128 cpus II Andi Kleen
2011-05-05 15:24             ` Dave Kleikamp
2011-05-05 13:58         ` idle issues running sembench on 128 cpus Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LFD.2.02.1105050110440.3005@ionos \
    --to=tglx@linutronix.de \
    --cc=a.p.zijlstra@chello.nl \
    --cc=andi@firstfloor.org \
    --cc=chris.mason@oracle.com \
    --cc=dkleikamp@gmail.com \
    --cc=lenb@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=paulmck@us.ibm.com \
    --cc=tim.c.chen@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox