public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Thomas Gleixner <tglx@linutronix.de>
To: Dave Kleikamp <dkleikamp@gmail.com>
Cc: Chris Mason <chris.mason@oracle.com>,
	Peter Zijlstra <a.p.zijlstra@chello.nl>,
	Tim Chen <tim.c.chen@linux.intel.com>,
	linux-kernel@vger.kernel.org
Subject: Re: idle issues running sembench on 128 cpus
Date: Thu, 5 May 2011 00:04:41 +0200 (CEST)	[thread overview]
Message-ID: <alpine.LFD.2.02.1105042348350.3005@ionos> (raw)
In-Reply-To: <4DC1C95B.4040706@gmail.com>

Dave,

On Wed, 4 May 2011, Dave Kleikamp wrote:

> Thomas,
> I've been looking at performance running sembench on a 128-cpu system and I'm
> running into some issues in the idle loop.
> 
> Initially, I was seeing a lot of contention on the clockevents_lock in
> clockevents_notify(). Assuming it is only protecting clockevents_chain, and
> not the handlers themselves, I changed this to an rwlock (with thoughts of
> using rcu if successful).
> 
> This didn't help, but exposed an underlying problem with high contention on
> tick_broadcast_lock in tick_broadcast_oneshot_control(). I think with this
> many cpus, tick_handle_oneshot_broadcast() is holding that lock a long time,
> causing the idle cpus to spin on the lock.
> 
> I am able to avoid this problem with either kernel parameter, "idle=mwait" or
> "processor.max_cstate=1". Similarly, defining CONFIG_INTEL_IDLE=y and using
> the kernel parameter intel_idle.max_cstate=1 exposes a different spinlock,
> pm_qos_lock, but I found this patch which fixes that contention:
> https://lists.linux-foundation.org/pipermail/linux-pm/2011-February/030266.html
> https://patchwork.kernel.org/patch/550721/
> 
> Of course, we'd like to find a way to reduce the spinlock contention and not
> resort to prohibiting the cpus from entering C3 state at all. I don't see a
> simple fix, and want to know if you've seen anything like this before and
> given it any thought.
> 
> I also don't know if it makes sense to be able to tune the cpuidle governors
> to add more resistance to enter the C3 state, or even being able to switch to
> a performance governor at runtime, similar to cpufreq.
> 
> I'd like to hear your thoughts before I dive any deeper into this.

Tick broadcasting for more than a few CPU's simply does not scale and
never will.

There is no real way to avoid the global lock if all what you have is
_ONE_ global working event device and N cpus which try to work around
their f*cked up local apics when deeper C-States are entered.

The same problem is with the TSC which stops in deeper C-States. You
just don't see lock contention because we rely on the HW serialization
of HPET or PM_TIMER which is a huge bottleneck when you try to do
timekeeping related stuff high frequency on more than a handful of
cores at the same time. Just benchmark a tight loop of gettimeofday()
or clock_gettime() on such a machine with and without max_cstate=1 on
the kernel command line.

We could perhaps get away w/o the locking for the NOHZ=n and HIGHRES=n
case, but I doubt that you want to have that given that you don't want
to restrict C-States either. C-states do not make much sense without
NOHZ=y at least.

We tried to beat sense into unnamed HW manufacturers for years and it
took just a little bit more than a decade that they started to act on
it :(

Thanks,

	tglx

  reply	other threads:[~2011-05-04 22:04 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-05-04 21:47 idle issues running sembench on 128 cpus Dave Kleikamp
2011-05-04 22:04 ` Thomas Gleixner [this message]
2011-05-04 22:07 ` Andi Kleen
2011-05-04 22:34   ` Thomas Gleixner
2011-05-04 23:03     ` Andi Kleen
2011-05-04 23:29       ` Thomas Gleixner
2011-05-04 23:42         ` Andi Kleen
2011-05-04 23:47           ` Thomas Gleixner
2011-05-04 23:49             ` Andi Kleen
2011-05-04 23:51               ` Thomas Gleixner
2011-05-04 23:48           ` idle issues running sembench on 128 cpus II Andi Kleen
2011-05-05 15:24             ` Dave Kleikamp
2011-05-05 13:58         ` idle issues running sembench on 128 cpus Thomas Gleixner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=alpine.LFD.2.02.1105042348350.3005@ionos \
    --to=tglx@linutronix.de \
    --cc=a.p.zijlstra@chello.nl \
    --cc=chris.mason@oracle.com \
    --cc=dkleikamp@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=tim.c.chen@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox