linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Jacob Pan <jacob.jun.pan@linux.intel.com>
Cc: Arjan van de Ven <arjan@linux.intel.com>,
	Peter Zijlstra <peterz@infradead.org>,
	lenb@kernel.org, rjw@rjwysocki.net,
	Eliezer Tamir <eliezer.tamir@linux.intel.com>,
	Chris Leech <christopher.leech@intel.com>,
	David Miller <davem@davemloft.net>,
	rui.zhang@intel.com, Mike Galbraith <bitbucket@online.de>,
	Ingo Molnar <mingo@kernel.org>,
	hpa@zytor.com, Thomas Gleixner <tglx@linutronix.de>,
	linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org,
	"Rafael J. Wysocki" <rafael.j.wysocki@intel.com>
Subject: Re: [PATCH 3/7] idle, thermal, acpi: Remove home grown idle implementations
Date: Thu, 21 Nov 2013 20:20:36 -0800	[thread overview]
Message-ID: <20131122042036.GL4138@linux.vnet.ibm.com> (raw)
In-Reply-To: <20131121161005.34150ab2@ultegra>

On Thu, Nov 21, 2013 at 04:10:05PM -0800, Jacob Pan wrote:
> On Thu, 21 Nov 2013 12:07:17 -0800
> "Paul E. McKenney" <paulmck@linux.vnet.ibm.com> wrote:
> 
> > On Thu, Nov 21, 2013 at 11:45:20AM -0800, Arjan van de Ven wrote:
> > > On 11/21/2013 11:19 AM, Paul E. McKenney wrote:
> > > >On Thu, Nov 21, 2013 at 08:21:03AM -0800, Arjan van de Ven wrote:
> > > >>On 11/21/2013 8:07 AM, Paul E. McKenney wrote:
> > > >>>As long as RCU has some reliable way to identify an idle task, I
> > > >>>am good.  But I have to ask -- why can't idle injection
> > > >>>coordinate with the existing idle tasks rather than temporarily
> > > >>>making alternative idle tasks?
> > > >>
> > > >>it's not a real idle. that's the whole problem of the situation.
> > > >>to the rest of the OS, this is being BUSY (busy saving power using
> > > >>a CPU instruction, but it might as well have been an mdelay()
> > > >>operation) and it's also what end users expect; they want to be
> > > >>able to see where there performance (read: cpu time in "top") is
> > > >>going.
> > > >
> > > >My concern is keeping RCU's books straight.  Suppose that there is
> > > >a need to call for idle in the middle of a preemptible RCU
> > > >read-side critical section.  Now, if that call for idle involves a
> > > >context switch, all is well -- RCU will see the task as still
> > > >being in its RCU read-side critical section, which means that it
> > > >is OK for RCU to see the CPU as idle.
> > > >
> > > >However, if there is no context switch and RCU sees the CPU as
> > > >idle, preemptible RCU could prematurely end the grace period.  If
> > > >there is no context switch and RCU sees the CPU as non-idle for
> > > >too long, we start getting RCU CPU stall warning splats.
> > > >
> > > >Another approach would be to only inject idle when the CPU is not
> > > >doing anything that could possibly be in an RCU read-side critical
> > > >section.  But things might get a bit hot in case of an overly
> > > >long RCU read-side critical section.
> > > >
> > > >One approach that might work would be to hook into RCU's
> > > >context-switch code going in and coming out, then telling RCU that
> > > >the CPU is idle, even though top and friends see it as non-idle.
> > > >This last is in fact similar to how RCU handles userspace
> > > >execution for NO_HZ_FULL.
> > > >
> > > 
> > > so powerclamp and such are not "idle".
> > > They are "busy" from everything except the lowest level of the CPU
> > > hardware. once you start thinking of them as idle, all hell breaks
> > > lose in terms of implications (including sysadmin visibility
> > > etc).... (hence some of the explosions in this thread as well).
> > > 
> > > but it's not "idle".
> > > 
> > > it's "put the cpu in a low power state for a specified amount of
> > > time". sure it uses the same instruction to do so that the idle
> > > loop uses.
> > > 
> > > (now to make it messy, the current driver does a bunch of things
> > > similar to the idle loop which is a mess and fair to be complained
> > > about)
> > 
> > Then from an RCU viewpoint, they need to be short in duration.
> > Otherwise you risk getting CPU stall-warning explosions from RCU.  ;-)
> > 
> > 							Thanx, Paul
> > 
> currently powerclamp allow idle injection duration between 6 to 25ms.
> I guess that is short considering the stall check is in seconds?
> 	return till_stall_check * HZ + RCU_STALL_DELAY_DELTA;

The 6ms to 25ms range should be just fine as far as normal RCU grace
periods are concerned.  However, it does mean that expedited grace
periods could be delayed: They normally take a few tens of microseconds,
but if they were unlucky enough to show up during an idle injection,
they would be magnified by two to three orders of magnitude, which is
not pretty.

Hence my suggestion of hooking into RCU on idle-injection start and end
so that RCU considers that time period to be idle.  Just like it does
for user-mode execution on NO_HZ_FULL kernels, so I still don't see this
approach to be a problem.  I must confess that I still don't understand
what Arjan doesn't like about it.

							Thanx, Paul

> BTW, by forcing intel_idle to use deepest c-states for idle injection
> thread the efficiency problem is gone. I am surprised that cpuidle
> would not pick the deepest c-states given powerclamp driver is asking
> for 6ms idle time and the wakeup latencies are in the usec.
> Anyway, for what i have tested so far powerclamp with this patchset can
> work as well as the code before.
> 
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-pm" in
> > the body of a message to majordomo@vger.kernel.org
> > More majordomo info at  http://vger.kernel.org/majordomo-info.html
> 
> [Jacob Pan]
> 


  reply	other threads:[~2013-11-22  4:20 UTC|newest]

Thread overview: 47+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-11-20 16:04 [PATCH 0/7] Cure some vaux idle wrackage Peter Zijlstra
2013-11-20 16:04 ` [PATCH 1/7] x86, acpi, idle: Restructure the mwait idle routines Peter Zijlstra
2013-11-20 16:04 ` [PATCH 2/7] sched, preempt: Fixup missed PREEMPT_NEED_RESCHED folding Peter Zijlstra
2013-11-21  8:25   ` Peter Zijlstra
2013-11-20 16:04 ` [PATCH 3/7] idle, thermal, acpi: Remove home grown idle implementations Peter Zijlstra
2013-11-20 16:40   ` Arjan van de Ven
2013-11-20 16:59     ` Peter Zijlstra
2013-11-20 17:23     ` Thomas Gleixner
2013-11-20 17:23       ` Arjan van de Ven
2013-11-20 17:55         ` Thomas Gleixner
2013-11-20 18:21           ` Arjan van de Ven
2013-11-20 19:38             ` Thomas Gleixner
2013-11-20 22:08               ` Jacob Pan
2013-11-21  0:54   ` Jacob Pan
2013-11-21  8:21     ` Peter Zijlstra
2013-11-21 16:07       ` Paul E. McKenney
2013-11-21 16:21         ` Arjan van de Ven
2013-11-21 19:19           ` Paul E. McKenney
2013-11-21 19:45             ` Arjan van de Ven
2013-11-21 20:07               ` Paul E. McKenney
2013-11-22  0:10                 ` Jacob Pan
2013-11-22  4:20                   ` Paul E. McKenney [this message]
2013-11-22 11:33                     ` Peter Zijlstra
2013-11-22 17:17                       ` Paul E. McKenney
2013-11-21 16:29         ` Peter Zijlstra
2013-11-21 17:27           ` Paul E. McKenney
2013-11-20 16:04 ` [PATCH 4/7] preempt, locking: Rework local_bh_{dis,en}able() Peter Zijlstra
2013-11-20 16:04 ` [PATCH 5/7] locking: Optimize lock_bh functions Peter Zijlstra
2013-11-20 16:04 ` [PATCH 6/7] sched: Clean up preempt_enable_no_resched() abuse Peter Zijlstra
2013-11-20 18:02   ` Eliezer Tamir
2013-11-20 18:15     ` Peter Zijlstra
2013-11-20 20:14       ` Eliezer Tamir
2013-11-21 10:10     ` Peter Zijlstra
2013-11-21 13:26       ` Eliezer Tamir
2013-11-21 13:39         ` Peter Zijlstra
2013-11-22  6:56           ` Eliezer Tamir
2013-11-22 11:30             ` Peter Zijlstra
2013-11-26  7:15               ` Eliezer Tamir
2013-11-26 10:51                 ` Thomas Gleixner
2013-11-20 16:04 ` [PATCH 7/7] preempt: Take away preempt_enable_no_resched() from modules Peter Zijlstra
2013-11-20 18:54   ` Jacob Pan
2013-11-20 19:00     ` Peter Zijlstra
2013-11-20 19:18     ` Peter Zijlstra
2013-11-20 19:29       ` Jacob Pan
2013-11-20 16:34 ` [PATCH 0/7] Cure some vaux idle wrackage Peter Zijlstra
2013-11-20 17:19   ` Jacob Pan
2013-11-20 17:24     ` Peter Zijlstra

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20131122042036.GL4138@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=arjan@linux.intel.com \
    --cc=bitbucket@online.de \
    --cc=christopher.leech@intel.com \
    --cc=davem@davemloft.net \
    --cc=eliezer.tamir@linux.intel.com \
    --cc=hpa@zytor.com \
    --cc=jacob.jun.pan@linux.intel.com \
    --cc=lenb@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rafael.j.wysocki@intel.com \
    --cc=rjw@rjwysocki.net \
    --cc=rui.zhang@intel.com \
    --cc=tglx@linutronix.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).