All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Ingo Molnar <mingo@elte.hu>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Tim Chen <tim.c.chen@linux.intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Hugh Dickins <hughd@google.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	David Miller <davem@davemloft.net>,
	Martin Schwidefsky <schwidefsky@de.ibm.com>,
	Russell King <rmk@arm.linux.org.uk>,
	Paul Mundt <lethal@linux-sh.org>, Jeff Dike <jdike@addtoit.com>,
	Richard Weinberger <richard@nod.at>,
	Tony Luck <tony.luck@intel.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Mel Gorman <mel@csn.ul.ie>, Nick Piggin <npiggin@kernel.dk>,
	Namhyung Kim <namhyung@gmail.com>,
	ak@linux.intel.com, shaohua.li@intel.com, alex.shi@intel.com,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	"Rafael J. Wysocki" <rjw@sisk.pl>
Subject: Re: [GIT PULL] Re: REGRESSION: Performance regressions from switching anon_vma->lock to mutex
Date: Thu, 16 Jun 2011 10:16:44 -0700	[thread overview]
Message-ID: <20110616171644.GK2582@linux.vnet.ibm.com> (raw)
In-Reply-To: <20110616070335.GA7661@elte.hu>

On Thu, Jun 16, 2011 at 09:03:35AM +0200, Ingo Molnar wrote:
> 
> * Linus Torvalds <torvalds@linux-foundation.org> wrote:
> 
> > 
> > 
> > Ingo Molnar <mingo@elte.hu> wrote:
> > >
> > > I have this fix queued up currently:
> > >
> > >  09223371deac: rcu: Use softirq to address performance regression
> > 
> > I really don't think that is even close to enough.
> 
> Yeah.
> 
> > It still does all the callbacks in the threads, and according to 
> > Peter, about half the rcu time in the threads remained..
> 
> You are right - things that are a few percent on a 24 core machine 
> will definitely go exponentially worse on larger boxen. We'll get rid 
> of the kthreads entirely.

I did indeed at one time have access to larger test systems than I
do now, and I clearly need to fix that.  :-/

> The funny thing about this workload is that context-switches are 
> really a fastpath here and we are using anonymous IRQ-triggered 
> softirqs embedded in random task contexts as a workaround for that.

The other thing that the IRQ-triggered softirqs do is to get the callbacks
invoked in cases where a CPU-bound user thread is never context switching.
Of course, one alternative might be to set_need_resched() to force entry
into the scheduler as needed.

> [ I think we'll have to revisit this issue and do it properly:
>   quiescent state is mostly defined by context-switches here, so we
>   could do the RCU callbacks from the task that turns a CPU
>   quiescent, right in the scheduler context-switch path - perhaps
>   with an option for SCHED_FIFO tasks to *not* do GC.

I considered this approach for TINY_RCU, but dropped it in favor of
reducing the interlocking between the scheduler and RCU callbacks.
Might be worth revisiting, though.  If SCHED_FIFO task omit RCU callback
invocation, then there will need to be some override for CPUs with lots
of SCHED_FIFO load, probably similar to RCU's current blimit stuff.

>   That could possibly be more cache-efficient than softirq execution,
>   as we'll process a still-hot pool of callbacks instead of doing
>   them only once per timer tick. It will also make the RCU GC
>   behavior HZ independent. ]

Well, the callbacks will normally be cache-cold in any case due to the
grace-period delay, but on the other hand, both tick-independence and
the ability to shield a given CPU from RCU callback execution might be
quite useful.  The tick currently does the following for RCU:

1.	Informs RCU of user-mode execution (rcu_sched and rcu_bh
	quiescent state).

2.	Informs RCU of non-dyntick idle mode (again, rcu_sched and
	rcu_bh quiescent state).

3.	Kicks the current CPU's RCU core processing as needed in
	response to actions from other CPUs.

Frederic's work avoiding ticks in long-running user-mode tasks
might take care of #1, and it should be possible to make use of
the current dyntick-idle APIs to deal with #2.  Replacing #3
efficiently will take some thought.

> In any case the proxy kthread model clearly sucked, no argument about 
> that.

Indeed, I lost track of the global nature of real-time scheduling.  :-(

Whatever does the boosting will need to have process context and
can be subject to delays, so that pretty much needs to be a kthread.
But it will context-switch quite rarely, so should not be a problem.

							Thanx, Paul

WARNING: multiple messages have this Message-ID (diff)
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Ingo Molnar <mingo@elte.hu>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Tim Chen <tim.c.chen@linux.intel.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Hugh Dickins <hughd@google.com>,
	KOSAKI Motohiro <kosaki.motohiro@jp.fujitsu.com>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>,
	David Miller <davem@davemloft.net>,
	Martin Schwidefsky <schwidefsky@de.ibm.com>,
	Russell King <rmk@arm.linux.org.uk>,
	Paul Mundt <lethal@linux-sh.org>, Jeff Dike <jdike@addtoit.com>,
	Richard Weinberger <richard@nod.at>,
	Tony Luck <tony.luck@intel.com>,
	KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>,
	Mel Gorman <mel@csn.ul.ie>, Nick Piggin <npiggin@kernel.dk>,
	Namhyung Kim <namhyung@gmail.com>,
	ak@linux.intel.com, shaohua.li@intel.com, alex.shi@intel.com,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org,
	"Rafael J. Wysocki" <rjw@sisk.pl>
Subject: Re: [GIT PULL] Re: REGRESSION: Performance regressions from switching anon_vma->lock to mutex
Date: Thu, 16 Jun 2011 10:16:44 -0700	[thread overview]
Message-ID: <20110616171644.GK2582@linux.vnet.ibm.com> (raw)
In-Reply-To: <20110616070335.GA7661@elte.hu>

On Thu, Jun 16, 2011 at 09:03:35AM +0200, Ingo Molnar wrote:
> 
> * Linus Torvalds <torvalds@linux-foundation.org> wrote:
> 
> > 
> > 
> > Ingo Molnar <mingo@elte.hu> wrote:
> > >
> > > I have this fix queued up currently:
> > >
> > >  09223371deac: rcu: Use softirq to address performance regression
> > 
> > I really don't think that is even close to enough.
> 
> Yeah.
> 
> > It still does all the callbacks in the threads, and according to 
> > Peter, about half the rcu time in the threads remained..
> 
> You are right - things that are a few percent on a 24 core machine 
> will definitely go exponentially worse on larger boxen. We'll get rid 
> of the kthreads entirely.

I did indeed at one time have access to larger test systems than I
do now, and I clearly need to fix that.  :-/

> The funny thing about this workload is that context-switches are 
> really a fastpath here and we are using anonymous IRQ-triggered 
> softirqs embedded in random task contexts as a workaround for that.

The other thing that the IRQ-triggered softirqs do is to get the callbacks
invoked in cases where a CPU-bound user thread is never context switching.
Of course, one alternative might be to set_need_resched() to force entry
into the scheduler as needed.

> [ I think we'll have to revisit this issue and do it properly:
>   quiescent state is mostly defined by context-switches here, so we
>   could do the RCU callbacks from the task that turns a CPU
>   quiescent, right in the scheduler context-switch path - perhaps
>   with an option for SCHED_FIFO tasks to *not* do GC.

I considered this approach for TINY_RCU, but dropped it in favor of
reducing the interlocking between the scheduler and RCU callbacks.
Might be worth revisiting, though.  If SCHED_FIFO task omit RCU callback
invocation, then there will need to be some override for CPUs with lots
of SCHED_FIFO load, probably similar to RCU's current blimit stuff.

>   That could possibly be more cache-efficient than softirq execution,
>   as we'll process a still-hot pool of callbacks instead of doing
>   them only once per timer tick. It will also make the RCU GC
>   behavior HZ independent. ]

Well, the callbacks will normally be cache-cold in any case due to the
grace-period delay, but on the other hand, both tick-independence and
the ability to shield a given CPU from RCU callback execution might be
quite useful.  The tick currently does the following for RCU:

1.	Informs RCU of user-mode execution (rcu_sched and rcu_bh
	quiescent state).

2.	Informs RCU of non-dyntick idle mode (again, rcu_sched and
	rcu_bh quiescent state).

3.	Kicks the current CPU's RCU core processing as needed in
	response to actions from other CPUs.

Frederic's work avoiding ticks in long-running user-mode tasks
might take care of #1, and it should be possible to make use of
the current dyntick-idle APIs to deal with #2.  Replacing #3
efficiently will take some thought.

> In any case the proxy kthread model clearly sucked, no argument about 
> that.

Indeed, I lost track of the global nature of real-time scheduling.  :-(

Whatever does the boosting will need to have process context and
can be subject to delays, so that pretty much needs to be a kthread.
But it will context-switch quite rarely, so should not be a problem.

							Thanx, Paul

--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org.  For more info on Linux MM,
see: http://www.linux-mm.org/ .
Fight unfair telecom internet charges in Canada: sign http://stopthemeter.ca/
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

  reply	other threads:[~2011-06-16 17:16 UTC|newest]

Thread overview: 166+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-06-15  0:29 REGRESSION: Performance regressions from switching anon_vma->lock to mutex Tim Chen
2011-06-15  0:29 ` Tim Chen
2011-06-15  0:36 ` Andi Kleen
2011-06-15  0:36   ` Andi Kleen
2011-06-17 19:07   ` Ingo Molnar
2011-06-17 19:07     ` Ingo Molnar
2011-06-15  1:21 ` Linus Torvalds
2011-06-15  1:21   ` Linus Torvalds
2011-06-15  3:42   ` Linus Torvalds
2011-06-15  1:26 ` Shaohua Li
2011-06-15  1:26   ` Shaohua Li
2011-06-15 11:52   ` Peter Zijlstra
2011-06-15 11:52     ` Peter Zijlstra
2011-06-15 12:49     ` Peter Zijlstra
2011-06-15 12:49       ` Peter Zijlstra
2011-06-15 16:18     ` Andi Kleen
2011-06-15 16:18       ` Andi Kleen
2011-06-15 16:45       ` Peter Zijlstra
2011-06-15 16:45         ` Peter Zijlstra
2011-06-15 16:47         ` Andi Kleen
2011-06-15 16:47           ` Andi Kleen
2011-06-15 18:43         ` Tim Chen
2011-06-15 18:43           ` Tim Chen
2011-06-15 20:32           ` Peter Zijlstra
2011-06-15 20:32             ` Peter Zijlstra
2011-06-15 20:57             ` Andi Kleen
2011-06-15 20:57               ` Andi Kleen
2011-06-15 21:12               ` Tim Chen
2011-06-15 21:12                 ` Tim Chen
2011-06-15 21:37                 ` Peter Zijlstra
2011-06-15 21:37                   ` Peter Zijlstra
2011-06-15 21:51                   ` Linus Torvalds
2011-06-15 21:51                     ` Linus Torvalds
2011-06-15 22:19                     ` Andi Kleen
2011-06-15 22:19                       ` Andi Kleen
2011-06-16  0:16                       ` Linus Torvalds
2011-06-16  0:16                         ` Linus Torvalds
2011-06-16 20:14                         ` Andi Kleen
2011-06-16 20:14                           ` Andi Kleen
2011-06-16 20:37                           ` Linus Torvalds
2011-06-16 20:37                             ` Linus Torvalds
2011-06-17  0:24                             ` Andi Kleen
2011-06-17  9:13                               ` Ingo Molnar
2011-06-17  9:13                                 ` Ingo Molnar
2011-06-15 22:15                   ` Andi Kleen
2011-06-15 22:15                     ` Andi Kleen
2011-06-16  1:08                   ` Tim Chen
2011-06-16  1:08                     ` Tim Chen
2011-06-16  1:50                   ` Linus Torvalds
2011-06-16  1:50                     ` Linus Torvalds
2011-06-16 20:26                     ` Tim Chen
2011-06-16 20:26                       ` Tim Chen
2011-06-16 20:47                       ` Linus Torvalds
2011-06-16 20:47                         ` Linus Torvalds
2011-06-16 21:05                         ` Linus Torvalds
2011-06-16 21:05                           ` Linus Torvalds
2011-06-16 21:06                           ` Linus Torvalds
2011-06-16 21:26                             ` Linus Torvalds
2011-06-16 21:26                               ` Linus Torvalds
2011-06-17  3:58                               ` Linus Torvalds
2011-06-17 11:28                                 ` Peter Zijlstra
2011-06-17 11:28                                   ` Peter Zijlstra
2011-06-17 11:54                                   ` Peter Zijlstra
2011-06-17 11:54                                     ` Peter Zijlstra
2011-06-17 16:36                                   ` Linus Torvalds
2011-06-17 16:36                                     ` Linus Torvalds
2011-06-17 17:41                                     ` Hugh Dickins
2011-06-17 17:41                                       ` Hugh Dickins
2011-06-17 17:55                                       ` Peter Zijlstra
2011-06-17 17:55                                         ` Peter Zijlstra
2011-06-17 18:01                                       ` Linus Torvalds
2011-06-17 18:01                                         ` Linus Torvalds
2011-06-17 18:18                                         ` Peter Zijlstra
2011-06-17 18:18                                           ` Peter Zijlstra
2011-06-17 18:32                                           ` Peter Zijlstra
2011-06-17 18:32                                             ` Peter Zijlstra
2011-06-17 18:39                                             ` Linus Torvalds
2011-06-17 18:41                                               ` Linus Torvalds
2011-06-17 18:41                                                 ` Linus Torvalds
2011-06-17 20:19                                               ` Tim Chen
2011-06-17 20:19                                                 ` Tim Chen
2011-06-17 22:20                                               ` Hugh Dickins
2011-06-17 22:20                                                 ` Hugh Dickins
2011-06-18  4:47                                                 ` Linus Torvalds
2011-06-18  4:47                                                   ` Linus Torvalds
2011-06-17 19:53                                             ` [PATCH] mm, memory-failure: Fix spinlock vs mutex order Peter Zijlstra
2011-06-17 19:53                                               ` Peter Zijlstra
2011-06-17 20:04                                               ` Andi Kleen
2011-06-17 20:04                                                 ` Andi Kleen
2011-06-17 16:46                                   ` REGRESSION: Performance regressions from switching anon_vma->lock to mutex Linus Torvalds
2011-06-17 16:46                                     ` Linus Torvalds
2011-06-17 17:28                                     ` Linus Torvalds
2011-06-17 19:40                                     ` Andi Kleen
2011-06-17 19:40                                       ` Andi Kleen
2011-06-18  8:08                                       ` Ingo Molnar
2011-06-18  8:08                                         ` Ingo Molnar
2011-06-17 18:22                                 ` Tim Chen
2011-06-17 18:22                                   ` Tim Chen
2011-06-17 19:05                                   ` Ray Lee
2011-06-17 19:05                                     ` Ray Lee
2011-06-16 22:00                           ` Andi Kleen
2011-06-16 22:00                             ` Andi Kleen
2011-06-15 10:36 ` Peter Zijlstra
2011-06-15 10:36   ` Peter Zijlstra
2011-06-15 10:58   ` Peter Zijlstra
2011-06-15 10:58     ` Peter Zijlstra
2011-06-15 11:41     ` Peter Zijlstra
2011-06-15 11:41       ` Peter Zijlstra
2011-06-15 19:11     ` Linus Torvalds
2011-06-15 19:11       ` Linus Torvalds
2011-06-15 19:24       ` Andrew Morton
2011-06-15 19:24         ` Andrew Morton
2011-06-15 20:16         ` Ingo Molnar
2011-06-15 20:16           ` Ingo Molnar
2011-06-15 20:55           ` Linus Torvalds
2011-06-15 20:55             ` Linus Torvalds
2011-06-15 20:12       ` [GIT PULL] " Ingo Molnar
2011-06-15 20:12         ` Ingo Molnar
2011-06-15 20:29         ` Paul E. McKenney
2011-06-15 20:29           ` Paul E. McKenney
2011-06-15 20:47           ` Linus Torvalds
2011-06-15 20:47             ` Linus Torvalds
2011-06-15 20:54             ` Paul E. McKenney
2011-06-15 20:54               ` Paul E. McKenney
2011-06-15 21:05         ` Linus Torvalds
2011-06-15 21:05           ` Linus Torvalds
2011-06-15 21:15           ` Paul E. McKenney
2011-06-15 21:15             ` Paul E. McKenney
2011-06-15 21:27             ` Linus Torvalds
2011-06-15 21:27               ` Linus Torvalds
2011-06-16  7:03           ` Ingo Molnar
2011-06-16  7:03             ` Ingo Molnar
2011-06-16 17:16             ` Paul E. McKenney [this message]
2011-06-16 17:16               ` Paul E. McKenney
2011-06-16 20:25               ` Ingo Molnar
2011-06-16 20:25                 ` Ingo Molnar
2011-06-16 21:01                 ` Frederic Weisbecker
2011-06-16 21:01                   ` Frederic Weisbecker
2011-06-16 23:02                   ` Ingo Molnar
2011-06-16 23:02                     ` Ingo Molnar
2011-06-17 15:19                     ` Frederic Weisbecker
2011-06-17 15:19                       ` Frederic Weisbecker
2011-06-16 21:02                 ` Andi Kleen
2011-06-16 21:02                   ` Andi Kleen
2011-06-16 22:21                 ` Benjamin Herrenschmidt
2011-06-16 22:21                   ` Benjamin Herrenschmidt
2011-06-16 22:38                   ` Ingo Molnar
2011-06-16 22:38                     ` Ingo Molnar
2011-06-16 22:47                     ` Andi Kleen
2011-06-16 22:47                       ` Andi Kleen
2011-06-16 22:58                       ` Ingo Molnar
2011-06-16 22:58                         ` Ingo Molnar
2011-06-17  0:45                         ` Paul E. McKenney
2011-06-17  0:45                           ` Paul E. McKenney
2011-06-17  9:43                           ` Ingo Molnar
2011-06-17  9:43                             ` Ingo Molnar
2011-06-17 16:48                             ` Paul E. McKenney
2011-06-17 16:48                               ` Paul E. McKenney
2011-06-16 23:37                 ` Paul E. McKenney
2011-06-16 23:37                   ` Paul E. McKenney
2011-06-15 20:13       ` Tim Chen
2011-06-15 20:13         ` Tim Chen
2011-06-15 20:17         ` Ingo Molnar
2011-06-15 20:17           ` Ingo Molnar
2011-06-15 20:21           ` Tim Chen
2011-06-15 20:21             ` Tim Chen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110616171644.GK2582@linux.vnet.ibm.com \
    --to=paulmck@linux.vnet.ibm.com \
    --cc=ak@linux.intel.com \
    --cc=akpm@linux-foundation.org \
    --cc=alex.shi@intel.com \
    --cc=benh@kernel.crashing.org \
    --cc=davem@davemloft.net \
    --cc=hughd@google.com \
    --cc=jdike@addtoit.com \
    --cc=kamezawa.hiroyu@jp.fujitsu.com \
    --cc=kosaki.motohiro@jp.fujitsu.com \
    --cc=lethal@linux-sh.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mel@csn.ul.ie \
    --cc=mingo@elte.hu \
    --cc=namhyung@gmail.com \
    --cc=npiggin@kernel.dk \
    --cc=peterz@infradead.org \
    --cc=richard@nod.at \
    --cc=rjw@sisk.pl \
    --cc=rmk@arm.linux.org.uk \
    --cc=schwidefsky@de.ibm.com \
    --cc=shaohua.li@intel.com \
    --cc=tim.c.chen@linux.intel.com \
    --cc=tony.luck@intel.com \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.