From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, peterz@infradead.org
Subject: Re: Fw: [lkp-developer] [sched,rcu] cf7a2dca60: [No primary change] +186% will-it-scale.time.involuntary_context_switches
Date: Wed, 14 Dec 2016 08:48:27 -0800 [thread overview]
Message-ID: <20161214164827.GL3924@linux.vnet.ibm.com> (raw)
In-Reply-To: <20161214161540.GP25573@dhcp22.suse.cz>
On Wed, Dec 14, 2016 at 05:15:41PM +0100, Michal Hocko wrote:
> On Wed 14-12-16 03:06:09, Paul E. McKenney wrote:
> > On Wed, Dec 14, 2016 at 10:54:25AM +0100, Michal Hocko wrote:
> > > On Tue 13-12-16 07:14:08, Paul E. McKenney wrote:
> > > > Just FYI for the moment...
> > > >
> > > > So even with the slowed-down checking, making cond_resched() do what
> > > > cond_resched_rcu_qs() does results in a smallish but quite measurable
> > > > degradation according to 0day.
> > >
> > > So if I understand those results properly, the reason seems to be the
> > > increased involuntary context switches, right? Or am I misreading the
> > > data?
> > > I am looking at your "sched,rcu: Make cond_resched() provide RCU
> > > quiescent state" in linux-next and I am wondering whether rcu_all_qs has
> > > to be called unconditionally and not only when should_resched failed few
> > > times? I guess you have discussed that with Peter already but do not
> > > remember the outcome.
> >
> > My first thought is to wait for the grace period to age further before
> > checking, the idea being to avoid increasing cond_resched() overhead
> > any further. But if that doesn't work, then yes, I may have to look at
> > adding more checks to cond_resched().
>
> This might be really naive but would something like the following work?
> The overhead should be pretty much negligible, I guess. Ideally the pcp
> variable could be set somewhere from check_cpu_stall() but I couldn't
> wrap my head around that code to see how exactly.
My concern (perhaps misplaced) with this approach is that there are
quite a few tight loops containing cond_resched(). So I would still
need to throttle the resulting grace-period acceleration to keep the
context switches down to a dull roar.
Thanx, Paul
> ---
> diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h
> index ac81e4063b40..1c005c5304a3 100644
> --- a/include/linux/rcutiny.h
> +++ b/include/linux/rcutiny.h
> @@ -243,6 +243,10 @@ static inline void rcu_all_qs(void)
> barrier(); /* Avoid RCU read-side critical sections leaking across. */
> }
>
> +static inline void cond_resched_rcu_check(void)
> +{
> +}
> +
> /* RCUtree hotplug events */
> #define rcutree_prepare_cpu NULL
> #define rcutree_online_cpu NULL
> diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h
> index 63a4e4cf40a5..176f6e386379 100644
> --- a/include/linux/rcutree.h
> +++ b/include/linux/rcutree.h
> @@ -110,6 +110,18 @@ extern int rcu_scheduler_active __read_mostly;
> bool rcu_is_watching(void);
>
> void rcu_all_qs(void);
> +#ifndef CONFIG_PREEMPT
> +DECLARE_PER_CPU(int, rcu_needs_qs);
> +
> +static inline void cond_resched_rcu_check(void)
> +{
> + /* Make sure we do not miss rcu_all_qs at least every now and then */
> + if (this_cpu_inc_return(rcu_needs_qs) > 10) {
> + this_cpu_write(rcu_needs_qs, 0);
> + rcu_all_qs();
> + }
> +}
> +#endif
>
> /* RCUtree hotplug events */
> int rcutree_prepare_cpu(unsigned int cpu);
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 69a5611a7e7c..783c74ae9930 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -268,6 +268,9 @@ void rcu_bh_qs(void)
> }
>
> static DEFINE_PER_CPU(int, rcu_sched_qs_mask);
> +#ifndef CONFIG_PREEMPT
> +DEFINE_PER_CPU(int, rcu_needs_qs);
> +#endif
>
> static DEFINE_PER_CPU(struct rcu_dynticks, rcu_dynticks) = {
> .dynticks_nesting = DYNTICK_TASK_EXIT_IDLE,
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 154fd689fe02..a58844be2ef1 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -4905,6 +4905,8 @@ int __sched _cond_resched(void)
> if (should_resched(0)) {
> preempt_schedule_common();
> return 1;
> + } else {
> + cond_resched_rcu_check();
> }
> return 0;
> }
> --
> Michal Hocko
> SUSE Labs
>
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to majordomo@kvack.org. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>
WARNING: multiple messages have this Message-ID (diff)
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Michal Hocko <mhocko@kernel.org>
Cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org, peterz@infradead.org
Subject: Re: Fw: [lkp-developer] [sched,rcu] cf7a2dca60: [No primary change] +186% will-it-scale.time.involuntary_context_switches
Date: Wed, 14 Dec 2016 08:48:27 -0800 [thread overview]
Message-ID: <20161214164827.GL3924@linux.vnet.ibm.com> (raw)
In-Reply-To: <20161214161540.GP25573@dhcp22.suse.cz>
On Wed, Dec 14, 2016 at 05:15:41PM +0100, Michal Hocko wrote:
> On Wed 14-12-16 03:06:09, Paul E. McKenney wrote:
> > On Wed, Dec 14, 2016 at 10:54:25AM +0100, Michal Hocko wrote:
> > > On Tue 13-12-16 07:14:08, Paul E. McKenney wrote:
> > > > Just FYI for the moment...
> > > >
> > > > So even with the slowed-down checking, making cond_resched() do what
> > > > cond_resched_rcu_qs() does results in a smallish but quite measurable
> > > > degradation according to 0day.
> > >
> > > So if I understand those results properly, the reason seems to be the
> > > increased involuntary context switches, right? Or am I misreading the
> > > data?
> > > I am looking at your "sched,rcu: Make cond_resched() provide RCU
> > > quiescent state" in linux-next and I am wondering whether rcu_all_qs has
> > > to be called unconditionally and not only when should_resched failed few
> > > times? I guess you have discussed that with Peter already but do not
> > > remember the outcome.
> >
> > My first thought is to wait for the grace period to age further before
> > checking, the idea being to avoid increasing cond_resched() overhead
> > any further. But if that doesn't work, then yes, I may have to look at
> > adding more checks to cond_resched().
>
> This might be really naive but would something like the following work?
> The overhead should be pretty much negligible, I guess. Ideally the pcp
> variable could be set somewhere from check_cpu_stall() but I couldn't
> wrap my head around that code to see how exactly.
My concern (perhaps misplaced) with this approach is that there are
quite a few tight loops containing cond_resched(). So I would still
need to throttle the resulting grace-period acceleration to keep the
context switches down to a dull roar.
Thanx, Paul
> ---
> diff --git a/include/linux/rcutiny.h b/include/linux/rcutiny.h
> index ac81e4063b40..1c005c5304a3 100644
> --- a/include/linux/rcutiny.h
> +++ b/include/linux/rcutiny.h
> @@ -243,6 +243,10 @@ static inline void rcu_all_qs(void)
> barrier(); /* Avoid RCU read-side critical sections leaking across. */
> }
>
> +static inline void cond_resched_rcu_check(void)
> +{
> +}
> +
> /* RCUtree hotplug events */
> #define rcutree_prepare_cpu NULL
> #define rcutree_online_cpu NULL
> diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h
> index 63a4e4cf40a5..176f6e386379 100644
> --- a/include/linux/rcutree.h
> +++ b/include/linux/rcutree.h
> @@ -110,6 +110,18 @@ extern int rcu_scheduler_active __read_mostly;
> bool rcu_is_watching(void);
>
> void rcu_all_qs(void);
> +#ifndef CONFIG_PREEMPT
> +DECLARE_PER_CPU(int, rcu_needs_qs);
> +
> +static inline void cond_resched_rcu_check(void)
> +{
> + /* Make sure we do not miss rcu_all_qs at least every now and then */
> + if (this_cpu_inc_return(rcu_needs_qs) > 10) {
> + this_cpu_write(rcu_needs_qs, 0);
> + rcu_all_qs();
> + }
> +}
> +#endif
>
> /* RCUtree hotplug events */
> int rcutree_prepare_cpu(unsigned int cpu);
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 69a5611a7e7c..783c74ae9930 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -268,6 +268,9 @@ void rcu_bh_qs(void)
> }
>
> static DEFINE_PER_CPU(int, rcu_sched_qs_mask);
> +#ifndef CONFIG_PREEMPT
> +DEFINE_PER_CPU(int, rcu_needs_qs);
> +#endif
>
> static DEFINE_PER_CPU(struct rcu_dynticks, rcu_dynticks) = {
> .dynticks_nesting = DYNTICK_TASK_EXIT_IDLE,
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 154fd689fe02..a58844be2ef1 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -4905,6 +4905,8 @@ int __sched _cond_resched(void)
> if (should_resched(0)) {
> preempt_schedule_common();
> return 1;
> + } else {
> + cond_resched_rcu_check();
> }
> return 0;
> }
> --
> Michal Hocko
> SUSE Labs
>
next prev parent reply other threads:[~2016-12-14 16:48 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2016-12-13 15:14 Fw: [lkp-developer] [sched,rcu] cf7a2dca60: [No primary change] +186% will-it-scale.time.involuntary_context_switches Paul E. McKenney
2016-12-13 15:14 ` Paul E. McKenney
2016-12-14 9:54 ` Michal Hocko
2016-12-14 9:54 ` Michal Hocko
2016-12-14 11:06 ` Paul E. McKenney
2016-12-14 11:06 ` Paul E. McKenney
2016-12-14 16:15 ` Michal Hocko
2016-12-14 16:15 ` Michal Hocko
2016-12-14 16:48 ` Paul E. McKenney [this message]
2016-12-14 16:48 ` Paul E. McKenney
2016-12-14 17:39 ` Michal Hocko
2016-12-14 17:39 ` Michal Hocko
2017-01-04 0:55 ` Paul E. McKenney
2017-01-04 0:55 ` Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20161214164827.GL3924@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=mhocko@kernel.org \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.