From: Jacob Pan <jacob.jun.pan@linux.intel.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>,
LKML <linux-kernel@vger.kernel.org>,
Arjan van de Ven <arjan@linux.intel.com>,
Paul Turner <pjt@google.com>, Len Brown <len.brown@intel.com>,
Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>,
Tim Chen <tim.c.chen@linux.intel.com>,
Andi Kleen <andi.kleen@intel.com>,
Rafael Wysocki <rafael.j.wysocki@intel.com>,
jacob.jun.pan@linux.intel.com
Subject: Re: [RFC PATCH 3/3] sched: introduce synchronized idle injection
Date: Thu, 5 Nov 2015 15:36:25 -0800 [thread overview]
Message-ID: <20151105153625.25fbfe69@icelake> (raw)
In-Reply-To: <20151105135952.GE11639@twins.programming.kicks-ass.net>
On Thu, 5 Nov 2015 14:59:52 +0100
Peter Zijlstra <peterz@infradead.org> wrote:
> On Tue, Nov 03, 2015 at 02:31:20PM +0100, Peter Zijlstra wrote:
> > > @@ -5136,6 +5148,16 @@ pick_next_task_fair(struct rq *rq, struct
> > > task_struct *prev) struct task_struct *p;
> > > int new_tasks;
> > >
> > > +#ifdef CONFIG_CFS_IDLE_INJECT
> > > + if (cfs_rq->force_throttled &&
> > > + !idle_cpu(cpu_of(rq)) &&
> > > + !unlikely(local_softirq_pending())) {
> > > + /* forced idle, pick no task */
> > > + trace_sched_cfs_idle_inject(cpu_of(rq), 1);
> > > + update_curr(cfs_rq);
> > > + return NULL;
> > > + }
> > > +#endif
> > > again:
> > > #ifdef CONFIG_FAIR_GROUP_SCHED
> > > if (!cfs_rq->nr_running)
> >
> > So this is horrible...
>
> So this isn't ideal either (I rather liked the previous approach of a
> random task assuming idle, but tglx hated that). This should at least
> not touch extra cachelines in the hot paths, although it does add a
> few extra instructions :/
>
> Very limited testing didn't show anything horrible.
>
I did some testing with the code below, it shows random
[ 150.442597] NOHZ: local_softirq_pending 02
[ 153.032673] NOHZ: local_softirq_pending 202
[ 153.203785] NOHZ: local_softirq_pending 202
[ 153.206486] NOHZ: local_softirq_pending 282
I recalled that was why i checked for local_softirq_pending in the
initial patch, still trying to find out how we can avoid that. These
also causes non stop sched ticks in the inner idle loop.
> Your throttle would:
>
> raw_spin_lock_irqsave(&rq->lock, flags);
> rq->cfs.forced_idle = true;
> resched = rq->cfs.runnable;
> rq->cfs.runnable = false;
> raw_spin_unlock_irqrestore(&rq->lock, flags);
> if (resched)
> resched_cpu(cpu_of(rq));
>
> And your unthrottle:
>
> raw_spin_lock_irqsave(&rq->lock, flags);
> rq->cfs.forced_idle = false;
> resched = rq->cfs.runnable = !!rq->cfs.nr_running;
> raw_spin_unlock_irqrestore(&rq->lock, flags);
> if (resched)
> resched_cpu(cpu_of(rq));
>
> ---
> kernel/sched/fair.c | 13 +++++++++----
> kernel/sched/sched.h | 1 +
> 2 files changed, 10 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> index 824aa9f..1f0c809 100644
> --- a/kernel/sched/fair.c
> +++ b/kernel/sched/fair.c
> @@ -2341,7 +2341,8 @@ account_entity_enqueue(struct cfs_rq *cfs_rq,
> struct sched_entity *se) list_add(&se->group_node, &rq->cfs_tasks);
> }
> #endif
> - cfs_rq->nr_running++;
> + if (!cfs_rq->nr_running++ && !cfs_rq->forced_idle)
> + cfs_rq->runnable = true;
> }
>
> static void
> @@ -2354,7 +2355,8 @@ account_entity_dequeue(struct cfs_rq *cfs_rq,
> struct sched_entity *se) account_numa_dequeue(rq_of(cfs_rq),
> task_of(se)); list_del_init(&se->group_node);
> }
> - cfs_rq->nr_running--;
> + if (!--cfs_rq->nr_running && !cfs_rq->forced_idle)
> + cfs_rq->runnable = false;
> }
>
> #ifdef CONFIG_FAIR_GROUP_SCHED
> @@ -5204,7 +5206,7 @@ pick_next_task_fair(struct rq *rq, struct
> task_struct *prev)
> again:
> #ifdef CONFIG_FAIR_GROUP_SCHED
> - if (!cfs_rq->nr_running)
> + if (!cfs_rq->runnable)
> goto idle;
>
> if (prev->sched_class != &fair_sched_class)
> @@ -5283,7 +5285,7 @@ simple:
> cfs_rq = &rq->cfs;
> #endif
>
> - if (!cfs_rq->nr_running)
> + if (!cfs_rq->runnable)
> goto idle;
>
> put_prev_task(rq, prev);
> @@ -5302,6 +5304,9 @@ simple:
> return p;
>
> idle:
> + if (cfs_rq->forced_idle)
> + return NULL;
> +
> /*
> * This is OK, because current is on_cpu, which avoids it
> being picked
> * for load-balance and preemption/IRQs are still disabled
> avoiding diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index efd3bfc..33d355d 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -347,6 +347,7 @@ struct cfs_bandwidth { };
> struct cfs_rq {
> struct load_weight load;
> unsigned int nr_running, h_nr_running;
> + unsigned int runnable, forced_idle;
>
> u64 exec_clock;
> u64 min_vruntime;
[Jacob Pan]
next prev parent reply other threads:[~2015-11-05 23:37 UTC|newest]
Thread overview: 43+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-11-03 0:10 [RFC PATCH 0/3] CFS idle injection Jacob Pan
2015-11-03 0:10 ` [RFC PATCH 1/3] ktime: add a roundup function Jacob Pan
2015-11-03 0:10 ` [RFC PATCH 2/3] timer: relax tick stop in idle entry Jacob Pan
2015-11-03 0:10 ` [RFC PATCH 3/3] sched: introduce synchronized idle injection Jacob Pan
2015-11-03 13:31 ` Peter Zijlstra
2015-11-03 14:16 ` Jacob Pan
2015-11-03 16:45 ` Jacob Pan
2015-11-05 10:09 ` Peter Zijlstra
2015-11-05 14:22 ` Arjan van de Ven
2015-11-05 14:33 ` Peter Zijlstra
2015-11-05 14:48 ` Peter Zijlstra
2015-11-05 15:28 ` Arjan van de Ven
2015-11-05 16:52 ` Peter Zijlstra
2015-11-05 18:55 ` Thomas Gleixner
2015-11-05 18:47 ` Thomas Gleixner
2015-11-05 15:32 ` Jacob Pan
2015-11-05 16:06 ` Arjan van de Ven
2015-11-05 19:27 ` Jacob Pan
2015-11-05 19:32 ` Jacob Pan
2015-11-05 13:59 ` Peter Zijlstra
2015-11-05 23:36 ` Jacob Pan [this message]
2015-11-06 7:45 ` Peter Zijlstra
2015-11-06 23:49 ` Jacob Pan
2015-11-10 0:19 ` Jacob Pan
2015-11-04 6:06 ` [RFC PATCH 0/3] CFS " Eduardo Valentin
2015-11-04 16:58 ` Jacob Pan
2015-11-04 17:05 ` Srinivas Pandruvada
2015-11-04 18:43 ` Eduardo Valentin
2015-11-05 10:12 ` Peter Zijlstra
2015-11-06 16:50 ` Punit Agrawal
2015-11-06 19:18 ` Jacob Pan
2015-11-09 11:56 ` Punit Agrawal
2015-11-09 14:15 ` Peter Zijlstra
2015-11-09 14:43 ` Jacob Pan
2015-11-10 10:07 ` Juri Lelli
2015-11-10 10:34 ` Peter Zijlstra
2015-11-10 10:56 ` Juri Lelli
2015-11-09 14:36 ` Jacob Pan
2015-11-06 18:30 ` Dietmar Eggemann
2015-11-06 19:10 ` Jacob Pan
2015-11-06 21:55 ` Dietmar Eggemann
2015-11-09 21:23 ` Jacob Pan
2015-11-09 21:45 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20151105153625.25fbfe69@icelake \
--to=jacob.jun.pan@linux.intel.com \
--cc=andi.kleen@intel.com \
--cc=arjan@linux.intel.com \
--cc=len.brown@intel.com \
--cc=linux-kernel@vger.kernel.org \
--cc=peterz@infradead.org \
--cc=pjt@google.com \
--cc=rafael.j.wysocki@intel.com \
--cc=srinivas.pandruvada@linux.intel.com \
--cc=tglx@linutronix.de \
--cc=tim.c.chen@linux.intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.