From: Mel Gorman <mgorman@techsingularity.net>
To: Hillf Danton <hdanton@sina.com>
Cc: Peter Zijlstra <peterz@infradead.org>,
Dave Chinner <david@fromorbit.com>,
Ingo Molnar <mingo@redhat.com>, Tejun Heo <tj@kernel.org>,
Vincent Guittot <vincent.guittot@linaro.org>,
linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH] sched, fair: Allow a per-cpu kthread waking a task to stack on the same CPU
Date: Tue, 28 Jan 2020 10:32:16 +0000 [thread overview]
Message-ID: <20200128103216.GA3466@techsingularity.net> (raw)
In-Reply-To: <20200128100643.3016-1-hdanton@sina.com>
On Tue, Jan 28, 2020 at 06:06:43PM +0800, Hillf Danton wrote:
>
> On Mon, 27 Jan 2020 14:36:08 +0000 Mel Gorman wrote:
> > Commit 8ab39f11d974 ("xfs: prevent CIL push holdoff in log
> > recovery") changed from using bound workqueues to using unbound
> > workqueues. Functionally this makes sense but it was observed at the time
> > that the dbench performance dropped quite a lot and CPU migrations were
> > excessively high even when there are plenty of idle CPUs.
> >
> > The pattern of the task migration is straight-forward. With XFS, an IO
> > issuer may delegate work to a kworker which wakes on the same CPU. On
> > completion of the work, it wakes the task, finds that the previous CPU
> > is busy (because the kworker is still running on it) and migrates the
> > task to the next idle CPU. The task ends up migrating around all CPUs
> > sharing a LLC at high frequency. This has negative implications both in
> > commication costs and power management. mpstat confirmed that at low
> > thread counts that all CPUs sharing an LLC has low level of activity.
> >
> > The impact of this problem is related to the number of CPUs sharing an LLC.
> >
>
> Are you trying to fix a problem of cache affinity?
>
No, I'm simply stating that the search space for select_idle_sibling
matters.
> > diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
> > index fe4e0d775375..76df439aff76 100644
> > --- a/kernel/sched/fair.c
> > +++ b/kernel/sched/fair.c
> > @@ -5912,6 +5912,19 @@ static int select_idle_sibling(struct task_struct *p, int prev, int target)
> > (available_idle_cpu(prev) || sched_idle_cpu(prev)))
> > return prev;
> >
> > + /*
> > + * Allow a per-cpu kthread to stack with the wakee if the
> > + * kworker thread and the tasks previous CPU are the same.
> > + * The assumption is that the wakee queued work for the
> > + * per-cpu kthread that is now complete and the wakeup is
> > + * essentially a sync wakeup.
> > + */
> > + if (is_per_cpu_kthread(current) &&
> > + prev == smp_processor_id() &&
>
> Looks like cache affinity is not your target.
It's not, at least not LLC. L1 is a partial consideration.
> Wondering why it does not work to select a cpu sharing cache with prev
> if strong relation exists between waker and wakee.
>
Functionally it works, it's just slow. There is a cost to migration and
a cost to exiting from idle state and ramping up the CPU frequency.
> > + this_rq()->nr_running <= 1) {
> > + return prev;
> > + }
> > +
> > /* Check a recently used CPU as a potential idle candidate: */
> > recent_used_cpu = p->recent_used_cpu;
> > if (recent_used_cpu != prev &&
> > diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> > index 1a88dc8ad11b..5876e6ba5903 100644
> > --- a/kernel/sched/sched.h
> > +++ b/kernel/sched/sched.h
> > @@ -2479,3 +2479,16 @@ static inline void membarrier_switch_mm(struct rq *rq,
> > {
> > }
> > #endif
> > +
> > +#ifdef CONFIG_SMP
> > +static inline bool is_per_cpu_kthread(struct task_struct *p)
> > +{
> > + if (!(p->flags & PF_KTHREAD))
> > + return false;
>
> Suspect you need PF_KTHREAD instead of PF_WQ_WORKER. Here is a
> small helper and feel free to pick it up if it makes a sense.
>
Did you mean to switch that around? Either way, I moved an existing helper
that is already used to detect this particular situation. While it works
when it's made specific to a workqueue and open-coding it, there was no
clear reason to narrow the conditions further.
--
Mel Gorman
SUSE Labs
next parent reply other threads:[~2020-01-28 10:32 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20200128100643.3016-1-hdanton@sina.com>
2020-01-28 10:32 ` Mel Gorman [this message]
[not found] ` <20200128130837.11136-1-hdanton@sina.com>
2020-01-28 13:41 ` [PATCH] sched, fair: Allow a per-cpu kthread waking a task to stack on the same CPU Mel Gorman
2020-01-27 14:36 Mel Gorman
2020-01-27 22:32 ` Dave Chinner
2020-01-28 1:19 ` Mel Gorman
2020-01-28 9:10 ` Mel Gorman
2020-01-29 17:38 ` Peter Zijlstra
2020-01-29 22:00 ` Dave Chinner
2020-01-30 0:50 ` Mel Gorman
2020-01-30 0:43 ` Mel Gorman
2020-01-30 8:06 ` Peter Zijlstra
2020-01-30 8:55 ` Mel Gorman
2020-01-28 14:24 ` Mel Gorman
2020-01-28 22:21 ` Dave Chinner
2020-01-29 10:53 ` Mel Gorman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20200128103216.GA3466@techsingularity.net \
--to=mgorman@techsingularity.net \
--cc=david@fromorbit.com \
--cc=hdanton@sina.com \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=tj@kernel.org \
--cc=vincent.guittot@linaro.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.