From: Mike Galbraith <efault@gmx.de>
To: Joel Fernandes <joelaf@google.com>
Cc: kernel test robot <xiaolong.ye@intel.com>,
LKML <linux-kernel@vger.kernel.org>,
Peter Zijlstra <peterz@infradead.org>,
Josef Bacik <jbacik@fb.com>, Juri Lelli <Juri.Lelli@arm.com>,
Brendan Jackman <brendan.jackman@arm.com>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
Matt Fleming <matt@codeblueprint.co.uk>,
Rik van Riel <riel@redhat.com>, Ingo Molnar <mingo@redhat.com>,
lkp@01.org
Subject: Re: [lkp-robot] [sched/fair] 6d46bd3d97: netperf.Throughput_tps -11.3% regression
Date: Mon, 11 Sep 2017 10:03:25 +0200 [thread overview]
Message-ID: <1505117005.6062.68.camel@gmx.de> (raw)
In-Reply-To: <CAJWu+oryGEK7=_R2AU6HaFCJcuS1DjLPgjYGLD7c_gWmGpiO=Q@mail.gmail.com>
On Sun, 2017-09-10 at 23:32 -0700, Joel Fernandes wrote:
> Hi Mike,
> Thanks a lot for sharing the history of this.
>
> On Sun, Sep 10, 2017 at 7:55 PM, Mike Galbraith <efault@gmx.de> wrote:
> > On Sun, 2017-09-10 at 09:53 -0700, Joel Fernandes wrote:
> >>
> >> Anyone know what in the netperf test triggers use of the sync flag?
> >
> > homer:..kernel/linux-master # git grep wake_up_interruptible_sync_poll net
> > net/core/sock.c: wake_up_interruptible_sync_poll(&wq->wait, POLLIN | POLLPRI |
> > net/core/sock.c: wake_up_interruptible_sync_poll(&wq->wait, POLLOUT |
> > net/sctp/socket.c: wake_up_interruptible_sync_poll(&wq->wait, POLLIN |
> > net/smc/smc_rx.c: wake_up_interruptible_sync_poll(&wq->wait, POLLIN | POLLPRI |
> > net/tipc/socket.c: wake_up_interruptible_sync_poll(&wq->wait, POLLOUT |
> > net/tipc/socket.c: wake_up_interruptible_sync_poll(&wq->wait, POLLIN |
> > net/unix/af_unix.c: wake_up_interruptible_sync_poll(&wq->wait,
> > net/unix/af_unix.c: wake_up_interruptible_sync_poll(&u->peer_wait,
> >
> > The same as metric tons of other stuff.
> >
> > Once upon a time, we had avg_overlap to help decide whether to wake
> > core affine or not, on top of the wake_affine() imbalance constraint,
> > but instrumentation showed it to be too error prone, so it had to die.
> > These days, an affine wakeup generally means cache affine, and the
> > sync hint gives you a wee bit more chance of migration near to tasty
> > hot data being approved.
> >
> > The sync hint was born back in the bad old days, when communicating
> > tasks not sharing L2 may as well have been talking over two tin cans
> > and a limp string. These days, things are oodles better, but truly
> > synchronous stuff could still benefit from core affinity (up to hugely
> > for very fast/light stuff) if it weren't for all the caveats that can
> > lead to tossing concurrency opportunities out the window.
>
> Cool, thanks. For this test I suspect its the other way? I think the
> reason why regresses is that the 'nr_running < 2' check is too weak of
> a check to prevent sync in all bad situations ('bad' being pulling a
> task to a crowded CPU). Could we maybe be having a situation for this
> test where if the blocked load a CPU is high (many tasks recently were
> running on it and went to sleep), then the nr_running < 2 is a false
> positive and in such a scenario we listened to the sync flag when we
> shouldn't have?
nr_running has ~little to do with synchronous baton passing, or rather
with overlap being too small to overcome costs (L2 miss/cstate..). If
you run TCP_RR, pipe-test or ilk (tiny ball ping-pong), you should see
gorgeous numbers, and may find yourself thinking sync is a wonder
elixir for network-land, but 'taint necessarily so in the real world.
The only way to win the "Rob Peter to pay Paul" game is through the use
of knowledge, acquisition of which is expensive. Try resurrecting the
cheap avg_overlap, you'll love it.. for a little while.
> To make the load check more meaningful, I am thinking if using
> wake_affine()'s balance check is a better thing to do than the
> 'nr_running < 2' check I used in this patch. Then again, since commit
> 3fed382b46baac ("sched/numa: Implement NUMA node level wake_affine()",
> wake_affine() doesn't do balance check for CPUs within a socket so
> probably bringing back something like the *old* wake_affine that
> checked load between different CPUs within a socket is needed to avoid
> a potentially disastrous sync decision? The commit I refer to was
> added with the reason that select_idle_sibling was selecting cores
> anywhere within a socket, but with my patch we're more specifically
> selecting the waker's CPU on passing the sync flag. Could you share
> your thoughts about this?
Mucking about with wake_affine() in core affinity context would be a
step backward methinks. The sync hint has not meant CPU affine for
quite some time now.
If you can come up with a sync hint modification that is win/win,
super, but I suspect you'll find it a lot easier to create a new
'really really' flag or make your governor more responsive.
-Mike
next prev parent reply other threads:[~2017-09-11 8:04 UTC|newest]
Thread overview: 25+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-08-27 1:02 [PATCH RFC/RFT] sched/fair: Improve the behavior of sync flag Joel Fernandes
2017-08-27 5:44 ` Mike Galbraith
2017-08-27 6:08 ` Mike Galbraith
2017-08-27 6:39 ` Joel Fernandes
2017-08-27 7:16 ` Mike Galbraith
2017-08-27 18:07 ` Mike Galbraith
2017-08-28 5:27 ` Joel Fernandes
2017-08-28 6:10 ` Mike Galbraith
2017-08-28 6:47 ` Mike Galbraith
2017-08-28 16:20 ` Joel Fernandes
2017-08-28 17:17 ` Mike Galbraith
2017-08-27 6:19 ` Joel Fernandes
[not found] ` <CAJWu+ooAPiuS+C7Gos4+8G9+DAvQL8X7=63D8U=yVLJkywbF7Q@mail.gmail.com>
2017-08-27 6:57 ` Mike Galbraith
2017-09-10 13:40 ` [lkp-robot] [sched/fair] 6d46bd3d97: netperf.Throughput_tps -11.3% regression kernel test robot
2017-09-10 16:53 ` Joel Fernandes
2017-09-11 2:55 ` Mike Galbraith
2017-09-11 6:32 ` Joel Fernandes
2017-09-11 8:03 ` Mike Galbraith [this message]
2017-09-14 15:56 ` Rik van Riel
2017-09-15 4:06 ` Mike Galbraith
2017-09-17 6:42 ` Joel Fernandes
2017-09-17 16:47 ` Mike Galbraith
2017-09-17 21:41 ` Joel Fernandes
2017-09-18 5:30 ` Mike Galbraith
2017-09-24 23:46 ` Joel Fernandes
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1505117005.6062.68.camel@gmx.de \
--to=efault@gmx.de \
--cc=Juri.Lelli@arm.com \
--cc=brendan.jackman@arm.com \
--cc=dietmar.eggemann@arm.com \
--cc=jbacik@fb.com \
--cc=joelaf@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=lkp@01.org \
--cc=matt@codeblueprint.co.uk \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=riel@redhat.com \
--cc=xiaolong.ye@intel.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox