From: Matt Fleming <matt@codeblueprint.co.uk>
To: Atish Patra <atish.patra@oracle.com>
Cc: linux-kernel@vger.kernel.org, joelaf@google.com, jbacik@fb.com,
mingo@redhat.com, peterz@infradead.org, efault@gmx.de,
urezki@gmail.com
Subject: Re: [PATCH RFC v2] sched: Minimize the idle cpu selection race window.
Date: Wed, 7 Feb 2018 13:59:02 +0000 [thread overview]
Message-ID: <20180207135902.GA3505@codeblueprint.co.uk> (raw)
In-Reply-To: <1512500947-24444-2-git-send-email-atish.patra@oracle.com>
On Tue, 05 Dec, at 01:09:07PM, Atish Patra wrote:
> Currently, multiple tasks can wakeup on same cpu from
> select_idle_sibiling() path in case they wakeup simulatenously
> and last ran on the same llc. This happens because an idle cpu
> is not updated until idle task is scheduled out. Any task waking
> during that period may potentially select that cpu for a wakeup
> candidate.
>
> Introduce a per cpu variable that is set as soon as a cpu is
> selected for wakeup for any task. This prevents from other tasks
> to select the same cpu again. Note: This does not close the race
> window but minimizes it to accessing the per-cpu variable. If two
> wakee tasks access the per cpu variable at the same time, they may
> select the same cpu again. But it minimizes the race window
> considerably.
>
> Here are some performance numbers:
I ran this patch through some tests here on the SUSE performance grid
and there's a definite regression for Mike's personal favourite
benchmark, tbench.
Here are the results: vanilla 4.15-rc9 on the left, -rc9 plus this
patch on the right.
tbench4
4.15.0-rc9 4.15.0-rc9
vanillasched-minimize-idle-cpu-window
Min mb/sec-1 484.50 ( 0.00%) 463.03 ( -4.43%)
Min mb/sec-2 961.43 ( 0.00%) 959.35 ( -0.22%)
Min mb/sec-4 1789.60 ( 0.00%) 1760.21 ( -1.64%)
Min mb/sec-8 3518.51 ( 0.00%) 3471.47 ( -1.34%)
Min mb/sec-16 5521.12 ( 0.00%) 5409.77 ( -2.02%)
Min mb/sec-32 7268.61 ( 0.00%) 7491.29 ( 3.06%)
Min mb/sec-64 14413.45 ( 0.00%) 14347.69 ( -0.46%)
Min mb/sec-128 13501.84 ( 0.00%) 13413.82 ( -0.65%)
Min mb/sec-192 13237.02 ( 0.00%) 13231.43 ( -0.04%)
Hmean mb/sec-1 505.20 ( 0.00%) 485.81 ( -3.84%)
Hmean mb/sec-2 973.12 ( 0.00%) 970.67 ( -0.25%)
Hmean mb/sec-4 1835.22 ( 0.00%) 1788.54 ( -2.54%)
Hmean mb/sec-8 3529.35 ( 0.00%) 3487.20 ( -1.19%)
Hmean mb/sec-16 5531.16 ( 0.00%) 5437.43 ( -1.69%)
Hmean mb/sec-32 7627.96 ( 0.00%) 8021.26 ( 5.16%)
Hmean mb/sec-64 14441.20 ( 0.00%) 14395.08 ( -0.32%)
Hmean mb/sec-128 13620.40 ( 0.00%) 13569.17 ( -0.38%)
Hmean mb/sec-192 13265.26 ( 0.00%) 13263.98 ( -0.01%)
Max mb/sec-1 510.30 ( 0.00%) 489.89 ( -4.00%)
Max mb/sec-2 989.45 ( 0.00%) 976.10 ( -1.35%)
Max mb/sec-4 1845.65 ( 0.00%) 1795.50 ( -2.72%)
Max mb/sec-8 3574.03 ( 0.00%) 3547.56 ( -0.74%)
Max mb/sec-16 5556.99 ( 0.00%) 5564.80 ( 0.14%)
Max mb/sec-32 7678.18 ( 0.00%) 8098.63 ( 5.48%)
Max mb/sec-64 14463.07 ( 0.00%) 14437.58 ( -0.18%)
Max mb/sec-128 13659.67 ( 0.00%) 13602.65 ( -0.42%)
Max mb/sec-192 13612.01 ( 0.00%) 13832.98 ( 1.62%)
There's a nice little performance bump around the 32-client mark.
Incidentally, my test machine has 2 NUMA nodes with 24 cpus (12 cores,
2 threads) each. So 32 clients is the point at which things no longer
fit on a single node.
It doesn't look like the regression is caused by the schedule() path
being slightly longer (i.e. it's not a latency issue) because schbench
results show improvements for the low-end:
schbench
4.15.0-rc9 4.15.0-rc9
vanillasched-minimize-idle-cpu-window
Lat 50.00th-qrtle-1 46.00 ( 0.00%) 36.00 ( 21.74%)
Lat 75.00th-qrtle-1 49.00 ( 0.00%) 37.00 ( 24.49%)
Lat 90.00th-qrtle-1 52.00 ( 0.00%) 38.00 ( 26.92%)
Lat 95.00th-qrtle-1 56.00 ( 0.00%) 41.00 ( 26.79%)
Lat 99.00th-qrtle-1 61.00 ( 0.00%) 46.00 ( 24.59%)
Lat 99.50th-qrtle-1 63.00 ( 0.00%) 48.00 ( 23.81%)
Lat 99.90th-qrtle-1 77.00 ( 0.00%) 64.00 ( 16.88%)
Lat 50.00th-qrtle-2 41.00 ( 0.00%) 41.00 ( 0.00%)
Lat 75.00th-qrtle-2 47.00 ( 0.00%) 46.00 ( 2.13%)
Lat 90.00th-qrtle-2 50.00 ( 0.00%) 49.00 ( 2.00%)
Lat 95.00th-qrtle-2 53.00 ( 0.00%) 52.00 ( 1.89%)
Lat 99.00th-qrtle-2 58.00 ( 0.00%) 57.00 ( 1.72%)
Lat 99.50th-qrtle-2 60.00 ( 0.00%) 59.00 ( 1.67%)
Lat 99.90th-qrtle-2 72.00 ( 0.00%) 69.00 ( 4.17%)
Lat 50.00th-qrtle-4 46.00 ( 0.00%) 45.00 ( 2.17%)
Lat 75.00th-qrtle-4 49.00 ( 0.00%) 48.00 ( 2.04%)
Lat 90.00th-qrtle-4 52.00 ( 0.00%) 51.00 ( 1.92%)
Lat 95.00th-qrtle-4 55.00 ( 0.00%) 53.00 ( 3.64%)
Lat 99.00th-qrtle-4 61.00 ( 0.00%) 59.00 ( 3.28%)
Lat 99.50th-qrtle-4 63.00 ( 0.00%) 61.00 ( 3.17%)
Lat 99.90th-qrtle-4 69.00 ( 0.00%) 74.00 ( -7.25%)
Lat 50.00th-qrtle-8 48.00 ( 0.00%) 50.00 ( -4.17%)
Lat 75.00th-qrtle-8 52.00 ( 0.00%) 54.00 ( -3.85%)
Lat 90.00th-qrtle-8 54.00 ( 0.00%) 58.00 ( -7.41%)
Lat 95.00th-qrtle-8 57.00 ( 0.00%) 61.00 ( -7.02%)
Lat 99.00th-qrtle-8 64.00 ( 0.00%) 68.00 ( -6.25%)
Lat 99.50th-qrtle-8 67.00 ( 0.00%) 72.00 ( -7.46%)
Lat 99.90th-qrtle-8 81.00 ( 0.00%) 81.00 ( 0.00%)
Lat 50.00th-qrtle-16 50.00 ( 0.00%) 47.00 ( 6.00%)
Lat 75.00th-qrtle-16 59.00 ( 0.00%) 57.00 ( 3.39%)
Lat 90.00th-qrtle-16 66.00 ( 0.00%) 65.00 ( 1.52%)
Lat 95.00th-qrtle-16 69.00 ( 0.00%) 68.00 ( 1.45%)
Lat 99.00th-qrtle-16 76.00 ( 0.00%) 75.00 ( 1.32%)
Lat 99.50th-qrtle-16 79.00 ( 0.00%) 79.00 ( 0.00%)
Lat 99.90th-qrtle-16 86.00 ( 0.00%) 89.00 ( -3.49%)
Lat 50.00th-qrtle-23 52.00 ( 0.00%) 52.00 ( 0.00%)
Lat 75.00th-qrtle-23 65.00 ( 0.00%) 65.00 ( 0.00%)
Lat 90.00th-qrtle-23 75.00 ( 0.00%) 74.00 ( 1.33%)
Lat 95.00th-qrtle-23 81.00 ( 0.00%) 79.00 ( 2.47%)
Lat 99.00th-qrtle-23 95.00 ( 0.00%) 90.00 ( 5.26%)
Lat 99.50th-qrtle-23 12624.00 ( 0.00%) 1050.00 ( 91.68%)
Lat 99.90th-qrtle-23 15184.00 ( 0.00%) 13872.00 ( 8.64%)
If you'd like to run these tests on your own machines they're all
available at https://github.com/gormanm/mmtests.git.
prev parent reply other threads:[~2018-02-07 13:59 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-12-05 19:09 [PATCH RFC v2] Fix race window during idle cpu selection Atish Patra
2017-12-05 19:09 ` [PATCH RFC v2] sched: Minimize the idle cpu selection race window Atish Patra
2018-02-07 13:59 ` Matt Fleming [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20180207135902.GA3505@codeblueprint.co.uk \
--to=matt@codeblueprint.co.uk \
--cc=atish.patra@oracle.com \
--cc=efault@gmx.de \
--cc=jbacik@fb.com \
--cc=joelaf@google.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=urezki@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.