All of lore.kernel.org
 help / color / mirror / Atom feed
* [RFC] Scheduler: DMA Engine regression because of sched/fair changes
@ 2022-01-12 15:26 Alexander Fomichev
  2022-01-12 17:05 ` Mel Gorman
  2022-01-16  9:55 ` [RFC] Scheduler: DMA Engine regression because of sched/fair changes Thorsten Leemhuis
  0 siblings, 2 replies; 12+ messages in thread
From: Alexander Fomichev @ 2022-01-12 15:26 UTC (permalink / raw)
  To: linux-kernel, dmaengine; +Cc: Mel Gorman, linux

CC: Mel Gorman <mgorman@suse.de>
CC: linux@yadro.com

Hi all,

There's a huge regression found, which affects Intel Xeon's DMA Engine
performance between v4.14 LTS and modern kernels. In certain
circumstances the speed in dmatest is more than 6 times lower.

	- Hardware -
I did testing on 2 systems:
1) Intel(R) Xeon(R) Gold 6132 CPU @ 2.60GHz (Supermicro X11DAi-N)
2) Intel(R) Xeon(R) Bronze 3204 CPU @ 1.90GHz (YADRO Vegman S220)

	- Measurement -
The dmatest result speed decreases with almost any test settings.
Although the most significant impact is revealed with 64K transfers. The
following parameters were used:

modprobe dmatest iterations=1000 timeout=2000 test_buf_size=0x100000 transfer_size=0x10000 norandom=1
echo "dma0chan0" > /sys/module/dmatest/parameters/channel
echo 1 > /sys/module/dmatest/parameters/run

Every test csse was performed at least 3 times. All detailed results are
below.

	- Analysis -
Bisecting revealed 2 different bad commits for those 2 systems, but both
change the same function/condition in the same file.
For the system (1) the bad commit is:
[7332dec055f2457c386032f7e9b2991eb05c2a0a] sched/fair: Only immediately migrate tasks due to interrupts if prev and target CPUs share cache
For the system (2) the bad commit is:
[806486c377e33ab662de6d47902e9e2a32b79368] sched/fair: Do not migrate if the prev_cpu is idle

	- Additional check -
Attempting to revert the changes above, a dirty patch for the (current)
kernel v5.16.0-rc5 was tested too:

diff --git a/kernel/sched/fair.c b/kernel/sched/fair.c
index 6f16dfb74246..0a58cc00b1b8 100644
--- a/kernel/sched/fair.c
+++ b/kernel/sched/fair.c
@@ -5931,8 +5931,8 @@ wake_affine_idle(int this_cpu, int prev_cpu, int sync)
         * a cpufreq perspective, it's better to have higher utilisation
         * on one CPU.
         */
-       if (available_idle_cpu(this_cpu) && cpus_share_cache(this_cpu, prev_cpu))
-               return available_idle_cpu(prev_cpu) ? prev_cpu : this_cpu;
+       if (available_idle_cpu(this_cpu))
+               return this_cpu;

        if (sync && cpu_rq(this_cpu)->nr_running == 1)
                return this_cpu;

Please, take a look if this makes sense. But with this patch applied the
performance of DMA Engine restores.

	- Dmatest results TL;DR -

System (1) before bad commit:
---------------------
[  519.894642] dmatest: Added 1 threads using dma0chan0
[  525.383021] dmatest: Started 1 threads using dma0chan0
[  528.521915] dmatest: dma0chan0-copy0: summary 1000 tests, 0 failures 98367.10 iops 6295494 KB/s (0)
[  544.851751] dmatest: Added 1 threads using dma0chan0
[  546.460064] dmatest: Started 1 threads using dma0chan0
[  549.609504] dmatest: dma0chan0-copy0: summary 1000 tests, 0 failures 100310.96 iops 6419901 KB/s (0)
[  562.178365] dmatest: Added 1 threads using dma0chan0
[  563.852534] dmatest: Started 1 threads using dma0chan0
[  567.004898] dmatest: dma0chan0-copy0: summary 1000 tests, 0 failures 98580.44 iops 6309148 KB/s (0)
---------------------

System (1) on HEAD=bad commit:
---------------------
[  149.555401] dmatest: Added 1 threads using dma0chan0
[  154.162444] dmatest: Started 1 threads using dma0chan0
[  157.490868] dmatest: dma0chan0-copy0: summary 1000 tests, 0 failures 26653.87 iops 1705847 KB/s (0)
[  176.783450] dmatest: Added 1 threads using dma0chan0
[  178.428518] dmatest: Started 1 threads using dma0chan0
[  181.606531] dmatest: dma0chan0-copy0: summary 1000 tests, 0 failures 14194.86 iops 908471 KB/s (0)
[  192.125218] dmatest: Added 1 threads using dma0chan0
[  194.060029] dmatest: Started 1 threads using dma0chan0
[  197.235265] dmatest: dma0chan0-copy0: summary 1000 tests, 0 failures 14757.09 iops 944454 KB/s (0)
---------------------

Systen (1) on v5.16.0-rc5:
---------------------
[ 1430.860170] dmatest: Added 1 threads using dma0chan0
[ 1437.367447] dmatest: Started 1 threads using dma0chan0
[ 1442.756660] dmatest: dma0chan0-copy0: summary 1000 tests, 0 failures 24837.31 iops 1589588 KB/s (0)
[ 1561.614191] dmatest: Added 1 threads using dma0chan0
[ 1562.816375] dmatest: Started 1 threads using dma0chan0
[ 1566.619614] dmatest: dma0chan0-copy0: summary 1000 tests, 0 failures 13666.05 iops 874627 KB/s (0)
[ 1585.019601] dmatest: Added 1 threads using dma0chan0
[ 1587.585741] dmatest: Started 1 threads using dma0chan0
[ 1591.386816] dmatest: dma0chan0-copy0: summary 1000 tests, 0 failures 13521.91 iops 865402 KB/s (0)
---------------------

System (1) on v5.16.0-rc5 with dirty patch:
---------------------
[  733.571508] dmatest: Added 1 threads using dma0chan0
[  746.050800] dmatest: Started 1 threads using dma0chan0
[  749.765600] dmatest: dma0chan0-copy0: summary 1000 tests, 0 failures 87260.03 iops 5584642 KB/s (0)
[  915.051955] dmatest: Added 1 threads using dma0chan0
[  916.550732] dmatest: Started 1 threads using dma0chan0
[  920.267525] dmatest: dma0chan0-copy0: summary 1000 tests, 0 failures 88464.25 iops 5661712 KB/s (0)
[  936.781273] dmatest: Added 1 threads using dma0chan0
[  939.528616] dmatest: Started 1 threads using dma0chan0
[  943.247694] dmatest: dma0chan0-copy0: summary 1000 tests, 0 failures 88833.61 iops 5685351 KB/s (0)
---------------------

System (2) before bad commit:
---------------------
[  481.309411] dmatest: Added 1 threads using dma0chan0
[  491.197425] dmatest: Started 1 threads using dma0chan0
[  497.047315] dmatest: dma0chan0-copy0: summary 1000 tests, 0 failures 78988.94 iops 5055292 KB/s (0)
[  506.057101] dmatest: Added 1 threads using dma0chan0
[  508.939426] dmatest: Started 1 threads using dma0chan0
[  514.788823] dmatest: dma0chan0-copy0: summary 1000 tests, 0 failures 77754.44 iops 4976284 KB/s (0)
[  531.894587] dmatest: Added 1 threads using dma0chan0
[  534.053360] dmatest: Started 1 threads using dma0chan0
[  539.906424] dmatest: dma0chan0-copy0: summary 1000 tests, 0 failures 76988.21 iops 4927246 KB/s (0)
---------------------

System (2) on HEAD=bad commit:
---------------------
[44522.892995] dmatest: Added 1 threads using dma0chan0
[44526.193331] dmatest: Started 1 threads using dma0chan0
[44532.043932] dmatest: dma0chan0-copy0: summary 1000 tests, 0 failures 80360.01 iops 5143040 KB/s (0)
[44561.121118] dmatest: Added 1 threads using dma0chan0
[44562.868428] dmatest: Started 1 threads using dma0chan0
[44568.808577] dmatest: dma0chan0-copy0: summary 1000 tests, 0 failures 16080.53 iops 1029154 KB/s (0)
[44728.597409] dmatest: Added 1 threads using dma0chan0
[44730.301566] dmatest: Started 1 threads using dma0chan0
[44736.259009] dmatest: dma0chan0-copy0: summary 1000 tests, 0 failures 16091.91 iops 1029882 KB/s (0)
---------------------

Thanks for reading.

-- 
Regards,
  Alexander Fomichev

^ permalink raw reply related	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2022-03-06 11:19 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2022-01-12 15:26 [RFC] Scheduler: DMA Engine regression because of sched/fair changes Alexander Fomichev
2022-01-12 17:05 ` Mel Gorman
2022-01-17  8:19   ` Alexander Fomichev
2022-01-17 10:27     ` Mel Gorman
2022-01-17 17:44       ` Alexander Fomichev
     [not found]       ` <20220118020448.2399-1-hdanton@sina.com>
2022-01-18 10:05         ` Mel Gorman
2022-01-19 12:55         ` Alexander Fomichev
     [not found]         ` <20220121101217.2849-1-hdanton@sina.com>
2022-01-21 13:46           ` Alexander Fomichev
     [not found]           ` <20220122233314.2999-1-hdanton@sina.com>
2022-01-28 16:50             ` Alexander Fomichev
2022-02-23 15:24               ` Thorsten Leemhuis
2022-03-06 11:19                 ` [RFC] Scheduler: DMA Engine regression because of sched/fair changes #forregzbot Thorsten Leemhuis
2022-01-16  9:55 ` [RFC] Scheduler: DMA Engine regression because of sched/fair changes Thorsten Leemhuis

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.