linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [Bug 215135] New: proposed cpufreq driver amd-pstate regresses wrt acpi-cpufreq on some AMD EPYC Zen3
@ 2021-11-25 13:41 bugzilla-daemon
  2021-11-25 13:47 ` [Bug 215135] " bugzilla-daemon
                   ` (2 more replies)
  0 siblings, 3 replies; 4+ messages in thread
From: bugzilla-daemon @ 2021-11-25 13:41 UTC (permalink / raw)
  To: linux-pm

https://bugzilla.kernel.org/show_bug.cgi?id=215135

            Bug ID: 215135
           Summary: proposed cpufreq driver amd-pstate regresses wrt
                    acpi-cpufreq on some AMD EPYC Zen3
           Product: Power Management
           Version: 2.5
    Kernel Version: not merged
          Hardware: All
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: cpufreq
          Assignee: linux-pm@vger.kernel.org
          Reporter: ggherdovich@suse.cz
        Regression: No

Reference: 
https://lore.kernel.org/lkml/20211029130241.1984459-2-ray.huang@amd.com/
[PATCH v3 00/21] cpufreq: introduce a new AMD CPU frequency control mechanism

Note: this is not-yet-merged code. This bugzilla entry is to track progress in
the performance optimization of the "amd-pstate" cpufreq driver, proposed in
the patchset linked above. The bug should be assigned to the patch author,
Huang Rui <ray.huang@amd.com>.

I've tested this driver and it seems the results are a little underwhelming.
The test machine is a two sockets server with two AMD EPYC 7713,
family:model:stepping 25:1:1, 128 cores/256 threads, 256G of memory and SSD
storage. On this system, the amd-pstate driver works only in "shared memory
support", not in "full MSR support", meaning that frequency switches are
triggered from a workqueue instead of scheduler context (!fast_switch).

Dbench sees some ludicrous improvements in both performance and performance
per watt; likewise netperf sees some modest improvements, but that's about
the only good news. Schedutil/ondemand on tbench and hackbench do worse
with amd-pstate than acpi-cpufreq. I don't have data for
ondemand/amd-pstate on kernbench and gitsource, but schedutil regresses on
both.

Here the tables, then some questions & discussion points.

Tilde (~) means the result is the same as baseline (which is, the ratio is
close to 1).
"Sugov" means "schedutil governor", "perfgov" means "performance governor".

             :        acpi-cpufreq          :        amd-pstate          :
 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - -
             :  ondemand  sugov  perfgov    :  ondemand  sugov  perfgov  : 
better if
 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - -
                                       PERFORMANCE RATIOS
dbench       :  1.00      ~      0.33       :  0.37      0.35   0.36     : 
lower
netperf      :  1.00      0.97   ~          :  1.03      1.04   ~        : 
higher
tbench       :  1.00      1.04   1.06       :  0.83      0.40   1.05     : 
higher
hackbench    :  1.00      ~      1.03       :  1.09      1.42   1.03     : 
lower
kernbench    :  1.00      0.96   0.97       :  N/A       1.08   ~        : 
lower
gitsource    :  1.00      0.67   0.69       :  N/A       0.79   0.67     : 
lower
 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
- - - -
                                  PERFORMANCE-PER-WATT RATIOS
dbench4      :  1.00      ~      3.37       :  2.68      3.12   3.03     : 
higher
netperf      :  1.00      0.96   ~          :  1.09      1.06   ~        : 
higher
tbench4      :  1.00      1.03   1.06       :  0.76      0.34   1.04     : 
higher
hackbench    :  1.00      ~      0.95       :  0.88      0.65   0.96     : 
higher
kernbench    :  1.00      1.06   1.05       :  N/A       0.93   1.05     : 
higher
gitsource    :  1.00      1.53   1.50       :  N/A       1.33   1.55     : 
higher


How to read the table: all numbers are ratios of the results of some
governor/driver combination and ondemand/acpi-cpufreq, which is the
baseline (first column). When the "better if" column says "higher", a ratio
larger than 1 indicates an improvement; otherwise it's a regression.
Example: hackbench with sugov/amd-pstate is 42% slower than with
ondemand/acpi-cpufreq (top table). At the same time, it's also 35% less
efficient (bottom table).

CPU information of this dual-socket server:

    CPU(s):                          256
    On-line CPU(s) list:             0-255
    Thread(s) per core:              2
    Core(s) per socket:              64
    Socket(s):                       2
    NUMA node(s):                    2
    Vendor ID:                       AuthenticAMD
    CPU family:                      25
    Model:                           1
    Model name:                      AMD EPYC 7713 64-Core Processor
    Stepping:                        1

The results posted above are, with the exception of gitsource, the average
over several value of a scaling parameter, which generally is the number of
threads or processes used. The tests are performed at low load (eg: a single
thread) all the way up to some multiple of the number of hardware threads. For
example, for tbench we varied the number of clients:

  low load -> 1 2 4 8 16 32 64 128 256 512 1024 <- 4x the number of cpus.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug 215135] proposed cpufreq driver amd-pstate regresses wrt acpi-cpufreq on some AMD EPYC Zen3
  2021-11-25 13:41 [Bug 215135] New: proposed cpufreq driver amd-pstate regresses wrt acpi-cpufreq on some AMD EPYC Zen3 bugzilla-daemon
@ 2021-11-25 13:47 ` bugzilla-daemon
  2022-01-18  9:54 ` bugzilla-daemon
  2022-06-24  1:24 ` bugzilla-daemon
  2 siblings, 0 replies; 4+ messages in thread
From: bugzilla-daemon @ 2021-11-25 13:47 UTC (permalink / raw)
  To: linux-pm

https://bugzilla.kernel.org/show_bug.cgi?id=215135

--- Comment #1 from Giovanni Gherdovich (ggherdovich@suse.cz) ---
Created attachment 299709
  --> https://bugzilla.kernel.org/attachment.cgi?id=299709&action=edit
data for tbench, 128 clients

The attachment contains data for tbench, 128 clients. These are the throughput
reached in the various configuration; acpi-cpufreq/ondemand is the baseline,
higher is better.

acpi-cpufreq-ondemand  23092.2  MB/sec  ( 1.00)
acpi-cpufreq-perfgov   28880.1  MB/sec  ( 1.25)
acpi-cpufreq-sugov     31474    MB/sec  ( 1.36)

amd-pstate-ondemand    23820.4  MB/sec  ( 1.03)
amd-pstate-perfgov     27336.9  MB/sec  ( 1.18)
amd-pstate-sugov        6275.4  MB/sec  ( 0.27)

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug 215135] proposed cpufreq driver amd-pstate regresses wrt acpi-cpufreq on some AMD EPYC Zen3
  2021-11-25 13:41 [Bug 215135] New: proposed cpufreq driver amd-pstate regresses wrt acpi-cpufreq on some AMD EPYC Zen3 bugzilla-daemon
  2021-11-25 13:47 ` [Bug 215135] " bugzilla-daemon
@ 2022-01-18  9:54 ` bugzilla-daemon
  2022-06-24  1:24 ` bugzilla-daemon
  2 siblings, 0 replies; 4+ messages in thread
From: bugzilla-daemon @ 2022-01-18  9:54 UTC (permalink / raw)
  To: linux-pm

https://bugzilla.kernel.org/show_bug.cgi?id=215135

Joe (jinzhou.su@amd.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |jinzhou.su@amd.com

--- Comment #2 from Joe (jinzhou.su@amd.com) ---
Hello Giovanni,

Thanks for testing AMD Pstate driver. Finally we have set up the same device on
our local, here is the lscpu info:

Architecture:        x86_64
CPU op-mode(s):      32-bit, 64-bit
Byte Order:          Little Endian
CPU(s):              256
On-line CPU(s) list: 0-255
Thread(s) per core:  2
Core(s) per socket:  64
Socket(s):           2
NUMA node(s):        2
Vendor ID:           AuthenticAMD
CPU family:          25
Model:               1

Here is the tbench test result.

acpi-cpufreq-ondemand  17628  MB/sec  ( 1.00)
acpi-cpufreq-perfgov   25317  MB/sec  ( 1.43)
acpi-cpufreq-sugov     21369   MB/sec ( 1.21)

amd-pstate-ondemand    17913  MB/sec  ( 1.02)
amd-pstate-perfgov     25865  MB/sec  ( 1.47)
amd-pstate-sugov       22359  MB/sec  ( 1.26)

From our site of view, amd-pstate performs slight better acpi driver in
Tbench4. The test based on 50 times test in each policy.

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [Bug 215135] proposed cpufreq driver amd-pstate regresses wrt acpi-cpufreq on some AMD EPYC Zen3
  2021-11-25 13:41 [Bug 215135] New: proposed cpufreq driver amd-pstate regresses wrt acpi-cpufreq on some AMD EPYC Zen3 bugzilla-daemon
  2021-11-25 13:47 ` [Bug 215135] " bugzilla-daemon
  2022-01-18  9:54 ` bugzilla-daemon
@ 2022-06-24  1:24 ` bugzilla-daemon
  2 siblings, 0 replies; 4+ messages in thread
From: bugzilla-daemon @ 2022-06-24  1:24 UTC (permalink / raw)
  To: linux-pm

https://bugzilla.kernel.org/show_bug.cgi?id=215135

Zhang Rui (rui.zhang@intel.com) changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |rui.zhang@intel.com
           Hardware|All                         |AMD

-- 
You may reply to this email to add a comment.

You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2022-06-24  1:24 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2021-11-25 13:41 [Bug 215135] New: proposed cpufreq driver amd-pstate regresses wrt acpi-cpufreq on some AMD EPYC Zen3 bugzilla-daemon
2021-11-25 13:47 ` [Bug 215135] " bugzilla-daemon
2022-01-18  9:54 ` bugzilla-daemon
2022-06-24  1:24 ` bugzilla-daemon

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).