linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/4] powerpc/smp: Shared processor sched optimizations
@ 2023-08-30 10:52 Srikar Dronamraju
  2023-08-30 10:52 ` [PATCH 1/4] powerpc/smp: Cache CPU has Asymmetric SMP Srikar Dronamraju
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Srikar Dronamraju @ 2023-08-30 10:52 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Juergen Gross, Nathan Lynch, Valentin Schneider,
	Srikar Dronamraju, Paul E McKenney, Peter Zijlstra, linux-kernel,
	virtualization, Nicholas Piggin, linuxppc-dev, Josh Poimboeuf

PowerVM systems configured in shared processors mode have some unique
challenges. Some device-tree properties will be missing on a shared
processor. Hence some sched domains may not make sense for shared processor
systems.

Most shared processor systems are over-provisioned. Underlying PowerVM
Hypervisor would schedule at a Big Core granularity. The most recent power
processors support two almost independent cores. In a lightly loaded
condition, it helps the overall system performance if we pack to lesser
number of Big Cores.

System Configuration
type=Shared mode=Capped smt=8 lcpu=128 mem=1066732224 kB cpus=96 ent=40.00
So *40 Entitled cores / 128 Virtual processors* scenario.

lscpu
Architecture:                    ppc64le
Byte Order:                      Little Endian
CPU(s):                          1024
On-line CPU(s) list:             0-1023
Model name:                      POWER10 (architected), altivec supported
Model:                           2.0 (pvr 0080 0200)
Thread(s) per core:              8
Core(s) per socket:              16
Socket(s):                       8
Hypervisor vendor:               pHyp
Virtualization type:             para
L1d cache:                       8 MiB (256 instances)
L1i cache:                       12 MiB (256 instances)
NUMA node(s):                    8
NUMA node0 CPU(s): 0-7,64-71,128-135,192-199,256-263,320-327,384-391,448-455,512-519,576-583,640-647,704-711,768-775,832-839,896-903,960-967
NUMA node1 CPU(s): 8-15,72-79,136-143,200-207,264-271,328-335,392-399,456-463,520-527,584-591,648-655,712-719,776-783,840-847,904-911,968-975
NUMA node2 CPU(s): 16-23,80-87,144-151,208-215,272-279,336-343,400-407,464-471,528-535,592-599,656-663,720-727,784-791,848-855,912-919,976-983
NUMA node3 CPU(s): 24-31,88-95,152-159,216-223,280-287,344-351,408-415,472-479,536-543,600-607,664-671,728-735,792-799,856-863,920-927,984-991
NUMA node4 CPU(s): 32-39,96-103,160-167,224-231,288-295,352-359,416-423,480-487,544-551,608-615,672-679,736-743,800-807,864-871,928-935,992-999
NUMA node5 CPU(s): 40-47,104-111,168-175,232-239,296-303,360-367,424-431,488-495,552-559,616-623,680-687,744-751,808-815,872-879,936-943,1000-1007
NUMA node6 CPU(s): 48-55,112-119,176-183,240-247,304-311,368-375,432-439,496-503,560-567,624-631,688-695,752-759,816-823,880-887,944-951,1008-1015
NUMA node7 CPU(s): 56-63,120-127,184-191,248-255,312-319,376-383,440-447,504-511,568-575,632-639,696-703,760-767,824-831,888-895,952-959,1016-1023

ebizzy -t 40 -S 200 (5 iterations) Records per second. (Higher is better)
Kernel     N  Min       Max       Median    Avg        Stddev     %Change
v6.5       5  4664647   5148125   5130549   5043050.2  211756.06
+patch     5  4769453   5220808   5137476   5040333.8  193586.43  -0.0538642

From lparstat (when the workload stabilized)
Kernel  %user  %sys  %wait  %idle  physc  %entc   lbusy  app    vcsw       phint
v6.5    6.23   0.00  0.00   93.77  40.06  100.15  6.23   55.92  138699651  100
+patch  6.26   0.01  0.00   93.73  21.15  52.87   6.27   74.78  71743299   148

ebizzy -t 80 -S 200 (5 iterations) Records per second. (Higher is better)
Kernel     N  Min       Max       Median    Avg        Stddev     %Change
v6.5       5  8735907   9121401   8986218   8967125.6  152793.38
+patch     5  9636679   9990229   9765958   9770081.8  143913.29  8.95444

From lparstat (when the workload stabilized)
Kernel  %user  %sys  %wait  %idle  physc  %entc   lbusy  app    vcsw      phint
v6.5    12.40  0.01  0.00   87.60  71.05  177.62  12.40  24.61  98047012  85
+patch  12.47  0.02  0.00   87.50  41.06  102.65  12.50  54.90  77821678  158

ebizzy -t 160 -S 200 (5 iterations) Records per second. (Higher is better)
Kernel     N  Min       Max       Median    Avg        Stddev     %Change
v6.5       5  12378356  12946633  12780732  12682369   266135.73
+patch     5  16756702  17676670  17406971  17341585   346054.89  36.7377

From lparstat (when the workload stabilized)
Kernel  %user  %sys  %wait  %idle  physc  %entc   lbusy  app    vcsw       phint
v6.5    24.56  0.09  0.15   75.19  77.42  193.55  24.65  17.94  135625276  98
+patch  24.78  0.03  0.00   75.19  78.33  195.83  24.81  17.17  107826112  215
-------------------------------------------------------------------------

System Configuration
type=Shared mode=Capped smt=8 lcpu=40 mem=1066732672 kB cpus=96 ent=40.00
So *40 Entitled cores / 40 Virtual processors* scenario.

lscpu
Architecture:                    ppc64le
Byte Order:                      Little Endian
CPU(s):                          320
On-line CPU(s) list:             0-319
Model name:                      POWER10 (architected), altivec supported
Model:                           2.0 (pvr 0080 0200)
Thread(s) per core:              8
Core(s) per socket:              10
Socket(s):                       4
Hypervisor vendor:               pHyp
Virtualization type:             para
L1d cache:                       2.5 MiB (80 instances)
L1i cache:                       3.8 MiB (80 instances)
NUMA node(s):                    4
NUMA node0 CPU(s):               0-7,32-39,64-71,96-103,128-135,160-167,192-199,224-231,256-263,288-295
NUMA node1 CPU(s):               8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303
NUMA node2 CPU(s):               16-23,48-55,80-87,112-119,144-151,176-183,208-215,240-247,272-279,304-311
NUMA node3 CPU(s):               24-31,56-63,88-95,120-127,152-159,184-191,216-223,248-255,280-287,312-319

ebizzy -t 40 -S 200 (5 iterations) Records per second. (Higher is better)
Kernel    N  Min       Max       Median    Avg        Stddev     %Change
v6.5      5  4966196   5148045   5078348   5072977.4  66572.122
+patch    5  5035210   5232882   5158456   5151734    78906.893  1.55247

From lparstat (when the workload stabilized)
Kernel  %user  %sys  %wait  %idle  physc  %entc   lbusy  app    vcsw     phint
v6.5    12.58  0.02  0.00   87.41  40.00  100.00  12.59  55.97  1029603  82
+patch  12.58  0.02  0.00   87.40  21.16  52.90   12.60  74.82  1188571  657

ebizzy -t 80 -S 200 (5 iterations) Records per second. (Higher is better)
Kernel    N  Min       Max       Median    Avg        Stddev     %Change
v6.5      5  10081713  10162128  10145721  10128119   35603.196
+patch    5  9928483   10430256  10338097  10218466   221155.16  0.892041

From lparstat (when the workload stabilized)
Kernel  %user  %sys  %wait  %idle  physc  %entc   lbusy  app    vcsw     phint
v6.5    25.02  0.06  0.00   74.93  40.00  100.00  25.07  55.99  1530297  92
+patch  25.03  0.04  0.00   74.93  40.00  100.00  25.07  55.99  2475875  667

ebizzy -t 160 -S 200 (5 iterations) Records per second. (Higher is better)
Kernel    N  Min       Max       Median    Avg        Stddev     %Change
v6.5      5  9064802   9169798   9115250   9123968.2  44901.261
+patch    5  9064533   9235200   9072374   9119558.2  76260.411  -0.0483342

From lparstat (when the workload stabilized)
Kernel  %user  %sys  %wait  %idle  physc  %entc   lbusy  app    vcsw     phint
v6.5    49.94  0.03  0.00   50.03  40.06  100.15  49.97  55.99  2058879  93
+patch  49.94  0.03  0.00   50.03  40.06  100.15  49.97  55.99  2058879  93
-------------------------------------------------------------------------

Observation:
We are able to see Improvement in ebizzy throughput even with lesser
core utilization (almost half the core utilization) in low utilization
scenarios while still retaining throughput in mid and higher utilization
scenarios.
Note: The numbers are with Uncapped + no-noise case. In the Capped and/or
noise case, due to contention on the Cores, the numbers are expected to
further improve.

Srikar Dronamraju (4):
  powerpc/smp: Cache CPU has Asymmetric SMP
  powerpc/smp: Move shared_processor static key to smp.h
  powerpc/smp: Enable Asym packing for cores on shared processor
  powerpc/smp: Disable MC domain for shared processor

 arch/powerpc/include/asm/paravirt.h | 12 -----------
 arch/powerpc/include/asm/smp.h      | 14 +++++++++++++
 arch/powerpc/kernel/smp.c           | 31 +++++++++++++++++++----------
 3 files changed, 35 insertions(+), 22 deletions(-)

-- 
2.41.0


^ permalink raw reply	[flat|nested] 5+ messages in thread

* [PATCH 1/4] powerpc/smp: Cache CPU has Asymmetric SMP
  2023-08-30 10:52 [PATCH 0/4] powerpc/smp: Shared processor sched optimizations Srikar Dronamraju
@ 2023-08-30 10:52 ` Srikar Dronamraju
  2023-08-30 10:52 ` [PATCH 2/4] powerpc/smp: Move shared_processor static key to smp.h Srikar Dronamraju
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Srikar Dronamraju @ 2023-08-30 10:52 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Juergen Gross, Nathan Lynch, Valentin Schneider,
	Srikar Dronamraju, Paul E McKenney, Peter Zijlstra, linux-kernel,
	virtualization, Nicholas Piggin, linuxppc-dev, Josh Poimboeuf

Currently cpu feature flag is checked whenever powerpc_smt_flags gets
called. This is an unnecessary overhead. CPU_FTR_ASYM_SMT is set based
on the processor and all processors will either have this set or will
have it unset.

Hence only check for the feature flag once and cache it to be used
subsequently. This commit will help avoid a branch in powerpc_smt_flags

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/smp.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index fbbb695bae3d..c7d1484ed230 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -987,18 +987,13 @@ static int __init init_thread_group_cache_map(int cpu, int cache_property)
 }
 
 static bool shared_caches;
+static int asym_pack_flag;
 
 #ifdef CONFIG_SCHED_SMT
 /* cpumask of CPUs with asymmetric SMT dependency */
 static int powerpc_smt_flags(void)
 {
-	int flags = SD_SHARE_CPUCAPACITY | SD_SHARE_PKG_RESOURCES;
-
-	if (cpu_has_feature(CPU_FTR_ASYM_SMT)) {
-		printk_once(KERN_INFO "Enabling Asymmetric SMT scheduling\n");
-		flags |= SD_ASYM_PACKING;
-	}
-	return flags;
+	return SD_SHARE_CPUCAPACITY | SD_SHARE_PKG_RESOURCES | asym_pack_flag;
 }
 #endif
 
@@ -1676,6 +1671,11 @@ static void __init fixup_topology(void)
 {
 	int i;
 
+	if (cpu_has_feature(CPU_FTR_ASYM_SMT)) {
+		printk_once(KERN_INFO "Enabling Asymmetric SMT scheduling\n");
+		asym_pack_flag = SD_ASYM_PACKING;
+	}
+
 #ifdef CONFIG_SCHED_SMT
 	if (has_big_cores) {
 		pr_info("Big cores detected but using small core scheduling\n");
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 2/4] powerpc/smp: Move shared_processor static key to smp.h
  2023-08-30 10:52 [PATCH 0/4] powerpc/smp: Shared processor sched optimizations Srikar Dronamraju
  2023-08-30 10:52 ` [PATCH 1/4] powerpc/smp: Cache CPU has Asymmetric SMP Srikar Dronamraju
@ 2023-08-30 10:52 ` Srikar Dronamraju
  2023-08-30 10:52 ` [PATCH 3/4] powerpc/smp: Enable Asym packing for cores on shared processor Srikar Dronamraju
  2023-08-30 10:52 ` [PATCH 4/4] powerpc/smp: Disable MC domain for " Srikar Dronamraju
  3 siblings, 0 replies; 5+ messages in thread
From: Srikar Dronamraju @ 2023-08-30 10:52 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Juergen Gross, Nathan Lynch, Valentin Schneider,
	Srikar Dronamraju, Paul E McKenney, Peter Zijlstra, linux-kernel,
	virtualization, Nicholas Piggin, linuxppc-dev, Josh Poimboeuf

The ability to detect if the system is running in a shared processor
mode is helpful in few more generic cases not just in
paravirtualization.
For example: At boot time, different scheduler/ topology flags may be
set based on the processor mode. Hence move it to a more generic file.

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---
 arch/powerpc/include/asm/paravirt.h | 12 ------------
 arch/powerpc/include/asm/smp.h      | 14 ++++++++++++++
 2 files changed, 14 insertions(+), 12 deletions(-)

diff --git a/arch/powerpc/include/asm/paravirt.h b/arch/powerpc/include/asm/paravirt.h
index f5ba1a3c41f8..3bc8cff97f9a 100644
--- a/arch/powerpc/include/asm/paravirt.h
+++ b/arch/powerpc/include/asm/paravirt.h
@@ -14,13 +14,6 @@
 #include <asm/kvm_guest.h>
 #include <asm/cputhreads.h>
 
-DECLARE_STATIC_KEY_FALSE(shared_processor);
-
-static inline bool is_shared_processor(void)
-{
-	return static_branch_unlikely(&shared_processor);
-}
-
 #ifdef CONFIG_PARAVIRT_TIME_ACCOUNTING
 extern struct static_key paravirt_steal_enabled;
 extern struct static_key paravirt_steal_rq_enabled;
@@ -71,11 +64,6 @@ static inline void yield_to_any(void)
 	plpar_hcall_norets_notrace(H_CONFER, -1, 0);
 }
 #else
-static inline bool is_shared_processor(void)
-{
-	return false;
-}
-
 static inline u32 yield_count_of(int cpu)
 {
 	return 0;
diff --git a/arch/powerpc/include/asm/smp.h b/arch/powerpc/include/asm/smp.h
index aaaa576d0e15..08631b2a4528 100644
--- a/arch/powerpc/include/asm/smp.h
+++ b/arch/powerpc/include/asm/smp.h
@@ -34,6 +34,20 @@ extern bool coregroup_enabled;
 extern int cpu_to_chip_id(int cpu);
 extern int *chip_id_lookup_table;
 
+#ifdef CONFIG_PPC_SPLPAR
+DECLARE_STATIC_KEY_FALSE(shared_processor);
+
+static inline bool is_shared_processor(void)
+{
+	return static_branch_unlikely(&shared_processor);
+}
+#else
+static inline bool is_shared_processor(void)
+{
+	return false;
+}
+#endif
+
 DECLARE_PER_CPU(cpumask_var_t, thread_group_l1_cache_map);
 DECLARE_PER_CPU(cpumask_var_t, thread_group_l2_cache_map);
 DECLARE_PER_CPU(cpumask_var_t, thread_group_l3_cache_map);
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 3/4] powerpc/smp: Enable Asym packing for cores on shared processor
  2023-08-30 10:52 [PATCH 0/4] powerpc/smp: Shared processor sched optimizations Srikar Dronamraju
  2023-08-30 10:52 ` [PATCH 1/4] powerpc/smp: Cache CPU has Asymmetric SMP Srikar Dronamraju
  2023-08-30 10:52 ` [PATCH 2/4] powerpc/smp: Move shared_processor static key to smp.h Srikar Dronamraju
@ 2023-08-30 10:52 ` Srikar Dronamraju
  2023-08-30 10:52 ` [PATCH 4/4] powerpc/smp: Disable MC domain for " Srikar Dronamraju
  3 siblings, 0 replies; 5+ messages in thread
From: Srikar Dronamraju @ 2023-08-30 10:52 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Juergen Gross, Nathan Lynch, Valentin Schneider,
	Srikar Dronamraju, Paul E McKenney, Peter Zijlstra, linux-kernel,
	virtualization, Nicholas Piggin, linuxppc-dev, Josh Poimboeuf

If there are shared processor lpars, underlying Hypervisor can have more
virtual cores to handle that actual physical cores.

Starting with Power 9, a core has 2 nearly independent thread groups.
On a shared processors lpars, it helps to pack threads to lesser number
of cores so that the overall system performance and utilization
improves. PowerVM schedules at a core level. Hence packing to fewer
cores helps.

For example: Lets says there are two 8-core Shared lpars that are
actually sharing a 8 Core shared physical pool, each running 8 threads
each. Then Consolidating 8 threads to 4 cores on each lpar would help
them to perform better.

To achieve this, enable SD_ASYM_PACKING flag at CACHE, MC and DIE level.

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/smp.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index c7d1484ed230..51403640440c 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -1005,7 +1005,12 @@ static int powerpc_smt_flags(void)
  */
 static int powerpc_shared_cache_flags(void)
 {
-	return SD_SHARE_PKG_RESOURCES;
+	return SD_SHARE_PKG_RESOURCES | asym_pack_flag;
+}
+
+static int powerpc_shared_proc_flags(void)
+{
+	return asym_pack_flag;
 }
 
 /*
@@ -1044,8 +1049,8 @@ static struct sched_domain_topology_level powerpc_topology[] = {
 	{ cpu_smt_mask, powerpc_smt_flags, SD_INIT_NAME(SMT) },
 #endif
 	{ shared_cache_mask, powerpc_shared_cache_flags, SD_INIT_NAME(CACHE) },
-	{ cpu_mc_mask, SD_INIT_NAME(MC) },
-	{ cpu_cpu_mask, SD_INIT_NAME(DIE) },
+	{ cpu_mc_mask, powerpc_shared_proc_flags, SD_INIT_NAME(MC) },
+	{ cpu_cpu_mask, powerpc_shared_proc_flags, SD_INIT_NAME(DIE) },
 	{ NULL, },
 };
 
@@ -1671,7 +1676,9 @@ static void __init fixup_topology(void)
 {
 	int i;
 
-	if (cpu_has_feature(CPU_FTR_ASYM_SMT)) {
+	if (is_shared_processor()) {
+		asym_pack_flag = SD_ASYM_PACKING;
+	} else if (cpu_has_feature(CPU_FTR_ASYM_SMT)) {
 		printk_once(KERN_INFO "Enabling Asymmetric SMT scheduling\n");
 		asym_pack_flag = SD_ASYM_PACKING;
 	}
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [PATCH 4/4] powerpc/smp: Disable MC domain for shared processor
  2023-08-30 10:52 [PATCH 0/4] powerpc/smp: Shared processor sched optimizations Srikar Dronamraju
                   ` (2 preceding siblings ...)
  2023-08-30 10:52 ` [PATCH 3/4] powerpc/smp: Enable Asym packing for cores on shared processor Srikar Dronamraju
@ 2023-08-30 10:52 ` Srikar Dronamraju
  3 siblings, 0 replies; 5+ messages in thread
From: Srikar Dronamraju @ 2023-08-30 10:52 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Juergen Gross, Nathan Lynch, Valentin Schneider,
	Srikar Dronamraju, Paul E McKenney, Peter Zijlstra, linux-kernel,
	virtualization, Nicholas Piggin, linuxppc-dev, Josh Poimboeuf

Like L2-cache info, coregroup information which is used to determine MC
sched domains is only present on dedicated LPARs. i.e PowerVM doesn't
export coregroup information for shared processor LPARs. Hence disable
creating MC domains on shared LPAR Systems.

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/smp.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 51403640440c..48b8161179a8 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -1036,6 +1036,10 @@ static struct cpumask *cpu_coregroup_mask(int cpu)
 
 static bool has_coregroup_support(void)
 {
+	/* Coregroup identification not available on shared systems */
+	if (is_shared_processor())
+		return 0;
+
 	return coregroup_enabled;
 }
 
-- 
2.41.0


^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2023-08-30 10:58 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-08-30 10:52 [PATCH 0/4] powerpc/smp: Shared processor sched optimizations Srikar Dronamraju
2023-08-30 10:52 ` [PATCH 1/4] powerpc/smp: Cache CPU has Asymmetric SMP Srikar Dronamraju
2023-08-30 10:52 ` [PATCH 2/4] powerpc/smp: Move shared_processor static key to smp.h Srikar Dronamraju
2023-08-30 10:52 ` [PATCH 3/4] powerpc/smp: Enable Asym packing for cores on shared processor Srikar Dronamraju
2023-08-30 10:52 ` [PATCH 4/4] powerpc/smp: Disable MC domain for " Srikar Dronamraju

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).