[PATCH v4 0/5] powerpc/smp: Topology and shared processor optimizations

linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed

* [PATCH v4 0/5] powerpc/smp: Topology and shared processor optimizations
@ 2023-11-09  5:49 Srikar Dronamraju
  2023-11-09  5:49 ` [PATCH v4 1/5] powerpc/smp: Enable Asym packing for cores on shared processor Srikar Dronamraju
                   ` (6 more replies)
  0 siblings, 7 replies; 15+ messages in thread
From: Srikar Dronamraju @ 2023-11-09  5:49 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Mark Rutland, Valentin Schneider, Vincent Guittot,
	Srikar Dronamraju, Paul E. McKenney, Peter Zijlstra, linux-kernel,
	Rohan McLure, Nicholas Piggin, linuxppc-dev, Josh Poimboeuf

PowerVM systems configured in shared processors mode have some unique
challenges. Some device-tree properties will be missing on a shared
processor. Hence some sched domains may not make sense for shared processor
systems.

Most shared processor systems are over-provisioned. Underlying PowerVM
Hypervisor would schedule at a Big Core granularity. The most recent power
processors support two almost independent cores. In a lightly loaded
condition, it helps the overall system performance if we pack to lesser
number of Big Cores.

System Configuration
type=Shared mode=Uncapped smt=8 lcpu=96 mem=1066409344 kB cpus=96 ent=64.00
So *64 Entitled cores/ 96 Virtual processor* Scenario

lscpu
Architecture:                       ppc64le
Byte Order:                         Little Endian
CPU(s):                             768
On-line CPU(s) list:                0-767
Model name:                         POWER10 (architected), altivec supported
Model:                              2.0 (pvr 0080 0200)
Thread(s) per core:                 8
Core(s) per socket:                 16
Socket(s):                          6
Hypervisor vendor:                  pHyp
Virtualization type:                para
L1d cache:                          6 MiB (192 instances)
L1i cache:                          9 MiB (192 instances)
NUMA node(s):                       6
NUMA node0 CPU(s):                  0-7,32-39,80-87,128-135,176-183,224-231,272-279,320-327,368-375,416-423,464-471,512-519,560-567,608-615,656-663,704-711,752-759
NUMA node1 CPU(s):                  8-15,40-47,88-95,136-143,184-191,232-239,280-287,328-335,376-383,424-431,472-479,520-527,568-575,616-623,664-671,712-719,760-767
NUMA node4 CPU(s):                  64-71,112-119,160-167,208-215,256-263,304-311,352-359,400-407,448-455,496-503,544-551,592-599,640-647,688-695,736-743
NUMA node5 CPU(s):                  16-23,48-55,96-103,144-151,192-199,240-247,288-295,336-343,384-391,432-439,480-487,528-535,576-583,624-631,672-679,720-727
NUMA node6 CPU(s):                  72-79,120-127,168-175,216-223,264-271,312-319,360-367,408-415,456-463,504-511,552-559,600-607,648-655,696-703,744-751
NUMA node7 CPU(s):                  24-31,56-63,104-111,152-159,200-207,248-255,296-303,344-351,392-399,440-447,488-495,536-543,584-591,632-639,680-687,728-735

ebizzy -t 32 -S 200 (5 iterations) Records per second. (Higher is better)
Kernel     N  Min      Max      Median   Avg        Stddev     %Change
6.6.0-rc3  5  3840178  4059268  3978042  3973936.6  84264.456
+patch     5  3768393  3927901  3874994  3854046    71532.926  -3.01692

>From lparstat (when the workload stabilized)
Kernel     %user  %sys  %wait  %idle  physc  %entc  lbusy  app    vcsw       phint
6.6.0-rc3  4.16   0.00  0.00   95.84  26.06  40.72  4.16   69.88  276906989  578
+patch     4.16   0.00  0.00   95.83  17.70  27.66  4.17   78.26  70436663   119

ebizzy -t 128 -S 200 (5 iterations) Records per second. (Higher is better)
Kernel     N Min      Max      Median   Avg        Stddev     %Change
6.6.0-rc3  5 5520692  5981856  5717709  5727053.2  176093.2
+patch     5 5305888  6259610  5854590  5843311    375917.03  2.02998

>From lparstat (when the workload stabilized)
Kernel     %user  %sys  %wait  %idle  physc  %entc  lbusy  app    vcsw       phint
6.6.0-rc3  16.66  0.00  0.00   83.33  45.49  71.08  16.67  50.50  288778533  581
+patch     16.65  0.00  0.00   83.35  30.15  47.11  16.65  65.76  85196150   133

ebizzy -t 512 -S 200 (5 iterations) Records per second. (Higher is better)
Kernel     N  Min       Max       Median    Avg       Stddev     %Change
6.6.0-rc3  5  19563921  20049955  19701510  19728733  198295.18
+patch     5  19455992  20176445  19718427  19832017  304094.05  0.523521

>From lparstat (when the workload stabilized)
%Kernel     user  %sys  %wait  %idle  physc  %entc   lbusy  app   vcsw       phint
66.6.0-rc3  6.44  0.01  0.00   33.55  94.14  147.09  66.45  1.33  313345175  621
6+patch     6.44  0.01  0.00   33.55  94.15  147.11  66.45  1.33  109193889  309

System Configuration
type=Shared mode=Uncapped smt=8 lcpu=40 mem=1067539392 kB cpus=96 ent=40.00
So *40 Entitled cores/ 40 Virtual processor* Scenario

lscpu
Architecture:                       ppc64le
Byte Order:                         Little Endian
CPU(s):                             320
On-line CPU(s) list:                0-319
Model name:                         POWER10 (architected), altivec supported
Model:                              2.0 (pvr 0080 0200)
Thread(s) per core:                 8
Core(s) per socket:                 10
Socket(s):                          4
Hypervisor vendor:                  pHyp
Virtualization type:                para
L1d cache:                          2.5 MiB (80 instances)
L1i cache:                          3.8 MiB (80 instances)
NUMA node(s):                       4
NUMA node0 CPU(s):                  0-7,32-39,64-71,96-103,128-135,160-167,192-199,224-231,256-263,288-295
NUMA node1 CPU(s):                  8-15,40-47,72-79,104-111,136-143,168-175,200-207,232-239,264-271,296-303
NUMA node4 CPU(s):                  16-23,48-55,80-87,112-119,144-151,176-183,208-215,240-247,272-279,304-311
NUMA node5 CPU(s):                  24-31,56-63,88-95,120-127,152-159,184-191,216-223,248-255,280-287,312-319

ebizzy -t 32 -S 200 (5 iterations) Records per second. (Higher is better)
Kernel     N   Min      Max      Median   Avg        Stddev     %Change
6.6.0-rc3  5   3535518  3864532  3745967  3704233.2  130216.76
+patch     5   3608385  3708026  3649379  3651596.6  37862.163  -1.42099

%Kernel    user   %sys  %wait  %idle  physc  %entc  lbusy  app    vcsw     phint
6.6.0-rc3  10.00  0.01  0.00   89.99  22.98  57.45  10.01  41.01  1135139  262
+patch     10.00  0.00  0.00   90.00  16.95  42.37  10.00  47.05  925561   19

ebizzy -t 64 -S 200 (5 iterations) Records per second. (Higher is better)
Kernel     N   Min      Max      Median   Avg        Stddev     %Change
6.6.0-rc3  5   4434984  4957281  4548786  4591298.2  211770.2
+patch     5   4461115  4835167  4544716  4607795.8  151474.85  0.359323

%Kernel    user   %sys  %wait  %idle  physc  %entc  lbusy  app    vcsw     phint
6.6.0-rc3  20.01  0.00  0.00   79.99  38.22  95.55  20.01  25.77  1287553  265
+patch     19.99  0.00  0.00   80.01  25.55  63.88  19.99  38.44  1077341  20

ebizzy -t 256 -S 200 (5 iterations) Records per second. (Higher is better)
Kernel     N   Min      Max      Median   Avg        Stddev     %Change
6.6.0-rc3  5   8850648  8982659  8951911  8936869.2  52278.031
+patch     5   8751038  9060510  8981409  8942268.4  117070.6   0.0604149

%Kernel    user   %sys  %wait  %idle  physc  %entc   lbusy  app    vcsw     phint
6.6.0-rc3  80.02  0.01  0.01   19.96  40.00  100.00  80.03  24.00  1597665  276
+patch     80.02  0.01  0.01   19.96  40.00  100.00  80.03  23.99  1383921  63

Observation:
We are able to see Improvement in ebizzy throughput even with lesser
core utilization (almost half the core utilization) in low utilization
scenarios while still retaining throughput in mid and higher utilization
scenarios.
Note: The numbers are with Uncapped + no-noise case. In the Capped and/or
noise case, due to contention on the Cores, the numbers are expected to
further improve.

Note: The numbers included (sched/fair: Enable group_asym_packing in find_idlest_group)
https://lore.kernel.org/all/20231018155036.2314342-1-srikar@linux.vnet.ibm.com/

Changelog
v3 (https://lore.kernel.org/all/20231026101843.56784-1-srikar@linux.vnet.ibm.com) ->v4:
1. SPLAR specific Asym packing only for MC and DIE domains.
2. Changes due to rebase (DIE became PKG)

v2 (https://lore.kernel.org/all/20231018163751.2423181-1-srikar@linux.vnet.ibm.com) ->v3:
1. Handle comments from Peter Zijlstra / Michael Ellerman
2. Use __ro_after_init attribute instead of read_mostly
3. Use cpu_has_feature static_key instead of a new one.
4. Build topology dynamically patch added to this patchset.

v1 (https://lore.kernel.org/all/20230830105244.62477-1-srikar@linux.vnet.ibm.com) -> v2:
1. Last two patches were added in this version
2. This version uses static keys

Cc: Christophe Leroy <christophe.leroy@csgroup.eu>
Cc: Josh Poimboeuf <jpoimboe@kernel.org>
Cc: linux-kernel@vger.kernel.org
Cc: linuxppc-dev@lists.ozlabs.org
Cc: Mark Rutland <mark.rutland@arm.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Nicholas Piggin <npiggin@gmail.com>
Cc: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Rohan McLure <rmclure@linux.ibm.com>
Cc: Valentin Schneider <vschneid@redhat.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>

Srikar Dronamraju (5):
  powerpc/smp: Enable Asym packing for cores on shared processor
  powerpc/smp: Disable MC domain for shared processor
  powerpc/smp: Add __ro_after_init attribute
  powerpc/smp: Avoid asym packing within thread_group of a core
  powerpc/smp: Dynamically build Powerpc topology

 arch/powerpc/kernel/smp.c | 136 +++++++++++++++++++++-----------------
 1 file changed, 76 insertions(+), 60 deletions(-)


base-commit: efdcf91a2158294ea1af97e7d592c00e7a97c5b5
-- 
2.31.1


^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH v4 1/5] powerpc/smp: Enable Asym packing for cores on shared processor
  2023-11-09  5:49 [PATCH v4 0/5] powerpc/smp: Topology and shared processor optimizations Srikar Dronamraju
@ 2023-11-09  5:49 ` Srikar Dronamraju
  2023-11-15  5:27   ` Aneesh Kumar K.V
  2023-11-15  6:35   ` Aneesh Kumar K.V
  2023-11-09  5:49 ` [PATCH v4 2/5] powerpc/smp: Disable MC domain for " Srikar Dronamraju
                   ` (5 subsequent siblings)
  6 siblings, 2 replies; 15+ messages in thread
From: Srikar Dronamraju @ 2023-11-09  5:49 UTC (permalink / raw)
  To: Michael Ellerman, Nicholas Piggin, Christophe Leroy
  Cc: Mark Rutland, Valentin Schneider, Vincent Guittot,
	Srikar Dronamraju, Paul E. McKenney, Peter Zijlstra,
	ndesaulniers@google.com, linux-kernel, Rohan McLure, linuxppc-dev,
	Josh Poimboeuf

If there are shared processor LPARs, underlying Hypervisor can have more
virtual cores to handle than actual physical cores.

Starting with Power 9, a big core (aka SMT8 core) has 2 nearly
independent thread groups. On a shared processors LPARs, it helps to
pack threads to lesser number of cores so that the overall system
performance and utilization improves. PowerVM schedules at a big core
level. Hence packing to fewer cores helps.

For example: Lets says there are two 8-core Shared LPARs that are
actually sharing a 8 Core shared physical pool, each running 8 threads
each. Then Consolidating 8 threads to 4 cores on each LPAR would help
them to perform better. This is because each of the LPAR will get
100% time to run applications and there will no switching required by
the Hypervisor.

To achieve this, enable SD_ASYM_PACKING flag at CACHE, MC and DIE level
when the system is running in shared processor mode and has big cores.

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---
Changelog:
v3 -> v4:
- Dont use splpar_asym_pack with SMT
- Conflict resolution due to rebase
	(DIE changed to PKG)
v2 -> v3:
- Handle comments from Michael Ellerman.
- Rework using existing cpu_has_features static key
v1->v2: Using Jump label instead of a variable.

 arch/powerpc/kernel/smp.c | 37 +++++++++++++++++++++++++++++--------
 1 file changed, 29 insertions(+), 8 deletions(-)

diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index ab691c89d787..69a3262024f1 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -993,16 +993,20 @@ static bool shared_caches;
 /* cpumask of CPUs with asymmetric SMT dependency */
 static int powerpc_smt_flags(void)
 {
-	int flags = SD_SHARE_CPUCAPACITY | SD_SHARE_PKG_RESOURCES;
+	if (!cpu_has_feature(CPU_FTR_ASYM_SMT))
+		return SD_SHARE_CPUCAPACITY | SD_SHARE_PKG_RESOURCES;
 
-	if (cpu_has_feature(CPU_FTR_ASYM_SMT)) {
-		printk_once(KERN_INFO "Enabling Asymmetric SMT scheduling\n");
-		flags |= SD_ASYM_PACKING;
-	}
-	return flags;
+	return SD_SHARE_CPUCAPACITY | SD_SHARE_PKG_RESOURCES | SD_ASYM_PACKING;
 }
 #endif
 
+/*
+ * On shared processor LPARs scheduled on a big core (which has two or more
+ * independent thread groups per core), prefer lower numbered CPUs, so
+ * that workload consolidates to lesser number of cores.
+ */
+static __ro_after_init DEFINE_STATIC_KEY_FALSE(splpar_asym_pack);
+
 /*
  * P9 has a slightly odd architecture where pairs of cores share an L2 cache.
  * This topology makes it *much* cheaper to migrate tasks between adjacent cores
@@ -1011,9 +1015,20 @@ static int powerpc_smt_flags(void)
  */
 static int powerpc_shared_cache_flags(void)
 {
+	if (static_branch_unlikely(&splpar_asym_pack))
+		return SD_SHARE_PKG_RESOURCES | SD_ASYM_PACKING;
+
 	return SD_SHARE_PKG_RESOURCES;
 }
 
+static int powerpc_shared_proc_flags(void)
+{
+	if (static_branch_unlikely(&splpar_asym_pack))
+		return SD_ASYM_PACKING;
+
+	return 0;
+}
+
 /*
  * We can't just pass cpu_l2_cache_mask() directly because
  * returns a non-const pointer and the compiler barfs on that.
@@ -1050,8 +1065,8 @@ static struct sched_domain_topology_level powerpc_topology[] = {
 	{ cpu_smt_mask, powerpc_smt_flags, SD_INIT_NAME(SMT) },
 #endif
 	{ shared_cache_mask, powerpc_shared_cache_flags, SD_INIT_NAME(CACHE) },
-	{ cpu_mc_mask, SD_INIT_NAME(MC) },
-	{ cpu_cpu_mask, SD_INIT_NAME(PKG) },
+	{ cpu_mc_mask, powerpc_shared_proc_flags, SD_INIT_NAME(MC) },
+	{ cpu_cpu_mask, powerpc_shared_proc_flags, SD_INIT_NAME(PKG) },
 	{ NULL, },
 };
 
@@ -1686,7 +1701,13 @@ static void __init fixup_topology(void)
 {
 	int i;
 
+	if (is_shared_processor() && has_big_cores)
+		static_branch_enable(&splpar_asym_pack);
+
 #ifdef CONFIG_SCHED_SMT
+	if (cpu_has_feature(CPU_FTR_ASYM_SMT))
+		pr_info_once("Enabling Asymmetric SMT scheduling\n");
+
 	if (has_big_cores) {
 		pr_info("Big cores detected but using small core scheduling\n");
 		powerpc_topology[smt_idx].mask = smallcore_smt_mask;
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v4 2/5] powerpc/smp: Disable MC domain for shared processor
  2023-11-09  5:49 [PATCH v4 0/5] powerpc/smp: Topology and shared processor optimizations Srikar Dronamraju
  2023-11-09  5:49 ` [PATCH v4 1/5] powerpc/smp: Enable Asym packing for cores on shared processor Srikar Dronamraju
@ 2023-11-09  5:49 ` Srikar Dronamraju
  2023-11-09  5:49 ` [PATCH v4 3/5] powerpc/smp: Add __ro_after_init attribute Srikar Dronamraju
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 15+ messages in thread
From: Srikar Dronamraju @ 2023-11-09  5:49 UTC (permalink / raw)
  To: Michael Ellerman, Nicholas Piggin, Christophe Leroy
  Cc: Mark Rutland, Valentin Schneider, Vincent Guittot,
	Srikar Dronamraju, Paul E. McKenney, Peter Zijlstra,
	ndesaulniers@google.com, linux-kernel, Rohan McLure, linuxppc-dev,
	Josh Poimboeuf

Like L2-cache info, coregroup information which is used to determine MC
sched domains is only present on dedicated LPARs. i.e PowerVM doesn't
export coregroup information for shared processor LPARs. Hence disable
creating MC domains on shared LPAR Systems.

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/smp.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 69a3262024f1..1dae4e9ba42d 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -1052,6 +1052,10 @@ static struct cpumask *cpu_coregroup_mask(int cpu)
 
 static bool has_coregroup_support(void)
 {
+	/* Coregroup identification not available on shared systems */
+	if (is_shared_processor())
+		return 0;
+
 	return coregroup_enabled;
 }
 
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v4 3/5] powerpc/smp: Add __ro_after_init attribute
  2023-11-09  5:49 [PATCH v4 0/5] powerpc/smp: Topology and shared processor optimizations Srikar Dronamraju
  2023-11-09  5:49 ` [PATCH v4 1/5] powerpc/smp: Enable Asym packing for cores on shared processor Srikar Dronamraju
  2023-11-09  5:49 ` [PATCH v4 2/5] powerpc/smp: Disable MC domain for " Srikar Dronamraju
@ 2023-11-09  5:49 ` Srikar Dronamraju
  2023-11-09  5:49 ` [PATCH v4 4/5] powerpc/smp: Avoid asym packing within thread_group of a core Srikar Dronamraju
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 15+ messages in thread
From: Srikar Dronamraju @ 2023-11-09  5:49 UTC (permalink / raw)
  To: Michael Ellerman, Nicholas Piggin, Christophe Leroy
  Cc: Mark Rutland, Valentin Schneider, Vincent Guittot,
	Srikar Dronamraju, Paul E. McKenney, Peter Zijlstra,
	ndesaulniers@google.com, linux-kernel, Rohan McLure, linuxppc-dev,
	Josh Poimboeuf

There are some variables that are only updated at boot time.
So add __ro_after_init attribute to such variables

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---
Changelog:
v2 -> v3:
Use __ro_after_init instead of __read_mostly
Suggested by : Peter Zijlstra and Michael Ellerman

 arch/powerpc/kernel/smp.c | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 1dae4e9ba42d..65a6f988374a 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -77,10 +77,10 @@ static DEFINE_PER_CPU(int, cpu_state) = { 0 };
 #endif
 
 struct task_struct *secondary_current;
-bool has_big_cores;
-bool coregroup_enabled;
-bool thread_group_shares_l2;
-bool thread_group_shares_l3;
+bool has_big_cores __ro_after_init;
+bool coregroup_enabled __ro_after_init;
+bool thread_group_shares_l2 __ro_after_init;
+bool thread_group_shares_l3 __ro_after_init;
 
 DEFINE_PER_CPU(cpumask_var_t, cpu_sibling_map);
 DEFINE_PER_CPU(cpumask_var_t, cpu_smallcore_map);
@@ -987,7 +987,7 @@ static int __init init_thread_group_cache_map(int cpu, int cache_property)
 	return 0;
 }
 
-static bool shared_caches;
+static bool shared_caches __ro_after_init;
 
 #ifdef CONFIG_SCHED_SMT
 /* cpumask of CPUs with asymmetric SMT dependency */
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v4 4/5] powerpc/smp: Avoid asym packing within thread_group of a core
  2023-11-09  5:49 [PATCH v4 0/5] powerpc/smp: Topology and shared processor optimizations Srikar Dronamraju
                   ` (2 preceding siblings ...)
  2023-11-09  5:49 ` [PATCH v4 3/5] powerpc/smp: Add __ro_after_init attribute Srikar Dronamraju
@ 2023-11-09  5:49 ` Srikar Dronamraju
  2023-11-09  5:49 ` [PATCH v4 5/5] powerpc/smp: Dynamically build Powerpc topology Srikar Dronamraju
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 15+ messages in thread
From: Srikar Dronamraju @ 2023-11-09  5:49 UTC (permalink / raw)
  To: Michael Ellerman, Nicholas Piggin, Christophe Leroy
  Cc: Mark Rutland, Valentin Schneider, Vincent Guittot,
	Srikar Dronamraju, Paul E. McKenney, Peter Zijlstra,
	ndesaulniers@google.com, linux-kernel, Rohan McLure, linuxppc-dev,
	Josh Poimboeuf

PowerVM Hypervisor will schedule at a core granularity. However each
core can have more than one thread_groups. For better utilization in
case of a shared processor, its preferable for the scheduler to pack to
the lowest core. However there is no benefit of moving a thread between
two thread groups of the same core.

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---
 arch/powerpc/kernel/smp.c | 13 +++++++++++++
 1 file changed, 13 insertions(+)

diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index 65a6f988374a..a84931c37246 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -1763,6 +1763,19 @@ void __init smp_cpus_done(unsigned int max_cpus)
 	set_sched_topology(powerpc_topology);
 }
 
+/*
+ * For asym packing, by default lower numbered CPU has higher priority.
+ * On shared processors, pack to lower numbered core. However avoid moving
+ * between thread_groups within the same core.
+ */
+int arch_asym_cpu_priority(int cpu)
+{
+	if (static_branch_unlikely(&splpar_asym_pack))
+		return -cpu / threads_per_core;
+
+	return -cpu;
+}
+
 #ifdef CONFIG_HOTPLUG_CPU
 int __cpu_disable(void)
 {
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* [PATCH v4 5/5] powerpc/smp: Dynamically build Powerpc topology
  2023-11-09  5:49 [PATCH v4 0/5] powerpc/smp: Topology and shared processor optimizations Srikar Dronamraju
                   ` (3 preceding siblings ...)
  2023-11-09  5:49 ` [PATCH v4 4/5] powerpc/smp: Avoid asym packing within thread_group of a core Srikar Dronamraju
@ 2023-11-09  5:49 ` Srikar Dronamraju
  2023-11-15  5:54 ` [PATCH v4 0/5] powerpc/smp: Topology and shared processor optimizations Aneesh Kumar K.V
  2023-12-11  2:56 ` Srikar Dronamraju
  6 siblings, 0 replies; 15+ messages in thread
From: Srikar Dronamraju @ 2023-11-09  5:49 UTC (permalink / raw)
  To: Michael Ellerman, Nicholas Piggin, Christophe Leroy
  Cc: Mark Rutland, Valentin Schneider, Vincent Guittot,
	Srikar Dronamraju, Paul E. McKenney, Peter Zijlstra,
	ndesaulniers@google.com, linux-kernel, Rohan McLure, linuxppc-dev,
	Josh Poimboeuf

Currently there are four Powerpc specific sched topologies.  These are
all statically defined.  However not all these topologies are used by
all Powerpc systems.

To avoid unnecessary degenerations by the scheduler, masks and flags
are compared. However if the sched topologies are build dynamically then
the code is simpler and there are greater chances of avoiding
degenerations.

Note:
Even X86 builds its sched topologies dynamically and proposed changes
are very similar to the way X86 is building its topologies.

Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
---
Changelog:
v3 -> v4:
- Conflict resolution due to rebase
	(DIE changed to PKG)

 arch/powerpc/kernel/smp.c | 78 ++++++++++++++-------------------------
 1 file changed, 28 insertions(+), 50 deletions(-)

diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
index a84931c37246..6631659cfb38 100644
--- a/arch/powerpc/kernel/smp.c
+++ b/arch/powerpc/kernel/smp.c
@@ -93,15 +93,6 @@ EXPORT_PER_CPU_SYMBOL(cpu_l2_cache_map);
 EXPORT_PER_CPU_SYMBOL(cpu_core_map);
 EXPORT_SYMBOL_GPL(has_big_cores);
 
-enum {
-#ifdef CONFIG_SCHED_SMT
-	smt_idx,
-#endif
-	cache_idx,
-	mc_idx,
-	die_idx,
-};
-
 #define MAX_THREAD_LIST_SIZE	8
 #define THREAD_GROUP_SHARE_L1   1
 #define THREAD_GROUP_SHARE_L2_L3 2
@@ -1064,16 +1055,6 @@ static const struct cpumask *cpu_mc_mask(int cpu)
 	return cpu_coregroup_mask(cpu);
 }
 
-static struct sched_domain_topology_level powerpc_topology[] = {
-#ifdef CONFIG_SCHED_SMT
-	{ cpu_smt_mask, powerpc_smt_flags, SD_INIT_NAME(SMT) },
-#endif
-	{ shared_cache_mask, powerpc_shared_cache_flags, SD_INIT_NAME(CACHE) },
-	{ cpu_mc_mask, powerpc_shared_proc_flags, SD_INIT_NAME(MC) },
-	{ cpu_cpu_mask, powerpc_shared_proc_flags, SD_INIT_NAME(PKG) },
-	{ NULL, },
-};
-
 static int __init init_big_cores(void)
 {
 	int cpu;
@@ -1701,9 +1682,11 @@ void start_secondary(void *unused)
 	BUG();
 }
 
-static void __init fixup_topology(void)
+static struct sched_domain_topology_level powerpc_topology[6];
+
+static void __init build_sched_topology(void)
 {
-	int i;
+	int i = 0;
 
 	if (is_shared_processor() && has_big_cores)
 		static_branch_enable(&splpar_asym_pack);
@@ -1714,36 +1697,33 @@ static void __init fixup_topology(void)
 
 	if (has_big_cores) {
 		pr_info("Big cores detected but using small core scheduling\n");
-		powerpc_topology[smt_idx].mask = smallcore_smt_mask;
+		powerpc_topology[i++] = (struct sched_domain_topology_level){
+			smallcore_smt_mask, powerpc_smt_flags, SD_INIT_NAME(SMT)
+		};
+	} else {
+		powerpc_topology[i++] = (struct sched_domain_topology_level){
+			cpu_smt_mask, powerpc_smt_flags, SD_INIT_NAME(SMT)
+		};
 	}
 #endif
+	if (shared_caches) {
+		powerpc_topology[i++] = (struct sched_domain_topology_level){
+			shared_cache_mask, powerpc_shared_cache_flags, SD_INIT_NAME(CACHE)
+		};
+	}
+	if (has_coregroup_support()) {
+		powerpc_topology[i++] = (struct sched_domain_topology_level){
+			cpu_mc_mask, powerpc_shared_proc_flags, SD_INIT_NAME(MC)
+		};
+	}
+	powerpc_topology[i++] = (struct sched_domain_topology_level){
+		cpu_cpu_mask, powerpc_shared_proc_flags, SD_INIT_NAME(PKG)
+	};
 
-	if (!has_coregroup_support())
-		powerpc_topology[mc_idx].mask = powerpc_topology[cache_idx].mask;
-
-	/*
-	 * Try to consolidate topology levels here instead of
-	 * allowing scheduler to degenerate.
-	 * - Dont consolidate if masks are different.
-	 * - Dont consolidate if sd_flags exists and are different.
-	 */
-	for (i = 1; i <= die_idx; i++) {
-		if (powerpc_topology[i].mask != powerpc_topology[i - 1].mask)
-			continue;
-
-		if (powerpc_topology[i].sd_flags && powerpc_topology[i - 1].sd_flags &&
-				powerpc_topology[i].sd_flags != powerpc_topology[i - 1].sd_flags)
-			continue;
-
-		if (!powerpc_topology[i - 1].sd_flags)
-			powerpc_topology[i - 1].sd_flags = powerpc_topology[i].sd_flags;
+	/* There must be one trailing NULL entry left.  */
+	BUG_ON(i >= ARRAY_SIZE(powerpc_topology) - 1);
 
-		powerpc_topology[i].mask = powerpc_topology[i + 1].mask;
-		powerpc_topology[i].sd_flags = powerpc_topology[i + 1].sd_flags;
-#ifdef CONFIG_SCHED_DEBUG
-		powerpc_topology[i].name = powerpc_topology[i + 1].name;
-#endif
-	}
+	set_sched_topology(powerpc_topology);
 }
 
 void __init smp_cpus_done(unsigned int max_cpus)
@@ -1758,9 +1738,7 @@ void __init smp_cpus_done(unsigned int max_cpus)
 		smp_ops->bringup_done();
 
 	dump_numa_cpu_topology();
-
-	fixup_topology();
-	set_sched_topology(powerpc_topology);
+	build_sched_topology();
 }
 
 /*
-- 
2.31.1


^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 1/5] powerpc/smp: Enable Asym packing for cores on shared processor
  2023-11-09  5:49 ` [PATCH v4 1/5] powerpc/smp: Enable Asym packing for cores on shared processor Srikar Dronamraju
@ 2023-11-15  5:27   ` Aneesh Kumar K.V
  2023-11-15  5:42     ` Srikar Dronamraju
  2023-11-15  6:35   ` Aneesh Kumar K.V
  1 sibling, 1 reply; 15+ messages in thread
From: Aneesh Kumar K.V @ 2023-11-15  5:27 UTC (permalink / raw)
  To: Srikar Dronamraju, Michael Ellerman, Nicholas Piggin,
	Christophe Leroy
  Cc: Mark Rutland, Valentin Schneider, Vincent Guittot,
	Srikar Dronamraju, Paul E. McKenney, Peter Zijlstra,
	ndesaulniers@google.com, linux-kernel, Rohan McLure, linuxppc-dev,
	Josh Poimboeuf

Srikar Dronamraju <srikar@linux.vnet.ibm.com> writes:

> If there are shared processor LPARs, underlying Hypervisor can have more
> virtual cores to handle than actual physical cores.
>
> Starting with Power 9, a big core (aka SMT8 core) has 2 nearly
> independent thread groups. On a shared processors LPARs, it helps to
> pack threads to lesser number of cores so that the overall system
> performance and utilization improves. PowerVM schedules at a big core
> level. Hence packing to fewer cores helps.
>
> For example: Lets says there are two 8-core Shared LPARs that are
> actually sharing a 8 Core shared physical pool, each running 8 threads
> each. Then Consolidating 8 threads to 4 cores on each LPAR would help
> them to perform better. This is because each of the LPAR will get
> 100% time to run applications and there will no switching required by
> the Hypervisor.
>

Will this patch consolidate things to first 8 threads or just the one
Big core? /me continues to look at other patches and wonder whether 4/5
should come before this? 


>
> To achieve this, enable SD_ASYM_PACKING flag at CACHE, MC and DIE level
> when the system is running in shared processor mode and has big cores.
>
> Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>


-aneesh

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 1/5] powerpc/smp: Enable Asym packing for cores on shared processor
  2023-11-15  5:27   ` Aneesh Kumar K.V
@ 2023-11-15  5:42     ` Srikar Dronamraju
  0 siblings, 0 replies; 15+ messages in thread
From: Srikar Dronamraju @ 2023-11-15  5:42 UTC (permalink / raw)
  To: Aneesh Kumar K.V
  Cc: Mark Rutland, Valentin Schneider, Vincent Guittot,
	Paul E. McKenney, Peter Zijlstra, ndesaulniers@google.com,
	linux-kernel, Rohan McLure, Nicholas Piggin, linuxppc-dev,
	Josh Poimboeuf

* Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> [2023-11-15 10:57:08]:

> Srikar Dronamraju <srikar@linux.vnet.ibm.com> writes:
> 
> > If there are shared processor LPARs, underlying Hypervisor can have more
> > virtual cores to handle than actual physical cores.
> >
> > Starting with Power 9, a big core (aka SMT8 core) has 2 nearly
> > independent thread groups. On a shared processors LPARs, it helps to
> > pack threads to lesser number of cores so that the overall system
> > performance and utilization improves. PowerVM schedules at a big core
> > level. Hence packing to fewer cores helps.
> >
> > For example: Lets says there are two 8-core Shared LPARs that are
> > actually sharing a 8 Core shared physical pool, each running 8 threads
> > each. Then Consolidating 8 threads to 4 cores on each LPAR would help
> > them to perform better. This is because each of the LPAR will get
> > 100% time to run applications and there will no switching required by
> > the Hypervisor.
> >
> 
> Will this patch consolidate things to first 8 threads or just the one
> Big core? /me continues to look at other patches and wonder whether 4/5
> should come before this? 

It will consolidate 1 thread per small core aka SMT domain or 2 threads per
Big core. If the load is such that there are more unbound threads than SMT
domains, asym packing will not kick-in.

4/5 would make sense only once we enable asym_packing above SMT domain.
> 
> 
> >
> > To achieve this, enable SD_ASYM_PACKING flag at CACHE, MC and DIE level
> > when the system is running in shared processor mode and has big cores.
> >
> > Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
> 
> 
> -aneesh

-- 
Thanks and Regards
Srikar Dronamraju

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 0/5] powerpc/smp: Topology and shared processor optimizations
  2023-11-09  5:49 [PATCH v4 0/5] powerpc/smp: Topology and shared processor optimizations Srikar Dronamraju
                   ` (4 preceding siblings ...)
  2023-11-09  5:49 ` [PATCH v4 5/5] powerpc/smp: Dynamically build Powerpc topology Srikar Dronamraju
@ 2023-11-15  5:54 ` Aneesh Kumar K.V
  2023-11-15  6:16   ` Srikar Dronamraju
  2023-12-11  2:56 ` Srikar Dronamraju
  6 siblings, 1 reply; 15+ messages in thread
From: Aneesh Kumar K.V @ 2023-11-15  5:54 UTC (permalink / raw)
  To: Srikar Dronamraju, Michael Ellerman
  Cc: Mark Rutland, Valentin Schneider, Vincent Guittot,
	Srikar Dronamraju, Paul E. McKenney, Peter Zijlstra, linux-kernel,
	Nicholas Piggin, Rohan McLure, linuxppc-dev, Josh Poimboeuf

Srikar Dronamraju <srikar@linux.vnet.ibm.com> writes:

> PowerVM systems configured in shared processors mode have some unique
> challenges. Some device-tree properties will be missing on a shared
> processor. Hence some sched domains may not make sense for shared processor
> systems.
>
> Most shared processor systems are over-provisioned. Underlying PowerVM
> Hypervisor would schedule at a Big Core granularity. The most recent power
> processors support two almost independent cores. In a lightly loaded
> condition, it helps the overall system performance if we pack to lesser
> number of Big Cores.
>

Is this good to do if the systems are not over-provisioned? What will be
the performance impact in that case with and without the change?

-aneesh

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 0/5] powerpc/smp: Topology and shared processor optimizations
  2023-11-15  5:54 ` [PATCH v4 0/5] powerpc/smp: Topology and shared processor optimizations Aneesh Kumar K.V
@ 2023-11-15  6:16   ` Srikar Dronamraju
  0 siblings, 0 replies; 15+ messages in thread
From: Srikar Dronamraju @ 2023-11-15  6:16 UTC (permalink / raw)
  To: Aneesh Kumar K.V
  Cc: Mark Rutland, Valentin Schneider, Vincent Guittot,
	Paul E. McKenney, Peter Zijlstra, linux-kernel, Nicholas Piggin,
	Rohan McLure, linuxppc-dev, Josh Poimboeuf

* Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> [2023-11-15 11:24:59]:

> Srikar Dronamraju <srikar@linux.vnet.ibm.com> writes:
> 
> > PowerVM systems configured in shared processors mode have some unique
> > challenges. Some device-tree properties will be missing on a shared
> > processor. Hence some sched domains may not make sense for shared processor
> > systems.
> >
> > Most shared processor systems are over-provisioned. Underlying PowerVM
> > Hypervisor would schedule at a Big Core granularity. The most recent power
> > processors support two almost independent cores. In a lightly loaded
> > condition, it helps the overall system performance if we pack to lesser
> > number of Big Cores.
> >
> 
> Is this good to do if the systems are not over-provisioned? What will be
> the performance impact in that case with and without the change?
> 

We are consolidating 1 thread per thread group (aka SMT domain).
Since each thread-group is suppose to be independent including having a
private L1/L2/L3 cache, we expect minimal impact in non over provisioned
scenario.

In Over utilization scenario, the changes in this patchset will not even kick in.

-- 
Thanks and Regards
Srikar Dronamraju

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 1/5] powerpc/smp: Enable Asym packing for cores on shared processor
  2023-11-09  5:49 ` [PATCH v4 1/5] powerpc/smp: Enable Asym packing for cores on shared processor Srikar Dronamraju
  2023-11-15  5:27   ` Aneesh Kumar K.V
@ 2023-11-15  6:35   ` Aneesh Kumar K.V
  2023-11-15 11:35     ` Srikar Dronamraju
  1 sibling, 1 reply; 15+ messages in thread
From: Aneesh Kumar K.V @ 2023-11-15  6:35 UTC (permalink / raw)
  To: Srikar Dronamraju, Michael Ellerman, Nicholas Piggin,
	Christophe Leroy
  Cc: Mark Rutland, Valentin Schneider, Vincent Guittot,
	Srikar Dronamraju, Paul E. McKenney, Peter Zijlstra,
	ndesaulniers@google.com, linux-kernel, Rohan McLure, linuxppc-dev,
	Josh Poimboeuf

Srikar Dronamraju <srikar@linux.vnet.ibm.com> writes:

> If there are shared processor LPARs, underlying Hypervisor can have more
> virtual cores to handle than actual physical cores.
>
> Starting with Power 9, a big core (aka SMT8 core) has 2 nearly
> independent thread groups. On a shared processors LPARs, it helps to
> pack threads to lesser number of cores so that the overall system
> performance and utilization improves. PowerVM schedules at a big core
> level. Hence packing to fewer cores helps.
>
> For example: Lets says there are two 8-core Shared LPARs that are
> actually sharing a 8 Core shared physical pool, each running 8 threads
> each. Then Consolidating 8 threads to 4 cores on each LPAR would help
> them to perform better. This is because each of the LPAR will get
> 100% time to run applications and there will no switching required by
> the Hypervisor.
>
> To achieve this, enable SD_ASYM_PACKING flag at CACHE, MC and DIE level
> when the system is running in shared processor mode and has big cores.
>
> Signed-off-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
> ---
> Changelog:
> v3 -> v4:
> - Dont use splpar_asym_pack with SMT
> - Conflict resolution due to rebase
> 	(DIE changed to PKG)
> v2 -> v3:
> - Handle comments from Michael Ellerman.
> - Rework using existing cpu_has_features static key
> v1->v2: Using Jump label instead of a variable.
>
>  arch/powerpc/kernel/smp.c | 37 +++++++++++++++++++++++++++++--------
>  1 file changed, 29 insertions(+), 8 deletions(-)
>
> diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
> index ab691c89d787..69a3262024f1 100644
> --- a/arch/powerpc/kernel/smp.c
> +++ b/arch/powerpc/kernel/smp.c
> @@ -993,16 +993,20 @@ static bool shared_caches;
>  /* cpumask of CPUs with asymmetric SMT dependency */
>  static int powerpc_smt_flags(void)
>  {
> -	int flags = SD_SHARE_CPUCAPACITY | SD_SHARE_PKG_RESOURCES;
> +	if (!cpu_has_feature(CPU_FTR_ASYM_SMT))
> +		return SD_SHARE_CPUCAPACITY | SD_SHARE_PKG_RESOURCES;
>  
> -	if (cpu_has_feature(CPU_FTR_ASYM_SMT)) {
> -		printk_once(KERN_INFO "Enabling Asymmetric SMT scheduling\n");
> -		flags |= SD_ASYM_PACKING;
> -	}
> -	return flags;
> +	return SD_SHARE_CPUCAPACITY | SD_SHARE_PKG_RESOURCES | SD_ASYM_PACKING;
>  }
>  #endif
>

Only relevant change there is dropping printk_once(). Rest of the
changes are not needed?

-aneesh

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 1/5] powerpc/smp: Enable Asym packing for cores on shared processor
  2023-11-15  6:35   ` Aneesh Kumar K.V
@ 2023-11-15 11:35     ` Srikar Dronamraju
  0 siblings, 0 replies; 15+ messages in thread
From: Srikar Dronamraju @ 2023-11-15 11:35 UTC (permalink / raw)
  To: Aneesh Kumar K.V
  Cc: Mark Rutland, Valentin Schneider, Vincent Guittot,
	Paul E. McKenney, Peter Zijlstra, ndesaulniers@google.com,
	linux-kernel, Rohan McLure, Nicholas Piggin, linuxppc-dev,
	Josh Poimboeuf

* Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com> [2023-11-15 12:05:22]:

> Srikar Dronamraju <srikar@linux.vnet.ibm.com> writes:
> 
> >
> >  arch/powerpc/kernel/smp.c | 37 +++++++++++++++++++++++++++++--------
> >  1 file changed, 29 insertions(+), 8 deletions(-)
> >
> > diff --git a/arch/powerpc/kernel/smp.c b/arch/powerpc/kernel/smp.c
> > index ab691c89d787..69a3262024f1 100644
> > --- a/arch/powerpc/kernel/smp.c
> > +++ b/arch/powerpc/kernel/smp.c
> > @@ -993,16 +993,20 @@ static bool shared_caches;
> >  /* cpumask of CPUs with asymmetric SMT dependency */
> >  static int powerpc_smt_flags(void)
> >  {
> > -	int flags = SD_SHARE_CPUCAPACITY | SD_SHARE_PKG_RESOURCES;
> > +	if (!cpu_has_feature(CPU_FTR_ASYM_SMT))
> > +		return SD_SHARE_CPUCAPACITY | SD_SHARE_PKG_RESOURCES;
> >  
> > -	if (cpu_has_feature(CPU_FTR_ASYM_SMT)) {
> > -		printk_once(KERN_INFO "Enabling Asymmetric SMT scheduling\n");
> > -		flags |= SD_ASYM_PACKING;
> > -	}
> > -	return flags;
> > +	return SD_SHARE_CPUCAPACITY | SD_SHARE_PKG_RESOURCES | SD_ASYM_PACKING;
> >  }
> >  #endif
> >
> 
> Only relevant change there is dropping printk_once(). Rest of the
> changes are not needed?
> 
> -aneesh

If you are looking at just this hunk, then yes its moving the printk_once to
another function.

-- 
Thanks and Regards
Srikar Dronamraju

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 0/5] powerpc/smp: Topology and shared processor optimizations
  2023-11-09  5:49 [PATCH v4 0/5] powerpc/smp: Topology and shared processor optimizations Srikar Dronamraju
                   ` (5 preceding siblings ...)
  2023-11-15  5:54 ` [PATCH v4 0/5] powerpc/smp: Topology and shared processor optimizations Aneesh Kumar K.V
@ 2023-12-11  2:56 ` Srikar Dronamraju
  2023-12-11 10:45   ` Michael Ellerman
  2023-12-13 11:20   ` Aneesh Kumar K.V
  6 siblings, 2 replies; 15+ messages in thread
From: Srikar Dronamraju @ 2023-12-11  2:56 UTC (permalink / raw)
  To: Michael Ellerman
  Cc: Mark Rutland, Valentin Schneider, Vincent Guittot,
	Paul E. McKenney, Peter Zijlstra, linux-kernel, Nicholas Piggin,
	Rohan McLure, linuxppc-dev, Josh Poimboeuf

* Srikar Dronamraju <srikar@linux.vnet.ibm.com> [2023-11-09 11:19:28]:

Hi Michael,

> PowerVM systems configured in shared processors mode have some unique
> challenges. Some device-tree properties will be missing on a shared
> processor. Hence some sched domains may not make sense for shared processor
> systems.
> 
> 

Did you get a chance to look at this patchset?
Do you see this getting pulled into your merge branch?
I havent seen any comments that requires a change from the current patchset.

-- 
Thanks and Regards
Srikar Dronamraju

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 0/5] powerpc/smp: Topology and shared processor optimizations
  2023-12-11  2:56 ` Srikar Dronamraju
@ 2023-12-11 10:45   ` Michael Ellerman
  2023-12-13 11:20   ` Aneesh Kumar K.V
  1 sibling, 0 replies; 15+ messages in thread
From: Michael Ellerman @ 2023-12-11 10:45 UTC (permalink / raw)
  To: Srikar Dronamraju
  Cc: Mark Rutland, Valentin Schneider, Vincent Guittot,
	Paul E. McKenney, Peter Zijlstra, linux-kernel, Nicholas Piggin,
	Rohan McLure, linuxppc-dev, Josh Poimboeuf

Srikar Dronamraju <srikar@linux.vnet.ibm.com> writes:
> * Srikar Dronamraju <srikar@linux.vnet.ibm.com> [2023-11-09 11:19:28]:
>
> Hi Michael,
>
>> PowerVM systems configured in shared processors mode have some unique
>> challenges. Some device-tree properties will be missing on a shared
>> processor. Hence some sched domains may not make sense for shared processor
>> systems.
>> 
>> 
>
> Did you get a chance to look at this patchset?
> Do you see this getting pulled into your merge branch?
> I havent seen any comments that requires a change from the current patchset.

I was assuming you'd send another version to at least incorporate the
clarifications you posted.

And I wasn't really sure the discussion about the prink_once() change
was resolved. Anyway I'll check with Aneesh.

cheers

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH v4 0/5] powerpc/smp: Topology and shared processor optimizations
  2023-12-11  2:56 ` Srikar Dronamraju
  2023-12-11 10:45   ` Michael Ellerman
@ 2023-12-13 11:20   ` Aneesh Kumar K.V
  1 sibling, 0 replies; 15+ messages in thread
From: Aneesh Kumar K.V @ 2023-12-13 11:20 UTC (permalink / raw)
  To: Srikar Dronamraju, Michael Ellerman
  Cc: Mark Rutland, Valentin Schneider, Vincent Guittot,
	Paul E. McKenney, Peter Zijlstra, linux-kernel, Nicholas Piggin,
	Rohan McLure, linuxppc-dev, Josh Poimboeuf

Srikar Dronamraju <srikar@linux.vnet.ibm.com> writes:

> * Srikar Dronamraju <srikar@linux.vnet.ibm.com> [2023-11-09 11:19:28]:
>
> Hi Michael,
>
>> PowerVM systems configured in shared processors mode have some unique
>> challenges. Some device-tree properties will be missing on a shared
>> processor. Hence some sched domains may not make sense for shared processor
>> systems.
>> 
>> 
>
> Did you get a chance to look at this patchset?
> Do you see this getting pulled into your merge branch?
> I havent seen any comments that requires a change from the current patchset.
>

It would be helpful if you could include the details mentioned in your
reply in the commit message. Specifically, provide information
about the over-provisioned config and if you plan to send another
update, please remove the additional changes in the printk_once section.

Reviewed-by: Aneesh Kumar K.V (IBM) <aneesh.kumar@kernel.org>

Thank you.
-aneesh

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2023-12-13 11:21 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-11-09  5:49 [PATCH v4 0/5] powerpc/smp: Topology and shared processor optimizations Srikar Dronamraju
2023-11-09  5:49 ` [PATCH v4 1/5] powerpc/smp: Enable Asym packing for cores on shared processor Srikar Dronamraju
2023-11-15  5:27   ` Aneesh Kumar K.V
2023-11-15  5:42     ` Srikar Dronamraju
2023-11-15  6:35   ` Aneesh Kumar K.V
2023-11-15 11:35     ` Srikar Dronamraju
2023-11-09  5:49 ` [PATCH v4 2/5] powerpc/smp: Disable MC domain for " Srikar Dronamraju
2023-11-09  5:49 ` [PATCH v4 3/5] powerpc/smp: Add __ro_after_init attribute Srikar Dronamraju
2023-11-09  5:49 ` [PATCH v4 4/5] powerpc/smp: Avoid asym packing within thread_group of a core Srikar Dronamraju
2023-11-09  5:49 ` [PATCH v4 5/5] powerpc/smp: Dynamically build Powerpc topology Srikar Dronamraju
2023-11-15  5:54 ` [PATCH v4 0/5] powerpc/smp: Topology and shared processor optimizations Aneesh Kumar K.V
2023-11-15  6:16   ` Srikar Dronamraju
2023-12-11  2:56 ` Srikar Dronamraju
2023-12-11 10:45   ` Michael Ellerman
2023-12-13 11:20   ` Aneesh Kumar K.V

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).