linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 2/2] sched/nohz: Optimize get_nohz_timer_target()
@ 2019-06-28  8:51 Wanpeng Li
  2019-06-28  8:51 ` [PATCH v4 1/2] sched/isolation: Prefer housekeeping cpu in local node Wanpeng Li
  0 siblings, 1 reply; 4+ messages in thread
From: Wanpeng Li @ 2019-06-28  8:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Peter Zijlstra, Ingo Molnar, Frederic Weisbecker,
	Thomas Gleixner

From: Wanpeng Li <wanpengli@tencent.com>

On a machine, cpu 0 is used for housekeeping, the other 39 cpus in the 
same socket are in nohz_full mode. We can observe huge time burn in the 
loop for seaching nearest busy housekeeper cpu by ftrace.

  2)               |       get_nohz_timer_target() {
  2)   0.240 us    |         housekeeping_test_cpu();
  2)   0.458 us    |         housekeeping_test_cpu();

  ...

  2)   0.292 us    |         housekeeping_test_cpu();
  2)   0.240 us    |         housekeeping_test_cpu();
  2)   0.227 us    |         housekeeping_any_cpu();
  2) + 43.460 us   |       }
  
This patch optimizes the searching logic by finding a nearest housekeeper
cpu in the housekeeping cpumask, it can minimize the worst searching time 
from ~44us to < 10us in my testing. In addition, the last iterated busy 
housekeeper can become a random candidate while current CPU is a better 
fallback if it is a housekeeper.

Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Cc: Ingo Molnar <mingo@redhat.com> 
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
---
v1 -> v2:
 * current CPU is a better fallback if it is a housekeeper

 kernel/sched/core.c | 19 ++++++++++++-------
 1 file changed, 12 insertions(+), 7 deletions(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 102dfcf..04a0f6a 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -539,27 +539,32 @@ void resched_cpu(int cpu)
  */
 int get_nohz_timer_target(void)
 {
-	int i, cpu = smp_processor_id();
+	int i, cpu = smp_processor_id(), default_cpu = -1;
 	struct sched_domain *sd;
 
-	if (!idle_cpu(cpu) && housekeeping_cpu(cpu, HK_FLAG_TIMER))
-		return cpu;
+	if (housekeeping_cpu(cpu, HK_FLAG_TIMER)) {
+		if (!idle_cpu(cpu))
+			return cpu;
+		default_cpu = cpu;
+	}
 
 	rcu_read_lock();
 	for_each_domain(cpu, sd) {
-		for_each_cpu(i, sched_domain_span(sd)) {
+		for_each_cpu_and(i, sched_domain_span(sd),
+			housekeeping_cpumask(HK_FLAG_TIMER)) {
 			if (cpu == i)
 				continue;
 
-			if (!idle_cpu(i) && housekeeping_cpu(i, HK_FLAG_TIMER)) {
+			if (!idle_cpu(i)) {
 				cpu = i;
 				goto unlock;
 			}
 		}
 	}
 
-	if (!housekeeping_cpu(cpu, HK_FLAG_TIMER))
-		cpu = housekeeping_any_cpu(HK_FLAG_TIMER);
+	if (default_cpu == -1)
+		default_cpu = housekeeping_any_cpu(HK_FLAG_TIMER);
+	cpu = default_cpu;
 unlock:
 	rcu_read_unlock();
 	return cpu;
-- 
1.8.3.1


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH v4 1/2] sched/isolation: Prefer housekeeping cpu in local node
  2019-06-28  8:51 [PATCH v4 2/2] sched/nohz: Optimize get_nohz_timer_target() Wanpeng Li
@ 2019-06-28  8:51 ` Wanpeng Li
  2019-07-08  4:05   ` Wanpeng Li
  2019-07-25 16:18   ` [tip:sched/core] sched/isolation: Prefer housekeeping CPU " tip-bot for Wanpeng Li
  0 siblings, 2 replies; 4+ messages in thread
From: Wanpeng Li @ 2019-06-28  8:51 UTC (permalink / raw)
  To: linux-kernel
  Cc: Ingo Molnar, Peter Zijlstra, Ingo Molnar, Frederic Weisbecker,
	Thomas Gleixner, Srikar Dronamraju

From: Wanpeng Li <wanpengli@tencent.com>

In real product setup, there will be houseeking cpus in each nodes, it 
is prefer to do housekeeping from local node, fallback to global online 
cpumask if failed to find houseeking cpu from local node.

Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Ingo Molnar <mingo@redhat.com> 
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Frederic Weisbecker <frederic@kernel.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
---
v3 -> v4:
 * have a static function for sched_numa_find_closest 
 * cleanup sched_numa_find_closest comments
v2 -> v3:
 * add sched_numa_find_closest comments
v1 -> v2:
 * introduce sched_numa_find_closest

 kernel/sched/isolation.c | 12 ++++++++++--
 kernel/sched/sched.h     |  8 +++++---
 kernel/sched/topology.c  | 20 ++++++++++++++++++++
 3 files changed, 35 insertions(+), 5 deletions(-)

diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c
index 7b9e1e0..191f751 100644
--- a/kernel/sched/isolation.c
+++ b/kernel/sched/isolation.c
@@ -16,9 +16,17 @@ static unsigned int housekeeping_flags;
 
 int housekeeping_any_cpu(enum hk_flags flags)
 {
-	if (static_branch_unlikely(&housekeeping_overridden))
-		if (housekeeping_flags & flags)
+	int cpu;
+
+	if (static_branch_unlikely(&housekeeping_overridden)) {
+		if (housekeeping_flags & flags) {
+			cpu = sched_numa_find_closest(housekeeping_mask, smp_processor_id());
+			if (cpu < nr_cpu_ids)
+				return cpu;
+
 			return cpumask_any_and(housekeeping_mask, cpu_online_mask);
+		}
+	}
 	return smp_processor_id();
 }
 EXPORT_SYMBOL_GPL(housekeeping_any_cpu);
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index 802b1f3..ec65d90 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1261,16 +1261,18 @@ enum numa_topology_type {
 extern enum numa_topology_type sched_numa_topology_type;
 extern int sched_max_numa_distance;
 extern bool find_numa_distance(int distance);
-#endif
-
-#ifdef CONFIG_NUMA
 extern void sched_init_numa(void);
 extern void sched_domains_numa_masks_set(unsigned int cpu);
 extern void sched_domains_numa_masks_clear(unsigned int cpu);
+extern int sched_numa_find_closest(const struct cpumask *cpus, int cpu);
 #else
 static inline void sched_init_numa(void) { }
 static inline void sched_domains_numa_masks_set(unsigned int cpu) { }
 static inline void sched_domains_numa_masks_clear(unsigned int cpu) { }
+static inline int sched_numa_find_closest(const struct cpumask *cpus, int cpu)
+{
+	return nr_cpu_ids;
+}
 #endif
 
 #ifdef CONFIG_NUMA_BALANCING
diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index f751ce0..4eea2c9 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -1724,6 +1724,26 @@ void sched_domains_numa_masks_clear(unsigned int cpu)
 	}
 }
 
+/*
+ * sched_numa_find_closest() - given the NUMA topology, find the cpu
+ *                             closest to @cpu from @cpumask.
+ * cpumask: cpumask to find a cpu from
+ * cpu: cpu to be close to
+ *
+ * returns: cpu, or nr_cpu_ids when nothing found.
+ */
+int sched_numa_find_closest(const struct cpumask *cpus, int cpu)
+{
+	int i, j = cpu_to_node(cpu);
+
+	for (i = 0; i < sched_domains_numa_levels; i++) {
+		cpu = cpumask_any_and(cpus, sched_domains_numa_masks[i][j]);
+		if (cpu < nr_cpu_ids)
+			return cpu;
+	}
+	return nr_cpu_ids;
+}
+
 #endif /* CONFIG_NUMA */
 
 static int __sdt_alloc(const struct cpumask *cpu_map)
-- 
2.7.4


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v4 1/2] sched/isolation: Prefer housekeeping cpu in local node
  2019-06-28  8:51 ` [PATCH v4 1/2] sched/isolation: Prefer housekeeping cpu in local node Wanpeng Li
@ 2019-07-08  4:05   ` Wanpeng Li
  2019-07-25 16:18   ` [tip:sched/core] sched/isolation: Prefer housekeeping CPU " tip-bot for Wanpeng Li
  1 sibling, 0 replies; 4+ messages in thread
From: Wanpeng Li @ 2019-07-08  4:05 UTC (permalink / raw)
  To: LKML
  Cc: Ingo Molnar, Peter Zijlstra, Ingo Molnar, Frederic Weisbecker,
	Thomas Gleixner, Srikar Dronamraju

Kindly ping for these two patches, :)
On Fri, 28 Jun 2019 at 16:51, Wanpeng Li <kernellwp@gmail.com> wrote:
>
> From: Wanpeng Li <wanpengli@tencent.com>
>
> In real product setup, there will be houseeking cpus in each nodes, it
> is prefer to do housekeeping from local node, fallback to global online
> cpumask if failed to find houseeking cpu from local node.
>
> Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
> Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Cc: Frederic Weisbecker <frederic@kernel.org>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Cc: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
> Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
> ---
> v3 -> v4:
>  * have a static function for sched_numa_find_closest
>  * cleanup sched_numa_find_closest comments
> v2 -> v3:
>  * add sched_numa_find_closest comments
> v1 -> v2:
>  * introduce sched_numa_find_closest
>
>  kernel/sched/isolation.c | 12 ++++++++++--
>  kernel/sched/sched.h     |  8 +++++---
>  kernel/sched/topology.c  | 20 ++++++++++++++++++++
>  3 files changed, 35 insertions(+), 5 deletions(-)
>
> diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c
> index 7b9e1e0..191f751 100644
> --- a/kernel/sched/isolation.c
> +++ b/kernel/sched/isolation.c
> @@ -16,9 +16,17 @@ static unsigned int housekeeping_flags;
>
>  int housekeeping_any_cpu(enum hk_flags flags)
>  {
> -       if (static_branch_unlikely(&housekeeping_overridden))
> -               if (housekeeping_flags & flags)
> +       int cpu;
> +
> +       if (static_branch_unlikely(&housekeeping_overridden)) {
> +               if (housekeeping_flags & flags) {
> +                       cpu = sched_numa_find_closest(housekeeping_mask, smp_processor_id());
> +                       if (cpu < nr_cpu_ids)
> +                               return cpu;
> +
>                         return cpumask_any_and(housekeeping_mask, cpu_online_mask);
> +               }
> +       }
>         return smp_processor_id();
>  }
>  EXPORT_SYMBOL_GPL(housekeeping_any_cpu);
> diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
> index 802b1f3..ec65d90 100644
> --- a/kernel/sched/sched.h
> +++ b/kernel/sched/sched.h
> @@ -1261,16 +1261,18 @@ enum numa_topology_type {
>  extern enum numa_topology_type sched_numa_topology_type;
>  extern int sched_max_numa_distance;
>  extern bool find_numa_distance(int distance);
> -#endif
> -
> -#ifdef CONFIG_NUMA
>  extern void sched_init_numa(void);
>  extern void sched_domains_numa_masks_set(unsigned int cpu);
>  extern void sched_domains_numa_masks_clear(unsigned int cpu);
> +extern int sched_numa_find_closest(const struct cpumask *cpus, int cpu);
>  #else
>  static inline void sched_init_numa(void) { }
>  static inline void sched_domains_numa_masks_set(unsigned int cpu) { }
>  static inline void sched_domains_numa_masks_clear(unsigned int cpu) { }
> +static inline int sched_numa_find_closest(const struct cpumask *cpus, int cpu)
> +{
> +       return nr_cpu_ids;
> +}
>  #endif
>
>  #ifdef CONFIG_NUMA_BALANCING
> diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
> index f751ce0..4eea2c9 100644
> --- a/kernel/sched/topology.c
> +++ b/kernel/sched/topology.c
> @@ -1724,6 +1724,26 @@ void sched_domains_numa_masks_clear(unsigned int cpu)
>         }
>  }
>
> +/*
> + * sched_numa_find_closest() - given the NUMA topology, find the cpu
> + *                             closest to @cpu from @cpumask.
> + * cpumask: cpumask to find a cpu from
> + * cpu: cpu to be close to
> + *
> + * returns: cpu, or nr_cpu_ids when nothing found.
> + */
> +int sched_numa_find_closest(const struct cpumask *cpus, int cpu)
> +{
> +       int i, j = cpu_to_node(cpu);
> +
> +       for (i = 0; i < sched_domains_numa_levels; i++) {
> +               cpu = cpumask_any_and(cpus, sched_domains_numa_masks[i][j]);
> +               if (cpu < nr_cpu_ids)
> +                       return cpu;
> +       }
> +       return nr_cpu_ids;
> +}
> +
>  #endif /* CONFIG_NUMA */
>
>  static int __sdt_alloc(const struct cpumask *cpu_map)
> --
> 2.7.4
>

^ permalink raw reply	[flat|nested] 4+ messages in thread

* [tip:sched/core] sched/isolation: Prefer housekeeping CPU in local node
  2019-06-28  8:51 ` [PATCH v4 1/2] sched/isolation: Prefer housekeeping cpu in local node Wanpeng Li
  2019-07-08  4:05   ` Wanpeng Li
@ 2019-07-25 16:18   ` tip-bot for Wanpeng Li
  1 sibling, 0 replies; 4+ messages in thread
From: tip-bot for Wanpeng Li @ 2019-07-25 16:18 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: torvalds, srikar, frederic, peterz, mingo, hpa, tglx, wanpengli,
	linux-kernel

Commit-ID:  e0e8d4911ed2695b12c3a01c15634000ede9bc73
Gitweb:     https://git.kernel.org/tip/e0e8d4911ed2695b12c3a01c15634000ede9bc73
Author:     Wanpeng Li <wanpengli@tencent.com>
AuthorDate: Fri, 28 Jun 2019 16:51:41 +0800
Committer:  Ingo Molnar <mingo@kernel.org>
CommitDate: Thu, 25 Jul 2019 15:51:55 +0200

sched/isolation: Prefer housekeeping CPU in local node

In real product setup, there will be houseeking CPUs in each nodes, it
is prefer to do housekeeping from local node, fallback to global online
cpumask if failed to find houseeking CPU from local node.

Signed-off-by: Wanpeng Li <wanpengli@tencent.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Reviewed-by: Srikar Dronamraju <srikar@linux.vnet.ibm.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/1561711901-4755-2-git-send-email-wanpengli@tencent.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 kernel/sched/isolation.c | 12 ++++++++++--
 kernel/sched/sched.h     |  8 +++++---
 kernel/sched/topology.c  | 20 ++++++++++++++++++++
 3 files changed, 35 insertions(+), 5 deletions(-)

diff --git a/kernel/sched/isolation.c b/kernel/sched/isolation.c
index ccb28085b114..9fcb2a695a41 100644
--- a/kernel/sched/isolation.c
+++ b/kernel/sched/isolation.c
@@ -22,9 +22,17 @@ EXPORT_SYMBOL_GPL(housekeeping_enabled);
 
 int housekeeping_any_cpu(enum hk_flags flags)
 {
-	if (static_branch_unlikely(&housekeeping_overridden))
-		if (housekeeping_flags & flags)
+	int cpu;
+
+	if (static_branch_unlikely(&housekeeping_overridden)) {
+		if (housekeeping_flags & flags) {
+			cpu = sched_numa_find_closest(housekeeping_mask, smp_processor_id());
+			if (cpu < nr_cpu_ids)
+				return cpu;
+
 			return cpumask_any_and(housekeeping_mask, cpu_online_mask);
+		}
+	}
 	return smp_processor_id();
 }
 EXPORT_SYMBOL_GPL(housekeeping_any_cpu);
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index aaca0e743776..16126efd14ed 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -1262,16 +1262,18 @@ enum numa_topology_type {
 extern enum numa_topology_type sched_numa_topology_type;
 extern int sched_max_numa_distance;
 extern bool find_numa_distance(int distance);
-#endif
-
-#ifdef CONFIG_NUMA
 extern void sched_init_numa(void);
 extern void sched_domains_numa_masks_set(unsigned int cpu);
 extern void sched_domains_numa_masks_clear(unsigned int cpu);
+extern int sched_numa_find_closest(const struct cpumask *cpus, int cpu);
 #else
 static inline void sched_init_numa(void) { }
 static inline void sched_domains_numa_masks_set(unsigned int cpu) { }
 static inline void sched_domains_numa_masks_clear(unsigned int cpu) { }
+static inline int sched_numa_find_closest(const struct cpumask *cpus, int cpu)
+{
+	return nr_cpu_ids;
+}
 #endif
 
 #ifdef CONFIG_NUMA_BALANCING
diff --git a/kernel/sched/topology.c b/kernel/sched/topology.c
index f751ce0b783e..4eea2c9bc732 100644
--- a/kernel/sched/topology.c
+++ b/kernel/sched/topology.c
@@ -1724,6 +1724,26 @@ void sched_domains_numa_masks_clear(unsigned int cpu)
 	}
 }
 
+/*
+ * sched_numa_find_closest() - given the NUMA topology, find the cpu
+ *                             closest to @cpu from @cpumask.
+ * cpumask: cpumask to find a cpu from
+ * cpu: cpu to be close to
+ *
+ * returns: cpu, or nr_cpu_ids when nothing found.
+ */
+int sched_numa_find_closest(const struct cpumask *cpus, int cpu)
+{
+	int i, j = cpu_to_node(cpu);
+
+	for (i = 0; i < sched_domains_numa_levels; i++) {
+		cpu = cpumask_any_and(cpus, sched_domains_numa_masks[i][j]);
+		if (cpu < nr_cpu_ids)
+			return cpu;
+	}
+	return nr_cpu_ids;
+}
+
 #endif /* CONFIG_NUMA */
 
 static int __sdt_alloc(const struct cpumask *cpu_map)

^ permalink raw reply related	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2019-07-25 16:18 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-06-28  8:51 [PATCH v4 2/2] sched/nohz: Optimize get_nohz_timer_target() Wanpeng Li
2019-06-28  8:51 ` [PATCH v4 1/2] sched/isolation: Prefer housekeeping cpu in local node Wanpeng Li
2019-07-08  4:05   ` Wanpeng Li
2019-07-25 16:18   ` [tip:sched/core] sched/isolation: Prefer housekeeping CPU " tip-bot for Wanpeng Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).