public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] lib/smp_processor_id: Don't use cpumask_equal()
@ 2019-10-03 20:36 Waiman Long
  2019-10-04  9:20 ` Sebastian Andrzej Siewior
                   ` (3 more replies)
  0 siblings, 4 replies; 6+ messages in thread
From: Waiman Long @ 2019-10-03 20:36 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra (Intel), Thomas Gleixner
  Cc: linux-kernel, Masami Hiramatsu, Sebastian Andrzej Siewior,
	Juri Lelli, Waiman Long

The check_preemption_disabled() function uses cpumask_equal() to see
if the task is bounded to the current CPU only. cpumask_equal() calls
memcmp() to do the comparison. As x86 doesn't have __HAVE_ARCH_MEMCMP,
the slow memcmp() function in lib/string.c is used.

On a RT kernel that call check_preemption_disabled() very frequently,
below is the perf-record output of a certain microbenchmark:

  42.75%  2.45%  testpmd [kernel.kallsyms] [k] check_preemption_disabled
  40.01% 39.97%  testpmd [kernel.kallsyms] [k] memcmp

We should avoid calling memcmp() in performance critical path. So the
cpumask_equal() call is now replaced with an equivalent simpler check.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 lib/smp_processor_id.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/smp_processor_id.c b/lib/smp_processor_id.c
index 60ba93fc42ce..bd9571653288 100644
--- a/lib/smp_processor_id.c
+++ b/lib/smp_processor_id.c
@@ -23,7 +23,7 @@ unsigned int check_preemption_disabled(const char *what1, const char *what2)
 	 * Kernel threads bound to a single CPU can safely use
 	 * smp_processor_id():
 	 */
-	if (cpumask_equal(current->cpus_ptr, cpumask_of(this_cpu)))
+	if (current->nr_cpus_allowed == 1)
 		goto out;
 
 	/*
-- 
2.18.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] lib/smp_processor_id: Don't use cpumask_equal()
  2019-10-03 20:36 [PATCH v2] lib/smp_processor_id: Don't use cpumask_equal() Waiman Long
@ 2019-10-04  9:20 ` Sebastian Andrzej Siewior
  2019-10-04 12:30   ` Waiman Long
  2019-10-04 10:32 ` Juri Lelli
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 6+ messages in thread
From: Sebastian Andrzej Siewior @ 2019-10-04  9:20 UTC (permalink / raw)
  To: Waiman Long
  Cc: Ingo Molnar, Peter Zijlstra (Intel), Thomas Gleixner,
	linux-kernel, Masami Hiramatsu, Juri Lelli

On 2019-10-03 16:36:08 [-0400], Waiman Long wrote:
> The check_preemption_disabled() function uses cpumask_equal() to see
> if the task is bounded to the current CPU only. cpumask_equal() calls
> memcmp() to do the comparison. As x86 doesn't have __HAVE_ARCH_MEMCMP,
> the slow memcmp() function in lib/string.c is used.
> 
> On a RT kernel that call check_preemption_disabled() very frequently,
> below is the perf-record output of a certain microbenchmark:
> 
>   42.75%  2.45%  testpmd [kernel.kallsyms] [k] check_preemption_disabled
>   40.01% 39.97%  testpmd [kernel.kallsyms] [k] memcmp
> 
> We should avoid calling memcmp() in performance critical path. So the
> cpumask_equal() call is now replaced with an equivalent simpler check.

using a simple integer comparison is still more efficient than what
__HAVE_ARCH_MEMCMP can offer.

> Signed-off-by: Waiman Long <longman@redhat.com>
Acked-by:  Sebastian Andrzej Siewior <bigeasy@linutronix.de>

Sebastian

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] lib/smp_processor_id: Don't use cpumask_equal()
  2019-10-03 20:36 [PATCH v2] lib/smp_processor_id: Don't use cpumask_equal() Waiman Long
  2019-10-04  9:20 ` Sebastian Andrzej Siewior
@ 2019-10-04 10:32 ` Juri Lelli
  2019-10-07 12:39 ` Peter Zijlstra
  2019-10-09 12:59 ` [tip: locking/core] " tip-bot2 for Waiman Long
  3 siblings, 0 replies; 6+ messages in thread
From: Juri Lelli @ 2019-10-04 10:32 UTC (permalink / raw)
  To: Waiman Long
  Cc: Ingo Molnar, Peter Zijlstra (Intel), Thomas Gleixner,
	linux-kernel, Masami Hiramatsu, Sebastian Andrzej Siewior,
	Juri Lelli

Hi,

On 03/10/19 16:36, Waiman Long wrote:
> The check_preemption_disabled() function uses cpumask_equal() to see
> if the task is bounded to the current CPU only. cpumask_equal() calls
> memcmp() to do the comparison. As x86 doesn't have __HAVE_ARCH_MEMCMP,
> the slow memcmp() function in lib/string.c is used.
> 
> On a RT kernel that call check_preemption_disabled() very frequently,
> below is the perf-record output of a certain microbenchmark:
> 
>   42.75%  2.45%  testpmd [kernel.kallsyms] [k] check_preemption_disabled
>   40.01% 39.97%  testpmd [kernel.kallsyms] [k] memcmp
> 
> We should avoid calling memcmp() in performance critical path. So the
> cpumask_equal() call is now replaced with an equivalent simpler check.
> 
> Signed-off-by: Waiman Long <longman@redhat.com>
> ---
>  lib/smp_processor_id.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/lib/smp_processor_id.c b/lib/smp_processor_id.c
> index 60ba93fc42ce..bd9571653288 100644
> --- a/lib/smp_processor_id.c
> +++ b/lib/smp_processor_id.c
> @@ -23,7 +23,7 @@ unsigned int check_preemption_disabled(const char *what1, const char *what2)
>  	 * Kernel threads bound to a single CPU can safely use
>  	 * smp_processor_id():
>  	 */
> -	if (cpumask_equal(current->cpus_ptr, cpumask_of(this_cpu)))
> +	if (current->nr_cpus_allowed == 1)
>  		goto out;

Makes sense to me.

Reviewed-by: Juri Lelli <juri.lelli@redhat.com>

Thanks,

Juri

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] lib/smp_processor_id: Don't use cpumask_equal()
  2019-10-04  9:20 ` Sebastian Andrzej Siewior
@ 2019-10-04 12:30   ` Waiman Long
  0 siblings, 0 replies; 6+ messages in thread
From: Waiman Long @ 2019-10-04 12:30 UTC (permalink / raw)
  To: Sebastian Andrzej Siewior
  Cc: Ingo Molnar, Peter Zijlstra (Intel), Thomas Gleixner,
	linux-kernel, Masami Hiramatsu, Juri Lelli

On 10/4/19 5:20 AM, Sebastian Andrzej Siewior wrote:
> On 2019-10-03 16:36:08 [-0400], Waiman Long wrote:
>> The check_preemption_disabled() function uses cpumask_equal() to see
>> if the task is bounded to the current CPU only. cpumask_equal() calls
>> memcmp() to do the comparison. As x86 doesn't have __HAVE_ARCH_MEMCMP,
>> the slow memcmp() function in lib/string.c is used.
>>
>> On a RT kernel that call check_preemption_disabled() very frequently,
>> below is the perf-record output of a certain microbenchmark:
>>
>>   42.75%  2.45%  testpmd [kernel.kallsyms] [k] check_preemption_disabled
>>   40.01% 39.97%  testpmd [kernel.kallsyms] [k] memcmp
>>
>> We should avoid calling memcmp() in performance critical path. So the
>> cpumask_equal() call is now replaced with an equivalent simpler check.
> using a simple integer comparison is still more efficient than what
> __HAVE_ARCH_MEMCMP can offer.
You are right. My main point is to try to avoid using cpumask_equal() in
performance critical path irrespective of this patch.
>> Signed-off-by: Waiman Long <longman@redhat.com>
> Acked-by:  Sebastian Andrzej Siewior <bigeasy@linutronix.de>
>
> Sebastian

Cheers,
Longman


^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2] lib/smp_processor_id: Don't use cpumask_equal()
  2019-10-03 20:36 [PATCH v2] lib/smp_processor_id: Don't use cpumask_equal() Waiman Long
  2019-10-04  9:20 ` Sebastian Andrzej Siewior
  2019-10-04 10:32 ` Juri Lelli
@ 2019-10-07 12:39 ` Peter Zijlstra
  2019-10-09 12:59 ` [tip: locking/core] " tip-bot2 for Waiman Long
  3 siblings, 0 replies; 6+ messages in thread
From: Peter Zijlstra @ 2019-10-07 12:39 UTC (permalink / raw)
  To: Waiman Long
  Cc: Ingo Molnar, Thomas Gleixner, linux-kernel, Masami Hiramatsu,
	Sebastian Andrzej Siewior, Juri Lelli

On Thu, Oct 03, 2019 at 04:36:08PM -0400, Waiman Long wrote:
> The check_preemption_disabled() function uses cpumask_equal() to see
> if the task is bounded to the current CPU only. cpumask_equal() calls
> memcmp() to do the comparison. As x86 doesn't have __HAVE_ARCH_MEMCMP,
> the slow memcmp() function in lib/string.c is used.
> 
> On a RT kernel that call check_preemption_disabled() very frequently,
> below is the perf-record output of a certain microbenchmark:
> 
>   42.75%  2.45%  testpmd [kernel.kallsyms] [k] check_preemption_disabled
>   40.01% 39.97%  testpmd [kernel.kallsyms] [k] memcmp
> 
> We should avoid calling memcmp() in performance critical path. So the
> cpumask_equal() call is now replaced with an equivalent simpler check.
> 
> Signed-off-by: Waiman Long <longman@redhat.com>

Thanks!

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [tip: locking/core] lib/smp_processor_id: Don't use cpumask_equal()
  2019-10-03 20:36 [PATCH v2] lib/smp_processor_id: Don't use cpumask_equal() Waiman Long
                   ` (2 preceding siblings ...)
  2019-10-07 12:39 ` Peter Zijlstra
@ 2019-10-09 12:59 ` tip-bot2 for Waiman Long
  3 siblings, 0 replies; 6+ messages in thread
From: tip-bot2 for Waiman Long @ 2019-10-09 12:59 UTC (permalink / raw)
  To: linux-tip-commits
  Cc: Waiman Long, Peter Zijlstra (Intel), Juri Lelli,
	Sebastian Andrzej Siewior, Linus Torvalds, Masami Hiramatsu,
	Thomas Gleixner, Ingo Molnar, Borislav Petkov, linux-kernel

The following commit has been merged into the locking/core branch of tip:

Commit-ID:     e950cca3f3c40902a052a78a36b3fac1f8a62d19
Gitweb:        https://git.kernel.org/tip/e950cca3f3c40902a052a78a36b3fac1f8a62d19
Author:        Waiman Long <longman@redhat.com>
AuthorDate:    Thu, 03 Oct 2019 16:36:08 -04:00
Committer:     Ingo Molnar <mingo@kernel.org>
CommitterDate: Wed, 09 Oct 2019 12:46:10 +02:00

lib/smp_processor_id: Don't use cpumask_equal()

The check_preemption_disabled() function uses cpumask_equal() to see
if the task is bounded to the current CPU only. cpumask_equal() calls
memcmp() to do the comparison. As x86 doesn't have __HAVE_ARCH_MEMCMP,
the slow memcmp() function in lib/string.c is used.

On a RT kernel that call check_preemption_disabled() very frequently,
below is the perf-record output of a certain microbenchmark:

  42.75%  2.45%  testpmd [kernel.kallsyms] [k] check_preemption_disabled
  40.01% 39.97%  testpmd [kernel.kallsyms] [k] memcmp

We should avoid calling memcmp() in performance critical path. So the
cpumask_equal() call is now replaced with an equivalent simpler check.

Signed-off-by: Waiman Long <longman@redhat.com>
Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Reviewed-by: Juri Lelli <juri.lelli@redhat.com>
Acked-by:  Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/20191003203608.21881-1-longman@redhat.com
Signed-off-by: Ingo Molnar <mingo@kernel.org>
---
 lib/smp_processor_id.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/lib/smp_processor_id.c b/lib/smp_processor_id.c
index 60ba93f..bd95716 100644
--- a/lib/smp_processor_id.c
+++ b/lib/smp_processor_id.c
@@ -23,7 +23,7 @@ unsigned int check_preemption_disabled(const char *what1, const char *what2)
 	 * Kernel threads bound to a single CPU can safely use
 	 * smp_processor_id():
 	 */
-	if (cpumask_equal(current->cpus_ptr, cpumask_of(this_cpu)))
+	if (current->nr_cpus_allowed == 1)
 		goto out;
 
 	/*

^ permalink raw reply related	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2019-10-09 12:59 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2019-10-03 20:36 [PATCH v2] lib/smp_processor_id: Don't use cpumask_equal() Waiman Long
2019-10-04  9:20 ` Sebastian Andrzej Siewior
2019-10-04 12:30   ` Waiman Long
2019-10-04 10:32 ` Juri Lelli
2019-10-07 12:39 ` Peter Zijlstra
2019-10-09 12:59 ` [tip: locking/core] " tip-bot2 for Waiman Long

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox