From: Frederic Weisbecker <frederic@kernel.org>
To: Finn Thain <fthain@linux-m68k.org>
Cc: Ingo Molnar <mingo@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
Juri Lelli <juri.lelli@redhat.com>,
Vincent Guittot <vincent.guittot@linaro.org>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
Steven Rostedt <rostedt@goodmis.org>,
Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
Daniel Bristot de Oliveira <bristot@redhat.com>,
Valentin Schneider <vschneid@redhat.com>,
Thomas Gleixner <tglx@linutronix.de>,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH] sched: Optimize in_task() and in_interrupt() a bit
Date: Fri, 14 Jul 2023 14:02:53 +0200 [thread overview]
Message-ID: <ZLE5bSAy6857cq9B@lothringen> (raw)
In-Reply-To: <44ad7a7afa1b8b1383426971402d2901361db1c5.1689326311.git.fthain@linux-m68k.org>
On Fri, Jul 14, 2023 at 07:18:31PM +1000, Finn Thain wrote:
> Except on x86, preempt_count is always accessed with READ_ONCE.
> Repeated invocations in macros like irq_count() produce repeated loads.
> These redundant instructions appear in various fast paths. In the one
> shown below, for example, irq_count() is evaluated during kernel entry
> if !tick_nohz_full_cpu(smp_processor_id()).
>
> 0001ed0a <irq_enter_rcu>:
> 1ed0a: 4e56 0000 linkw %fp,#0
> 1ed0e: 200f movel %sp,%d0
> 1ed10: 0280 ffff e000 andil #-8192,%d0
> 1ed16: 2040 moveal %d0,%a0
> 1ed18: 2028 0008 movel %a0@(8),%d0
> 1ed1c: 0680 0001 0000 addil #65536,%d0
> 1ed22: 2140 0008 movel %d0,%a0@(8)
> 1ed26: 082a 0001 000f btst #1,%a2@(15)
> 1ed2c: 670c beqs 1ed3a <irq_enter_rcu+0x30>
> 1ed2e: 2028 0008 movel %a0@(8),%d0
> 1ed32: 2028 0008 movel %a0@(8),%d0
> 1ed36: 2028 0008 movel %a0@(8),%d0
> 1ed3a: 4e5e unlk %fp
> 1ed3c: 4e75 rts
>
> This patch doesn't prevent the pointless btst and beqs instructions
> above, but it does eliminate 2 of the 3 pointless move instructions
> here and elsewhere.
>
> On x86, preempt_count is per-cpu data and the problem does not arise
> perhaps because the compiler is free to perform similar optimizations.
>
> Cc: Thomas Gleixner <tglx@linutronix.de>
> Fixes: 15115830c887 ("preempt: Cleanup the macro maze a bit")
Does this optimization really deserves a "Fixes:" tag?
> Signed-off-by: Finn Thain <fthain@linux-m68k.org>
> ---
> This patch was tested on m68k and x86. I was expecting no changes
> to object code for x86 and mostly that's what I saw. However, there
> were a few places where code generation was perturbed for some reason.
> ---
> include/linux/preempt.h | 9 +++++++--
> 1 file changed, 7 insertions(+), 2 deletions(-)
>
> diff --git a/include/linux/preempt.h b/include/linux/preempt.h
> index 0df425bf9bd7..953358e40291 100644
> --- a/include/linux/preempt.h
> +++ b/include/linux/preempt.h
> @@ -102,10 +102,11 @@ static __always_inline unsigned char interrupt_context_level(void)
> #define hardirq_count() (preempt_count() & HARDIRQ_MASK)
> #ifdef CONFIG_PREEMPT_RT
> # define softirq_count() (current->softirq_disable_cnt & SOFTIRQ_MASK)
> +# define irq_count() ((preempt_count() & (NMI_MASK | HARDIRQ_MASK)) | softirq_count())
> #else
> # define softirq_count() (preempt_count() & SOFTIRQ_MASK)
> +# define irq_count() (preempt_count() & (NMI_MASK | HARDIRQ_MASK | SOFTIRQ_MASK))
> #endif
> -#define irq_count() (nmi_count() | hardirq_count() | softirq_count())
Perhaps add a comment as to why you're making these two versions (ie: because
that avoids three consecutive reads), otherwise people may be tempted to roll
that back again in the future to make the code shorter.
>
> /*
> * Macros to retrieve the current execution context:
> @@ -118,7 +119,11 @@ static __always_inline unsigned char interrupt_context_level(void)
> #define in_nmi() (nmi_count())
> #define in_hardirq() (hardirq_count())
> #define in_serving_softirq() (softirq_count() & SOFTIRQ_OFFSET)
> -#define in_task() (!(in_nmi() | in_hardirq() | in_serving_softirq()))
> +#ifdef CONFIG_PREEMPT_RT
> +# define in_task() (!((preempt_count() & (NMI_MASK | HARDIRQ_MASK)) | in_serving_softirq()))
> +#else
> +# define in_task() (!(preempt_count() & (NMI_MASK | HARDIRQ_MASK | SOFTIRQ_OFFSET)))
> +#endif
Same here, thanks!
>
> /*
> * The following macros are deprecated and should not be used in new code:
> --
> 2.39.3
>
next prev parent reply other threads:[~2023-07-14 12:03 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-07-14 9:18 [PATCH] sched: Optimize in_task() and in_interrupt() a bit Finn Thain
2023-07-14 12:02 ` Frederic Weisbecker [this message]
2023-07-14 23:48 ` Finn Thain
-- strict thread matches above, loose matches on Subject: below --
2023-09-15 5:47 Finn Thain
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZLE5bSAy6857cq9B@lothringen \
--to=frederic@kernel.org \
--cc=bristot@redhat.com \
--cc=bsegall@google.com \
--cc=dietmar.eggemann@arm.com \
--cc=fthain@linux-m68k.org \
--cc=juri.lelli@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mgorman@suse.de \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
--cc=vincent.guittot@linaro.org \
--cc=vschneid@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.