All of lore.kernel.org
 help / color / mirror / Atom feed
From: Jisheng Zhang <jszhang@kernel.org>
To: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] arm64: save movk instructions in mov_q when the lower 16|32 bits are all zero
Date: Tue, 26 Jul 2022 21:44:40 +0800	[thread overview]
Message-ID: <Yt/vyClCGr5XRPoO@xhacker> (raw)
In-Reply-To: <20220719181340.GC14526@willie-the-truck>

On Tue, Jul 19, 2022 at 07:13:41PM +0100, Will Deacon wrote:
> On Sat, Jul 09, 2022 at 04:48:30PM +0800, Jisheng Zhang wrote:
> > Currently mov_q is used to move a constant into a 64-bit register,
> > when the lower 16 or 32bits of the constant are all zero, the mov_q
> > emits one or two useless movk instructions. If the mov_q macro is used
> > in hot code path, we want to save the movk instructions as much as
> > possible. For example, when CONFIG_ARM64_MTE is 'Y' and
> > CONFIG_KASAN_HW_TAGS is 'N', the following code in __cpu_setup()
> > routine is the pontential optimization target:
> > 
> >         /* set the TCR_EL1 bits */
> >         mov_q   x10, TCR_MTE_FLAGS
> > 
> > Before the patch:
> > 	mov	x10, #0x10000000000000
> > 	movk	x10, #0x40, lsl #32
> > 	movk	x10, #0x0, lsl #16
> > 	movk	x10, #0x0
> > 
> > After the patch:
> > 	mov	x10, #0x10000000000000
> > 	movk	x10, #0x40, lsl #32
> > 
> > Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
> > ---
> >  arch/arm64/include/asm/assembler.h | 4 ++++
> >  1 file changed, 4 insertions(+)
> > 
> > diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
> > index 8c5a61aeaf8e..09f408424cae 100644
> > --- a/arch/arm64/include/asm/assembler.h
> > +++ b/arch/arm64/include/asm/assembler.h
> > @@ -568,9 +568,13 @@ alternative_endif
> >  	movz	\reg, :abs_g3:\val
> >  	movk	\reg, :abs_g2_nc:\val
> >  	.endif
> > +	.if ((((\val) >> 16) & 0xffff) != 0)
> >  	movk	\reg, :abs_g1_nc:\val
> >  	.endif
> > +	.endif
> > +	.if (((\val) & 0xffff) != 0)
> >  	movk	\reg, :abs_g0_nc:\val
> > +	.endif
> 
> Please provide some numbers showing that this is worthwhile.
> 

No, I have no performance numbers, but here are my opnion
about this patch: the two checks doesn't add maintaince effort, its
readability is good, if the two checks can save two movk instructions,
it's worthwhile to add the checks.


Thanks

_______________________________________________
linux-arm-kernel mailing list
linux-arm-kernel@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

WARNING: multiple messages have this Message-ID (diff)
From: Jisheng Zhang <jszhang@kernel.org>
To: Will Deacon <will@kernel.org>
Cc: Catalin Marinas <catalin.marinas@arm.com>,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH] arm64: save movk instructions in mov_q when the lower 16|32 bits are all zero
Date: Tue, 26 Jul 2022 21:44:40 +0800	[thread overview]
Message-ID: <Yt/vyClCGr5XRPoO@xhacker> (raw)
In-Reply-To: <20220719181340.GC14526@willie-the-truck>

On Tue, Jul 19, 2022 at 07:13:41PM +0100, Will Deacon wrote:
> On Sat, Jul 09, 2022 at 04:48:30PM +0800, Jisheng Zhang wrote:
> > Currently mov_q is used to move a constant into a 64-bit register,
> > when the lower 16 or 32bits of the constant are all zero, the mov_q
> > emits one or two useless movk instructions. If the mov_q macro is used
> > in hot code path, we want to save the movk instructions as much as
> > possible. For example, when CONFIG_ARM64_MTE is 'Y' and
> > CONFIG_KASAN_HW_TAGS is 'N', the following code in __cpu_setup()
> > routine is the pontential optimization target:
> > 
> >         /* set the TCR_EL1 bits */
> >         mov_q   x10, TCR_MTE_FLAGS
> > 
> > Before the patch:
> > 	mov	x10, #0x10000000000000
> > 	movk	x10, #0x40, lsl #32
> > 	movk	x10, #0x0, lsl #16
> > 	movk	x10, #0x0
> > 
> > After the patch:
> > 	mov	x10, #0x10000000000000
> > 	movk	x10, #0x40, lsl #32
> > 
> > Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
> > ---
> >  arch/arm64/include/asm/assembler.h | 4 ++++
> >  1 file changed, 4 insertions(+)
> > 
> > diff --git a/arch/arm64/include/asm/assembler.h b/arch/arm64/include/asm/assembler.h
> > index 8c5a61aeaf8e..09f408424cae 100644
> > --- a/arch/arm64/include/asm/assembler.h
> > +++ b/arch/arm64/include/asm/assembler.h
> > @@ -568,9 +568,13 @@ alternative_endif
> >  	movz	\reg, :abs_g3:\val
> >  	movk	\reg, :abs_g2_nc:\val
> >  	.endif
> > +	.if ((((\val) >> 16) & 0xffff) != 0)
> >  	movk	\reg, :abs_g1_nc:\val
> >  	.endif
> > +	.endif
> > +	.if (((\val) & 0xffff) != 0)
> >  	movk	\reg, :abs_g0_nc:\val
> > +	.endif
> 
> Please provide some numbers showing that this is worthwhile.
> 

No, I have no performance numbers, but here are my opnion
about this patch: the two checks doesn't add maintaince effort, its
readability is good, if the two checks can save two movk instructions,
it's worthwhile to add the checks.


Thanks

  reply	other threads:[~2022-07-26 13:57 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-07-09  8:48 [PATCH] arm64: save movk instructions in mov_q when the lower 16|32 bits are all zero Jisheng Zhang
2022-07-09  8:48 ` Jisheng Zhang
2022-07-19 18:13 ` Will Deacon
2022-07-19 18:13   ` Will Deacon
2022-07-26 13:44   ` Jisheng Zhang [this message]
2022-07-26 13:44     ` Jisheng Zhang
2022-07-27  8:35     ` Will Deacon
2022-07-27  8:35       ` Will Deacon
2022-07-27 15:15 ` Ard Biesheuvel
2022-07-27 15:15   ` Ard Biesheuvel
2022-07-28 14:48   ` Jisheng Zhang
2022-07-28 14:48     ` Jisheng Zhang
2022-07-28 15:17     ` Jisheng Zhang
2022-07-28 15:17       ` Jisheng Zhang
2022-07-28 15:40       ` Ard Biesheuvel
2022-07-28 15:40         ` Ard Biesheuvel

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Yt/vyClCGr5XRPoO@xhacker \
    --to=jszhang@kernel.org \
    --cc=catalin.marinas@arm.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.