From: Jisheng Zhang <jszhang@kernel.org>
To: Catalin Marinas <catalin.marinas@arm.com>, Will Deacon <will@kernel.org>
Cc: linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org
Subject: [PATCH] arm64/lib: copy_page: s/stnp/stp
Date: Thu, 13 Jun 2024 08:18:12 +0800 [thread overview]
Message-ID: <20240613001812.2141-1-jszhang@kernel.org> (raw)
stnp performs non-temporal store, give a hints to the memory system
that caching is not useful for this data. But the scenario where
copy_page() used may not have this implication, although I must admit
there's such case where stnp helps performance(good). In this good
case, we can rely on the HW write streaming mechanism in some
implementations such as cortex-a55 to detect the case and take actions.
testing with https://github.com/apinski-cavium/copy_page_benchmark
this patch can reduce the time by about 3% on cortex-a55 platforms.
Signed-off-by: Jisheng Zhang <jszhang@kernel.org>
---
arch/arm64/lib/copy_page.S | 32 ++++++++++++++++----------------
1 file changed, 16 insertions(+), 16 deletions(-)
diff --git a/arch/arm64/lib/copy_page.S b/arch/arm64/lib/copy_page.S
index 6a56d7cf309d..4c74fe2d8bd6 100644
--- a/arch/arm64/lib/copy_page.S
+++ b/arch/arm64/lib/copy_page.S
@@ -32,21 +32,21 @@ SYM_FUNC_START(__pi_copy_page)
1:
tst x0, #(PAGE_SIZE - 1)
- stnp x2, x3, [x0, #-256]
+ stp x2, x3, [x0, #-256]
ldp x2, x3, [x1]
- stnp x4, x5, [x0, #16 - 256]
+ stp x4, x5, [x0, #16 - 256]
ldp x4, x5, [x1, #16]
- stnp x6, x7, [x0, #32 - 256]
+ stp x6, x7, [x0, #32 - 256]
ldp x6, x7, [x1, #32]
- stnp x8, x9, [x0, #48 - 256]
+ stp x8, x9, [x0, #48 - 256]
ldp x8, x9, [x1, #48]
- stnp x10, x11, [x0, #64 - 256]
+ stp x10, x11, [x0, #64 - 256]
ldp x10, x11, [x1, #64]
- stnp x12, x13, [x0, #80 - 256]
+ stp x12, x13, [x0, #80 - 256]
ldp x12, x13, [x1, #80]
- stnp x14, x15, [x0, #96 - 256]
+ stp x14, x15, [x0, #96 - 256]
ldp x14, x15, [x1, #96]
- stnp x16, x17, [x0, #112 - 256]
+ stp x16, x17, [x0, #112 - 256]
ldp x16, x17, [x1, #112]
add x0, x0, #128
@@ -54,14 +54,14 @@ SYM_FUNC_START(__pi_copy_page)
b.ne 1b
- stnp x2, x3, [x0, #-256]
- stnp x4, x5, [x0, #16 - 256]
- stnp x6, x7, [x0, #32 - 256]
- stnp x8, x9, [x0, #48 - 256]
- stnp x10, x11, [x0, #64 - 256]
- stnp x12, x13, [x0, #80 - 256]
- stnp x14, x15, [x0, #96 - 256]
- stnp x16, x17, [x0, #112 - 256]
+ stp x2, x3, [x0, #-256]
+ stp x4, x5, [x0, #16 - 256]
+ stp x6, x7, [x0, #32 - 256]
+ stp x8, x9, [x0, #48 - 256]
+ stp x10, x11, [x0, #64 - 256]
+ stp x12, x13, [x0, #80 - 256]
+ stp x14, x15, [x0, #96 - 256]
+ stp x16, x17, [x0, #112 - 256]
ret
SYM_FUNC_END(__pi_copy_page)
--
2.43.0
next reply other threads:[~2024-06-13 0:32 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-13 0:18 Jisheng Zhang [this message]
2024-06-24 17:56 ` [PATCH] arm64/lib: copy_page: s/stnp/stp Catalin Marinas
2024-06-26 11:50 ` Jisheng Zhang
2024-06-26 18:08 ` Catalin Marinas
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240613001812.2141-1-jszhang@kernel.org \
--to=jszhang@kernel.org \
--cc=catalin.marinas@arm.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=will@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.