* [PATCH v3 RESEND 0/2] riscv: enable lockless lockref implementation
@ 2024-03-25 11:10 Jisheng Zhang
2024-03-25 11:10 ` [PATCH v3 RESEND 1/2] riscv: select ARCH_USE_CMPXCHG_LOCKREF Jisheng Zhang
` (2 more replies)
0 siblings, 3 replies; 4+ messages in thread
From: Jisheng Zhang @ 2024-03-25 11:10 UTC (permalink / raw)
To: Paul Walmsley, Palmer Dabbelt, Albert Ou; +Cc: linux-riscv, linux-kernel
This series selects ARCH_USE_CMPXCHG_LOCKREF to enable the
cmpxchg-based lockless lockref implementation for riscv. Then,
implement arch_cmpxchg64_{relaxed|acquire|release}.
After patch1:
Using Linus' test case[1] on TH1520 platform, I see a 11.2% improvement.
On JH7110 platform, I see 12.0% improvement.
After patch2:
on both TH1520 and JH7110 platforms, I didn't see obvious
performance improvement with Linus' test case [1]. IMHO, this may
be related with the fence and lr.d/sc.d hw implementations. In theory,
lr/sc without fence could give performance improvement over lr/sc plus
fence, so add the code here to leave performance improvement room on
newer HW platforms.
Link: http://marc.info/?l=linux-fsdevel&m=137782380714721&w=4 [1]
Since v2:
- rebase on v6.8-rc1
- collect Reviewed-by tag
Since v1:
- only select ARCH_USE_CMPXCHG_LOCKREF if 64BIT
Jisheng Zhang (2):
riscv: select ARCH_USE_CMPXCHG_LOCKREF
riscv: cmpxchg: implement arch_cmpxchg64_{relaxed|acquire|release}
arch/riscv/Kconfig | 1 +
arch/riscv/include/asm/cmpxchg.h | 18 ++++++++++++++++++
2 files changed, 19 insertions(+)
--
2.43.0
^ permalink raw reply [flat|nested] 4+ messages in thread* [PATCH v3 RESEND 1/2] riscv: select ARCH_USE_CMPXCHG_LOCKREF 2024-03-25 11:10 [PATCH v3 RESEND 0/2] riscv: enable lockless lockref implementation Jisheng Zhang @ 2024-03-25 11:10 ` Jisheng Zhang 2024-03-25 11:10 ` [PATCH v3 RESEND 2/2] riscv: cmpxchg: implement arch_cmpxchg64_{relaxed|acquire|release} Jisheng Zhang 2024-04-28 22:00 ` [PATCH v3 RESEND 0/2] riscv: enable lockless lockref implementation patchwork-bot+linux-riscv 2 siblings, 0 replies; 4+ messages in thread From: Jisheng Zhang @ 2024-03-25 11:10 UTC (permalink / raw) To: Paul Walmsley, Palmer Dabbelt, Albert Ou Cc: linux-riscv, linux-kernel, Andrea Parri Select ARCH_USE_CMPXCHG_LOCKREF to enable the cmpxchg-based lockless lockref implementation for riscv. Using Linus' test case[1] on TH1520 platform, I see a 11.2% improvement. On JH7110 platform, I see 12.0% improvement. Link: http://marc.info/?l=linux-fsdevel&m=137782380714721&w=4 [1] Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Reviewed-by: Andrea Parri <parri.andrea@gmail.com> --- arch/riscv/Kconfig | 1 + 1 file changed, 1 insertion(+) diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index aba42b2bf660..7895c77545f1 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -58,6 +58,7 @@ config RISCV select ARCH_SUPPORTS_PAGE_TABLE_CHECK if MMU select ARCH_SUPPORTS_PER_VMA_LOCK if MMU select ARCH_SUPPORTS_SHADOW_CALL_STACK if HAVE_SHADOW_CALL_STACK + select ARCH_USE_CMPXCHG_LOCKREF if 64BIT select ARCH_USE_MEMTEST select ARCH_USE_QUEUED_RWLOCKS select ARCH_USES_CFI_TRAPS if CFI_CLANG -- 2.43.0 ^ permalink raw reply related [flat|nested] 4+ messages in thread
* [PATCH v3 RESEND 2/2] riscv: cmpxchg: implement arch_cmpxchg64_{relaxed|acquire|release} 2024-03-25 11:10 [PATCH v3 RESEND 0/2] riscv: enable lockless lockref implementation Jisheng Zhang 2024-03-25 11:10 ` [PATCH v3 RESEND 1/2] riscv: select ARCH_USE_CMPXCHG_LOCKREF Jisheng Zhang @ 2024-03-25 11:10 ` Jisheng Zhang 2024-04-28 22:00 ` [PATCH v3 RESEND 0/2] riscv: enable lockless lockref implementation patchwork-bot+linux-riscv 2 siblings, 0 replies; 4+ messages in thread From: Jisheng Zhang @ 2024-03-25 11:10 UTC (permalink / raw) To: Paul Walmsley, Palmer Dabbelt, Albert Ou Cc: linux-riscv, linux-kernel, Andrea Parri After selecting ARCH_USE_CMPXCHG_LOCKREF, one straight futher optimization is implementing the arch_cmpxchg64_relaxed() because the lockref code does not need the cmpxchg to have barrier semantics. At the same time, implement arch_cmpxchg64_acquire and arch_cmpxchg64_release as well. However, on both TH1520 and JH7110 platforms, I didn't see obvious performance improvement with Linus' test case [1]. IMHO, this may be related with the fence and lr.d/sc.d hw implementations. In theory, lr/sc without fence could give performance improvement over lr/sc plus fence, so add the code here to leave performance improvement room on newer HW platforms. Link: http://marc.info/?l=linux-fsdevel&m=137782380714721&w=4 [1] Signed-off-by: Jisheng Zhang <jszhang@kernel.org> Reviewed-by: Andrea Parri <parri.andrea@gmail.com> --- arch/riscv/include/asm/cmpxchg.h | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) diff --git a/arch/riscv/include/asm/cmpxchg.h b/arch/riscv/include/asm/cmpxchg.h index 2fee65cc8443..f1c8271c66f8 100644 --- a/arch/riscv/include/asm/cmpxchg.h +++ b/arch/riscv/include/asm/cmpxchg.h @@ -359,4 +359,22 @@ arch_cmpxchg_relaxed((ptr), (o), (n)); \ }) +#define arch_cmpxchg64_relaxed(ptr, o, n) \ +({ \ + BUILD_BUG_ON(sizeof(*(ptr)) != 8); \ + arch_cmpxchg_relaxed((ptr), (o), (n)); \ +}) + +#define arch_cmpxchg64_acquire(ptr, o, n) \ +({ \ + BUILD_BUG_ON(sizeof(*(ptr)) != 8); \ + arch_cmpxchg_acquire((ptr), (o), (n)); \ +}) + +#define arch_cmpxchg64_release(ptr, o, n) \ +({ \ + BUILD_BUG_ON(sizeof(*(ptr)) != 8); \ + arch_cmpxchg_release((ptr), (o), (n)); \ +}) + #endif /* _ASM_RISCV_CMPXCHG_H */ -- 2.43.0 ^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [PATCH v3 RESEND 0/2] riscv: enable lockless lockref implementation 2024-03-25 11:10 [PATCH v3 RESEND 0/2] riscv: enable lockless lockref implementation Jisheng Zhang 2024-03-25 11:10 ` [PATCH v3 RESEND 1/2] riscv: select ARCH_USE_CMPXCHG_LOCKREF Jisheng Zhang 2024-03-25 11:10 ` [PATCH v3 RESEND 2/2] riscv: cmpxchg: implement arch_cmpxchg64_{relaxed|acquire|release} Jisheng Zhang @ 2024-04-28 22:00 ` patchwork-bot+linux-riscv 2 siblings, 0 replies; 4+ messages in thread From: patchwork-bot+linux-riscv @ 2024-04-28 22:00 UTC (permalink / raw) To: Jisheng Zhang; +Cc: linux-riscv, paul.walmsley, palmer, aou, linux-kernel Hello: This series was applied to riscv/linux.git (for-next) by Palmer Dabbelt <palmer@rivosinc.com>: On Mon, 25 Mar 2024 19:10:36 +0800 you wrote: > This series selects ARCH_USE_CMPXCHG_LOCKREF to enable the > cmpxchg-based lockless lockref implementation for riscv. Then, > implement arch_cmpxchg64_{relaxed|acquire|release}. > > After patch1: > Using Linus' test case[1] on TH1520 platform, I see a 11.2% improvement. > On JH7110 platform, I see 12.0% improvement. > > [...] Here is the summary with links: - [v3,RESEND,1/2] riscv: select ARCH_USE_CMPXCHG_LOCKREF https://git.kernel.org/riscv/c/eb1e50372946 - [v3,RESEND,2/2] riscv: cmpxchg: implement arch_cmpxchg64_{relaxed|acquire|release} https://git.kernel.org/riscv/c/79d6e4eae966 You are awesome, thank you! -- Deet-doot-dot, I am a bot. https://korg.docs.kernel.org/patchwork/pwbot.html ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2024-04-28 22:00 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-03-25 11:10 [PATCH v3 RESEND 0/2] riscv: enable lockless lockref implementation Jisheng Zhang
2024-03-25 11:10 ` [PATCH v3 RESEND 1/2] riscv: select ARCH_USE_CMPXCHG_LOCKREF Jisheng Zhang
2024-03-25 11:10 ` [PATCH v3 RESEND 2/2] riscv: cmpxchg: implement arch_cmpxchg64_{relaxed|acquire|release} Jisheng Zhang
2024-04-28 22:00 ` [PATCH v3 RESEND 0/2] riscv: enable lockless lockref implementation patchwork-bot+linux-riscv
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox