All of lore.kernel.org
 help / color / mirror / Atom feed
From: Leonardo Bras <leobras@redhat.com>
To: guoren@kernel.org
Cc: paul.walmsley@sifive.com, anup@brainfault.org,
	peterz@infradead.org, mingo@redhat.com, will@kernel.org,
	palmer@rivosinc.com, longman@redhat.com, boqun.feng@gmail.com,
	tglx@linutronix.de, paulmck@kernel.org, rostedt@goodmis.org,
	rdunlap@infradead.org, catalin.marinas@arm.com,
	conor.dooley@microchip.com, xiaoguang.xing@sophgo.com,
	bjorn@rivosinc.com, alexghiti@rivosinc.com,
	keescook@chromium.org, greentime.hu@sifive.com,
	ajones@ventanamicro.com, jszhang@kernel.org, wefu@redhat.com,
	wuwei2016@iscas.ac.cn, linux-arch@vger.kernel.org,
	linux-riscv@lists.infradead.org, linux-doc@vger.kernel.org,
	kvm@vger.kernel.org, virtualization@lists.linux-foundation.org,
	linux-csky@vger.kernel.org, Guo Ren <guoren@linux.alibaba.com>
Subject: Re: [PATCH V11 00/17] riscv: Add Native/Paravirt qspinlock support
Date: Mon, 6 Nov 2023 17:42:41 -0300	[thread overview]
Message-ID: <ZUlPwQVG4OTkighB@redhat.com> (raw)
In-Reply-To: <20230910082911.3378782-1-guoren@kernel.org>

On Sun, Sep 10, 2023 at 04:28:54AM -0400, guoren@kernel.org wrote:
> From: Guo Ren <guoren@linux.alibaba.com>
> 
> patch[1 - 10]: Native   qspinlock
> patch[11 -17]: Paravirt qspinlock
> 
> patch[4]: Add prefetchw in qspinlock's xchg_tail when cpus >= 16k
> 
> This series based on:
>  - [RFC PATCH v5 0/5] Rework & improve riscv cmpxchg.h and atomic.h
>    https://lore.kernel.org/linux-riscv/20230810040349.92279-2-leobras@redhat.com/
>  - [PATCH V3] asm-generic: ticket-lock: Optimize arch_spin_value_unlocked
>    https://lore.kernel.org/linux-riscv/20230908154339.3250567-1-guoren@kernel.org/ 
> 
> I merge them into sg2042-master branch, then you could directly try it on
> sg2042 hardware platform:
> 
> https://github.com/guoren83/linux/tree/sg2042-master-qspinlock-64ilp32_v5
> 
> Use sophgo_mango_ubuntu_defconfig for sg2042 64/128 cores hardware
> platform.
> 
> Native qspinlock
> ================
> 
> This time we've proved the qspinlock on th1520 [1] & sg2042 [2], which
> gives stability and performance improvement. All T-HEAD processors have
> a strong LR/SC forward progress guarantee than the requirements of the
> ISA, which could satisfy the xchg_tail of native_qspinlock. Now,
> qspinlock has been run with us for more than 1 year, and we have enough
> confidence to enable it for all the T-HEAD processors. Of causes, we
> found a livelock problem with the qspinlock lock torture test from the
> CPU store merge buffer delay mechanism, which caused the queued spinlock
> becomes a dead ring and RCU warning to come out. We introduce a custom
> WRITE_ONCE to solve this. Do we need explicit ISA instruction to signal
> it? Or let hardware handle this.
> 
> We've tested the patch on SOPHGO sg2042 & th1520 and passed the stress
> test on Fedora & Ubuntu & OpenEuler ... Here is the performance
> comparison between qspinlock and ticket_lock on sg2042 (64 cores):
> 
> sysbench test=threads threads=32 yields=100 lock=8 (+13.8%):
>   queued_spinlock 0.5109/0.00
>   ticket_spinlock 0.5814/0.00
> 
> perf futex/hash (+6.7%):
>   queued_spinlock 1444393 operations/sec (+- 0.09%)
>   ticket_spinlock 1353215 operations/sec (+- 0.15%)
> 
> perf futex/wake-parallel (+8.6%):
>   queued_spinlock (waking 1/64 threads) in 0.0253 ms (+-2.90%)
>   ticket_spinlock (waking 1/64 threads) in 0.0275 ms (+-3.12%)
> 
> perf futex/requeue (+4.2%):
>   queued_spinlock Requeued 64 of 64 threads in 0.0785 ms (+-0.55%)
>   ticket_spinlock Requeued 64 of 64 threads in 0.0818 ms (+-4.12%)
> 
> System Benchmarks (+6.4%)
>   queued_spinlock:
>     System Benchmarks Index Values               BASELINE       RESULT    INDEX
>     Dhrystone 2 using register variables         116700.0  628613745.4  53865.8
>     Double-Precision Whetstone                       55.0     182422.8  33167.8
>     Execl Throughput                                 43.0      13116.6   3050.4
>     File Copy 1024 bufsize 2000 maxblocks          3960.0    7762306.2  19601.8
>     File Copy 256 bufsize 500 maxblocks            1655.0    3417556.8  20649.9
>     File Copy 4096 bufsize 8000 maxblocks          5800.0    7427995.7  12806.9
>     Pipe Throughput                               12440.0   23058600.5  18535.9
>     Pipe-based Context Switching                   4000.0    2835617.7   7089.0
>     Process Creation                                126.0      12537.3    995.0
>     Shell Scripts (1 concurrent)                     42.4      57057.4  13456.9
>     Shell Scripts (8 concurrent)                      6.0       7367.1  12278.5
>     System Call Overhead                          15000.0   33308301.3  22205.5
>                                                                        ========
>     System Benchmarks Index Score                                       12426.1
> 
>   ticket_spinlock:
>     System Benchmarks Index Values               BASELINE       RESULT    INDEX
>     Dhrystone 2 using register variables         116700.0  626541701.9  53688.2
>     Double-Precision Whetstone                       55.0     181921.0  33076.5
>     Execl Throughput                                 43.0      12625.1   2936.1
>     File Copy 1024 bufsize 2000 maxblocks          3960.0    6553792.9  16550.0
>     File Copy 256 bufsize 500 maxblocks            1655.0    3189231.6  19270.3
>     File Copy 4096 bufsize 8000 maxblocks          5800.0    7221277.0  12450.5
>     Pipe Throughput                               12440.0   20594018.7  16554.7
>     Pipe-based Context Switching                   4000.0    2571117.7   6427.8
>     Process Creation                                126.0      10798.4    857.0
>     Shell Scripts (1 concurrent)                     42.4      57227.5  13497.1
>     Shell Scripts (8 concurrent)                      6.0       7329.2  12215.3
>     System Call Overhead                          15000.0   30766778.4  20511.2
>                                                                        ========
>     System Benchmarks Index Score                                       11670.7
> 
> The qspinlock has a significant improvement on SOPHGO SG2042 64
> cores platform than the ticket_lock.
> 
> Paravirt qspinlock
> ==================
> 
> We implemented kvm_kick_cpu/kvm_wait_cpu and add tracepoints to observe the
> behaviors. Also, introduce a new SBI extension SBI_EXT_PVLOCK (0xAB0401). If the
> name and number are approved, I will send a formal proposal to the SBI spec.
> 

Hello Guo Ren,

Any update on this series?

Thanks!
Leo


> Changlog:
> V11:
>  - Based on Leonardo Bras's cmpxchg_small patches v5.
>  - Based on Guo Ren's Optimize arch_spin_value_unlocked patch v3.
>  - Remove abusing alternative framework and use jump_label instead.
>  - Introduce prefetch.w to improve T-HEAD processors' LR/SC forward progress
>    guarantee.
>  - Optimize qspinlock xchg_tail when NR_CPUS >= 16K.
> 
> V10:
> https://lore.kernel.org/linux-riscv/20230802164701.192791-1-guoren@kernel.org/
>  - Using an alternative framework instead of static_key_branch in the
>    asm/spinlock.h.
>  - Fixup store merge buffer problem, which causes qspinlock lock
>    torture test livelock.
>  - Add paravirt qspinlock support, include KVM backend
>  - Add Compact NUMA-awared qspinlock support 
> 
> V9:
> https://lore.kernel.org/linux-riscv/20220808071318.3335746-1-guoren@kernel.org/
>  - Cleanup generic ticket-lock code, (Using smp_mb__after_spinlock as
>    RCsc)
>  - Add qspinlock and combo-lock for riscv
>  - Add qspinlock to openrisc
>  - Use generic header in csky
>  - Optimize cmpxchg & atomic code
> 
> V8:
> https://lore.kernel.org/linux-riscv/20220724122517.1019187-1-guoren@kernel.org/
>  - Coding convention ticket fixup
>  - Move combo spinlock into riscv and simply asm-generic/spinlock.h
>  - Fixup xchg16 with wrong return value
>  - Add csky qspinlock
>  - Add combo & qspinlock & ticket-lock comparison
>  - Clean up unnecessary riscv acquire and release definitions
>  - Enable ARCH_INLINE_READ*/WRITE*/SPIN* for riscv & csky
> 
> V7:
> https://lore.kernel.org/linux-riscv/20220628081946.1999419-1-guoren@kernel.org/
>  - Add combo spinlock (ticket & queued) support
>  - Rename ticket_spinlock.h
>  - Remove unnecessary atomic_read in ticket_spin_value_unlocked  
> 
> V6:
> https://lore.kernel.org/linux-riscv/20220621144920.2945595-1-guoren@kernel.org/
>  - Fixup Clang compile problem Reported-by: kernel test robot
>  - Cleanup asm-generic/spinlock.h
>  - Remove changelog in patch main comment part, suggested by
>    Conor.Dooley
>  - Remove "default y if NUMA" in Kconfig
> 
> V5:
> https://lore.kernel.org/linux-riscv/20220620155404.1968739-1-guoren@kernel.org/
>  - Update comment with RISC-V forward guarantee feature.
>  - Back to V3 direction and optimize asm code.
> 
> V4:
> https://lore.kernel.org/linux-riscv/1616868399-82848-4-git-send-email-guoren@kernel.org/
>  - Remove custom sub-word xchg implementation
>  - Add ARCH_USE_QUEUED_SPINLOCKS_XCHG32 in locking/qspinlock
> 
> V3:
> https://lore.kernel.org/linux-riscv/1616658937-82063-1-git-send-email-guoren@kernel.org/
>  - Coding convention by Peter Zijlstra's advices
> 
> V2:
> https://lore.kernel.org/linux-riscv/1606225437-22948-2-git-send-email-guoren@kernel.org/
>  - Coding convention in cmpxchg.h
>  - Re-implement short xchg
>  - Remove char & cmpxchg implementations
> 
> V1:
> https://lore.kernel.org/linux-riscv/20190211043829.30096-1-michaeljclark@mac.com/
>  - Using cmpxchg loop to implement sub-word atomic
> 
> 
> Guo Ren (17):
>   asm-generic: ticket-lock: Reuse arch_spinlock_t of qspinlock
>   asm-generic: ticket-lock: Move into ticket_spinlock.h
>   riscv: Use Zicbop in arch_xchg when available
>   locking/qspinlock: Improve xchg_tail for number of cpus >= 16k
>   riscv: qspinlock: Add basic queued_spinlock support
>   riscv: qspinlock: Introduce combo spinlock
>   riscv: qspinlock: Introduce qspinlock param for command line
>   riscv: qspinlock: Add virt_spin_lock() support for KVM guest
>   riscv: qspinlock: errata: Add ERRATA_THEAD_WRITE_ONCE fixup
>   riscv: qspinlock: errata: Enable qspinlock for T-HEAD processors
>   RISC-V: paravirt: pvqspinlock: Add paravirt qspinlock skeleton
>   RISC-V: paravirt: pvqspinlock: Add nopvspin kernel parameter
>   RISC-V: paravirt: pvqspinlock: Add SBI implementation
>   RISC-V: paravirt: pvqspinlock: Add kconfig entry
>   RISC-V: paravirt: pvqspinlock: Add trace point for pv_kick/wait
>   RISC-V: paravirt: pvqspinlock: KVM: Add paravirt qspinlock skeleton
>   RISC-V: paravirt: pvqspinlock: KVM: Implement
>     kvm_sbi_ext_pvlock_kick_cpu()
> 
>  .../admin-guide/kernel-parameters.txt         |   8 +-
>  arch/riscv/Kconfig                            |  50 ++++++++
>  arch/riscv/Kconfig.errata                     |  19 +++
>  arch/riscv/errata/thead/errata.c              |  29 +++++
>  arch/riscv/include/asm/Kbuild                 |   2 +-
>  arch/riscv/include/asm/cmpxchg.h              |   4 +-
>  arch/riscv/include/asm/errata_list.h          |  13 --
>  arch/riscv/include/asm/hwcap.h                |   1 +
>  arch/riscv/include/asm/insn-def.h             |   5 +
>  arch/riscv/include/asm/kvm_vcpu_sbi.h         |   1 +
>  arch/riscv/include/asm/processor.h            |  13 ++
>  arch/riscv/include/asm/qspinlock.h            |  35 ++++++
>  arch/riscv/include/asm/qspinlock_paravirt.h   |  29 +++++
>  arch/riscv/include/asm/rwonce.h               |  24 ++++
>  arch/riscv/include/asm/sbi.h                  |  14 +++
>  arch/riscv/include/asm/spinlock.h             | 113 ++++++++++++++++++
>  arch/riscv/include/asm/vendorid_list.h        |  14 +++
>  arch/riscv/include/uapi/asm/kvm.h             |   1 +
>  arch/riscv/kernel/Makefile                    |   1 +
>  arch/riscv/kernel/cpufeature.c                |   1 +
>  arch/riscv/kernel/qspinlock_paravirt.c        |  83 +++++++++++++
>  arch/riscv/kernel/sbi.c                       |   2 +-
>  arch/riscv/kernel/setup.c                     |  60 ++++++++++
>  .../kernel/trace_events_filter_paravirt.h     |  60 ++++++++++
>  arch/riscv/kvm/Makefile                       |   1 +
>  arch/riscv/kvm/vcpu_sbi.c                     |   4 +
>  arch/riscv/kvm/vcpu_sbi_pvlock.c              |  57 +++++++++
>  include/asm-generic/rwonce.h                  |   2 +
>  include/asm-generic/spinlock.h                |  87 +-------------
>  include/asm-generic/spinlock_types.h          |  12 +-
>  include/asm-generic/ticket_spinlock.h         | 103 ++++++++++++++++
>  kernel/locking/qspinlock.c                    |   5 +-
>  32 files changed, 739 insertions(+), 114 deletions(-)
>  create mode 100644 arch/riscv/include/asm/qspinlock.h
>  create mode 100644 arch/riscv/include/asm/qspinlock_paravirt.h
>  create mode 100644 arch/riscv/include/asm/rwonce.h
>  create mode 100644 arch/riscv/include/asm/spinlock.h
>  create mode 100644 arch/riscv/kernel/qspinlock_paravirt.c
>  create mode 100644 arch/riscv/kernel/trace_events_filter_paravirt.h
>  create mode 100644 arch/riscv/kvm/vcpu_sbi_pvlock.c
>  create mode 100644 include/asm-generic/ticket_spinlock.h
> 
> -- 
> 2.36.1
> 


WARNING: multiple messages have this Message-ID (diff)
From: Leonardo Bras <leobras@redhat.com>
To: guoren@kernel.org
Cc: paul.walmsley@sifive.com, anup@brainfault.org,
	peterz@infradead.org, mingo@redhat.com, will@kernel.org,
	palmer@rivosinc.com, longman@redhat.com, boqun.feng@gmail.com,
	tglx@linutronix.de, paulmck@kernel.org, rostedt@goodmis.org,
	rdunlap@infradead.org, catalin.marinas@arm.com,
	conor.dooley@microchip.com, xiaoguang.xing@sophgo.com,
	bjorn@rivosinc.com, alexghiti@rivosinc.com,
	keescook@chromium.org, greentime.hu@sifive.com,
	ajones@ventanamicro.com, jszhang@kernel.org, wefu@redhat.com,
	wuwei2016@iscas.ac.cn, linux-arch@vger.kernel.org,
	linux-riscv@lists.infradead.org, linux-doc@vger.kernel.org,
	kvm@vger.kernel.org, virtualization@lists.linux-foundation.org,
	linux-csky@vger.kernel.org, Guo Ren <guoren@linux.alibaba.com>
Subject: Re: [PATCH V11 00/17] riscv: Add Native/Paravirt qspinlock support
Date: Mon, 6 Nov 2023 17:42:41 -0300	[thread overview]
Message-ID: <ZUlPwQVG4OTkighB@redhat.com> (raw)
In-Reply-To: <20230910082911.3378782-1-guoren@kernel.org>

On Sun, Sep 10, 2023 at 04:28:54AM -0400, guoren@kernel.org wrote:
> From: Guo Ren <guoren@linux.alibaba.com>
> 
> patch[1 - 10]: Native   qspinlock
> patch[11 -17]: Paravirt qspinlock
> 
> patch[4]: Add prefetchw in qspinlock's xchg_tail when cpus >= 16k
> 
> This series based on:
>  - [RFC PATCH v5 0/5] Rework & improve riscv cmpxchg.h and atomic.h
>    https://lore.kernel.org/linux-riscv/20230810040349.92279-2-leobras@redhat.com/
>  - [PATCH V3] asm-generic: ticket-lock: Optimize arch_spin_value_unlocked
>    https://lore.kernel.org/linux-riscv/20230908154339.3250567-1-guoren@kernel.org/ 
> 
> I merge them into sg2042-master branch, then you could directly try it on
> sg2042 hardware platform:
> 
> https://github.com/guoren83/linux/tree/sg2042-master-qspinlock-64ilp32_v5
> 
> Use sophgo_mango_ubuntu_defconfig for sg2042 64/128 cores hardware
> platform.
> 
> Native qspinlock
> ================
> 
> This time we've proved the qspinlock on th1520 [1] & sg2042 [2], which
> gives stability and performance improvement. All T-HEAD processors have
> a strong LR/SC forward progress guarantee than the requirements of the
> ISA, which could satisfy the xchg_tail of native_qspinlock. Now,
> qspinlock has been run with us for more than 1 year, and we have enough
> confidence to enable it for all the T-HEAD processors. Of causes, we
> found a livelock problem with the qspinlock lock torture test from the
> CPU store merge buffer delay mechanism, which caused the queued spinlock
> becomes a dead ring and RCU warning to come out. We introduce a custom
> WRITE_ONCE to solve this. Do we need explicit ISA instruction to signal
> it? Or let hardware handle this.
> 
> We've tested the patch on SOPHGO sg2042 & th1520 and passed the stress
> test on Fedora & Ubuntu & OpenEuler ... Here is the performance
> comparison between qspinlock and ticket_lock on sg2042 (64 cores):
> 
> sysbench test=threads threads=32 yields=100 lock=8 (+13.8%):
>   queued_spinlock 0.5109/0.00
>   ticket_spinlock 0.5814/0.00
> 
> perf futex/hash (+6.7%):
>   queued_spinlock 1444393 operations/sec (+- 0.09%)
>   ticket_spinlock 1353215 operations/sec (+- 0.15%)
> 
> perf futex/wake-parallel (+8.6%):
>   queued_spinlock (waking 1/64 threads) in 0.0253 ms (+-2.90%)
>   ticket_spinlock (waking 1/64 threads) in 0.0275 ms (+-3.12%)
> 
> perf futex/requeue (+4.2%):
>   queued_spinlock Requeued 64 of 64 threads in 0.0785 ms (+-0.55%)
>   ticket_spinlock Requeued 64 of 64 threads in 0.0818 ms (+-4.12%)
> 
> System Benchmarks (+6.4%)
>   queued_spinlock:
>     System Benchmarks Index Values               BASELINE       RESULT    INDEX
>     Dhrystone 2 using register variables         116700.0  628613745.4  53865.8
>     Double-Precision Whetstone                       55.0     182422.8  33167.8
>     Execl Throughput                                 43.0      13116.6   3050.4
>     File Copy 1024 bufsize 2000 maxblocks          3960.0    7762306.2  19601.8
>     File Copy 256 bufsize 500 maxblocks            1655.0    3417556.8  20649.9
>     File Copy 4096 bufsize 8000 maxblocks          5800.0    7427995.7  12806.9
>     Pipe Throughput                               12440.0   23058600.5  18535.9
>     Pipe-based Context Switching                   4000.0    2835617.7   7089.0
>     Process Creation                                126.0      12537.3    995.0
>     Shell Scripts (1 concurrent)                     42.4      57057.4  13456.9
>     Shell Scripts (8 concurrent)                      6.0       7367.1  12278.5
>     System Call Overhead                          15000.0   33308301.3  22205.5
>                                                                        ========
>     System Benchmarks Index Score                                       12426.1
> 
>   ticket_spinlock:
>     System Benchmarks Index Values               BASELINE       RESULT    INDEX
>     Dhrystone 2 using register variables         116700.0  626541701.9  53688.2
>     Double-Precision Whetstone                       55.0     181921.0  33076.5
>     Execl Throughput                                 43.0      12625.1   2936.1
>     File Copy 1024 bufsize 2000 maxblocks          3960.0    6553792.9  16550.0
>     File Copy 256 bufsize 500 maxblocks            1655.0    3189231.6  19270.3
>     File Copy 4096 bufsize 8000 maxblocks          5800.0    7221277.0  12450.5
>     Pipe Throughput                               12440.0   20594018.7  16554.7
>     Pipe-based Context Switching                   4000.0    2571117.7   6427.8
>     Process Creation                                126.0      10798.4    857.0
>     Shell Scripts (1 concurrent)                     42.4      57227.5  13497.1
>     Shell Scripts (8 concurrent)                      6.0       7329.2  12215.3
>     System Call Overhead                          15000.0   30766778.4  20511.2
>                                                                        ========
>     System Benchmarks Index Score                                       11670.7
> 
> The qspinlock has a significant improvement on SOPHGO SG2042 64
> cores platform than the ticket_lock.
> 
> Paravirt qspinlock
> ==================
> 
> We implemented kvm_kick_cpu/kvm_wait_cpu and add tracepoints to observe the
> behaviors. Also, introduce a new SBI extension SBI_EXT_PVLOCK (0xAB0401). If the
> name and number are approved, I will send a formal proposal to the SBI spec.
> 

Hello Guo Ren,

Any update on this series?

Thanks!
Leo


> Changlog:
> V11:
>  - Based on Leonardo Bras's cmpxchg_small patches v5.
>  - Based on Guo Ren's Optimize arch_spin_value_unlocked patch v3.
>  - Remove abusing alternative framework and use jump_label instead.
>  - Introduce prefetch.w to improve T-HEAD processors' LR/SC forward progress
>    guarantee.
>  - Optimize qspinlock xchg_tail when NR_CPUS >= 16K.
> 
> V10:
> https://lore.kernel.org/linux-riscv/20230802164701.192791-1-guoren@kernel.org/
>  - Using an alternative framework instead of static_key_branch in the
>    asm/spinlock.h.
>  - Fixup store merge buffer problem, which causes qspinlock lock
>    torture test livelock.
>  - Add paravirt qspinlock support, include KVM backend
>  - Add Compact NUMA-awared qspinlock support 
> 
> V9:
> https://lore.kernel.org/linux-riscv/20220808071318.3335746-1-guoren@kernel.org/
>  - Cleanup generic ticket-lock code, (Using smp_mb__after_spinlock as
>    RCsc)
>  - Add qspinlock and combo-lock for riscv
>  - Add qspinlock to openrisc
>  - Use generic header in csky
>  - Optimize cmpxchg & atomic code
> 
> V8:
> https://lore.kernel.org/linux-riscv/20220724122517.1019187-1-guoren@kernel.org/
>  - Coding convention ticket fixup
>  - Move combo spinlock into riscv and simply asm-generic/spinlock.h
>  - Fixup xchg16 with wrong return value
>  - Add csky qspinlock
>  - Add combo & qspinlock & ticket-lock comparison
>  - Clean up unnecessary riscv acquire and release definitions
>  - Enable ARCH_INLINE_READ*/WRITE*/SPIN* for riscv & csky
> 
> V7:
> https://lore.kernel.org/linux-riscv/20220628081946.1999419-1-guoren@kernel.org/
>  - Add combo spinlock (ticket & queued) support
>  - Rename ticket_spinlock.h
>  - Remove unnecessary atomic_read in ticket_spin_value_unlocked  
> 
> V6:
> https://lore.kernel.org/linux-riscv/20220621144920.2945595-1-guoren@kernel.org/
>  - Fixup Clang compile problem Reported-by: kernel test robot
>  - Cleanup asm-generic/spinlock.h
>  - Remove changelog in patch main comment part, suggested by
>    Conor.Dooley
>  - Remove "default y if NUMA" in Kconfig
> 
> V5:
> https://lore.kernel.org/linux-riscv/20220620155404.1968739-1-guoren@kernel.org/
>  - Update comment with RISC-V forward guarantee feature.
>  - Back to V3 direction and optimize asm code.
> 
> V4:
> https://lore.kernel.org/linux-riscv/1616868399-82848-4-git-send-email-guoren@kernel.org/
>  - Remove custom sub-word xchg implementation
>  - Add ARCH_USE_QUEUED_SPINLOCKS_XCHG32 in locking/qspinlock
> 
> V3:
> https://lore.kernel.org/linux-riscv/1616658937-82063-1-git-send-email-guoren@kernel.org/
>  - Coding convention by Peter Zijlstra's advices
> 
> V2:
> https://lore.kernel.org/linux-riscv/1606225437-22948-2-git-send-email-guoren@kernel.org/
>  - Coding convention in cmpxchg.h
>  - Re-implement short xchg
>  - Remove char & cmpxchg implementations
> 
> V1:
> https://lore.kernel.org/linux-riscv/20190211043829.30096-1-michaeljclark@mac.com/
>  - Using cmpxchg loop to implement sub-word atomic
> 
> 
> Guo Ren (17):
>   asm-generic: ticket-lock: Reuse arch_spinlock_t of qspinlock
>   asm-generic: ticket-lock: Move into ticket_spinlock.h
>   riscv: Use Zicbop in arch_xchg when available
>   locking/qspinlock: Improve xchg_tail for number of cpus >= 16k
>   riscv: qspinlock: Add basic queued_spinlock support
>   riscv: qspinlock: Introduce combo spinlock
>   riscv: qspinlock: Introduce qspinlock param for command line
>   riscv: qspinlock: Add virt_spin_lock() support for KVM guest
>   riscv: qspinlock: errata: Add ERRATA_THEAD_WRITE_ONCE fixup
>   riscv: qspinlock: errata: Enable qspinlock for T-HEAD processors
>   RISC-V: paravirt: pvqspinlock: Add paravirt qspinlock skeleton
>   RISC-V: paravirt: pvqspinlock: Add nopvspin kernel parameter
>   RISC-V: paravirt: pvqspinlock: Add SBI implementation
>   RISC-V: paravirt: pvqspinlock: Add kconfig entry
>   RISC-V: paravirt: pvqspinlock: Add trace point for pv_kick/wait
>   RISC-V: paravirt: pvqspinlock: KVM: Add paravirt qspinlock skeleton
>   RISC-V: paravirt: pvqspinlock: KVM: Implement
>     kvm_sbi_ext_pvlock_kick_cpu()
> 
>  .../admin-guide/kernel-parameters.txt         |   8 +-
>  arch/riscv/Kconfig                            |  50 ++++++++
>  arch/riscv/Kconfig.errata                     |  19 +++
>  arch/riscv/errata/thead/errata.c              |  29 +++++
>  arch/riscv/include/asm/Kbuild                 |   2 +-
>  arch/riscv/include/asm/cmpxchg.h              |   4 +-
>  arch/riscv/include/asm/errata_list.h          |  13 --
>  arch/riscv/include/asm/hwcap.h                |   1 +
>  arch/riscv/include/asm/insn-def.h             |   5 +
>  arch/riscv/include/asm/kvm_vcpu_sbi.h         |   1 +
>  arch/riscv/include/asm/processor.h            |  13 ++
>  arch/riscv/include/asm/qspinlock.h            |  35 ++++++
>  arch/riscv/include/asm/qspinlock_paravirt.h   |  29 +++++
>  arch/riscv/include/asm/rwonce.h               |  24 ++++
>  arch/riscv/include/asm/sbi.h                  |  14 +++
>  arch/riscv/include/asm/spinlock.h             | 113 ++++++++++++++++++
>  arch/riscv/include/asm/vendorid_list.h        |  14 +++
>  arch/riscv/include/uapi/asm/kvm.h             |   1 +
>  arch/riscv/kernel/Makefile                    |   1 +
>  arch/riscv/kernel/cpufeature.c                |   1 +
>  arch/riscv/kernel/qspinlock_paravirt.c        |  83 +++++++++++++
>  arch/riscv/kernel/sbi.c                       |   2 +-
>  arch/riscv/kernel/setup.c                     |  60 ++++++++++
>  .../kernel/trace_events_filter_paravirt.h     |  60 ++++++++++
>  arch/riscv/kvm/Makefile                       |   1 +
>  arch/riscv/kvm/vcpu_sbi.c                     |   4 +
>  arch/riscv/kvm/vcpu_sbi_pvlock.c              |  57 +++++++++
>  include/asm-generic/rwonce.h                  |   2 +
>  include/asm-generic/spinlock.h                |  87 +-------------
>  include/asm-generic/spinlock_types.h          |  12 +-
>  include/asm-generic/ticket_spinlock.h         | 103 ++++++++++++++++
>  kernel/locking/qspinlock.c                    |   5 +-
>  32 files changed, 739 insertions(+), 114 deletions(-)
>  create mode 100644 arch/riscv/include/asm/qspinlock.h
>  create mode 100644 arch/riscv/include/asm/qspinlock_paravirt.h
>  create mode 100644 arch/riscv/include/asm/rwonce.h
>  create mode 100644 arch/riscv/include/asm/spinlock.h
>  create mode 100644 arch/riscv/kernel/qspinlock_paravirt.c
>  create mode 100644 arch/riscv/kernel/trace_events_filter_paravirt.h
>  create mode 100644 arch/riscv/kvm/vcpu_sbi_pvlock.c
>  create mode 100644 include/asm-generic/ticket_spinlock.h
> 
> -- 
> 2.36.1
> 


_______________________________________________
linux-riscv mailing list
linux-riscv@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/linux-riscv

  parent reply	other threads:[~2023-11-06 20:43 UTC|newest]

Thread overview: 215+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-10  8:28 [PATCH V11 00/17] riscv: Add Native/Paravirt qspinlock support guoren
2023-09-10  8:28 ` guoren
2023-09-10  8:28 ` [PATCH V11 01/17] asm-generic: ticket-lock: Reuse arch_spinlock_t of qspinlock guoren
2023-09-10  8:28   ` guoren
2023-09-11 19:05   ` Leonardo Brás
2023-09-11 19:05     ` Leonardo Brás
2023-09-13  1:55     ` Guo Ren
2023-09-13  1:55       ` Guo Ren
2023-09-13  7:59       ` Leonardo Bras
2023-09-13  7:59         ` Leonardo Bras
2023-09-10  8:28 ` [PATCH V11 02/17] asm-generic: ticket-lock: Move into ticket_spinlock.h guoren
2023-09-10  8:28   ` guoren
2023-09-13  8:15   ` Leonardo Bras
2023-09-13  8:15     ` Leonardo Bras
2023-09-10  8:28 ` [PATCH V11 03/17] riscv: Use Zicbop in arch_xchg when available guoren
2023-12-31  8:29   ` guoren
2023-09-10  8:28   ` guoren
2023-09-13  8:49   ` Leonardo Bras
2023-09-13  8:49     ` Leonardo Bras
2023-09-15 12:36     ` Guo Ren
2023-09-15 12:36       ` Guo Ren
2023-09-16  1:25       ` Leonardo Bras
2023-09-16  1:25         ` Leonardo Bras
2023-09-17 14:34         ` Guo Ren
2023-09-17 14:34           ` Guo Ren
2023-09-19  5:13           ` Leonardo Bras
2023-09-19  5:13             ` Leonardo Bras
2023-09-19  7:53             ` Guo Ren
2023-09-19  7:53               ` Guo Ren
2023-09-19 14:38               ` Leonardo Bras
2023-09-19 14:38                 ` Leonardo Bras
2023-09-14 13:47   ` Andrew Jones
2023-09-14 13:47     ` Andrew Jones
2023-09-15  8:22     ` Leonardo Bras
2023-09-15  8:22       ` Leonardo Bras
2023-09-15 11:07       ` Andrew Jones
2023-09-15 11:07         ` Andrew Jones
2023-09-15 11:26         ` Conor Dooley
2023-09-15 11:26           ` Conor Dooley
2023-09-15 12:22           ` Andrew Jones
2023-09-15 12:22             ` Andrew Jones
2023-09-15 12:42             ` Conor Dooley
2023-09-15 12:42               ` Conor Dooley
2023-09-16  0:05               ` Conor Dooley
2023-09-16  0:05                 ` Conor Dooley
2023-09-15 20:32         ` Leonardo Bras
2023-09-15 20:32           ` Leonardo Bras
2023-09-14 14:25   ` Andrew Jones
2023-09-14 14:25     ` Andrew Jones
2023-09-14 14:47     ` Andrew Jones
2023-09-14 14:47       ` Andrew Jones
2023-09-15 11:37       ` Conor Dooley
2023-09-15 11:37         ` Conor Dooley
2023-09-15 12:14         ` Andrew Jones
2023-09-15 12:14           ` Andrew Jones
2023-09-15 12:53           ` Conor Dooley
2023-09-15 12:53             ` Conor Dooley
2023-09-10  8:28 ` [PATCH V11 04/17] locking/qspinlock: Improve xchg_tail for number of cpus >= 16k guoren
2023-09-10  8:28   ` guoren
2023-09-11  2:35   ` Waiman Long
2023-09-11  2:35     ` Waiman Long
2023-09-11  2:35     ` Waiman Long
2023-09-11  3:09     ` Guo Ren
2023-09-11  3:09       ` Guo Ren
2023-09-11 13:03       ` Waiman Long
2023-09-11 13:03         ` Waiman Long
2023-09-11 13:03         ` Waiman Long
2023-09-12  1:10         ` Guo Ren
2023-09-12  1:10           ` Guo Ren
2023-09-13  8:55           ` Leonardo Bras
2023-09-13  8:55             ` Leonardo Bras
2023-09-13 12:52             ` Guo Ren
2023-09-13 12:52               ` Guo Ren
2023-09-13 13:06               ` Waiman Long
2023-09-13 13:06                 ` Waiman Long
2023-09-13 13:06                 ` Waiman Long
2023-09-14  3:45                 ` Guo Ren
2023-09-14  3:45                   ` Guo Ren
2023-09-10  8:28 ` [PATCH V11 05/17] riscv: qspinlock: Add basic queued_spinlock support guoren
2023-09-10  8:28   ` guoren
2023-09-13 20:28   ` Leonardo Bras
2023-09-13 20:28     ` Leonardo Bras
2023-09-14  4:46     ` Guo Ren
2023-09-14  4:46       ` Guo Ren
2023-09-14  9:43       ` Leonardo Bras
2023-09-14  9:43         ` Leonardo Bras
2023-09-15  2:10         ` Guo Ren
2023-09-15  2:10           ` Guo Ren
2023-09-15  9:08           ` Leonardo Bras
2023-09-15  9:08             ` Leonardo Bras
2023-09-17 15:02             ` Guo Ren
2023-09-17 15:02               ` Guo Ren
2023-09-19  5:20               ` Leonardo Bras
2023-09-19  5:20                 ` Leonardo Bras
2023-09-10  8:29 ` [PATCH V11 06/17] riscv: qspinlock: Introduce combo spinlock guoren
2023-09-10  8:29   ` guoren
2023-09-10 11:06   ` Guo Ren
2023-09-10 11:06     ` Guo Ren
2023-09-13 20:37     ` Leonardo Bras
2023-09-13 20:37       ` Leonardo Bras
2023-09-13 20:49       ` Leonardo Bras
2023-09-13 20:49         ` Leonardo Bras
2023-09-14  4:49         ` Guo Ren
2023-09-14  4:49           ` Guo Ren
2023-09-14  7:17           ` Leonardo Bras
2023-09-14  7:17             ` Leonardo Bras
2023-09-10  8:29 ` [PATCH V11 07/17] riscv: qspinlock: Introduce qspinlock param for command line guoren
2023-09-10  8:29   ` guoren
2023-09-11 15:22   ` Waiman Long
2023-09-11 15:22     ` Waiman Long
2023-09-11 15:22     ` Waiman Long
2023-09-12  1:06     ` Guo Ren
2023-09-12  1:06       ` Guo Ren
2023-09-11 15:34   ` Waiman Long
2023-09-11 15:34     ` Waiman Long
2023-09-11 15:34     ` Waiman Long
2023-09-12  1:08     ` Guo Ren
2023-09-12  1:08       ` Guo Ren
2023-09-14  7:32       ` Leonardo Bras
2023-09-14  7:32         ` Leonardo Bras
2023-09-14 17:23         ` Waiman Long
2023-09-14 17:23           ` Waiman Long
2023-09-14 17:23           ` Waiman Long
2023-09-10  8:29 ` [PATCH V11 08/17] riscv: qspinlock: Add virt_spin_lock() support for KVM guest guoren
2023-09-10  8:29   ` guoren
2023-09-14  8:02   ` Leonardo Bras
2023-09-14  8:02     ` Leonardo Bras
2023-09-17 15:12     ` Guo Ren
2023-09-17 15:12       ` Guo Ren
2023-09-19  5:30       ` Leonardo Bras
2023-09-19  5:30         ` Leonardo Bras
2023-09-19  8:04         ` Guo Ren
2023-09-19  8:04           ` Guo Ren
2023-09-19 14:40           ` Leonardo Bras
2023-09-19 14:40             ` Leonardo Bras
2023-09-10  8:29 ` [PATCH V11 09/17] riscv: qspinlock: errata: Add ERRATA_THEAD_WRITE_ONCE fixup guoren
2023-09-10  8:29   ` guoren
2023-09-14  8:32   ` Leonardo Bras
2023-09-14  8:32     ` Leonardo Bras
2023-09-17 15:15     ` Guo Ren
2023-09-17 15:15       ` Guo Ren
2023-09-19  5:34       ` Leonardo Bras
2023-09-19  5:34         ` Leonardo Bras
2023-09-10  8:29 ` [PATCH V11 10/17] riscv: qspinlock: errata: Enable qspinlock for T-HEAD processors guoren
2023-09-10  8:29   ` guoren
2023-09-14  9:36   ` Leonardo Bras
2023-09-14  9:36     ` Leonardo Bras
2023-09-10  8:29 ` [PATCH V11 11/17] RISC-V: paravirt: pvqspinlock: Add paravirt qspinlock skeleton guoren
2023-09-10  8:29   ` guoren
2023-09-15  5:42   ` Leonardo Bras
2023-09-15  5:42     ` Leonardo Bras
2023-09-17 14:58     ` Guo Ren
2023-09-17 14:58       ` Guo Ren
2023-09-19  5:43       ` Leonardo Bras
2023-09-19  5:43         ` Leonardo Bras
2023-09-10  8:29 ` [PATCH V11 12/17] RISC-V: paravirt: pvqspinlock: Add nopvspin kernel parameter guoren
2023-09-10  8:29   ` guoren
2023-09-15  6:05   ` Leonardo Bras
2023-09-15  6:05     ` Leonardo Bras
2023-09-17 15:03     ` Guo Ren
2023-09-17 15:03       ` Guo Ren
2023-09-19  5:44       ` Leonardo Bras
2023-09-19  5:44         ` Leonardo Bras
2023-09-10  8:29 ` [PATCH V11 13/17] RISC-V: paravirt: pvqspinlock: Add SBI implementation guoren
2023-09-10  8:29   ` guoren
2023-09-15  6:23   ` Leonardo Bras
2023-09-15  6:23     ` Leonardo Bras
2023-09-17 15:06     ` Guo Ren
2023-09-17 15:06       ` Guo Ren
2023-09-19  5:45       ` Leonardo Bras
2023-09-19  5:45         ` Leonardo Bras
2023-09-10  8:29 ` [PATCH V11 14/17] RISC-V: paravirt: pvqspinlock: Add kconfig entry guoren
2023-09-10  8:29   ` guoren
2023-09-15  6:25   ` Leonardo Bras
2023-09-15  6:25     ` Leonardo Bras
2023-09-10  8:29 ` [PATCH V11 15/17] RISC-V: paravirt: pvqspinlock: Add trace point for pv_kick/wait guoren
2023-09-10  8:29   ` guoren
2023-09-15  6:33   ` Leonardo Bras
2023-09-15  6:33     ` Leonardo Bras
2023-09-10  8:29 ` [PATCH V11 16/17] RISC-V: paravirt: pvqspinlock: KVM: Add paravirt qspinlock skeleton guoren
2023-09-10  8:29   ` guoren
2023-09-15  6:46   ` Leonardo Bras
2023-09-15  6:46     ` Leonardo Bras
2023-09-10  8:29 ` [PATCH V11 17/17] RISC-V: paravirt: pvqspinlock: KVM: Implement kvm_sbi_ext_pvlock_kick_cpu() guoren
2023-09-10  8:29   ` guoren
2023-09-15  6:52   ` Leonardo Bras
2023-09-15  6:52     ` Leonardo Bras
2023-09-10  8:58 ` [PATCH V11 00/17] riscv: Add Native/Paravirt qspinlock support Conor Dooley
2023-09-10  8:58   ` Conor Dooley
2023-09-10  9:16   ` Guo Ren
2023-09-10  9:16     ` Guo Ren
2023-09-10  9:20     ` Guo Ren
2023-09-10  9:20       ` Guo Ren
2023-09-10  9:31     ` Conor Dooley
2023-09-10  9:31       ` Conor Dooley
2023-09-10  9:49       ` Guo Ren
2023-09-10  9:49         ` Guo Ren
2023-09-10 19:45         ` Conor Dooley
2023-09-10 19:45           ` Conor Dooley
2023-09-11  3:36           ` Guo Ren
2023-09-11  3:36             ` Guo Ren
2023-09-11 12:52             ` Conor Dooley
2023-09-11 12:52               ` Conor Dooley
2023-09-12  1:33               ` Guo Ren
2023-09-12  1:33                 ` Guo Ren
2023-09-12  8:07                 ` Conor Dooley
2023-09-12  8:07                   ` Conor Dooley
2023-09-12 10:58                   ` Guo Ren
2023-09-12 10:58                     ` Guo Ren
2023-11-06 20:42 ` Leonardo Bras [this message]
2023-11-06 20:42   ` Leonardo Bras
2023-11-12  4:23   ` Guo Ren
2023-11-12  4:23     ` Guo Ren
2023-11-13 10:19     ` Leonardo Bras Soares Passos
2023-11-13 10:19       ` Leonardo Bras Soares Passos

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZUlPwQVG4OTkighB@redhat.com \
    --to=leobras@redhat.com \
    --cc=ajones@ventanamicro.com \
    --cc=alexghiti@rivosinc.com \
    --cc=anup@brainfault.org \
    --cc=bjorn@rivosinc.com \
    --cc=boqun.feng@gmail.com \
    --cc=catalin.marinas@arm.com \
    --cc=conor.dooley@microchip.com \
    --cc=greentime.hu@sifive.com \
    --cc=guoren@kernel.org \
    --cc=guoren@linux.alibaba.com \
    --cc=jszhang@kernel.org \
    --cc=keescook@chromium.org \
    --cc=kvm@vger.kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-csky@vger.kernel.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-riscv@lists.infradead.org \
    --cc=longman@redhat.com \
    --cc=mingo@redhat.com \
    --cc=palmer@rivosinc.com \
    --cc=paul.walmsley@sifive.com \
    --cc=paulmck@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rdunlap@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=tglx@linutronix.de \
    --cc=virtualization@lists.linux-foundation.org \
    --cc=wefu@redhat.com \
    --cc=will@kernel.org \
    --cc=wuwei2016@iscas.ac.cn \
    --cc=xiaoguang.xing@sophgo.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.