linux-trace-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 3/3] locking/atomic: openrisc: use generic_cmpxchg[64]_local for arch_cmpxchg[64]_local
@ 2023-10-26  7:39 wuqiang.matt
  2023-10-26  8:46 ` Arnd Bergmann
  0 siblings, 1 reply; 11+ messages in thread
From: wuqiang.matt @ 2023-10-26  7:39 UTC (permalink / raw)
  To: Jonas Bonn, Stefan Kristiansson, Stafford Horne, Ingo Molnar,
	Peter Zijlstra (Intel), Andi Shyti, Palmer Dabbelt, Arnd Bergmann,
	Andrzej Hajda
  Cc: linux-trace-kernel, mhiramat, mattwu, wuqiang.matt

arch_cmpxchg[64]_local() are not defined for openrisc. So implement
them with generci_cmpxchg[64]_local, advised by Masami Hiramatsu.

Closes: https://lore.kernel.org/linux-trace-kernel/169824660459.24340.14614817132696360531.stgit@devnote2
Closes: https://lore.kernel.org/oe-kbuild-all/202310241310.Ir5uukOG-lkp@intel.com

Signed-off-by: wuqiang.matt <wuqiang.matt@bytedance.com>
---
 arch/openrisc/include/asm/cmpxchg.h | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/arch/openrisc/include/asm/cmpxchg.h b/arch/openrisc/include/asm/cmpxchg.h
index 8ee151c072e4..7d0555257389 100644
--- a/arch/openrisc/include/asm/cmpxchg.h
+++ b/arch/openrisc/include/asm/cmpxchg.h
@@ -168,4 +168,18 @@ __arch_xchg(volatile void *ptr, unsigned long with, int size)
 						 sizeof(*(ptr)));	\
 	})
 
+/*
+ * always make arch_cmpxchg[64]_local available. __generic_cmpxchg[64]_local
+ * are atomic with respect to current cpu.
+ */
+#include <asm-generic/cmpxchg-local.h>
+
+#define arch_cmpxchg_local(ptr, o, n) ({				\
+	(__typeof__(*ptr))__generic_cmpxchg_local((ptr),		\
+						(unsigned long)(o),	\
+						(unsigned long)(n),	\
+						sizeof(*(ptr)));	\
+})
+#define arch_cmpxchg64_local(ptr, o, n) __generic_cmpxchg64_local((ptr), (o), (n))
+
 #endif /* __ASM_OPENRISC_CMPXCHG_H */
-- 
2.40.1


^ permalink raw reply related	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] locking/atomic: openrisc: use generic_cmpxchg[64]_local for arch_cmpxchg[64]_local
  2023-10-26  7:39 [PATCH 3/3] locking/atomic: openrisc: use generic_cmpxchg[64]_local for arch_cmpxchg[64]_local wuqiang.matt
@ 2023-10-26  8:46 ` Arnd Bergmann
  2023-10-26 11:05   ` wuqiang.matt
  0 siblings, 1 reply; 11+ messages in thread
From: Arnd Bergmann @ 2023-10-26  8:46 UTC (permalink / raw)
  To: wuqiang.matt, Jonas Bonn, Stefan Kristiansson, Stafford Horne,
	Ingo Molnar, Peter Zijlstra, Andi Shyti, Palmer Dabbelt,
	Andrzej Hajda
  Cc: linux-trace-kernel, Masami Hiramatsu, mattwu

On Thu, Oct 26, 2023, at 09:39, wuqiang.matt wrote:
> arch_cmpxchg[64]_local() are not defined for openrisc. So implement
> them with generci_cmpxchg[64]_local, advised by Masami Hiramatsu.
>
> Closes: 
> https://lore.kernel.org/linux-trace-kernel/169824660459.24340.14614817132696360531.stgit@devnote2
> Closes: 
> https://lore.kernel.org/oe-kbuild-all/202310241310.Ir5uukOG-lkp@intel.com
>
> Signed-off-by: wuqiang.matt <wuqiang.matt@bytedance.com>

I think on architectures that have actual atomics, you
generally want to define this to be the same as arch_cmpxchg()
rather than the generic version.

It depends on the relative cost of doing one atomic compared
to an irq-disable/enable pair, but everyone else went with
the former if they could. The exceptions are armv4/armv5,
sparc32 and parisc, which don't have a generic cmpxchg()
or similar operation.

You could do the thing that sparc64 and xtensa do, which
use the native cmpxchg for supported word sizes but the
generic version for 1- and 2-byte swaps, but that has its
own set of problems if you end up doing operations on both
the entire word and a sub-unit of the same thing.

    Arnd

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] locking/atomic: openrisc: use generic_cmpxchg[64]_local for arch_cmpxchg[64]_local
  2023-10-26  8:46 ` Arnd Bergmann
@ 2023-10-26 11:05   ` wuqiang.matt
  2023-10-28 12:49     ` Masami Hiramatsu
  0 siblings, 1 reply; 11+ messages in thread
From: wuqiang.matt @ 2023-10-26 11:05 UTC (permalink / raw)
  To: Arnd Bergmann, Jonas Bonn, Stefan Kristiansson, Stafford Horne,
	Ingo Molnar, Peter Zijlstra, Andi Shyti, Palmer Dabbelt,
	Andrzej Hajda
  Cc: linux-trace-kernel, Masami Hiramatsu, mattwu

On 2023/10/26 16:46, Arnd Bergmann wrote:
> On Thu, Oct 26, 2023, at 09:39, wuqiang.matt wrote:
>> arch_cmpxchg[64]_local() are not defined for openrisc. So implement
>> them with generci_cmpxchg[64]_local, advised by Masami Hiramatsu.
>>
>> Closes:
>> https://lore.kernel.org/linux-trace-kernel/169824660459.24340.14614817132696360531.stgit@devnote2
>> Closes:
>> https://lore.kernel.org/oe-kbuild-all/202310241310.Ir5uukOG-lkp@intel.com
>>
>> Signed-off-by: wuqiang.matt <wuqiang.matt@bytedance.com>
> 
> I think on architectures that have actual atomics, you
> generally want to define this to be the same as arch_cmpxchg()
> rather than the generic version.
> 
> It depends on the relative cost of doing one atomic compared
> to an irq-disable/enable pair, but everyone else went with
> the former if they could. The exceptions are armv4/armv5,
> sparc32 and parisc, which don't have a generic cmpxchg()
> or similar operation.

Sure, better native than the generic. I'll try to collect more
insights before next move.

> You could do the thing that sparc64 and xtensa do, which
> use the native cmpxchg for supported word sizes but the
> generic version for 1- and 2-byte swaps, but that has its
> own set of problems if you end up doing operations on both
> the entire word and a sub-unit of the same thing.

Thank you for pointing out this. I'll do some research on these
implementations.

>      Arnd

Regards,
wuqiang


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] locking/atomic: openrisc: use generic_cmpxchg[64]_local for arch_cmpxchg[64]_local
  2023-10-26 11:05   ` wuqiang.matt
@ 2023-10-28 12:49     ` Masami Hiramatsu
  2023-10-28 16:40       ` wuqiang.matt
  0 siblings, 1 reply; 11+ messages in thread
From: Masami Hiramatsu @ 2023-10-28 12:49 UTC (permalink / raw)
  To: wuqiang.matt
  Cc: Arnd Bergmann, Jonas Bonn, Stefan Kristiansson, Stafford Horne,
	Ingo Molnar, Peter Zijlstra, Andi Shyti, Palmer Dabbelt,
	Andrzej Hajda, linux-trace-kernel, Masami Hiramatsu, mattwu

Hi Wuqiang,

On Thu, 26 Oct 2023 19:05:51 +0800
"wuqiang.matt" <wuqiang.matt@bytedance.com> wrote:

> On 2023/10/26 16:46, Arnd Bergmann wrote:
> > On Thu, Oct 26, 2023, at 09:39, wuqiang.matt wrote:
> >> arch_cmpxchg[64]_local() are not defined for openrisc. So implement
> >> them with generci_cmpxchg[64]_local, advised by Masami Hiramatsu.
> >>
> >> Closes:
> >> https://lore.kernel.org/linux-trace-kernel/169824660459.24340.14614817132696360531.stgit@devnote2
> >> Closes:
> >> https://lore.kernel.org/oe-kbuild-all/202310241310.Ir5uukOG-lkp@intel.com
> >>
> >> Signed-off-by: wuqiang.matt <wuqiang.matt@bytedance.com>
> > 
> > I think on architectures that have actual atomics, you
> > generally want to define this to be the same as arch_cmpxchg()
> > rather than the generic version.
> > 
> > It depends on the relative cost of doing one atomic compared
> > to an irq-disable/enable pair, but everyone else went with
> > the former if they could. The exceptions are armv4/armv5,
> > sparc32 and parisc, which don't have a generic cmpxchg()
> > or similar operation.
> 
> Sure, better native than the generic. I'll try to collect more
> insights before next move.

So I will temporally remove the last change (use arch_cmpxchg_local
in objpool) until these series are rewritten with arch native code,
so that the next release will not break the kernel build.

But this must be fixed because arch_cmpxchg_local() is required
for each arch anyway.

> 
> > You could do the thing that sparc64 and xtensa do, which
> > use the native cmpxchg for supported word sizes but the
> > generic version for 1- and 2-byte swaps, but that has its
> > own set of problems if you end up doing operations on both
> > the entire word and a sub-unit of the same thing.
> 
> Thank you for pointing out this. I'll do some research on these
> implementations.

arc also has the LL-SC instruction but depends on the core feature,
so I think we can use it.

Thank you,

> 
> >      Arnd
> 
> Regards,
> wuqiang
> 


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] locking/atomic: openrisc: use generic_cmpxchg[64]_local for arch_cmpxchg[64]_local
  2023-10-28 12:49     ` Masami Hiramatsu
@ 2023-10-28 16:40       ` wuqiang.matt
  2023-10-29  3:26         ` Masami Hiramatsu
  0 siblings, 1 reply; 11+ messages in thread
From: wuqiang.matt @ 2023-10-28 16:40 UTC (permalink / raw)
  To: Masami Hiramatsu (Google)
  Cc: Arnd Bergmann, Jonas Bonn, Stefan Kristiansson, Stafford Horne,
	Ingo Molnar, Peter Zijlstra, Andi Shyti, Palmer Dabbelt,
	Andrzej Hajda, linux-trace-kernel, mattwu, linux-snps-arc,
	Vineet Gupta

On 2023/10/28 20:49, Masami Hiramatsu (Google) wrote:
> Hi Wuqiang,
> 
> On Thu, 26 Oct 2023 19:05:51 +0800
> "wuqiang.matt" <wuqiang.matt@bytedance.com> wrote:
> 
>> On 2023/10/26 16:46, Arnd Bergmann wrote:
>>> On Thu, Oct 26, 2023, at 09:39, wuqiang.matt wrote:
>>>> arch_cmpxchg[64]_local() are not defined for openrisc. So implement
>>>> them with generci_cmpxchg[64]_local, advised by Masami Hiramatsu.
>>>>
>>>> Closes:
>>>> https://lore.kernel.org/linux-trace-kernel/169824660459.24340.14614817132696360531.stgit@devnote2
>>>> Closes:
>>>> https://lore.kernel.org/oe-kbuild-all/202310241310.Ir5uukOG-lkp@intel.com
>>>>
>>>> Signed-off-by: wuqiang.matt <wuqiang.matt@bytedance.com>
>>>
>>> I think on architectures that have actual atomics, you
>>> generally want to define this to be the same as arch_cmpxchg()
>>> rather than the generic version.
>>>
>>> It depends on the relative cost of doing one atomic compared
>>> to an irq-disable/enable pair, but everyone else went with
>>> the former if they could. The exceptions are armv4/armv5,
>>> sparc32 and parisc, which don't have a generic cmpxchg()
>>> or similar operation.
>>
>> Sure, better native than the generic. I'll try to collect more
>> insights before next move.
> 
> So I will temporally remove the last change (use arch_cmpxchg_local
> in objpool) until these series are rewritten with arch native code,
> so that the next release will not break the kernel build.

Ok, it's fine to me. Thank you.


> But this must be fixed because arch_cmpxchg_local() is required
> for each arch anyway.

Yes. I'm working on the new update for arc/openrisc/hexagon. It would
be better resolve this issue first, then consider the objpool update
of using arch_cmpxchg_local.

>>
>>> You could do the thing that sparc64 and xtensa do, which
>>> use the native cmpxchg for supported word sizes but the
>>> generic version for 1- and 2-byte swaps, but that has its
>>> own set of problems if you end up doing operations on both
>>> the entire word and a sub-unit of the same thing.
>>
>> Thank you for pointing out this. I'll do some research on these
>> implementations.
> 
> arc also has the LL-SC instruction but depends on the core feature,
> so I think we can use it.

Right. The arc processor does have the CONFIG_ARC_HAS_LLSC option, but
I doubt the correctness of arch_cmpxchg_relaxed and arch_cmpxchg:

arch_cmpxchg_relaxed:
...
         switch(sizeof((_p_))) {
         case 4:
....

arch_cmpxchg:
...
	BUILD_BUG_ON(sizeof(_p_) != 4);		
...

_p is the address pointer, so I'm thinking it's a typo but I couldn't
yet confirm. There is not much about arc processors in the web :(


> Thank you,
> 
>>
>>>       Arnd
>>
>> Regards,
>> wuqiang
>>
> 
> 


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] locking/atomic: openrisc: use generic_cmpxchg[64]_local for arch_cmpxchg[64]_local
  2023-10-28 16:40       ` wuqiang.matt
@ 2023-10-29  3:26         ` Masami Hiramatsu
  2023-10-30  2:22           ` Vineet Gupta
  0 siblings, 1 reply; 11+ messages in thread
From: Masami Hiramatsu @ 2023-10-29  3:26 UTC (permalink / raw)
  To: wuqiang.matt, Vineet Gupta
  Cc: Arnd Bergmann, Jonas Bonn, Stefan Kristiansson, Stafford Horne,
	Ingo Molnar, Peter Zijlstra, Andi Shyti, Palmer Dabbelt,
	Andrzej Hajda, linux-trace-kernel, mattwu, linux-snps-arc

On Sun, 29 Oct 2023 00:40:17 +0800
"wuqiang.matt" <wuqiang.matt@bytedance.com> wrote:

> On 2023/10/28 20:49, Masami Hiramatsu (Google) wrote:
> > Hi Wuqiang,
> > 
> > On Thu, 26 Oct 2023 19:05:51 +0800
> > "wuqiang.matt" <wuqiang.matt@bytedance.com> wrote:
> > 
> >> On 2023/10/26 16:46, Arnd Bergmann wrote:
> >>> On Thu, Oct 26, 2023, at 09:39, wuqiang.matt wrote:
> >>>> arch_cmpxchg[64]_local() are not defined for openrisc. So implement
> >>>> them with generci_cmpxchg[64]_local, advised by Masami Hiramatsu.
> >>>>
> >>>> Closes:
> >>>> https://lore.kernel.org/linux-trace-kernel/169824660459.24340.14614817132696360531.stgit@devnote2
> >>>> Closes:
> >>>> https://lore.kernel.org/oe-kbuild-all/202310241310.Ir5uukOG-lkp@intel.com
> >>>>
> >>>> Signed-off-by: wuqiang.matt <wuqiang.matt@bytedance.com>
> >>>
> >>> I think on architectures that have actual atomics, you
> >>> generally want to define this to be the same as arch_cmpxchg()
> >>> rather than the generic version.
> >>>
> >>> It depends on the relative cost of doing one atomic compared
> >>> to an irq-disable/enable pair, but everyone else went with
> >>> the former if they could. The exceptions are armv4/armv5,
> >>> sparc32 and parisc, which don't have a generic cmpxchg()
> >>> or similar operation.
> >>
> >> Sure, better native than the generic. I'll try to collect more
> >> insights before next move.
> > 
> > So I will temporally remove the last change (use arch_cmpxchg_local
> > in objpool) until these series are rewritten with arch native code,
> > so that the next release will not break the kernel build.
> 
> Ok, it's fine to me. Thank you.
> 
> 
> > But this must be fixed because arch_cmpxchg_local() is required
> > for each arch anyway.
> 
> Yes. I'm working on the new update for arc/openrisc/hexagon. It would
> be better resolve this issue first, then consider the objpool update
> of using arch_cmpxchg_local.
> 
> >>
> >>> You could do the thing that sparc64 and xtensa do, which
> >>> use the native cmpxchg for supported word sizes but the
> >>> generic version for 1- and 2-byte swaps, but that has its
> >>> own set of problems if you end up doing operations on both
> >>> the entire word and a sub-unit of the same thing.
> >>
> >> Thank you for pointing out this. I'll do some research on these
> >> implementations.
> > 
> > arc also has the LL-SC instruction but depends on the core feature,
> > so I think we can use it.
> 
> Right. The arc processor does have the CONFIG_ARC_HAS_LLSC option, but
> I doubt the correctness of arch_cmpxchg_relaxed and arch_cmpxchg:
> 
> arch_cmpxchg_relaxed:
> ...
>          switch(sizeof((_p_))) {
>          case 4:
> ....
> 
> arch_cmpxchg:
> ...
> 	BUILD_BUG_ON(sizeof(_p_) != 4);		
> ...
> 
> _p is the address pointer, so I'm thinking it's a typo but I couldn't
> yet confirm. There is not much about arc processors in the web :(

Hmm, indeed. This seems like a bug but it depends on the 'llock  %0, [%1]'
can take a 32bit address or 32bit data register. Usually it should
check the size of data, but need to check with ISA manual.

Vineet, can you check this suspicious bug?

Thank you,

> 
> 
> > Thank you,
> > 
> >>
> >>>       Arnd
> >>
> >> Regards,
> >> wuqiang
> >>
> > 
> > 
> 


-- 
Masami Hiramatsu (Google) <mhiramat@kernel.org>

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] locking/atomic: openrisc: use generic_cmpxchg[64]_local for arch_cmpxchg[64]_local
  2023-10-29  3:26         ` Masami Hiramatsu
@ 2023-10-30  2:22           ` Vineet Gupta
  2023-10-30  3:41             ` wuqiang.matt
  0 siblings, 1 reply; 11+ messages in thread
From: Vineet Gupta @ 2023-10-30  2:22 UTC (permalink / raw)
  To: Masami Hiramatsu (Google), wuqiang.matt, Vineet Gupta
  Cc: Arnd Bergmann, Jonas Bonn, Stefan Kristiansson, Stafford Horne,
	Ingo Molnar, Peter Zijlstra, Andi Shyti, Palmer Dabbelt,
	Andrzej Hajda, linux-trace-kernel, mattwu, linux-snps-arc



On 10/28/23 20:26, Masami Hiramatsu (Google) wrote:
> On Sun, 29 Oct 2023 00:40:17 +0800
> "wuqiang.matt" <wuqiang.matt@bytedance.com> wrote:
>
>> On 2023/10/28 20:49, Masami Hiramatsu (Google) wrote:
>>> Hi Wuqiang,
>>>
>>> On Thu, 26 Oct 2023 19:05:51 +0800
>>> "wuqiang.matt" <wuqiang.matt@bytedance.com> wrote:
>>>
>>>> On 2023/10/26 16:46, Arnd Bergmann wrote:
>>>>> On Thu, Oct 26, 2023, at 09:39, wuqiang.matt wrote:
>>>>>> arch_cmpxchg[64]_local() are not defined for openrisc. So implement
>>>>>> them with generci_cmpxchg[64]_local, advised by Masami Hiramatsu.
>>>>>>
>>>>>> Closes:
>>>>>> https://lore.kernel.org/linux-trace-kernel/169824660459.24340.14614817132696360531.stgit@devnote2
>>>>>> Closes:
>>>>>> https://lore.kernel.org/oe-kbuild-all/202310241310.Ir5uukOG-lkp@intel.com
>>>>>>
>>>>>> Signed-off-by: wuqiang.matt <wuqiang.matt@bytedance.com>
>>>>> I think on architectures that have actual atomics, you
>>>>> generally want to define this to be the same as arch_cmpxchg()
>>>>> rather than the generic version.
>>>>>
>>>>> It depends on the relative cost of doing one atomic compared
>>>>> to an irq-disable/enable pair, but everyone else went with
>>>>> the former if they could. The exceptions are armv4/armv5,
>>>>> sparc32 and parisc, which don't have a generic cmpxchg()
>>>>> or similar operation.
>>>> Sure, better native than the generic. I'll try to collect more
>>>> insights before next move.
>>> So I will temporally remove the last change (use arch_cmpxchg_local
>>> in objpool) until these series are rewritten with arch native code,
>>> so that the next release will not break the kernel build.
>> Ok, it's fine to me. Thank you.
>>
>>
>>> But this must be fixed because arch_cmpxchg_local() is required
>>> for each arch anyway.
>> Yes. I'm working on the new update for arc/openrisc/hexagon. It would
>> be better resolve this issue first, then consider the objpool update
>> of using arch_cmpxchg_local.
>>
>>>>> You could do the thing that sparc64 and xtensa do, which
>>>>> use the native cmpxchg for supported word sizes but the
>>>>> generic version for 1- and 2-byte swaps, but that has its
>>>>> own set of problems if you end up doing operations on both
>>>>> the entire word and a sub-unit of the same thing.
>>>> Thank you for pointing out this. I'll do some research on these
>>>> implementations.
>>> arc also has the LL-SC instruction but depends on the core feature,
>>> so I think we can use it.
>> Right. The arc processor does have the CONFIG_ARC_HAS_LLSC option, but
>> I doubt the correctness of arch_cmpxchg_relaxed and arch_cmpxchg:
>>
>> arch_cmpxchg_relaxed:
>> ...
>>           switch(sizeof((_p_))) {
>>           case 4:
>> ....
>>
>> arch_cmpxchg:
>> ...
>> 	BUILD_BUG_ON(sizeof(_p_) != 4);		
>> ...
>>
>> _p is the address pointer, so I'm thinking it's a typo but I couldn't
>> yet confirm. There is not much about arc processors in the web :(
> Hmm, indeed. This seems like a bug but it depends on the 'llock  %0, [%1]'
> can take a 32bit address or 32bit data register. Usually it should
> check the size of data, but need to check with ISA manual.
>
> Vineet, can you check this suspicious bug?

ARCv2 is a 32-bit ISA and LLOCK/SCOND work on 32-bit data.
So the pointers will be 32-bit anyways. Is the issue that 
pointer/cmpxchg operation could be on a smaller data type ?

-Vineet

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] locking/atomic: openrisc: use generic_cmpxchg[64]_local for arch_cmpxchg[64]_local
  2023-10-30  2:22           ` Vineet Gupta
@ 2023-10-30  3:41             ` wuqiang.matt
  2023-11-02  4:53               ` Vineet Gupta
  0 siblings, 1 reply; 11+ messages in thread
From: wuqiang.matt @ 2023-10-30  3:41 UTC (permalink / raw)
  To: Vineet Gupta, Masami Hiramatsu (Google)
  Cc: Arnd Bergmann, Jonas Bonn, Stefan Kristiansson, Stafford Horne,
	Ingo Molnar, Peter Zijlstra, Andi Shyti, Palmer Dabbelt,
	Andrzej Hajda, linux-trace-kernel, mattwu, linux-snps-arc

On 2023/10/30 10:22, Vineet Gupta wrote:
> 
> 
> On 10/28/23 20:26, Masami Hiramatsu (Google) wrote:
>> On Sun, 29 Oct 2023 00:40:17 +0800
>> "wuqiang.matt" <wuqiang.matt@bytedance.com> wrote:
>>
>>> On 2023/10/28 20:49, Masami Hiramatsu (Google) wrote:
>>>> Hi Wuqiang,
>>>>
>>>> On Thu, 26 Oct 2023 19:05:51 +0800
>>>> "wuqiang.matt" <wuqiang.matt@bytedance.com> wrote:
>>>>
>>>>> On 2023/10/26 16:46, Arnd Bergmann wrote:
>>>>>> On Thu, Oct 26, 2023, at 09:39, wuqiang.matt wrote:
>>>>>>> arch_cmpxchg[64]_local() are not defined for openrisc. So implement
>>>>>>> them with generci_cmpxchg[64]_local, advised by Masami Hiramatsu.
>>>>>>>
>>>>>>> Closes:
>>>>>>> https://lore.kernel.org/linux-trace-kernel/169824660459.24340.14614817132696360531.stgit@devnote2
>>>>>>> Closes:
>>>>>>> https://lore.kernel.org/oe-kbuild-all/202310241310.Ir5uukOG-lkp@intel.com
>>>>>>>
>>>>>>> Signed-off-by: wuqiang.matt <wuqiang.matt@bytedance.com>
>>>>>> I think on architectures that have actual atomics, you
>>>>>> generally want to define this to be the same as arch_cmpxchg()
>>>>>> rather than the generic version.
>>>>>>
>>>>>> It depends on the relative cost of doing one atomic compared
>>>>>> to an irq-disable/enable pair, but everyone else went with
>>>>>> the former if they could. The exceptions are armv4/armv5,
>>>>>> sparc32 and parisc, which don't have a generic cmpxchg()
>>>>>> or similar operation.
>>>>> Sure, better native than the generic. I'll try to collect more
>>>>> insights before next move.
>>>> So I will temporally remove the last change (use arch_cmpxchg_local
>>>> in objpool) until these series are rewritten with arch native code,
>>>> so that the next release will not break the kernel build.
>>> Ok, it's fine to me. Thank you.
>>>
>>>
>>>> But this must be fixed because arch_cmpxchg_local() is required
>>>> for each arch anyway.
>>> Yes. I'm working on the new update for arc/openrisc/hexagon. It would
>>> be better resolve this issue first, then consider the objpool update
>>> of using arch_cmpxchg_local.
>>>
>>>>>> You could do the thing that sparc64 and xtensa do, which
>>>>>> use the native cmpxchg for supported word sizes but the
>>>>>> generic version for 1- and 2-byte swaps, but that has its
>>>>>> own set of problems if you end up doing operations on both
>>>>>> the entire word and a sub-unit of the same thing.
>>>>> Thank you for pointing out this. I'll do some research on these
>>>>> implementations.
>>>> arc also has the LL-SC instruction but depends on the core feature,
>>>> so I think we can use it.
>>> Right. The arc processor does have the CONFIG_ARC_HAS_LLSC option, but
>>> I doubt the correctness of arch_cmpxchg_relaxed and arch_cmpxchg:
>>>
>>> arch_cmpxchg_relaxed:
>>> ...
>>>           switch(sizeof((_p_))) {
>>>           case 4:
>>> ....
>>>
>>> arch_cmpxchg:
>>> ...
>>>     BUILD_BUG_ON(sizeof(_p_) != 4);
>>> ...
>>>
>>> _p is the address pointer, so I'm thinking it's a typo but I couldn't
>>> yet confirm. There is not much about arc processors in the web :(
>> Hmm, indeed. This seems like a bug but it depends on the 'llock  %0, [%1]'
>> can take a 32bit address or 32bit data register. Usually it should
>> check the size of data, but need to check with ISA manual.
>>
>> Vineet, can you check this suspicious bug?
> 
> ARCv2 is a 32-bit ISA and LLOCK/SCOND work on 32-bit data.
> So the pointers will be 32-bit anyways. Is the issue that pointer/cmpxchg 
> operation could be on a smaller data type ?

For ARCv2 with CONFIG_ARC_HAS_LLSC, better add the data size checking and
only permit 32bit data size. Even for 32-bit system, data should can be
64bit 'long long'.

And In the case that CONFIG_ARC_HAS_LLSC is undefined, in arch_cmpxchg: the
pointer size checking is unnecessary, since it's using spinlock internally:

https://elixir.bootlin.com/linux/v6.6-rc7/source/arch/arc/include/asm/cmpxchg.h#L60: 

	BUILD_BUG_ON(sizeof(_p_) != 4);					\
									\
	/*								\
	 * spin lock/unlock provide the needed smp_mb() before/after	\
	 */								\
	atomic_ops_lock(__flags);					\
	_prev_ = *_p_;							\
	if (_prev_ == _o_)						\
		*_p_ = _n_;						\
	atomic_ops_unlock(__flags);

Another question about the naming: arch_cmpxchg_relaxed() implemented if
CONFIG_ARC_HAS_LLSC is configured and arch_cmpxchg() defined for the rest.
Are there any reasons for difference names ?

As I checked, Synopsys has released 64bit ARC processors (HS66/HS68), but
I don't know the status of Linux kernel support.

> -Vineet

Regards,
wuqiang

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] locking/atomic: openrisc: use generic_cmpxchg[64]_local for arch_cmpxchg[64]_local
  2023-10-30  3:41             ` wuqiang.matt
@ 2023-11-02  4:53               ` Vineet Gupta
  2023-11-02  9:30                 ` wuqiang.matt
  0 siblings, 1 reply; 11+ messages in thread
From: Vineet Gupta @ 2023-11-02  4:53 UTC (permalink / raw)
  To: wuqiang.matt, Vineet Gupta, Masami Hiramatsu (Google)
  Cc: Arnd Bergmann, Jonas Bonn, Stefan Kristiansson, Stafford Horne,
	Ingo Molnar, Peter Zijlstra, Andi Shyti, Palmer Dabbelt,
	Andrzej Hajda, linux-trace-kernel, mattwu, linux-snps-arc



On 10/29/23 20:41, wuqiang.matt wrote:
>>>>
>>>> arch_cmpxchg_relaxed:
>>>> ...
>>>>           switch(sizeof((_p_))) {
>>>>           case 4:
>>>> ....
>>>>
>>>> arch_cmpxchg:
>>>> ...
>>>>     BUILD_BUG_ON(sizeof(_p_) != 4);
>>>> ...
>>>>
>>>> _p is the address pointer, so I'm thinking it's a typo but I couldn't
>>>> yet confirm. There is not much about arc processors in the web :(
>>> Hmm, indeed. This seems like a bug but it depends on the 'llock  %0, 
>>> [%1]'
>>> can take a 32bit address or 32bit data register. Usually it should
>>> check the size of data, but need to check with ISA manual.
>>>
>>> Vineet, can you check this suspicious bug?
>>
>> ARCv2 is a 32-bit ISA and LLOCK/SCOND work on 32-bit data.
>> So the pointers will be 32-bit anyways. Is the issue that 
>> pointer/cmpxchg operation could be on a smaller data type ?
>
> For ARCv2 with CONFIG_ARC_HAS_LLSC, better add the data size checking and
> only permit 32bit data size. Even for 32-bit system, data should can be
> 64bit 'long long'.

Right, makes sense. Can you send a patch ?

>
> And In the case that CONFIG_ARC_HAS_LLSC is undefined, in 
> arch_cmpxchg: the
> pointer size checking is unnecessary, since it's using spinlock 
> internally:

Correct, I had the same thought.

>
> https://elixir.bootlin.com/linux/v6.6-rc7/source/arch/arc/include/asm/cmpxchg.h#L60: 
>
>     BUILD_BUG_ON(sizeof(_p_) != 4);                    \
>                                     \
>     /*                                \
>      * spin lock/unlock provide the needed smp_mb() before/after    \
>      */                                \
>     atomic_ops_lock(__flags);                    \
>     _prev_ = *_p_;                            \
>     if (_prev_ == _o_)                        \
>         *_p_ = _n_;                        \
>     atomic_ops_unlock(__flags);

Can you do that fix in same patch as well ?

> Another question about the naming: arch_cmpxchg_relaxed() implemented if
> CONFIG_ARC_HAS_LLSC is configured and arch_cmpxchg() defined for the 
> rest.
> Are there any reasons for difference names ?

Yes the atomics API consists of _relaxed, _acquire, _release and unadorned.
_relaxed is the most efficient, with rest having some extra barriers.
If arch provides _relaxed, generic code can create the rest - assuming 
arch hardware can support the relaxed ones.
That is true for LLSC.

However for !LLSC, spinlock versions already have full barriers due to 
spinlock, so relaxed variant doesn't make sense.

>
> As I checked, Synopsys has released 64bit ARC processors (HS66/HS68), but
> I don't know the status of Linux kernel support.

Yes there's an internal ARC64 port running but it is not ready for 
upstreaming yet.

-Vineet

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] locking/atomic: openrisc: use generic_cmpxchg[64]_local for arch_cmpxchg[64]_local
  2023-11-02  4:53               ` Vineet Gupta
@ 2023-11-02  9:30                 ` wuqiang.matt
  2023-11-02 10:05                   ` wuqiang.matt
  0 siblings, 1 reply; 11+ messages in thread
From: wuqiang.matt @ 2023-11-02  9:30 UTC (permalink / raw)
  To: Vineet Gupta, Masami Hiramatsu (Google)
  Cc: Arnd Bergmann, Jonas Bonn, Stefan Kristiansson, Stafford Horne,
	Ingo Molnar, Peter Zijlstra, Andi Shyti, Palmer Dabbelt,
	Andrzej Hajda, linux-trace-kernel, mattwu, linux-snps-arc

On 2023/11/2 12:53, Vineet Gupta wrote:
> 
> 
> On 10/29/23 20:41, wuqiang.matt wrote:
>>>>>
>>>>> arch_cmpxchg_relaxed:
>>>>> ...
>>>>>           switch(sizeof((_p_))) {
>>>>>           case 4:
>>>>> ....
>>>>>
>>>>> arch_cmpxchg:
>>>>> ...
>>>>>     BUILD_BUG_ON(sizeof(_p_) != 4);
>>>>> ...
>>>>>
>>>>> _p is the address pointer, so I'm thinking it's a typo but I couldn't
>>>>> yet confirm. There is not much about arc processors in the web :(
>>>> Hmm, indeed. This seems like a bug but it depends on the 'llock  %0, [%1]'
>>>> can take a 32bit address or 32bit data register. Usually it should
>>>> check the size of data, but need to check with ISA manual.
>>>>
>>>> Vineet, can you check this suspicious bug?
>>>
>>> ARCv2 is a 32-bit ISA and LLOCK/SCOND work on 32-bit data.
>>> So the pointers will be 32-bit anyways. Is the issue that pointer/cmpxchg 
>>> operation could be on a smaller data type ?
>>
>> For ARCv2 with CONFIG_ARC_HAS_LLSC, better add the data size checking and
>> only permit 32bit data size. Even for 32-bit system, data should can be
>> 64bit 'long long'.
> 
> Right, makes sense. Can you send a patch ?

Okay, I will.

>>
>> And In the case that CONFIG_ARC_HAS_LLSC is undefined, in arch_cmpxchg: the
>> pointer size checking is unnecessary, since it's using spinlock internally:
> 
> Correct, I had the same thought.
> 
>>
>> https://elixir.bootlin.com/linux/v6.6-rc7/source/arch/arc/include/asm/cmpxchg.h#L60:
>>     BUILD_BUG_ON(sizeof(_p_) != 4);                    \
>>                                     \
>>     /*                                \
>>      * spin lock/unlock provide the needed smp_mb() before/after    \
>>      */                                \
>>     atomic_ops_lock(__flags);                    \
>>     _prev_ = *_p_;                            \
>>     if (_prev_ == _o_)                        \
>>         *_p_ = _n_;                        \
>>     atomic_ops_unlock(__flags);
> 
> Can you do that fix in same patch as well ?

Sure.

>> Another question about the naming: arch_cmpxchg_relaxed() implemented if
>> CONFIG_ARC_HAS_LLSC is configured and arch_cmpxchg() defined for the rest.
>> Are there any reasons for difference names ?
> 
> Yes the atomics API consists of _relaxed, _acquire, _release and unadorned.
> _relaxed is the most efficient, with rest having some extra barriers.
> If arch provides _relaxed, generic code can create the rest - assuming arch 
> hardware can support the relaxed ones.

Definitely Yes! The generic cmpxchg will prefer arch_cmpxchg_relaxed rather
than arch_cmpxchg.

> That is true for LLSC.
> 
> However for !LLSC, spinlock versions already have full barriers due to 
> spinlock, so relaxed variant doesn't make sense.
> 
>>
>> As I checked, Synopsys has released 64bit ARC processors (HS66/HS68), but
>> I don't know the status of Linux kernel support.
> 
> Yes there's an internal ARC64 port running but it is not ready for upstreaming 
> yet.

Thanks for the update.

> -Vineet

Regards,
wuqiang

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH 3/3] locking/atomic: openrisc: use generic_cmpxchg[64]_local for arch_cmpxchg[64]_local
  2023-11-02  9:30                 ` wuqiang.matt
@ 2023-11-02 10:05                   ` wuqiang.matt
  0 siblings, 0 replies; 11+ messages in thread
From: wuqiang.matt @ 2023-11-02 10:05 UTC (permalink / raw)
  To: Vineet Gupta, Masami Hiramatsu (Google)
  Cc: Arnd Bergmann, Jonas Bonn, Stefan Kristiansson, Stafford Horne,
	Ingo Molnar, Peter Zijlstra, Andi Shyti, Palmer Dabbelt,
	Andrzej Hajda, linux-trace-kernel, mattwu, linux-snps-arc

On 2023/11/2 17:30, wuqiang.matt wrote:
> On 2023/11/2 12:53, Vineet Gupta wrote:
>>
>>
>> On 10/29/23 20:41, wuqiang.matt wrote:
>>>>>>
>>>>>> arch_cmpxchg_relaxed:
>>>>>> ...
>>>>>>           switch(sizeof((_p_))) {
>>>>>>           case 4:
>>>>>> ....
>>>>>>
>>>>>> arch_cmpxchg:
>>>>>> ...
>>>>>>     BUILD_BUG_ON(sizeof(_p_) != 4);
>>>>>> ...
>>>>>>
>>>>>> _p is the address pointer, so I'm thinking it's a typo but I couldn't
>>>>>> yet confirm. There is not much about arc processors in the web :(
>>>>> Hmm, indeed. This seems like a bug but it depends on the 'llock  %0, [%1]'
>>>>> can take a 32bit address or 32bit data register. Usually it should
>>>>> check the size of data, but need to check with ISA manual.
>>>>>
>>>>> Vineet, can you check this suspicious bug?
>>>>
>>>> ARCv2 is a 32-bit ISA and LLOCK/SCOND work on 32-bit data.
>>>> So the pointers will be 32-bit anyways. Is the issue that pointer/cmpxchg 
>>>> operation could be on a smaller data type ?
>>>
>>> For ARCv2 with CONFIG_ARC_HAS_LLSC, better add the data size checking and
>>> only permit 32bit data size. Even for 32-bit system, data should can be
>>> 64bit 'long long'.
>>
>> Right, makes sense. Can you send a patch ?
> 
> Okay, I will.
> 
>>>
>>> And In the case that CONFIG_ARC_HAS_LLSC is undefined, in arch_cmpxchg: the
>>> pointer size checking is unnecessary, since it's using spinlock internally:
>>
>> Correct, I had the same thought.
>>
>>>
>>> https://elixir.bootlin.com/linux/v6.6-rc7/source/arch/arc/include/asm/cmpxchg.h#L60:
>>>     BUILD_BUG_ON(sizeof(_p_) != 4);                    \
>>>                                     \
>>>     /*                                \
>>>      * spin lock/unlock provide the needed smp_mb() before/after    \
>>>      */                                \
>>>     atomic_ops_lock(__flags);                    \
>>>     _prev_ = *_p_;                            \
>>>     if (_prev_ == _o_)                        \
>>>         *_p_ = _n_;                        \
>>>     atomic_ops_unlock(__flags);
>>
>> Can you do that fix in same patch as well ?
> 
> Sure.
> 
>>> Another question about the naming: arch_cmpxchg_relaxed() implemented if
>>> CONFIG_ARC_HAS_LLSC is configured and arch_cmpxchg() defined for the rest.
>>> Are there any reasons for difference names ?
>>
>> Yes the atomics API consists of _relaxed, _acquire, _release and unadorned.
>> _relaxed is the most efficient, with rest having some extra barriers.
>> If arch provides _relaxed, generic code can create the rest - assuming arch 
>> hardware can support the relaxed ones.
> 
> Definitely Yes! The generic cmpxchg will prefer arch_cmpxchg_relaxed rather
> than arch_cmpxchg.

It should be arch_cmpxchg (if defined) rather than arch_cmpxchg_relaxed. I
just double checked with v6.6:

	...
	#if defined(arch_cmpxchg)
	#define raw_cmpxchg arch_cmpxchg
	#elif defined(arch_cmpxchg_relaxed)
		#define raw_cmpxchg(...) \
		__atomic_op_fence(arch_cmpxchg, __VA_ARGS__)
	#else
	extern void raw_cmpxchg_not_implemented(void);
	#define raw_cmpxchg(...) raw_cmpxchg_not_implemented()
	#endif
	...
	#if defined(arch_cmpxchg_relaxed)
	#define raw_cmpxchg_relaxed arch_cmpxchg_relaxed
	#elif defined(arch_cmpxchg)
	#define raw_cmpxchg_relaxed arch_cmpxchg
	#else
	extern void raw_cmpxchg_relaxed_not_implemented(void);
	#define raw_cmpxchg_relaxed(...) raw_cmpxchg_relaxed_not_implemented()
	#endif
	...

I made the mistake since the definitions of v5.2 confused me. The followings
are for v5.2, and these conditions are removed from v5.14:

	...
	#if !defined(arch_cmpxchg_relaxed) || defined(arch_cmpxchg)
	#define cmpxchg(ptr, ...) \
	({ \
		typeof(ptr) __ai_ptr = (ptr); \
		instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
		arch_cmpxchg(__ai_ptr, __VA_ARGS__); \
	})
	#endif
	...
	#if defined(arch_cmpxchg_relaxed)
	#define cmpxchg_relaxed(ptr, ...) \
	({ \
		typeof(ptr) __ai_ptr = (ptr); \
		instrument_atomic_write(__ai_ptr, sizeof(*__ai_ptr)); \
		arch_cmpxchg_relaxed(__ai_ptr, __VA_ARGS__); \
	})
	#endif
	...

Sorry for the noise.

>> That is true for LLSC.
>>
>> However for !LLSC, spinlock versions already have full barriers due to 
>> spinlock, so relaxed variant doesn't make sense.
>>
>>>
>>> As I checked, Synopsys has released 64bit ARC processors (HS66/HS68), but
>>> I don't know the status of Linux kernel support.
>>
>> Yes there's an internal ARC64 port running but it is not ready for 
>> upstreaming yet.
> 
> Thanks for the update.
> 
>> -Vineet
> 
> Regards,
> wuqiang


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2023-11-02 10:05 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-10-26  7:39 [PATCH 3/3] locking/atomic: openrisc: use generic_cmpxchg[64]_local for arch_cmpxchg[64]_local wuqiang.matt
2023-10-26  8:46 ` Arnd Bergmann
2023-10-26 11:05   ` wuqiang.matt
2023-10-28 12:49     ` Masami Hiramatsu
2023-10-28 16:40       ` wuqiang.matt
2023-10-29  3:26         ` Masami Hiramatsu
2023-10-30  2:22           ` Vineet Gupta
2023-10-30  3:41             ` wuqiang.matt
2023-11-02  4:53               ` Vineet Gupta
2023-11-02  9:30                 ` wuqiang.matt
2023-11-02 10:05                   ` wuqiang.matt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).