* Re: [PATCH]atomic_inc_return() for i386/x86_64 (Re: RCU issue with SELinux) [not found] ` <2x2JC-3Uu-11@gated-at.bofh.it> @ 2004-08-28 14:48 ` Andi Kleen 2004-08-31 8:19 ` Kaigai Kohei 0 siblings, 1 reply; 5+ messages in thread From: Andi Kleen @ 2004-08-28 14:48 UTC (permalink / raw) To: Kaigai Kohei; +Cc: linux-kernel "Kaigai Kohei" <kaigai@ak.jp.nec.com> writes: > atomic_inc_return() is not defined for arm,arm26,i386,x86_64 and um archtectures. > This attached patch adds atomic_inc_return() and atomic_dec_return() to arm,i386 and x86_64. > > It is implemented by 'xaddl' operation with LOCK prefix for i386 and x86_64. > But this operation is permitted after i486 processor only. > Another implementation may be necessary for i386SX/DX processor. > But 'xaddl' operation is used in 'include/asm-i386/rwsem.h' unconditionally. > I think it has agreed on using 'xaddl' operation in past days. We don't support SMP on 386 boxes. What you can do for 386 is to use alternative() and just use an non SMP safe version for 386 and xadd for 486+ This will require adding a new pseudo CPUID bit in cpufeature.h that can be tested for in alternative (similar to the X86_FEATURE_P3/P4 flags that are already there) -Andi ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH]atomic_inc_return() for i386/x86_64 (Re: RCU issue with SELinux) 2004-08-28 14:48 ` [PATCH]atomic_inc_return() for i386/x86_64 (Re: RCU issue with SELinux) Andi Kleen @ 2004-08-31 8:19 ` Kaigai Kohei 2004-08-31 8:49 ` Andi Kleen 0 siblings, 1 reply; 5+ messages in thread From: Kaigai Kohei @ 2004-08-31 8:19 UTC (permalink / raw) To: Andi Kleen; +Cc: linux-kernel [-- Attachment #1: Type: text/plain, Size: 1484 bytes --] Hi Andi, thanks for your comment. Sorry, I have not noticed your mail in the flood of Linux-Kernel ML. > > atomic_inc_return() is not defined for arm,arm26,i386,x86_64 and um archtectures. > > This attached patch adds atomic_inc_return() and atomic_dec_return() to arm,i386 and x86_64. > > > > It is implemented by 'xaddl' operation with LOCK prefix for i386 and x86_64. > > But this operation is permitted after i486 processor only. > > Another implementation may be necessary for i386SX/DX processor. > > But 'xaddl' operation is used in 'include/asm-i386/rwsem.h' unconditionally. > > I think it has agreed on using 'xaddl' operation in past days. > > We don't support SMP on 386 boxes. What you can do for 386 is to use > alternative() and just use an non SMP safe version for 386 and xadd > for 486+ We can avoid the problem by the simple solution, since SMP on 386 boxes isn't supported. It is to disable interrupt while updating atomic_t variable. The attached patch modifies the include/asm-i386/atomic.h. If the target processor is 386, then atomic_add_return() use non-atomic operations between local_irq_disable() and local_irq_enable(). Otherwise, atomic_add_return() use 'xadd' operation with LOCK prefix. By the way, do you know why 'xadd' operation is used unconditionally in 'include/asm-i386/rwsem.h'? Thanks. Signed-off-by: KaiGai, Kohei <kaigai@ak.jp.nec.com> Signed-off-by: Takayoshi Kochi <t-kochi@bq.jp.nec.com> -------- Kai Gai <kaigai@ak.jp.nec.com> [-- Attachment #2: atomic_inc_return-2.6.8.1.M386.patch --] [-- Type: application/octet-stream, Size: 3067 bytes --] --- linux-2.6.8.1/include/asm-arm/atomic.h 2004-08-14 19:54:50.000000000 +0900 +++ linux-2.6.8.1.rcu/include/asm-arm/atomic.h 2004-08-24 19:31:56.000000000 +0900 @@ -194,8 +194,10 @@ #define atomic_dec(v) atomic_sub(1, v) #define atomic_inc_and_test(v) (atomic_add_return(1, v) == 0) #define atomic_dec_and_test(v) (atomic_sub_return(1, v) == 0) +#define atomic_inc_return(v) (atomic_add_return(1, v)) +#define atomic_dec_return(v) (atomic_sub_return(1, v)) #define atomic_add_negative(i,v) (atomic_add_return(i, v) < 0) /* Atomic operations are already serializing on ARM */ --- linux-2.6.8.1/include/asm-i386/atomic.h 2004-08-14 19:55:09.000000000 +0900 +++ linux-2.6.8.1.rcu/include/asm-i386/atomic.h 2004-08-31 13:10:48.000000000 +0900 @@ -1,8 +1,9 @@ #ifndef __ARCH_I386_ATOMIC__ #define __ARCH_I386_ATOMIC__ #include <linux/config.h> +#include <asm/system.h> /* * Atomic operations that C can't guarantee us. Useful for * resource counting etc.. @@ -175,8 +176,41 @@ :"ir" (i), "m" (v->counter) : "memory"); return c; } +/** + * atomic_add_return - add and return + * @v: pointer of type atomic_t + * @i: integer value to add + * + * Atomically adds @i to @v and returns @i + @v + */ +static __inline__ int atomic_add_return(int i, atomic_t *v) +{ + int __i; +#ifdef CONFIG_M386 + local_irq_disable(); + __i = atomic_read(v); + atomic_set(v, i + __i); + local_irq_enable(); +#else + __i = i; + __asm__ __volatile__( + LOCK "xaddl %0, %1;" + :"=r"(i) + :"m"(v->counter), "0"(i)); +#endif + return i + __i; +} + +static __inline__ int atomic_sub_return(int i, atomic_t *v) +{ + return atomic_add_return(-i,v); +} + +#define atomic_inc_return(v) (atomic_add_return(1,v)) +#define atomic_dec_return(v) (atomic_sub_return(1,v)) + /* These are x86-specific, used by some header files */ #define atomic_clear_mask(mask, addr) \ __asm__ __volatile__(LOCK "andl %0,%1" \ : : "r" (~(mask)),"m" (*addr) : "memory") --- linux-2.6.8.1/include/asm-x86_64/atomic.h 2004-08-14 19:56:23.000000000 +0900 +++ linux-2.6.8.1.rcu/include/asm-x86_64/atomic.h 2004-08-25 11:57:36.000000000 +0900 @@ -177,8 +177,33 @@ :"ir" (i), "m" (v->counter) : "memory"); return c; } +/** + * atomic_add_return - add and return + * @v: pointer of type atomic_t + * @i: integer value to add + * + * Atomically adds @i to @v and returns @i + @v + */ +static __inline__ int atomic_add_return(int i, atomic_t *v) +{ + int __i = i; + __asm__ __volatile__( + LOCK "xaddl %0, %1;" + :"=r"(i) + :"m"(v->counter), "0"(i)); + return i + __i; +} + +static __inline__ int atomic_sub_return(int i, atomic_t *v) +{ + return atomic_add_return(-i,v); +} + +#define atomic_inc_return(v) (atomic_add_return(1,v)) +#define atomic_dec_return(v) (atomic_sub_return(1,v)) + /* These are x86-specific, used by some header files */ #define atomic_clear_mask(mask, addr) \ __asm__ __volatile__(LOCK "andl %0,%1" \ : : "r" (~(mask)),"m" (*addr) : "memory") ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH]atomic_inc_return() for i386/x86_64 (Re: RCU issue with SELinux) 2004-08-31 8:19 ` Kaigai Kohei @ 2004-08-31 8:49 ` Andi Kleen 2004-09-01 6:22 ` Kaigai Kohei 0 siblings, 1 reply; 5+ messages in thread From: Andi Kleen @ 2004-08-31 8:49 UTC (permalink / raw) To: Kaigai Kohei; +Cc: linux-kernel On Tue, Aug 31, 2004 at 05:19:34PM +0900, Kaigai Kohei wrote: > Hi Andi, thanks for your comment. > Sorry, I have not noticed your mail in the flood of Linux-Kernel ML. > > > > atomic_inc_return() is not defined for arm,arm26,i386,x86_64 and um archtectures. > > > This attached patch adds atomic_inc_return() and atomic_dec_return() to arm,i386 and x86_64. > > > > > > It is implemented by 'xaddl' operation with LOCK prefix for i386 and x86_64. > > > But this operation is permitted after i486 processor only. > > > Another implementation may be necessary for i386SX/DX processor. > > > But 'xaddl' operation is used in 'include/asm-i386/rwsem.h' unconditionally. > > > I think it has agreed on using 'xaddl' operation in past days. > > > > We don't support SMP on 386 boxes. What you can do for 386 is to use > > alternative() and just use an non SMP safe version for 386 and xadd > > for 486+ > > We can avoid the problem by the simple solution, since SMP > on 386 boxes isn't supported. It is to disable interrupt > while updating atomic_t variable. The patch is wrong. A CONFIG_M386 kernel can run on non 386 SMP boxes. Your patch would be racy then. The only thing that's not supported is a real 386 with multiple CPUs. You either have to check boot_cpu_data.x86 == 3 at runtime or use alternative() like I earlier suggested. > By the way, do you know why 'xadd' operation is used > unconditionally in 'include/asm-i386/rwsem.h'? 386 compatible kernels use a different rwsem implementation that doesn't use this include. -Andi ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: [PATCH]atomic_inc_return() for i386/x86_64 (Re: RCU issue with SELinux) 2004-08-31 8:49 ` Andi Kleen @ 2004-09-01 6:22 ` Kaigai Kohei 0 siblings, 0 replies; 5+ messages in thread From: Kaigai Kohei @ 2004-09-01 6:22 UTC (permalink / raw) To: Andi Kleen; +Cc: linux-kernel [-- Attachment #1: Type: text/plain, Size: 945 bytes --] Hi Andi, thanks for your comment. > > We can avoid the problem by the simple solution, since SMP > > on 386 boxes isn't supported. It is to disable interrupt > > while updating atomic_t variable. > > The patch is wrong. A CONFIG_M386 kernel can run on non > 386 SMP boxes. Your patch would be racy then. The only thing > that's not supported is a real 386 with multiple CPUs. > > You either have to check boot_cpu_data.x86 == 3 at runtime or > use alternative() like I earlier suggested. Indeed, the attached patch is implemented according to the suggestion. The code for legacy-386 is used only when boot_cpu_data.x86 == 3. This run-time check is done only when CONFIG_M386 is set. Otherwise, we use 'XADD' operation for atomic_add_return() as the previous patch. I don't adopt the alternative(), because the macro doesn't fit for the situation when arguments for __asm__ are necessary. Thanks. -------- Kai Gai <kaigai@ak.jp.nec.com> [-- Attachment #2: atomic_inc_return-2.6.8.1.M386.patch --] [-- Type: application/octet-stream, Size: 3270 bytes --] --- linux-2.6.8.1/include/asm-i386/atomic.h 2004-08-14 19:55:09.000000000 +0900 +++ linux-2.6.8.1.rcu/include/asm-i386/atomic.h 2004-09-01 14:40:40.000000000 +0900 @@ -1,8 +1,10 @@ #ifndef __ARCH_I386_ATOMIC__ #define __ARCH_I386_ATOMIC__ #include <linux/config.h> +#include <linux/compiler.h> +#include <asm/processor.h> /* * Atomic operations that C can't guarantee us. Useful for * resource counting etc.. @@ -175,8 +177,48 @@ :"ir" (i), "m" (v->counter) : "memory"); return c; } +/** + * atomic_add_return - add and return + * @v: pointer of type atomic_t + * @i: integer value to add + * + * Atomically adds @i to @v and returns @i + @v + */ +static __inline__ int atomic_add_return(int i, atomic_t *v) +{ + int __i; +#ifdef CONFIG_M386 + if(unlikely(boot_cpu_data.x86==3)) + goto no_xadd; +#endif + /* Modern 486+ processor */ + __i = i; + __asm__ __volatile__( + LOCK "xaddl %0, %1;" + :"=r"(i) + :"m"(v->counter), "0"(i)); + return i + __i; + +#ifdef CONFIG_M386 +no_xadd: /* Legacy 386 processor */ + local_irq_disable(); + __i = atomic_read(v); + atomic_set(v, i + __i); + local_irq_enable(); + return i + __i; +#endif +} + +static __inline__ int atomic_sub_return(int i, atomic_t *v) +{ + return atomic_add_return(-i,v); +} + +#define atomic_inc_return(v) (atomic_add_return(1,v)) +#define atomic_dec_return(v) (atomic_sub_return(1,v)) + /* These are x86-specific, used by some header files */ #define atomic_clear_mask(mask, addr) \ __asm__ __volatile__(LOCK "andl %0,%1" \ : : "r" (~(mask)),"m" (*addr) : "memory") --- linux-2.6.8.1/include/asm-x86_64/atomic.h 2004-08-14 19:56:23.000000000 +0900 +++ linux-2.6.8.1.rcu/include/asm-x86_64/atomic.h 2004-08-25 11:57:36.000000000 +0900 @@ -177,8 +177,33 @@ :"ir" (i), "m" (v->counter) : "memory"); return c; } +/** + * atomic_add_return - add and return + * @v: pointer of type atomic_t + * @i: integer value to add + * + * Atomically adds @i to @v and returns @i + @v + */ +static __inline__ int atomic_add_return(int i, atomic_t *v) +{ + int __i = i; + __asm__ __volatile__( + LOCK "xaddl %0, %1;" + :"=r"(i) + :"m"(v->counter), "0"(i)); + return i + __i; +} + +static __inline__ int atomic_sub_return(int i, atomic_t *v) +{ + return atomic_add_return(-i,v); +} + +#define atomic_inc_return(v) (atomic_add_return(1,v)) +#define atomic_dec_return(v) (atomic_sub_return(1,v)) + /* These are x86-specific, used by some header files */ #define atomic_clear_mask(mask, addr) \ __asm__ __volatile__(LOCK "andl %0,%1" \ : : "r" (~(mask)),"m" (*addr) : "memory") --- linux-2.6.8.1/include/asm-arm/atomic.h 2004-08-14 19:54:50.000000000 +0900 +++ linux-2.6.8.1.rcu/include/asm-arm/atomic.h 2004-08-24 19:31:56.000000000 +0900 @@ -194,8 +194,10 @@ #define atomic_dec(v) atomic_sub(1, v) #define atomic_inc_and_test(v) (atomic_add_return(1, v) == 0) #define atomic_dec_and_test(v) (atomic_sub_return(1, v) == 0) +#define atomic_inc_return(v) (atomic_add_return(1, v)) +#define atomic_dec_return(v) (atomic_sub_return(1, v)) #define atomic_add_negative(i,v) (atomic_add_return(i, v) < 0) /* Atomic operations are already serializing on ARM */ ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: RCU issue with SELinux (Re: SELINUX performance issues) @ 2004-08-24 13:24 James Morris 2004-08-25 9:52 ` [PATCH]atomic_inc_return() for i386/x86_64 (Re: RCU issue with SELinux) Kaigai Kohei 0 siblings, 1 reply; 5+ messages in thread From: James Morris @ 2004-08-24 13:24 UTC (permalink / raw) To: Kaigai Kohei; +Cc: Stephen Smalley, SELinux-ML(Eng), Linux Kernel ML(Eng) On Tue, 24 Aug 2004, Kaigai Kohei wrote: > o UNIXbench > > * INDEX value comparison > 2.6.8.1 2.6.8.1 2.6.8.1 2.6.8.1 > (Disable) (Enable) (rwlock) (RCU) > Dhrystone 2 using register variables 268.9 268.8 269.2 269.0 > Double-Precision Whetstone 94.2 94.2 94.2 94.2 > Execl Throughput 388.3 379.0 377.8 377.9 + > File Copy 1024 bufsize 2000 maxblocks 606.6 526.6 515.6 504.8 * > File Copy 256 bufsize 500 maxblocks 508.9 417.0 410.4 395.2 * > File Copy 4096 bufsize 8000 maxblocks 987.1 890.4 876.0 857.9 * > Pipe Throughput 525.1 406.4 404.5 408.8 + > Process Creation 321.2 317.8 315.9 316.3 + > Shell Scripts (8 concurrent) 1312.8 1276.2 1278.8 1282.8 + > System Call Overhead 467.1 468.7 464.1 467.2 + > ======================================== > FINAL SCORE 445.8 413.2 410.1 407.7 This benchmark somewhat characterizes 1P performance, and the ones I've marked with (*) get noticably worse with the RCU patch compared to the current locking scheme. Tests marked (+) show no or insignificant improvement. Might be useful to compare with the lmbench macrobenchmark, to see if it shows a similar pattern. > o dbench [ 4 processes run parallely on 4-CPUs / 10 times trials ] > ---- mean ---- - STD - > 2.6.8.1(disable) 860.249 [MB/s] 44.683 > 2.6.8.1(enable) 714.254 [MB/s] 32.359 > 2.6.8.1(+rwlock) 767.904 [MB/s] 27.968 > 2.6.8.1(+RCU) 830.678 [MB/s] 16.352 Can you show the figures for 1 and 2 clients? > In IA-32 or x86_64, can anybady implement atomic_inc_return()? > If it can not, I'll try to make alternative macros or inline functions. If you can get this done, it will be very useful, as I could allso run some benchmarks on my test systems. - James -- James Morris <jmorris@redhat.com> ^ permalink raw reply [flat|nested] 5+ messages in thread
* [PATCH]atomic_inc_return() for i386/x86_64 (Re: RCU issue with SELinux) 2004-08-24 13:24 RCU issue with SELinux (Re: SELINUX performance issues) James Morris @ 2004-08-25 9:52 ` Kaigai Kohei 0 siblings, 0 replies; 5+ messages in thread From: Kaigai Kohei @ 2004-08-25 9:52 UTC (permalink / raw) To: James Morris; +Cc: Stephen Smalley, SELinux-ML(Eng), Linux Kernel ML(Eng) [-- Attachment #1: Type: text/plain, Size: 963 bytes --] Hi James, thanks for your comment. > > In IA-32 or x86_64, can anybady implement atomic_inc_return()? > > If it can not, I'll try to make alternative macros or inline functions. > > If you can get this done, it will be very useful, as I could allso run > some benchmarks on my test systems. atomic_inc_return() is not defined for arm,arm26,i386,x86_64 and um archtectures. This attached patch adds atomic_inc_return() and atomic_dec_return() to arm,i386 and x86_64. It is implemented by 'xaddl' operation with LOCK prefix for i386 and x86_64. But this operation is permitted after i486 processor only. Another implementation may be necessary for i386SX/DX processor. But 'xaddl' operation is used in 'include/asm-i386/rwsem.h' unconditionally. I think it has agreed on using 'xaddl' operation in past days. Is it acceptable? Any comment please. (*) The implementation for ARM is simple wrapper to atomic_add_return(). -------- Kai Gai <kaigai@ak.jp.nec.com> [-- Attachment #2: atomic_inc_return-2.6.8.1.patch --] [-- Type: application/octet-stream, Size: 2739 bytes --] --- linux-2.6.8.1/include/asm-arm/atomic.h 2004-08-14 19:54:50.000000000 +0900 +++ linux-2.6.8.1.rcu/include/asm-arm/atomic.h 2004-08-24 19:31:56.000000000 +0900 @@ -194,8 +194,10 @@ #define atomic_dec(v) atomic_sub(1, v) #define atomic_inc_and_test(v) (atomic_add_return(1, v) == 0) #define atomic_dec_and_test(v) (atomic_sub_return(1, v) == 0) +#define atomic_inc_return(v) (atomic_add_return(1, v)) +#define atomic_dec_return(v) (atomic_sub_return(1, v)) #define atomic_add_negative(i,v) (atomic_add_return(i, v) < 0) /* Atomic operations are already serializing on ARM */ --- linux-2.6.8.1/include/asm-i386/atomic.h 2004-08-14 19:55:09.000000000 +0900 +++ linux-2.6.8.1.rcu/include/asm-i386/atomic.h 2004-08-25 11:57:08.000000000 +0900 @@ -175,8 +175,34 @@ :"ir" (i), "m" (v->counter) : "memory"); return c; } +/** + * atomic_add_return - add and return + * @v: pointer of type atomic_t + * @i: integer value to add + * + * Atomically adds @i to @v and returns @i + @v + */ +static __inline__ int atomic_add_return(int i, atomic_t *v) +{ + /* NOTICE: This function can be used on i486+ */ + int __i = i; + __asm__ __volatile__( + LOCK "xaddl %0, %1;" + :"=r"(i) + :"m"(v->counter), "0"(i)); + return i + __i; +} + +static __inline__ int atomic_sub_return(int i, atomic_t *v) +{ + return atomic_add_return(-i,v); +} + +#define atomic_inc_return(v) (atomic_add_return(1,v)) +#define atomic_dec_return(v) (atomic_sub_return(1,v)) + /* These are x86-specific, used by some header files */ #define atomic_clear_mask(mask, addr) \ __asm__ __volatile__(LOCK "andl %0,%1" \ : : "r" (~(mask)),"m" (*addr) : "memory") --- linux-2.6.8.1/include/asm-x86_64/atomic.h 2004-08-14 19:56:23.000000000 +0900 +++ linux-2.6.8.1.rcu/include/asm-x86_64/atomic.h 2004-08-25 11:57:36.000000000 +0900 @@ -177,8 +177,33 @@ :"ir" (i), "m" (v->counter) : "memory"); return c; } +/** + * atomic_add_return - add and return + * @v: pointer of type atomic_t + * @i: integer value to add + * + * Atomically adds @i to @v and returns @i + @v + */ +static __inline__ int atomic_add_return(int i, atomic_t *v) +{ + int __i = i; + __asm__ __volatile__( + LOCK "xaddl %0, %1;" + :"=r"(i) + :"m"(v->counter), "0"(i)); + return i + __i; +} + +static __inline__ int atomic_sub_return(int i, atomic_t *v) +{ + return atomic_add_return(-i,v); +} + +#define atomic_inc_return(v) (atomic_add_return(1,v)) +#define atomic_dec_return(v) (atomic_sub_return(1,v)) + /* These are x86-specific, used by some header files */ #define atomic_clear_mask(mask, addr) \ __asm__ __volatile__(LOCK "andl %0,%1" \ : : "r" (~(mask)),"m" (*addr) : "memory") ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2004-09-01 6:24 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <2wJxj-7g2-23@gated-at.bofh.it>
[not found] ` <2x2JC-3Uu-11@gated-at.bofh.it>
2004-08-28 14:48 ` [PATCH]atomic_inc_return() for i386/x86_64 (Re: RCU issue with SELinux) Andi Kleen
2004-08-31 8:19 ` Kaigai Kohei
2004-08-31 8:49 ` Andi Kleen
2004-09-01 6:22 ` Kaigai Kohei
2004-08-24 13:24 RCU issue with SELinux (Re: SELINUX performance issues) James Morris
2004-08-25 9:52 ` [PATCH]atomic_inc_return() for i386/x86_64 (Re: RCU issue with SELinux) Kaigai Kohei
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox