public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] add likely around access_ok for uaccess
@ 2003-09-14 11:17 Manfred Spraul
  2003-09-14 19:30 ` Andrew Morton
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Manfred Spraul @ 2003-09-14 11:17 UTC (permalink / raw)
  To: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 384 bytes --]

Hi,

while trying to figure out why sysv msg is around 30% slower than pipes 
for data transfers I noticed that gcc's autodetection (3.2.2) guesses 
the "if(access_ok())" tests in uaccess.h wrong and puts the error memset 
into the direct path and the copy out of line.

The attached patch adds likely to the tests - any objections? What about 
the archs except i386?

--
    Manfred

[-- Attachment #2: patch-uaccess-likely --]
[-- Type: text/plain, Size: 1000 bytes --]

--- 2.6/include/asm-i386/uaccess.h	2003-09-12 21:53:58.000000000 +0200
+++ build-2.6/include/asm-i386/uaccess.h	2003-09-14 12:56:19.000000000 +0200
@@ -300,7 +300,7 @@
 	long __pu_err = -EFAULT;					\
 	__typeof__(*(ptr)) *__pu_addr = (ptr);				\
 	might_sleep();						\
-	if (access_ok(VERIFY_WRITE,__pu_addr,size))			\
+	if (likely(access_ok(VERIFY_WRITE,__pu_addr,size)))			\
 		__put_user_size((x),__pu_addr,(size),__pu_err,-EFAULT);	\
 	__pu_err;							\
 })							
@@ -510,7 +510,7 @@
 direct_copy_to_user(void __user *to, const void *from, unsigned long n)
 {
 	might_sleep();
-	if (access_ok(VERIFY_WRITE, to, n))
+	if (likely(access_ok(VERIFY_WRITE, to, n)))
 		n = __direct_copy_to_user(to, from, n);
 	return n;
 }
@@ -535,7 +535,7 @@
 direct_copy_from_user(void *to, const void __user *from, unsigned long n)
 {
 	might_sleep();
-	if (access_ok(VERIFY_READ, from, n))
+	if (likely(access_ok(VERIFY_READ, from, n)))
 		n = __direct_copy_from_user(to, from, n);
 	else
 		memset(to, 0, n);

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] add likely around access_ok for uaccess
       [not found] <vD5l.6Wt.19@gated-at.bofh.it>
@ 2003-09-14 16:49 ` Andi Kleen
  0 siblings, 0 replies; 11+ messages in thread
From: Andi Kleen @ 2003-09-14 16:49 UTC (permalink / raw)
  To: Manfred Spraul; +Cc: linux-kernel

Manfred Spraul <manfred@colorfullife.com> writes:

> The attached patch adds likely to the tests - any objections? What
> about the archs except i386?

I fixed x86-64 get/put_user, although it doesn't matter much there because
it compiles with -fno-reorder-blocks by default. 

(generates smaller code and makes the assembly output much more readable)

But:

>
> --
>     Manfred
> --- 2.6/include/asm-i386/uaccess.h	2003-09-12 21:53:58.000000000 +0200
> +++ build-2.6/include/asm-i386/uaccess.h	2003-09-14 12:56:19.000000000 +0200
> @@ -300,7 +300,7 @@
>  	long __pu_err = -EFAULT;					\
>  	__typeof__(*(ptr)) *__pu_addr = (ptr);				\
>  	might_sleep();						\
> -	if (access_ok(VERIFY_WRITE,__pu_addr,size))			\
> +	if (likely(access_ok(VERIFY_WRITE,__pu_addr,size)))			\
>  		__put_user_size((x),__pu_addr,(size),__pu_err,-EFAULT);	\
>  	__pu_err;							\
>  })							
> @@ -510,7 +510,7 @@
>  direct_copy_to_user(void __user *to, const void *from, unsigned long n)

My copy of 2.6.0test5 doesn't have a "direct_copy_to_user", so I'm wondering
at what tree you're looking.

And more importantly I think the i386 uaccess.h copy_{from/to}_user should 
be fixed to not do the access_ok()/memset check inline. It can be as well done
out of line. This would likely save some code size. Doing it inline
just doesn't make any sense.

[x86-64 does it this way]

-Andi

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] add likely around access_ok for uaccess
  2003-09-14 11:17 Manfred Spraul
@ 2003-09-14 19:30 ` Andrew Morton
  2003-09-14 19:45   ` Manfred Spraul
  2003-09-15 22:59 ` David S. Miller
  2003-09-16 19:49 ` Pavel Machek
  2 siblings, 1 reply; 11+ messages in thread
From: Andrew Morton @ 2003-09-14 19:30 UTC (permalink / raw)
  To: Manfred Spraul; +Cc: linux-kernel

Manfred Spraul <manfred@colorfullife.com> wrote:
>
> I noticed that gcc's autodetection (3.2.2) guesses 
>  the "if(access_ok())" tests in uaccess.h wrong and puts the error memset 
>  into the direct path and the copy out of line.
> 
>  The attached patch adds likely to the tests

Fair enough.  There are quite a few other access_ok() callers.   How
about just:

--- 25/include/asm-i386/uaccess.h~access_ok-is-likely	2003-09-14 12:29:10.000000000 -0700
+++ 25-akpm/include/asm-i386/uaccess.h	2003-09-14 12:29:24.000000000 -0700
@@ -80,7 +80,7 @@ extern struct movsl_mask {
  * checks that the pointer is in the user space range - after calling
  * this function, memory access functions may still return -EFAULT.
  */
-#define access_ok(type,addr,size) (__range_ok(addr,size) == 0)
+#define access_ok(type,addr,size) (likely(__range_ok(addr,size) == 0))
 
 /**
  * verify_area: - Obsolete, use access_ok()

_


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] add likely around access_ok for uaccess
  2003-09-14 19:30 ` Andrew Morton
@ 2003-09-14 19:45   ` Manfred Spraul
  2003-09-14 19:48     ` Manfred Spraul
  2003-09-14 20:00     ` Andrew Morton
  0 siblings, 2 replies; 11+ messages in thread
From: Manfred Spraul @ 2003-09-14 19:45 UTC (permalink / raw)
  To: Andrew Morton; +Cc: linux-kernel

Andrew Morton wrote:

>+#define access_ok(type,addr,size) (likely(__range_ok(addr,size) == 0))
>  
>
I don't know. What happens if access_ok is used outside of an if statement?
One example is

int get_compat_timespec(struct timespec *ts, struct compat_timespec *cts)
{
        return (verify_area(VERIFY_READ, cts, sizeof(*cts)) ||
                        __get_user(ts->tv_sec, &cts->tv_sec) ||
                        __get_user(ts->tv_nsec, &cts->tv_nsec)) ? -EFAULT : 0;
}

And fs/reiserfs/file.c already contains an unlikely marker around access_ok.

Have you tried if the kernel still compiles with your patch?

--
	Manfred 




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] add likely around access_ok for uaccess
  2003-09-14 19:45   ` Manfred Spraul
@ 2003-09-14 19:48     ` Manfred Spraul
  2003-09-14 20:00     ` Andrew Morton
  1 sibling, 0 replies; 11+ messages in thread
From: Manfred Spraul @ 2003-09-14 19:48 UTC (permalink / raw)
  To: linux-kernel; +Cc: Andrew Morton

Manfred Spraul wrote:

> Andrew Morton wrote:
>
>> +#define access_ok(type,addr,size) (likely(__range_ok(addr,size) == 0))
>>  
>>
> I don't know. What happens if access_ok is used outside of an if 
> statement?
> One example is
>
> int get_compat_timespec(struct timespec *ts, struct compat_timespec *cts)
> {
>        return (verify_area(VERIFY_READ, cts, sizeof(*cts)) ||
>                        __get_user(ts->tv_sec, &cts->tv_sec) ||
>                        __get_user(ts->tv_nsec, &cts->tv_nsec)) ? 
> -EFAULT : 0;
> }

Duh. Reading is difficult.
The functions with access_ok are get_compat_itimerval and 
put_compat_itimerval.

--
    Manfred


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] add likely around access_ok for uaccess
  2003-09-14 19:45   ` Manfred Spraul
  2003-09-14 19:48     ` Manfred Spraul
@ 2003-09-14 20:00     ` Andrew Morton
  1 sibling, 0 replies; 11+ messages in thread
From: Andrew Morton @ 2003-09-14 20:00 UTC (permalink / raw)
  To: Manfred Spraul; +Cc: linux-kernel

Manfred Spraul <manfred@colorfullife.com> wrote:
>
> What happens if access_ok is used outside of an if statement?

Seems OK.   Even this compiles:

        int x;

        x = likely(y == 0);

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] add likely around access_ok for uaccess
  2003-09-14 11:17 Manfred Spraul
  2003-09-14 19:30 ` Andrew Morton
@ 2003-09-15 22:59 ` David S. Miller
  2003-09-16 19:49 ` Pavel Machek
  2 siblings, 0 replies; 11+ messages in thread
From: David S. Miller @ 2003-09-15 22:59 UTC (permalink / raw)
  To: Manfred Spraul; +Cc: linux-kernel

On Sun, 14 Sep 2003 13:17:10 +0200
Manfred Spraul <manfred@colorfullife.com> wrote:

> The attached patch adds likely to the tests - any objections? What about 
> the archs except i386?

On sparc64, access_ok() is a constant, so it doesn't matter
there.

Sparc32 probably should have it.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] add likely around access_ok for uaccess
  2003-09-14 11:17 Manfred Spraul
  2003-09-14 19:30 ` Andrew Morton
  2003-09-15 22:59 ` David S. Miller
@ 2003-09-16 19:49 ` Pavel Machek
  2003-09-16 20:55   ` Jeff Garzik
  2 siblings, 1 reply; 11+ messages in thread
From: Pavel Machek @ 2003-09-16 19:49 UTC (permalink / raw)
  To: Manfred Spraul; +Cc: linux-kernel

Hi!

> while trying to figure out why sysv msg is around 30% slower than pipes 
> for data transfers I noticed that gcc's autodetection (3.2.2) guesses 
> the "if(access_ok())" tests in uaccess.h wrong and puts the error memset 
> into the direct path and the copy out of line.
> 
> The attached patch adds likely to the tests - any objections? What about 
> the archs except i386?

How much speedup did you gain?
								Pavel

-- 
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] add likely around access_ok for uaccess
  2003-09-16 19:49 ` Pavel Machek
@ 2003-09-16 20:55   ` Jeff Garzik
  2003-09-16 20:59     ` Pavel Machek
  0 siblings, 1 reply; 11+ messages in thread
From: Jeff Garzik @ 2003-09-16 20:55 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Manfred Spraul, linux-kernel

On Tue, Sep 16, 2003 at 09:49:29PM +0200, Pavel Machek wrote:
> Hi!
> 
> > while trying to figure out why sysv msg is around 30% slower than pipes 
> > for data transfers I noticed that gcc's autodetection (3.2.2) guesses 
> > the "if(access_ok())" tests in uaccess.h wrong and puts the error memset 
> > into the direct path and the copy out of line.
> > 
> > The attached patch adds likely to the tests - any objections? What about 
> > the archs except i386?
> 
> How much speedup did you gain?

How much can it hurt?

	Jeff




^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] add likely around access_ok for uaccess
  2003-09-16 20:55   ` Jeff Garzik
@ 2003-09-16 20:59     ` Pavel Machek
  2003-09-17 16:57       ` Manfred Spraul
  0 siblings, 1 reply; 11+ messages in thread
From: Pavel Machek @ 2003-09-16 20:59 UTC (permalink / raw)
  To: Jeff Garzik; +Cc: Manfred Spraul, linux-kernel

Hi!

> > > while trying to figure out why sysv msg is around 30% slower than pipes 
> > > for data transfers I noticed that gcc's autodetection (3.2.2) guesses 
> > > the "if(access_ok())" tests in uaccess.h wrong and puts the error memset 
> > > into the direct path and the copy out of line.
> > > 
> > > The attached patch adds likely to the tests - any objections? What about 
> > > the archs except i386?
> > 
> > How much speedup did you gain?
> 
> How much can it hurt?

The change is obviously okay, I just wanted to know... If it gains 30%
on sysv messages.. that would be pretty big surprise.
								Pavel
-- 
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: [PATCH] add likely around access_ok for uaccess
  2003-09-16 20:59     ` Pavel Machek
@ 2003-09-17 16:57       ` Manfred Spraul
  0 siblings, 0 replies; 11+ messages in thread
From: Manfred Spraul @ 2003-09-17 16:57 UTC (permalink / raw)
  To: Pavel Machek; +Cc: Jeff Garzik, linux-kernel

Pavel Machek wrote:

>Hi!
>
>  
>
>>>>while trying to figure out why sysv msg is around 30% slower than pipes 
>>>>for data transfers I noticed that gcc's autodetection (3.2.2) guesses 
>>>>the "if(access_ok())" tests in uaccess.h wrong and puts the error memset 
>>>>into the direct path and the copy out of line.
>>>>
>>>>The attached patch adds likely to the tests - any objections? What about 
>>>>the archs except i386?
>>>>        
>>>>
>>>How much speedup did you gain?
>>>      
>>>
>>How much can it hurt?
>>    
>>
>
>The change is obviously okay, I just wanted to know... If it gains 30%
>on sysv messages.. that would be pretty big surprise.
>
No.
I didn't benchmark it at all. I'd expect one or two cycles.
The 30% difference are due to misaligned buffers: The sysvmsg ABI uses
struct msgbuf {
    unsigned long type;
    char data[];
};
msgbuf was aligned, thus data on an not 8-byte aligned address. Thus the 
"rep;movsl" microcode fastpath didn't kick in, and that caused the 
30-50% slowdown.
After manually misaligning the msgbuf structure, and properly aligning 
the kernel buffers, the performance of sysvmsg is now identical to 
pipes: Around 4 k cycles for a one-byte ping-pong, and around 10k cycles 
for 4 kB ping-pong, with a Celeron mobile 1.13 GHz.

I haven't decided yet if a patch to align the kernel buffers is a good 
thing or not - it only helps my benchmark, I'm not aware of a real-world 
app that uses sysvmsg for bulk data transfers.

--
    Manfred


^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2003-09-17 16:57 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <vD5l.6Wt.19@gated-at.bofh.it>
2003-09-14 16:49 ` [PATCH] add likely around access_ok for uaccess Andi Kleen
2003-09-14 11:17 Manfred Spraul
2003-09-14 19:30 ` Andrew Morton
2003-09-14 19:45   ` Manfred Spraul
2003-09-14 19:48     ` Manfred Spraul
2003-09-14 20:00     ` Andrew Morton
2003-09-15 22:59 ` David S. Miller
2003-09-16 19:49 ` Pavel Machek
2003-09-16 20:55   ` Jeff Garzik
2003-09-16 20:59     ` Pavel Machek
2003-09-17 16:57       ` Manfred Spraul

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox