public inbox for linux-ia64@vger.kernel.org
 help / color / mirror / Atom feed
* getcpu() returns EFAULT when called via the vdso
@ 2012-05-06 20:45 Mike Frysinger
  2012-05-07 17:58 ` Luck, Tony
                   ` (6 more replies)
  0 siblings, 7 replies; 8+ messages in thread
From: Mike Frysinger @ 2012-05-06 20:45 UTC (permalink / raw)
  To: linux-ia64

[-- Attachment #1: Type: Text/Plain, Size: 3324 bytes --]

sched_getcpu() on my ia64 systems is failing with EFAULT.  simple test code:

$ cat test.c
#define _GNU_SOURCE
#include <errno.h>
#include <sched.h>
#include <stdio.h>
#include <string.h>
#include <sys/syscall.h>
main() {
	int i, e;
	puts("");
close(444);
	kill(0, 0);
close(333);
	syscall(__NR_getcpu, &i, 0, 0);
close(123);
	errno = 0;
	i = sched_getcpu();
	e = errno;
close(321);
	printf("getcpu() = %i: %s\n", i, strerror(e));
}
the puts() is to force initialization of some internal libc stuff so the strace 
output is easier to read later on.  the close() calls make it easy to pick out 
the different steps.

when i run this with newer glibc versions (like 2.13+), i get:
	getcpu() = -1: Bad address

running it through strace, we see:
close(444)                              = -1 EBADF (Bad file descriptor)
kill(0, SIG_0)                          = 0
close(333)                              = -1 EBADF (Bad file descriptor)
getcpu([1], NULL, 0)                    = 0
close(123)                              = -1 EBADF (Bad file descriptor)
close(321)                              = -1 EBADF (Bad file descriptor)
write(1, "\ngetcpu() = -1: Bad address\n", 28) = 28

you can see the syscall() working, but the sched_getcpu() doesn't seem to make 
it into supervisor mode.  glibc internally uses the vdso for doing most 
syscalls (while the syscall() func sticks to the old break method).  since we 
know kill() and sched_getcpu() use the vdso, we know strace can handle both 
styles fine since kill() gets decoded.  so that leaves something funky.

the sched_getcpu() code is somewhat simple:
int sched_getcpu (void) {
	unsigned int cpu;
	int r = INLINE_SYSCALL (getcpu, 3, &cpu, NULL, NULL);
	return r == -1 ? r : cpu;
}
so it passes in a pointer to an int on the stack, and 2 null pointers ...

the disassembly of the sched_getcpu() code at runtime looks like:
   0x2000000000203940 <+0>:     [MMI]       alloc r32=ar.pfs,9,1,0
   0x2000000000203941 <+1>:                 adds r14=8,r13
   0x2000000000203942 <+2>:                 mov r33=r12;;
   0x2000000000203950 <+16>:    [MMI]       ld8 r14=[r14]
   0x2000000000203951 <+17>:                nop.m 0x0
   0x2000000000203952 <+18>:                mov r15=1304
   0x2000000000203960 <+32>:    [MII]       mov r35=r0
   0x2000000000203961 <+33>:                mov r34=r0;;
   0x2000000000203962 <+34>:                mov b7=r14;;
   0x2000000000203970 <+48>:    [MIB]       nop.m 0x0
   0x2000000000203971 <+49>:                nop.i 0x0
   0x2000000000203972 <+50>:                br.call.sptk.many b6=b7;;

so we see:
 - b7 gets loaded with a pointer to the __kernel_syscall_via_epc entry point
 - r15 gets the right syscall number (1304 for __NR_getcpu)
 - r33 is a pointer to the stack (gdb shows it is in the $sp region)
 - r34 and r35 get zeroed out

yet once i step over the call to __kernel_syscall_via_epc, i see r8 is set to 
14 (EFAULT).  i can't see that value being setup in kernel/gate.S, but my 
knowledge of ia64 assembly isn't that great, nor the kernel paths, so i'm 
hoping someone can point out the obvious to me here.

i've tested linux 3.0.6 and 3.1.6, glibc 2.13 and 2.15/2.16, and gcc 4.5.3 
(just what i have access to).  they all behave the same.
-mike

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* RE: getcpu() returns EFAULT when called via the vdso
  2012-05-06 20:45 getcpu() returns EFAULT when called via the vdso Mike Frysinger
@ 2012-05-07 17:58 ` Luck, Tony
  2012-05-07 23:12 ` Mike Frysinger
                   ` (5 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Luck, Tony @ 2012-05-07 17:58 UTC (permalink / raw)
  To: linux-ia64


the disassembly of the sched_getcpu() code at runtime looks like:
   0x2000000000203940 <+0>:     [MMI]       alloc r32=ar.pfs,9,1,0
   0x2000000000203941 <+1>:                 adds r14=8,r13
   0x2000000000203942 <+2>:                 mov r33=r12;;
   0x2000000000203950 <+16>:    [MMI]       ld8 r14=[r14]
   0x2000000000203951 <+17>:                nop.m 0x0
   0x2000000000203952 <+18>:                mov r15\x1304
   0x2000000000203960 <+32>:    [MII]       mov r35=r0  <<<<<<<<<<<<<<<<<<<
   0x2000000000203961 <+33>:                mov r34=r0;;<<<<<<<<<<<<<<<<<<<
   0x2000000000203962 <+34>:                mov b7=r14;;
   0x2000000000203970 <+48>:    [MIB]       nop.m 0x0
   0x2000000000203971 <+49>:                nop.i 0x0
   0x2000000000203972 <+50>:                br.call.sptk.many b6·;;

When the "br.call" is executed, we flip to the new register
frame and r34/r35 in the sched_getcpu() frame become r32/r33
in the new frame.

So you get -EFAULT because the VDSO tries to dereference a NULL
pointer for each of the *cpu and *node arguments.

-Tony

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: getcpu() returns EFAULT when called via the vdso
  2012-05-06 20:45 getcpu() returns EFAULT when called via the vdso Mike Frysinger
  2012-05-07 17:58 ` Luck, Tony
@ 2012-05-07 23:12 ` Mike Frysinger
  2012-05-07 23:53 ` Mike Frysinger
                   ` (4 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Mike Frysinger @ 2012-05-07 23:12 UTC (permalink / raw)
  To: linux-ia64

[-- Attachment #1: Type: Text/Plain, Size: 1797 bytes --]

On Monday 07 May 2012 13:58:53 Luck, Tony wrote:
> the disassembly of the sched_getcpu() code at runtime looks like:
>    0x2000000000203940 <+0>:     [MMI]       alloc r32=ar.pfs,9,1,0
>    0x2000000000203941 <+1>:                 adds r14=8,r13
>    0x2000000000203942 <+2>:                 mov r33=r12;;
>    0x2000000000203950 <+16>:    [MMI]       ld8 r14=[r14]
>    0x2000000000203951 <+17>:                nop.m 0x0
>    0x2000000000203952 <+18>:                mov r15=1304
>    0x2000000000203960 <+32>:    [MII]       mov r35=r0  <<<<<<<<<<<<<<<<<<<
>    0x2000000000203961 <+33>:                mov r34=r0;;<<<<<<<<<<<<<<<<<<<
>    0x2000000000203962 <+34>:                mov b7=r14;;
>    0x2000000000203970 <+48>:    [MIB]       nop.m 0x0
>    0x2000000000203971 <+49>:                nop.i 0x0
>    0x2000000000203972 <+50>:                br.call.sptk.many b6=b7;;
> 
> When the "br.call" is executed, we flip to the new register
> frame and r34/r35 in the sched_getcpu() frame become r32/r33
> in the new frame.

i think only one register is rotated because we called alloc with locals=1.  
stepping through with gdb shows that -- on the first insn in 
__kernel_syscall_via_epc, r32 is now the local variable on the stack, r33 and 
r34 are 0.

> So you get -EFAULT because the VDSO tries to dereference a NULL
> pointer for each of the *cpu and *node arguments.

the kernel doesn't care if cpu/node are NULL:
SYSCALL_DEFINE3(getcpu, unsigned __user *, cpup, unsigned __user *, nodep,
        struct getcpu_cache __user *, unused)
{
    int err = 0;
    int cpu = raw_smp_processor_id();
    if (cpup)
        err |= put_user(cpu, cpup);
    if (nodep)
        err |= put_user(cpu_to_node(cpu), nodep);
    return err ? -EFAULT : 0;
}
-mike

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: getcpu() returns EFAULT when called via the vdso
  2012-05-06 20:45 getcpu() returns EFAULT when called via the vdso Mike Frysinger
  2012-05-07 17:58 ` Luck, Tony
  2012-05-07 23:12 ` Mike Frysinger
@ 2012-05-07 23:53 ` Mike Frysinger
  2012-05-08 21:13 ` Tony Luck
                   ` (3 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Mike Frysinger @ 2012-05-07 23:53 UTC (permalink / raw)
  To: linux-ia64

[-- Attachment #1: Type: Text/Plain, Size: 1794 bytes --]

On Monday 07 May 2012 13:58:53 Luck, Tony wrote:
> the disassembly of the sched_getcpu() code at runtime looks like:
>    0x2000000000203940 <+0>:     [MMI]       alloc r32=ar.pfs,9,1,0
>    0x2000000000203941 <+1>:                 adds r14=8,r13
>    0x2000000000203942 <+2>:                 mov r33=r12;;
>    0x2000000000203950 <+16>:    [MMI]       ld8 r14=[r14]
>    0x2000000000203951 <+17>:                nop.m 0x0
>    0x2000000000203952 <+18>:                mov r15=1304
>    0x2000000000203960 <+32>:    [MII]       mov r35=r0  <<<<<<<<<<<<<<<<<<<
>    0x2000000000203961 <+33>:                mov r34=r0;;<<<<<<<<<<<<<<<<<<<
>    0x2000000000203962 <+34>:                mov b7=r14;;
>    0x2000000000203970 <+48>:    [MIB]       nop.m 0x0
>    0x2000000000203971 <+49>:                nop.i 0x0
>    0x2000000000203972 <+50>:                br.call.sptk.many b6=b7;;
> 
> When the "br.call" is executed, we flip to the new register
> frame and r34/r35 in the sched_getcpu() frame become r32/r33
> in the new frame.
> 
> So you get -EFAULT because the VDSO tries to dereference a NULL
> pointer for each of the *cpu and *node arguments.

oh, i think i see.  the funcs implemented via the ia64 vdso are not the normal 
kernel funcs.  instead they're hand coded assembly.  in this case, fsys_getcpu 
in arch/ia64/kernel/fsys.S is lacking handling for NULL cpu/node arguments and 
if either is NULL, it incorrectly fails.

EX(.fail_efault, probe.w.fault r32, 3)     // M This takes 5 cycles
EX(.fail_efault, probe.w.fault r33, 3)     // M This takes 5 cycles

i guess that needs to load some p reg with a NULL pointer test and then do the 
loads/stores based on that.  calling getcpu(NULL, NULL, NULL) from userspace 
shouldn't trigger EFAULT.
-mike

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: getcpu() returns EFAULT when called via the vdso
  2012-05-06 20:45 getcpu() returns EFAULT when called via the vdso Mike Frysinger
                   ` (2 preceding siblings ...)
  2012-05-07 23:53 ` Mike Frysinger
@ 2012-05-08 21:13 ` Tony Luck
  2012-05-08 22:31 ` Mike Frysinger
                   ` (2 subsequent siblings)
  6 siblings, 0 replies; 8+ messages in thread
From: Tony Luck @ 2012-05-08 21:13 UTC (permalink / raw)
  To: linux-ia64

[-- Attachment #1: Type: text/plain, Size: 466 bytes --]

On Mon, May 7, 2012 at 4:53 PM, Mike Frysinger <vapier@gentoo.org> wrote:
> i guess that needs to load some p reg with a NULL pointer test and then do the
> loads/stores based on that.  calling getcpu(NULL, NULL, NULL) from userspace
> shouldn't trigger EFAULT.

We could do that (see attached, untested, patch). But it wouldn't help
the sched_getcpu() code much ... it would stop getting -EFAULT, but
it still wouldn't have the right return value.

-Tony

[-- Attachment #2: getcpu.patch --]
[-- Type: application/octet-stream, Size: 1611 bytes --]

diff --git a/arch/ia64/kernel/fsys.S b/arch/ia64/kernel/fsys.S
index cc26eda..e2dfae2 100644
--- a/arch/ia64/kernel/fsys.S
+++ b/arch/ia64/kernel/fsys.S
@@ -559,11 +559,15 @@ ENTRY(fsys_getcpu)
 	;;
 	tnat.nz p7,p0 = r33			// I guard against NaT argument
 (p7)    br.cond.spnt.few .fail_einval		// B
+	;;
+	cmp.ne p6,p0=r32,r0
+	cmp.ne p7,p0=r33,r0
+	;;
 #ifdef CONFIG_NUMA
 	movl r17=cpu_to_node_map
 	;;
-EX(.fail_efault, probe.w.fault r32, 3)		// M This takes 5 cycles
-EX(.fail_efault, probe.w.fault r33, 3)		// M This takes 5 cycles
+EX(.fail_efault, (p6) probe.w.fault r32, 3)		// M This takes 5 cycles
+EX(.fail_efault, (p7) probe.w.fault r33, 3)		// M This takes 5 cycles
 	shladd r18=r3,1,r17
 	;;
 	ld2 r20=[r18]				// r20 = cpu_to_node_map[cpu]
@@ -573,20 +577,20 @@ EX(.fail_efault, probe.w.fault r33, 3)		// M This takes 5 cycles
 (p8)	br.spnt.many fsys_fallback_syscall
 	;;
 	;;
-EX(.fail_efault, st4 [r32] = r3)
-EX(.fail_efault, st2 [r33] = r20)
+EX(.fail_efault, (p6) st4 [r32] = r3)
+EX(.fail_efault, (p7) st2 [r33] = r20)
 	mov r8=0
 	;;
 #else
-EX(.fail_efault, probe.w.fault r32, 3)		// M This takes 5 cycles
-EX(.fail_efault, probe.w.fault r33, 3)		// M This takes 5 cycles
+EX(.fail_efault, (p6) probe.w.fault r32, 3)		// M This takes 5 cycles
+EX(.fail_efault, (p7) probe.w.fault r33, 3)		// M This takes 5 cycles
 	and r2 = TIF_ALLWORK_MASK,r2
 	;;
 	cmp.ne p8,p0=0,r2
 (p8)	br.spnt.many fsys_fallback_syscall
 	;;
-EX(.fail_efault, st4 [r32] = r3)
-EX(.fail_efault, st2 [r33] = r0)
+EX(.fail_efault, (p6) st4 [r32] = r3)
+EX(.fail_efault, (p7) st2 [r33] = r0)
 	mov r8=0
 	;;
 #endif

^ permalink raw reply related	[flat|nested] 8+ messages in thread

* Re: getcpu() returns EFAULT when called via the vdso
  2012-05-06 20:45 getcpu() returns EFAULT when called via the vdso Mike Frysinger
                   ` (3 preceding siblings ...)
  2012-05-08 21:13 ` Tony Luck
@ 2012-05-08 22:31 ` Mike Frysinger
  2012-05-15  5:41 ` Mike Frysinger
  2012-11-29 23:34 ` Mike Frysinger
  6 siblings, 0 replies; 8+ messages in thread
From: Mike Frysinger @ 2012-05-08 22:31 UTC (permalink / raw)
  To: linux-ia64

[-- Attachment #1: Type: Text/Plain, Size: 907 bytes --]

On Tuesday 08 May 2012 17:13:28 Tony Luck wrote:
> On Mon, May 7, 2012 at 4:53 PM, Mike Frysinger wrote:
> > i guess that needs to load some p reg with a NULL pointer test and then
> > do the loads/stores based on that.  calling getcpu(NULL, NULL, NULL)
> > from userspace shouldn't trigger EFAULT.
> 
> We could do that (see attached, untested, patch). But it wouldn't help
> the sched_getcpu() code much ... it would stop getting -EFAULT, but
> it still wouldn't have the right return value.

wouldn't it though ?  the return value with getcpu is just "0" when things 
worked.  if i do:
	int i;
	syscall(__NR_getcpu, &i, NULL, NULL);
this should return 0 and fill in "i" with the cpu value.  i think your proposed 
patch would make things work as no fault would be triggered, and the return 
value (r8) would be set to 0.

i'll have to rebuild the kernel to double check though.
-mike

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: getcpu() returns EFAULT when called via the vdso
  2012-05-06 20:45 getcpu() returns EFAULT when called via the vdso Mike Frysinger
                   ` (4 preceding siblings ...)
  2012-05-08 22:31 ` Mike Frysinger
@ 2012-05-15  5:41 ` Mike Frysinger
  2012-11-29 23:34 ` Mike Frysinger
  6 siblings, 0 replies; 8+ messages in thread
From: Mike Frysinger @ 2012-05-15  5:41 UTC (permalink / raw)
  To: linux-ia64

[-- Attachment #1: Type: Text/Plain, Size: 745 bytes --]

On Tuesday 08 May 2012 17:13:28 Tony Luck wrote:
> On Mon, May 7, 2012 at 4:53 PM, Mike Frysinger <vapier@gentoo.org> wrote:
> > i guess that needs to load some p reg with a NULL pointer test and then
> > do the loads/stores based on that.  calling getcpu(NULL, NULL, NULL)
> > from userspace shouldn't trigger EFAULT.
> 
> We could do that (see attached, untested, patch). But it wouldn't help
> the sched_getcpu() code much ... it would stop getting -EFAULT, but
> it still wouldn't have the right return value.

seems to work for me
Tested-by: Mike Frysinger <vapier@gentoo.org>

also, strace shows the successful syscall via the vdso, but not the failing 
ones.  not sure where things are going wrong with that though.
-mike

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: getcpu() returns EFAULT when called via the vdso
  2012-05-06 20:45 getcpu() returns EFAULT when called via the vdso Mike Frysinger
                   ` (5 preceding siblings ...)
  2012-05-15  5:41 ` Mike Frysinger
@ 2012-11-29 23:34 ` Mike Frysinger
  6 siblings, 0 replies; 8+ messages in thread
From: Mike Frysinger @ 2012-11-29 23:34 UTC (permalink / raw)
  To: linux-ia64

[-- Attachment #1: Type: Text/Plain, Size: 734 bytes --]

On Tuesday 15 May 2012 01:41:11 Mike Frysinger wrote:
> On Tuesday 08 May 2012 17:13:28 Tony Luck wrote:
> > On Mon, May 7, 2012 at 4:53 PM, Mike Frysinger <vapier@gentoo.org> wrote:
> > > i guess that needs to load some p reg with a NULL pointer test and then
> > > do the loads/stores based on that.  calling getcpu(NULL, NULL, NULL)
> > > from userspace shouldn't trigger EFAULT.
> > 
> > We could do that (see attached, untested, patch). But it wouldn't help
> > the sched_getcpu() code much ... it would stop getting -EFAULT, but
> > it still wouldn't have the right return value.
> 
> seems to work for me
> Tested-by: Mike Frysinger <vapier@gentoo.org>

was this patch going to get merged at some point ?
-mike

[-- Attachment #2: This is a digitally signed message part. --]
[-- Type: application/pgp-signature, Size: 836 bytes --]

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2012-11-29 23:34 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-05-06 20:45 getcpu() returns EFAULT when called via the vdso Mike Frysinger
2012-05-07 17:58 ` Luck, Tony
2012-05-07 23:12 ` Mike Frysinger
2012-05-07 23:53 ` Mike Frysinger
2012-05-08 21:13 ` Tony Luck
2012-05-08 22:31 ` Mike Frysinger
2012-05-15  5:41 ` Mike Frysinger
2012-11-29 23:34 ` Mike Frysinger

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox