All of lore.kernel.org
 help / color / mirror / Atom feed
* SMP Sparc64 : bug in clone?
@ 2002-03-23  2:15 Erik de Castro Lopo
  2002-03-23  3:33 ` Erik de Castro Lopo
                   ` (9 more replies)
  0 siblings, 10 replies; 11+ messages in thread
From: Erik de Castro Lopo @ 2002-03-23  2:15 UTC (permalink / raw)
  To: ultralinux

Hi all,

The clone system call on Sparc64 Linux seems to be producing drastically 
different results to those obtain in x86 linux. On Sparc64 it seems as if
the child process never runs (tried on uni-processor 2.4.18 and SMP
2.4.19-pre4 with the same results).

My test program can be found here:

	https://mega-nerd.net/clone_test.c

On Sparc64 I get:

    root@razor > gcc -Wall clone_test.c -o clone_test
    root@razor > ./clone_test 
    parent running
    parent about to exit
    root@razor > sparc64-linux-gcc -Wall clone_test.c -o clone_test
    root@razor > ./clone_test 
    parent running
    parent about to exit
    root@razor >

On x86 linux I get this:

    erikd@coltrane > ./clone_test 
    parent running
    child running
    child about to exit
    parent about to exit
    erikd@coltrane > 

The assember output of sparc64-linux-gcc seems to be OK:

    main:
            !#PROLOGUE# 0
            save    %sp, -112, %sp
            !#PROLOGUE# 1
            st      %i0, [%fp+68]
            st      %i1, [%fp+72]
            sethi   %hi(.LLC2), %o0
            or      %o0, %lo(.LLC2), %o0
            call    printf, 0
             nop
            sethi   %hi(child), %o0
            or      %o0, %lo(child), %o0
            sethi   %hi(stack+32768), %o1
            or      %o1, %lo(stack+32768), %o1
            mov     20, %o2
            mov     0, %o3
            call    clone, 0
             nop

Strace output looks like this:

    munmap(0x7001c000, 9341)                = 0
    fstat64(1, {st_mode=S_IFCHR|0600, st_rdev=makedev(4, 64), ...}) = 0
    ioctl(1, 0x40245408, {B9600 opost isig icanon echo ...}) = 0
    mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7001a000
    write(1, "parent running\n", 15parent running
    )        = 15
    clone(child_stack=0x29cc8, flags=0x14)  = 166
    --- SIGCHLD (Child exited) ---
    wait4(166, [WIFSIGNALED(s) && WTERMSIG(s) = SIGSEGV], WUNTRACED, NULL) = 166
    write(1, "parent about to exit\n", 21parent about to exit
    )  = 21
    munmap(0x7001a000, 8192)                = 0
    exit(0)                                 = ?

which seems to imply that the child exits immediately. If I do strace -f I get
this:

    mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7001a000
    write(1, "parent running\n", 15parent running
    )        = 15
    clone(child_stack=0x61cc8, flags=0x14)  = 179
    [pid   179] --- SIGSEGV (Segmentation fault) ---
    --- SIGCHLD (Child exited) ---
    wait4(179,  <unfinished ...>

and then strace hangs. I think this is actually a problem with strace rather than 
a result of what my program is trying to do.

For the moment I will believe that first strace output which states that the child
process segfaults immediately after clone. Now I gotta figure out why.

Anybody got any clues on how to progress this?

Cheers,
Erik
-- 
+-----------------------------------------------------------+
  Erik de Castro Lopo  nospam@mega-nerd.com (Yes it's valid)
+-----------------------------------------------------------+
When aiming for the common denominator, be prepared for the 
occasional division by zero.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: SMP Sparc64 : bug in clone?
  2002-03-23  2:15 SMP Sparc64 : bug in clone? Erik de Castro Lopo
@ 2002-03-23  3:33 ` Erik de Castro Lopo
  2002-03-25  7:53 ` Erik de Castro Lopo
                   ` (8 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Erik de Castro Lopo @ 2002-03-23  3:33 UTC (permalink / raw)
  To: ultralinux

On Sat, 23 Mar 2002 13:15:45 +1100
Erik de Castro Lopo <nospam@mega-nerd.com> wrote:

> and then strace hangs. I think this is actually a problem with strace rather than 
> a result of what my program is trying to do.

I have compiled a new sparc64 version of strace (with the right kernel headers) and
and strace no longer bombs. New output from "strace -f ./clone_test" :

    mmap(NULL, 8192, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x7001a000
    write(1, "parent running\n", 15parent running
    )        = 15
    clone(PTRACE_ATTACH: Operation not permitted
    Too late?
    child_stack=0x61cc8, flags=0x14)  = 5659
    --- SIGCHLD (Child exited) ---
    wait4(5659, [WIFSIGNALED(s) && WTERMSIG(s) = SIGSEGV], WUNTRACED, NULL) = 5659
    write(1, "parent about to exit\n", 21parent about to exit
    )  = 21
    munmap(0x7001a000, 8192)                = 0
    exit(0)                                 = ?

Erik
-- 
+-----------------------------------------------------------+
  Erik de Castro Lopo  nospam@mega-nerd.com (Yes it's valid)
+-----------------------------------------------------------+
"One of the great things about books is sometimes there are 
some fantastic pictures" - George W. Bush

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: SMP Sparc64 : bug in clone?
  2002-03-23  2:15 SMP Sparc64 : bug in clone? Erik de Castro Lopo
  2002-03-23  3:33 ` Erik de Castro Lopo
@ 2002-03-25  7:53 ` Erik de Castro Lopo
  2002-03-25  8:12 ` David S. Miller
                   ` (7 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Erik de Castro Lopo @ 2002-03-25  7:53 UTC (permalink / raw)
  To: ultralinux

On Sat, 23 Mar 2002 13:15:45 +1100
Erik de Castro Lopo <nospam@mega-nerd.com> wrote:

> Hi all,
> 
> The clone system call on Sparc64 Linux seems to be producing drastically 
> different results to those obtain in x86 linux. On Sparc64 it seems as if
> the child process never runs (tried on uni-processor 2.4.18 and SMP
> 2.4.19-pre4 with the same results).

A collegue of mine suggested that it may be failing because the stack was 
not executable. I therefore wrote a new test where memory for the stack
was allocated using an anonymous mmap().

Like the others, this program runs on x86 but does not on Sparc64.

New test program here:

 	https://mega-nerd.net/clone_mmap_test.c

A similar program using fork() works as expected.

Erik
-- 
+-----------------------------------------------------------+
  Erik de Castro Lopo  nospam@mega-nerd.com (Yes it's valid)
+-----------------------------------------------------------+
"The reasonable man adapts himself to the world; the unreasonable one 
persists to adapt the world to himself. Therefore all progress depends
on the unreasonable man." -- George Bernard Shaw (1856-1950)

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: SMP Sparc64 : bug in clone?
  2002-03-23  2:15 SMP Sparc64 : bug in clone? Erik de Castro Lopo
  2002-03-23  3:33 ` Erik de Castro Lopo
  2002-03-25  7:53 ` Erik de Castro Lopo
@ 2002-03-25  8:12 ` David S. Miller
  2002-03-25  8:35 ` Erik de Castro Lopo
                   ` (6 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: David S. Miller @ 2002-03-25  8:12 UTC (permalink / raw)
  To: ultralinux


sparc64-linux-gcc is only for kernel building, it is not
going to work for userland applications.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: SMP Sparc64 : bug in clone?
  2002-03-23  2:15 SMP Sparc64 : bug in clone? Erik de Castro Lopo
                   ` (2 preceding siblings ...)
  2002-03-25  8:12 ` David S. Miller
@ 2002-03-25  8:35 ` Erik de Castro Lopo
  2002-03-25  8:36 ` David S. Miller
                   ` (5 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Erik de Castro Lopo @ 2002-03-25  8:35 UTC (permalink / raw)
  To: ultralinux

On Mon, 25 Mar 2002 00:12:25 -0800 (PST)
"David S. Miller" <davem@redhat.com> wrote:

> 
> sparc64-linux-gcc is only for kernel building, it is not
> going to work for userland applications.

I have actually tried both the standard gcc and sparc64-linux-gcc.
The clone() system call does not work correctly in either case.

I have however compiled other small test applications with 
sparc64-linux-gcc and they seem to work (sizeof (void*) = 8,
behave perfectly).

Erik
-- 
+-----------------------------------------------------------+
  Erik de Castro Lopo  nospam@mega-nerd.com (Yes it's valid)
+-----------------------------------------------------------+
Failure is not an option. It comes bundled with your Microsoft product.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: SMP Sparc64 : bug in clone?
  2002-03-23  2:15 SMP Sparc64 : bug in clone? Erik de Castro Lopo
                   ` (3 preceding siblings ...)
  2002-03-25  8:35 ` Erik de Castro Lopo
@ 2002-03-25  8:36 ` David S. Miller
  2002-04-29 20:14 ` David S. Miller
                   ` (4 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: David S. Miller @ 2002-03-25  8:36 UTC (permalink / raw)
  To: ultralinux

   From: Erik de Castro Lopo <nospam@mega-nerd.com>
   Date: Mon, 25 Mar 2002 19:35:51 +1100

   On Mon, 25 Mar 2002 00:12:25 -0800 (PST)
   "David S. Miller" <davem@redhat.com> wrote:
   
   > sparc64-linux-gcc is only for kernel building, it is not
   > going to work for userland applications.
   
   I have actually tried both the standard gcc and sparc64-linux-gcc.
   The clone() system call does not work correctly in either case.

Perhaps you should look at how the glibc sources use and invoke
clone() in the linuxthreads sparc-specific code.  That does work.

I haven't looked at your sources because I cannot simply grab it
with wget (which doesn't understand https).  You're probably just
doing something silly.  There are all sorts of platform specific
nuances to do with the stack pointer you pass into clone(), on Sparc
for example the stack you pass in must be at least 8 byte aligned.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: SMP Sparc64 : bug in clone?
  2002-03-23  2:15 SMP Sparc64 : bug in clone? Erik de Castro Lopo
                   ` (4 preceding siblings ...)
  2002-03-25  8:36 ` David S. Miller
@ 2002-04-29 20:14 ` David S. Miller
  2002-04-29 20:17 ` Erik de Castro Lopo
                   ` (3 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: David S. Miller @ 2002-04-29 20:14 UTC (permalink / raw)
  To: ultralinux


You need to subtract 2047 (the Sparc64 "stack bias") to the stack you
pass to clone.

^ permalink raw reply	[flat|nested] 11+ messages in thread

* SMP Sparc64 : bug in clone?
  2002-03-23  2:15 SMP Sparc64 : bug in clone? Erik de Castro Lopo
                   ` (5 preceding siblings ...)
  2002-04-29 20:14 ` David S. Miller
@ 2002-04-29 20:17 ` Erik de Castro Lopo
  2002-05-04 23:08 ` Erik de Castro Lopo
                   ` (2 subsequent siblings)
  9 siblings, 0 replies; 11+ messages in thread
From: Erik de Castro Lopo @ 2002-04-29 20:17 UTC (permalink / raw)
  To: ultralinux

Hi all,

I now have a piece of code (at the end of this email) which uses clone()
and seems to work. However, the behaviour between Sparc64 and x86 Linux
is different.

On x86 both :

   pid = clone (child, stack, SIGCHLD, NULL);

and 

   pid = clone (child, stack, CLONE_VM | SIGCHLD, NULL);

work while on Sparc64, the former results in a stillborn child process.
Using strace I have found that the stillborn child received SIGSEGV as
soon as it was spawned. 

My suspicion is that in the case where CLONE_VM is not part of the flags,
the child process is not granted r/w access to the stack supplied to it
and hence segfaults as soon as it tries to access the stack.

I've had a look at the glibc sources (where clone is implemented in
sysdeps/unix/sysv/linux/sparc/sparc64/clone.S) but being somewhat unfamiliar
with Sparc assembler I can't really figure out if it is right or wrong.

Anybody have any insight?

Cheers,
Erik
-- 
+-----------------------------------------------------------+
  Erik de Castro Lopo  nospam@mega-nerd.com (Yes it's valid)
+-----------------------------------------------------------+
The word "Windows" is a word out of an old dialect of the 
Apaches. It means: "White man staring through glass-screen 
onto an hourglass..."


------------------------------------------------------------------------------
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sched.h>
#include <signal.h>
#include <sys/wait.h>

#define STACK_SIZE	(1<<10)

/* On sparc64, stack needs to be 8 byte aligned. */
static double stack_bytes [STACK_SIZE+1];

int
child(void *arg)
{	printf ("child running\n") ;
	sleep(2);
	printf ("child about to exit\n") ;
	exit (0);
} /* child */

int
main(int argc, char **argv)
{	int pid, status ;
	void *stack ;

	stack = &stack_bytes[STACK_SIZE];
	printf ("stack = %p\n", stack);

	printf("parent running\n");

	/* On x86, CLONE_VM flag is not required. Why?? */
	if((pid = clone (child, stack, CLONE_VM | SIGCHLD, NULL)) < 0)
	{	perror("clone");
		exit(1);
		}

	printf ("pid : %d\n", pid);

	if((pid = waitpid(pid, &status, WUNTRACED)) < 0)
	{	perror("Waiting for stop");
		exit(1);
		}

	printf ("parent about to exit\n") ;
	return 0 ;
} /* main */
------------------------------------------------------------------------------


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: SMP Sparc64 : bug in clone?
  2002-03-23  2:15 SMP Sparc64 : bug in clone? Erik de Castro Lopo
                   ` (6 preceding siblings ...)
  2002-04-29 20:17 ` Erik de Castro Lopo
@ 2002-05-04 23:08 ` Erik de Castro Lopo
  2002-05-06 14:25 ` Noah Beck
  2002-05-06 21:36 ` Erik de Castro Lopo
  9 siblings, 0 replies; 11+ messages in thread
From: Erik de Castro Lopo @ 2002-05-04 23:08 UTC (permalink / raw)
  To: ultralinux

On Mon, 29 Apr 2002 13:14:31 -0700 (PDT)
"David S. Miller" <davem@redhat.com> wrote:

> 
> You need to subtract 2047 (the Sparc64 "stack bias") to the stack you
> pass to clone.

I've just got around to playing with this.

Should I be subtract 2047 bytes (unlikely) or 2047 * sizeof (void*) bytes?

The stack is defined as:

#define  STACK_SIZE (1<<15)

static void* stack [STACK_SIZE] ;

And I've tried pass the following pointers to clone ():

	&stack [STACK_SIZE]
	&stack [STACK_SIZE-2047]
	&stack [STACK_SIZE-2048]
	&stack [STACK_SIZE/2]
	((char*) (&stack [STACK_SIZE])) - 2047
	((char*) (&stack [STACK_SIZE])) - 2048

None of the above works without the CLONE_VM flag. The first four do work
with it.

Any further clues?

Erik
-- 
+-----------------------------------------------------------+
  Erik de Castro Lopo  nospam@mega-nerd.com (Yes it's valid)
+-----------------------------------------------------------+
Everyone seems to assume that the current system in America is capitalism. 
I beg to differ. True capitalism does not involve false advertising, 
distribution cartels, or political lobbying for special advantages in the 
market. How can you call Microsoft or the RIAA capitalist, when their main 
business is interfering with a free market? Some of us would like to see a 
*return* to capitalism in this country. - Jim Flynn on Linuxtoday.com

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: SMP Sparc64 : bug in clone?
  2002-03-23  2:15 SMP Sparc64 : bug in clone? Erik de Castro Lopo
                   ` (7 preceding siblings ...)
  2002-05-04 23:08 ` Erik de Castro Lopo
@ 2002-05-06 14:25 ` Noah Beck
  2002-05-06 21:36 ` Erik de Castro Lopo
  9 siblings, 0 replies; 11+ messages in thread
From: Noah Beck @ 2002-05-06 14:25 UTC (permalink / raw)
  To: ultralinux

On Sun, 5 May 2002, Erik de Castro Lopo wrote:

> On Mon, 29 Apr 2002 13:14:31 -0700 (PDT)
> "David S. Miller" <davem@redhat.com> wrote:
> 
> > 
> > You need to subtract 2047 (the Sparc64 "stack bias") to the stack you
> > pass to clone.
> 
> I've just got around to playing with this.
> 
> Should I be subtract 2047 bytes (unlikely) or 2047 * sizeof (void*) bytes?

The stack bias is in bytes, not words.

> The stack is defined as:
> 
> #define  STACK_SIZE (1<<15)
> 
> static void* stack [STACK_SIZE] ;
> 
> And I've tried pass the following pointers to clone ():
> 
> 	&stack [STACK_SIZE]
> 	&stack [STACK_SIZE-2047]
> 	&stack [STACK_SIZE-2048]
> 	&stack [STACK_SIZE/2]
> 	((char*) (&stack [STACK_SIZE])) - 2047
> 	((char*) (&stack [STACK_SIZE])) - 2048
> 
> None of the above works without the CLONE_VM flag. The first four do work
> with it.
> 
> Any further clues?

Does the minimum space for a stack frame (176 bytes) need to be subtracted
as well?  Also, is CANRESTORE=0 in the new thread?

Noah



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: SMP Sparc64 : bug in clone?
  2002-03-23  2:15 SMP Sparc64 : bug in clone? Erik de Castro Lopo
                   ` (8 preceding siblings ...)
  2002-05-06 14:25 ` Noah Beck
@ 2002-05-06 21:36 ` Erik de Castro Lopo
  9 siblings, 0 replies; 11+ messages in thread
From: Erik de Castro Lopo @ 2002-05-06 21:36 UTC (permalink / raw)
  To: ultralinux

On Mon, 6 May 2002 10:25:14 -0400 (EDT)
Noah Beck <noah@noahsark.dyndns.org> wrote:

> On Sun, 5 May 2002, Erik de Castro Lopo wrote:
> 
> > Should I be subtract 2047 bytes (unlikely) or 2047 * sizeof (void*) bytes?
> 
> The stack bias is in bytes, not words.

OK, I tried this:

    #define  STACK_SIZE (1<<15)

    static void* stack [STACK_SIZE] ;

    stack = stack [STACK_SIZE/2] ;
    stack = ((char*) stack) - 2047 ;

and still no go (although it did work with the CLONE_VM flag).

> Does the minimum space for a stack frame (176 bytes) need to be subtracted
> as well?  

Since I'm starting in the middle of a large array, fiddle factors like this
shouldn't be an issue.

> Also, is CANRESTORE=0 in the new thread?

Now thats a possibility. Isn't this supposed to be set up somewhere inside 
the clone() system call?

Erik
-- 
+-----------------------------------------------------------+
  Erik de Castro Lopo  nospam@mega-nerd.com (Yes it's valid)
+-----------------------------------------------------------+
"Don't be fooled by NT/Exchange propaganda. M$ Exchange is 
just plain broken and NT cannot handle the sustained load 
of a high-volume remote mail server"  
-- Eric S. Raymond in the Fetchmail FAQ

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2002-05-06 21:36 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-03-23  2:15 SMP Sparc64 : bug in clone? Erik de Castro Lopo
2002-03-23  3:33 ` Erik de Castro Lopo
2002-03-25  7:53 ` Erik de Castro Lopo
2002-03-25  8:12 ` David S. Miller
2002-03-25  8:35 ` Erik de Castro Lopo
2002-03-25  8:36 ` David S. Miller
2002-04-29 20:14 ` David S. Miller
2002-04-29 20:17 ` Erik de Castro Lopo
2002-05-04 23:08 ` Erik de Castro Lopo
2002-05-06 14:25 ` Noah Beck
2002-05-06 21:36 ` Erik de Castro Lopo

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.