* [patch] clone_startup(), 2.5.31-A0
@ 2002-08-13 15:09 Ingo Molnar
2002-08-13 15:30 ` Linus Torvalds
2002-08-13 15:44 ` Christoph Hellwig
0 siblings, 2 replies; 32+ messages in thread
From: Ingo Molnar @ 2002-08-13 15:09 UTC (permalink / raw)
To: Linus Torvalds; +Cc: linux-kernel
the attached patch implements a new syscall on x86, clone_startup().
The parameters are:
clone_startup(fn, child_stack, flags, tls_desc, pid_addr)
with the help of this syscall glibc's next-generation pthreads code is
able to perform single-syscall thread creation: clone_startup() sets up
the child TLS and writes the child PID back into userspace. (which address
points to the thread control block.).
the child PID has to be written back because otherwise the parent and the
child would have to do expensive and unrobust synchronization. The first
instruction the child executes might as well be a signal handler, and
glibc code needs the TLS and the PID of the thread. There are a number of
workarounds in userspace that can solve this problem without
clone_startup(), but each of them has disadvantages:
- the parent might disable all signals across clone() calls. This adds 3
extra syscalls: 2 in the parent, 1 in the child. The old LinuxThreads
code did something like this.
- glibc could check whether the PID is available and call getpid() if
not, but this introduces runtime overhead.
- glibc might define a wrapper function around signal handlers - this is
both unrobust and might cause compatibility problems.
- glibc could use a mutex with the parent to let the parent fill in the
PID. This causes two extra context-switches if the child is executed
first after clone().
and the kernel can indeed provide a pretty good solution - why not do it
this way.
the TLS setup avoids an extra set_thread_area() syscalls. [Standalone
glibc applications still use set_thread_area(), so this syscall does not
obsolete set_thread_area().]
Implementational issues: i've introduced a new kernel-internal clone flag,
CLONE_STARTUP. In theory we could use the existing clone() syscall and let
applications fill in CLONE_STARTUP - i felt uneasy about this solution
because it introduces a versioned sys_clone() parametering, makings things
messy. But it would undoubtedly work safely and reliably, and it's even a
bit slower than the separated syscalls solution this patch adds.
for performance reasons the kernel does not recognize the -1 TLS
descriptor number, but this is a non-issue, child threads inherit the TLS
index of the parent anyway. Also, the kernel does not allow an 'empty' TLS
to be defined.
Comments?
Ingo
--- linux/arch/i386/kernel/process.c.orig Tue Aug 13 16:24:59 2002
+++ linux/arch/i386/kernel/process.c Tue Aug 13 16:51:51 2002
@@ -559,6 +559,7 @@
unsigned long unused,
struct task_struct * p, struct pt_regs * regs)
{
+ struct thread_struct *t = &p->thread;
struct pt_regs * childregs;
struct task_struct *tsk;
@@ -567,17 +568,42 @@
childregs->eax = 0;
childregs->esp = esp;
- p->thread.esp = (unsigned long) childregs;
- p->thread.esp0 = (unsigned long) (childregs+1);
+ t->esp = (unsigned long) childregs;
+ t->esp0 = (unsigned long) (childregs+1);
+ t->eip = (unsigned long) ret_from_fork;
- p->thread.eip = (unsigned long) ret_from_fork;
-
- savesegment(fs,p->thread.fs);
- savesegment(gs,p->thread.gs);
+ savesegment(fs, t->fs);
+ savesegment(gs, t->gs);
tsk = current;
unlazy_fpu(tsk);
- struct_cpy(&p->thread.i387, &tsk->thread.i387);
+ struct_cpy(&t->i387, &tsk->thread.i387);
+
+ /*
+ * In the clone_startup() case we set up the child's
+ * TLS, and we also put its TID into userspace.
+ */
+ if (clone_flags & CLONE_STARTUP) {
+ struct desc_struct *desc;
+ struct user_desc info;
+ int idx;
+
+ if (copy_from_user(&info, (void *)childregs->esi, sizeof(info)))
+ return -EFAULT;
+ if (LDT_empty(&info))
+ return -EINVAL;
+
+ idx = info.entry_number;
+ if (idx < GDT_ENTRY_TLS_MIN || idx > GDT_ENTRY_TLS_MAX)
+ return -EINVAL;
+
+ desc = t->tls_array + idx - GDT_ENTRY_TLS_MIN;
+ desc->a = LDT_entry_a(&info);
+ desc->b = LDT_entry_b(&info);
+
+ if (put_user(p->pid, (pid_t *)childregs->edx))
+ return -EFAULT;
+ }
if (unlikely(NULL != tsk->thread.ts_io_bitmap)) {
p->thread.ts_io_bitmap = kmalloc(IO_BITMAP_BYTES, GFP_KERNEL);
@@ -744,18 +770,27 @@
return IS_ERR(p) ? PTR_ERR(p) : p->pid;
}
-asmlinkage int sys_clone(struct pt_regs regs)
+static inline int __clone(struct pt_regs *regs, unsigned long clone_flags)
{
struct task_struct *p;
- unsigned long clone_flags;
unsigned long newsp;
- clone_flags = regs.ebx;
- newsp = regs.ecx;
+ clone_flags = regs->ebx;
+ newsp = regs->ecx;
if (!newsp)
- newsp = regs.esp;
- p = do_fork(clone_flags & ~CLONE_IDLETASK, newsp, ®s, 0);
+ newsp = regs->esp;
+ p = do_fork(clone_flags & ~CLONE_IDLETASK, newsp, regs, 0);
return IS_ERR(p) ? PTR_ERR(p) : p->pid;
+}
+
+asmlinkage int sys_clone(struct pt_regs regs)
+{
+ return __clone(®s, regs.ebx & ~CLONE_STARTUP);
+}
+
+asmlinkage int sys_clone_startup(struct pt_regs regs)
+{
+ return __clone(®s, regs.ebx | CLONE_STARTUP);
}
/*
--- linux/arch/i386/kernel/entry.S.orig Tue Aug 13 16:50:01 2002
+++ linux/arch/i386/kernel/entry.S Tue Aug 13 16:49:48 2002
@@ -754,6 +754,7 @@
.long sys_sched_getaffinity
.long sys_set_thread_area
.long sys_get_thread_area
+ .long sys_clone_startup /* 245 */
.rept NR_syscalls-(.-sys_call_table)/4
.long sys_ni_syscall
--- linux/include/linux/sched.h.orig Tue Aug 13 16:48:27 2002
+++ linux/include/linux/sched.h Tue Aug 13 16:49:09 2002
@@ -45,6 +45,7 @@
#define CLONE_THREAD 0x00010000 /* Same thread group? */
#define CLONE_NEWNS 0x00020000 /* New namespace group? */
#define CLONE_SYSVSEM 0x00040000 /* share system V SEM_UNDO semantics */
+#define CLONE_STARTUP 0x00080000 /* create child state */
#define CLONE_SIGNAL (CLONE_SIGHAND | CLONE_THREAD)
--- linux/include/asm-i386/unistd.h.orig Tue Aug 13 16:50:11 2002
+++ linux/include/asm-i386/unistd.h Tue Aug 13 16:50:28 2002
@@ -249,6 +249,7 @@
#define __NR_sched_getaffinity 242
#define __NR_set_thread_area 243
#define __NR_get_thread_area 244
+#define __NR_clone_startup 245
/* user-visible error numbers are in the range -1 - -124: see <asm-i386/errno.h> */
^ permalink raw reply [flat|nested] 32+ messages in thread* Re: [patch] clone_startup(), 2.5.31-A0 2002-08-13 15:09 [patch] clone_startup(), 2.5.31-A0 Ingo Molnar @ 2002-08-13 15:30 ` Linus Torvalds 2002-08-13 15:44 ` Christoph Hellwig 1 sibling, 0 replies; 32+ messages in thread From: Linus Torvalds @ 2002-08-13 15:30 UTC (permalink / raw) To: Ingo Molnar; +Cc: linux-kernel On Tue, 13 Aug 2002, Ingo Molnar wrote: > > the attached patch implements a new syscall on x86, clone_startup(). Hmm. Please don't do that CLONE_STARTUP flag thing. Just do the code inside the sys_clone_startup(), instead of dynamically adding a new flag. And that code looks like it returns EFAULT without freeing the stuff it has allocated. Also, "caching" &tsk->thread only adds code - even if it makes the sources slightly shorter. It adds a register that is just a constant offset from another register. Other than that the thing seems to make sense. Linus ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch] clone_startup(), 2.5.31-A0 2002-08-13 15:09 [patch] clone_startup(), 2.5.31-A0 Ingo Molnar 2002-08-13 15:30 ` Linus Torvalds @ 2002-08-13 15:44 ` Christoph Hellwig 2002-08-13 16:01 ` Linus Torvalds ` (2 more replies) 1 sibling, 3 replies; 32+ messages in thread From: Christoph Hellwig @ 2002-08-13 15:44 UTC (permalink / raw) To: Ingo Molnar; +Cc: Linus Torvalds, linux-kernel On Tue, Aug 13, 2002 at 05:09:03PM +0200, Ingo Molnar wrote: > > the attached patch implements a new syscall on x86, clone_startup(). > The parameters are: > > clone_startup(fn, child_stack, flags, tls_desc, pid_addr) First the name souns horrible. What about spawn_thread or create_thread instead? it's not our good old clone and not a lookalike, it's some pthreadish monster.. > with the help of this syscall glibc's next-generation pthreads code have you discussed this code with IBM's pthread group? I think we don't want to add a new syscall that is only useful to glibc.. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch] clone_startup(), 2.5.31-A0 2002-08-13 15:44 ` Christoph Hellwig @ 2002-08-13 16:01 ` Linus Torvalds 2002-08-13 16:05 ` Christoph Hellwig 2002-08-13 16:09 ` [patch] clone_startup(), 2.5.31-A0 Erik Andersen 2002-08-13 17:27 ` Ingo Molnar 2 siblings, 1 reply; 32+ messages in thread From: Linus Torvalds @ 2002-08-13 16:01 UTC (permalink / raw) To: Christoph Hellwig; +Cc: Ingo Molnar, linux-kernel On Tue, 13 Aug 2002, Christoph Hellwig wrote: > > First the name souns horrible. What about spawn_thread or create_thread > instead? it's not our good old clone and not a lookalike, it's some > pthreadish monster.. This one definitely isn't a pthread-specific problem. The old UNIX fork() semantics for <pid> returning really are fairly broken, since the notion of returning the pid in a local register is inherently racy for _anything_ that wants to maintain a list of its children and needs to access the list in the SIGCHLD handler. (Simple explanation: imagine a child that exits so quickly that the parent hasn't even had time to do the "store pid into the array" before the parent is already signalled with SIGCHLD. Yes, this happens, and yes, it's a real problem. It wasn't that long ago that bash would _crash_ on this). With a system call like this, the parent can - allocate the new child information block first - do the fork with the pid pointer pointing into the child information block. - when SIGCHLD for that child comes in, all state will have been set up with no races. Note how this isn't even thread-specific: it very much would work with a regular fork-like approach and a standard shell. > > with the help of this syscall glibc's next-generation pthreads code > > have you discussed this code with IBM's pthread group? I think we don't > want to add a new syscall that is only useful to glibc.. I agree that the name is a bit ugly, but this is a system call that I actually think is fundamentally useful (ie I can see how it would be totally usable quite outside any specific library implementation issues). Talking it over with the IBM threading guys is still worthwhile, though. There may be _other_ information that the IBM guys have problems with, and it could easily be that the interface we really want is even more generic. Linus ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch] clone_startup(), 2.5.31-A0 2002-08-13 16:01 ` Linus Torvalds @ 2002-08-13 16:05 ` Christoph Hellwig 2002-08-13 16:20 ` Linus Torvalds 0 siblings, 1 reply; 32+ messages in thread From: Christoph Hellwig @ 2002-08-13 16:05 UTC (permalink / raw) To: Linus Torvalds; +Cc: Ingo Molnar, linux-kernel On Tue, Aug 13, 2002 at 09:01:56AM -0700, Linus Torvalds wrote: > > First the name souns horrible. What about spawn_thread or create_thread > > instead? it's not our good old clone and not a lookalike, it's some > > pthreadish monster.. > > This one definitely isn't a pthread-specific problem. The old UNIX fork() > semantics for <pid> returning really are fairly broken, since the notion > of returning the pid in a local register is inherently racy for _anything_ > that wants to maintain a list of its children and needs to access the list > in the SIGCHLD handler. The TLS setting makes it pretty pthread-specific, though (or at least thread-specific). Also the fn parameter makes it very different from both clone and fork. What about spawn() if you dislike a thread in the name? ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch] clone_startup(), 2.5.31-A0 2002-08-13 16:05 ` Christoph Hellwig @ 2002-08-13 16:20 ` Linus Torvalds 2002-08-13 18:32 ` [patch] CLONE_SETTLS, CLONE_SETTID, 2.5.31-BK Ingo Molnar 0 siblings, 1 reply; 32+ messages in thread From: Linus Torvalds @ 2002-08-13 16:20 UTC (permalink / raw) To: Christoph Hellwig; +Cc: Ingo Molnar, linux-kernel On Tue, 13 Aug 2002, Christoph Hellwig wrote: > > This one definitely isn't a pthread-specific problem. The old UNIX fork() > > semantics for <pid> returning really are fairly broken, since the notion > > of returning the pid in a local register is inherently racy for _anything_ > > that wants to maintain a list of its children and needs to access the list > > in the SIGCHLD handler. > > The TLS setting makes it pretty pthread-specific, though (or at least > thread-specific). That's certainly true, and potentially worth a clone() flag of its own, quite independently of the startup thing ("CLONE_SETTLS"). Ingo, how about breaking it down that way? > Also the fn parameter makes it very different from > both clone and fork. The fn thing is a purely user-mode effect, it's not there in the system call. Which is true of a regular clone() too. > What about spawn() if you dislike a thread in the name? spawn() to me implies doing the equivalent of "vfork()+execve()", the way most non-UNIX OS's do new process creation. I don't dislike the "thread" name too much, but I want this to be generic, and seen as such. The same way the original clone() was a proper superset of fork(), this needs to be a proper superset, not just in name, but in _usage_. If it's useful for only one thing, that's not good. Linus ^ permalink raw reply [flat|nested] 32+ messages in thread
* [patch] CLONE_SETTLS, CLONE_SETTID, 2.5.31-BK 2002-08-13 16:20 ` Linus Torvalds @ 2002-08-13 18:32 ` Ingo Molnar 2002-08-13 18:41 ` Linus Torvalds 2002-08-15 22:38 ` [patch] complain about unknown CLONE_* flags Jamie Lokier 0 siblings, 2 replies; 32+ messages in thread From: Ingo Molnar @ 2002-08-13 18:32 UTC (permalink / raw) To: Linus Torvalds; +Cc: Christoph Hellwig, linux-kernel okay, the attached patch gets rid of clone_startup() and adds two new clone() flags instead: CLONE_SETTLS => if present then the third clone() syscall parameter is the new TLS. CLONE_SETTID => if present then the child TID is written to the address specified by the fourth clone() parameter. the new parameters are handled in a safe way, clone() returns -EFAULT or -EINVAL if there's some problem with them. No current code is affected by these new flags. Patch was testbooted on 2.5.31-BK-current. Ingo --- linux/arch/i386/kernel/process.c.orig Tue Aug 13 20:10:25 2002 +++ linux/arch/i386/kernel/process.c Tue Aug 13 20:30:11 2002 @@ -559,6 +559,7 @@ unsigned long unused, struct task_struct * p, struct pt_regs * regs) { + struct thread_struct *t = &p->thread; struct pt_regs * childregs; struct task_struct *tsk; @@ -567,17 +568,45 @@ childregs->eax = 0; childregs->esp = esp; - p->thread.esp = (unsigned long) childregs; - p->thread.esp0 = (unsigned long) (childregs+1); + t->esp = (unsigned long) childregs; + t->esp0 = (unsigned long) (childregs+1); + t->eip = (unsigned long) ret_from_fork; - p->thread.eip = (unsigned long) ret_from_fork; - - savesegment(fs,p->thread.fs); - savesegment(gs,p->thread.gs); + savesegment(fs, t->fs); + savesegment(gs, t->gs); tsk = current; unlazy_fpu(tsk); - struct_cpy(&p->thread.i387, &tsk->thread.i387); + struct_cpy(&t->i387, &tsk->thread.i387); + + /* + * Set a new TLS for the child thread? + */ + if (clone_flags & CLONE_SETTLS) { + struct desc_struct *desc; + struct user_desc info; + int idx; + + if (copy_from_user(&info, (void *)childregs->esi, sizeof(info))) + return -EFAULT; + if (LDT_empty(&info)) + return -EINVAL; + + idx = info.entry_number; + if (idx < GDT_ENTRY_TLS_MIN || idx > GDT_ENTRY_TLS_MAX) + return -EINVAL; + + desc = t->tls_array + idx - GDT_ENTRY_TLS_MIN; + desc->a = LDT_entry_a(&info); + desc->b = LDT_entry_b(&info); + } + + /* + * Notify the child of the TID? + */ + if (clone_flags & CLONE_SETTLS) + if (put_user(p->pid, (pid_t *)childregs->edx)) + return -EFAULT; if (unlikely(NULL != tsk->thread.ts_io_bitmap)) { p->thread.ts_io_bitmap = kmalloc(IO_BITMAP_BYTES, GFP_KERNEL); @@ -747,8 +776,7 @@ asmlinkage int sys_clone(struct pt_regs regs) { struct task_struct *p; - unsigned long clone_flags; - unsigned long newsp; + unsigned long clone_flags, newsp; clone_flags = regs.ebx; newsp = regs.ecx; --- linux/include/linux/sched.h.orig Tue Aug 13 19:55:06 2002 +++ linux/include/linux/sched.h Tue Aug 13 20:27:23 2002 @@ -45,6 +45,8 @@ #define CLONE_THREAD 0x00010000 /* Same thread group? */ #define CLONE_NEWNS 0x00020000 /* New namespace group? */ #define CLONE_SYSVSEM 0x00040000 /* share system V SEM_UNDO semantics */ +#define CLONE_SETTLS 0x00080000 /* create a new TLS for the child */ +#define CLONE_SETTID 0x00100000 /* write the TID back to userspace */ #define CLONE_SIGNAL (CLONE_SIGHAND | CLONE_THREAD) ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch] CLONE_SETTLS, CLONE_SETTID, 2.5.31-BK 2002-08-13 18:32 ` [patch] CLONE_SETTLS, CLONE_SETTID, 2.5.31-BK Ingo Molnar @ 2002-08-13 18:41 ` Linus Torvalds 2002-08-13 19:09 ` Ingo Molnar 2002-08-13 19:35 ` Ingo Molnar 2002-08-15 22:38 ` [patch] complain about unknown CLONE_* flags Jamie Lokier 1 sibling, 2 replies; 32+ messages in thread From: Linus Torvalds @ 2002-08-13 18:41 UTC (permalink / raw) To: Ingo Molnar; +Cc: Christoph Hellwig, linux-kernel On Tue, 13 Aug 2002, Ingo Molnar wrote: > > CLONE_SETTLS => if present then the third clone() syscall parameter > is the new TLS. > > CLONE_SETTID => if present then the child TID is written to the > address specified by the fourth clone() parameter. Except you actually test the CLONE_SETTLS bit.. This looks basically ok, although that "struct thread_struct *t" still serves no useful purpose.. Linus ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch] CLONE_SETTLS, CLONE_SETTID, 2.5.31-BK 2002-08-13 18:41 ` Linus Torvalds @ 2002-08-13 19:09 ` Ingo Molnar 2002-08-13 19:41 ` Linus Torvalds 2002-08-13 19:35 ` Ingo Molnar 1 sibling, 1 reply; 32+ messages in thread From: Ingo Molnar @ 2002-08-13 19:09 UTC (permalink / raw) To: Linus Torvalds; +Cc: Christoph Hellwig, linux-kernel On Tue, 13 Aug 2002, Linus Torvalds wrote: > > CLONE_SETTLS => if present then the third clone() syscall parameter > > is the new TLS. > > > > CLONE_SETTID => if present then the child TID is written to the > > address specified by the fourth clone() parameter. > > Except you actually test the CLONE_SETTLS bit.. We've tested clone_startup() with real threads on a 2.4-backported version of yesterday's final TLS API quite extensively, and it works as expected. (as we've tested earlier incarnations of the TLS API and code as well.) This portion of the code is not changed by CLONE_SETTLS - so it should work equally well, unless i've done something stupid ... I'll backport this and will fit glibc to the new API later today. (Ulrich's on LW) > This looks basically ok, although that "struct thread_struct *t" still > serves no useful purpose.. this was just me shortening some code a bit. I've attached a new patch that gets rid of those unrelated changes and does the CLONE_SETTLS / CLONE_SETTID bits only. Ingo --- linux/arch/i386/kernel/process.c.orig Tue Aug 13 21:08:01 2002 +++ linux/arch/i386/kernel/process.c Tue Aug 13 21:10:36 2002 @@ -579,6 +579,35 @@ unlazy_fpu(tsk); struct_cpy(&p->thread.i387, &tsk->thread.i387); + /* + * Set a new TLS for the child thread? + */ + if (clone_flags & CLONE_SETTLS) { + struct desc_struct *desc; + struct user_desc info; + int idx; + + if (copy_from_user(&info, (void *)childregs->esi, sizeof(info))) + return -EFAULT; + if (LDT_empty(&info)) + return -EINVAL; + + idx = info.entry_number; + if (idx < GDT_ENTRY_TLS_MIN || idx > GDT_ENTRY_TLS_MAX) + return -EINVAL; + + desc = p->thread.tls_array + idx - GDT_ENTRY_TLS_MIN; + desc->a = LDT_entry_a(&info); + desc->b = LDT_entry_b(&info); + } + + /* + * Notify the child of the TID? + */ + if (clone_flags & CLONE_SETTLS) + if (put_user(p->pid, (pid_t *)childregs->edx)) + return -EFAULT; + if (unlikely(NULL != tsk->thread.ts_io_bitmap)) { p->thread.ts_io_bitmap = kmalloc(IO_BITMAP_BYTES, GFP_KERNEL); if (!p->thread.ts_io_bitmap) --- linux/include/linux/sched.h.orig Tue Aug 13 21:01:51 2002 +++ linux/include/linux/sched.h Tue Aug 13 21:08:56 2002 @@ -45,6 +45,8 @@ #define CLONE_THREAD 0x00010000 /* Same thread group? */ #define CLONE_NEWNS 0x00020000 /* New namespace group? */ #define CLONE_SYSVSEM 0x00040000 /* share system V SEM_UNDO semantics */ +#define CLONE_SETTLS 0x00080000 /* create a new TLS for the child */ +#define CLONE_SETTID 0x00100000 /* write the TID back to userspace */ #define CLONE_SIGNAL (CLONE_SIGHAND | CLONE_THREAD) ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch] CLONE_SETTLS, CLONE_SETTID, 2.5.31-BK 2002-08-13 19:09 ` Ingo Molnar @ 2002-08-13 19:41 ` Linus Torvalds 2002-08-13 19:41 ` Ingo Molnar 0 siblings, 1 reply; 32+ messages in thread From: Linus Torvalds @ 2002-08-13 19:41 UTC (permalink / raw) To: Ingo Molnar; +Cc: Christoph Hellwig, linux-kernel On Tue, 13 Aug 2002, Ingo Molnar wrote: > > > > Except you actually test the CLONE_SETTLS bit.. > > We've tested clone_startup() with real threads on a 2.4-backported version > of yesterday's final TLS API quite extensively, and it works as expected. > (as we've tested earlier incarnations of the TLS API and code as well.) It's still buggy, and you didn't read what I wrote. + /* + * Notify the child of the TID? + */ + if (clone_flags & CLONE_SETTLS) + if (put_user(p->pid, (pid_t *)childregs->edx)) + return -EFAULT; Find the bug. Find the sentence I wrote that pointed it out last time. Notice how your testing did not find it, since you always just set both flags. Linus ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch] CLONE_SETTLS, CLONE_SETTID, 2.5.31-BK 2002-08-13 19:41 ` Linus Torvalds @ 2002-08-13 19:41 ` Ingo Molnar 0 siblings, 0 replies; 32+ messages in thread From: Ingo Molnar @ 2002-08-13 19:41 UTC (permalink / raw) To: Linus Torvalds; +Cc: Christoph Hellwig, linux-kernel On Tue, 13 Aug 2002, Linus Torvalds wrote: > It's still buggy, and you didn't read what I wrote. yeah - paperbag time. > Notice how your testing did not find it, since you always just set both > flags. yes, the test used the old single CLONE_STARTUP flag. Ingo ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch] CLONE_SETTLS, CLONE_SETTID, 2.5.31-BK 2002-08-13 18:41 ` Linus Torvalds 2002-08-13 19:09 ` Ingo Molnar @ 2002-08-13 19:35 ` Ingo Molnar 1 sibling, 0 replies; 32+ messages in thread From: Ingo Molnar @ 2002-08-13 19:35 UTC (permalink / raw) To: Linus Torvalds; +Cc: Christoph Hellwig, linux-kernel On Tue, 13 Aug 2002, Linus Torvalds wrote: > > CLONE_SETTID => if present then the child TID is written to the > > address specified by the fourth clone() parameter. > > Except you actually test the CLONE_SETTLS bit.. doh - Joe Perches finally told me how to parse this sentence. Correct patch is: --- linux/arch/i386/kernel/process.c.orig Tue Aug 13 21:08:01 2002 +++ linux/arch/i386/kernel/process.c Tue Aug 13 21:10:36 2002 @@ -579,6 +579,35 @@ unlazy_fpu(tsk); struct_cpy(&p->thread.i387, &tsk->thread.i387); + /* + * Set a new TLS for the child thread? + */ + if (clone_flags & CLONE_SETTLS) { + struct desc_struct *desc; + struct user_desc info; + int idx; + + if (copy_from_user(&info, (void *)childregs->esi, sizeof(info))) + return -EFAULT; + if (LDT_empty(&info)) + return -EINVAL; + + idx = info.entry_number; + if (idx < GDT_ENTRY_TLS_MIN || idx > GDT_ENTRY_TLS_MAX) + return -EINVAL; + + desc = p->thread.tls_array + idx - GDT_ENTRY_TLS_MIN; + desc->a = LDT_entry_a(&info); + desc->b = LDT_entry_b(&info); + } + + /* + * Notify the child of the TID? + */ + if (clone_flags & CLONE_SETTID) + if (put_user(p->pid, (pid_t *)childregs->edx)) + return -EFAULT; + if (unlikely(NULL != tsk->thread.ts_io_bitmap)) { p->thread.ts_io_bitmap = kmalloc(IO_BITMAP_BYTES, GFP_KERNEL); if (!p->thread.ts_io_bitmap) --- linux/include/linux/sched.h.orig Tue Aug 13 21:01:51 2002 +++ linux/include/linux/sched.h Tue Aug 13 21:08:56 2002 @@ -45,6 +45,8 @@ #define CLONE_THREAD 0x00010000 /* Same thread group? */ #define CLONE_NEWNS 0x00020000 /* New namespace group? */ #define CLONE_SYSVSEM 0x00040000 /* share system V SEM_UNDO semantics */ +#define CLONE_SETTLS 0x00080000 /* create a new TLS for the child */ +#define CLONE_SETTID 0x00100000 /* write the TID back to userspace */ #define CLONE_SIGNAL (CLONE_SIGHAND | CLONE_THREAD) ^ permalink raw reply [flat|nested] 32+ messages in thread
* [patch] complain about unknown CLONE_* flags 2002-08-13 18:32 ` [patch] CLONE_SETTLS, CLONE_SETTID, 2.5.31-BK Ingo Molnar 2002-08-13 18:41 ` Linus Torvalds @ 2002-08-15 22:38 ` Jamie Lokier 2002-08-16 10:17 ` Ingo Molnar 1 sibling, 1 reply; 32+ messages in thread From: Jamie Lokier @ 2002-08-15 22:38 UTC (permalink / raw) To: Linus Torvalds; +Cc: Ingo Molnar, linux-kernel About new clone() flags... One of the obvious things for a thread library is to: (a) try clone() with lots of snazzy flags (b) if that returns -EINVAL, we must be running on an older kernel; try with fewer flags and more workarounds However, I don't see any code in sys_clone() that rejects a call that specifies unknown flags. So, code that uses e.g. CLONE_SETTID will appear to run perfectly well on an old kernel... except that it will behave incorrectly. That leads to having to write some silly test for each feature prior to using it, instead of trying it and falling back. E.g. I'd need to do the silly signal-blocking workaround when creating the second thread in a program, just to find out whether CLONE_SETTID actually worked. Either that, or check the kernel version. Ingo, how do you handle this sort of backward compatibility in your latest pthreads library, or don't you do backward compatibility? For future-proofing, here's a patch: diff -u linux-2.5/kernel/fork.c.orig linux-2.5/kernel/fork.c --- linux-2.5/kernel/fork.c +++ linux-2.5/kernel/fork.c Thu Aug 15 23:35:00 2002 @@ -619,6 +619,11 @@ struct task_struct *p = NULL; struct completion vfork; + if ((clone_flags & ~(0UL|CSIGNAL|CLONE_VM|CLONE_FS|CLONE_FILES + |CLONE_SIGHAND|CLONE_PID|CLONE_PTRACE|CLONE_VFORK + |CLONE_PARENT|CLONE_THREAD))) + return ERR_PTR (-EINVAL); + if ((clone_flags & (CLONE_NEWNS|CLONE_FS)) == (CLONE_NEWNS|CLONE_FS)) return ERR_PTR(-EINVAL); ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch] complain about unknown CLONE_* flags 2002-08-15 22:38 ` [patch] complain about unknown CLONE_* flags Jamie Lokier @ 2002-08-16 10:17 ` Ingo Molnar 2001-11-02 5:55 ` Pavel Machek 0 siblings, 1 reply; 32+ messages in thread From: Ingo Molnar @ 2002-08-16 10:17 UTC (permalink / raw) To: Jamie Lokier; +Cc: Linus Torvalds, linux-kernel On Thu, 15 Aug 2002, Jamie Lokier wrote: > Ingo, how do you handle this sort of backward compatibility in your > latest pthreads library, or don't you do backward compatibility? [btw., it's not me doing it but Ulrich Drepper. I'm mostly doing the 'lets find out how the kernel could help' side of things.] the proper way of doing this is a way of getting fundamental kernel capabilities, not the 'probing' of the kernel in various ways. Glibc starts looking like old ISA drivers trying to do nonintrusive autodetection: 'lets try this port carefully without disturbing state, perhaps this feature is there'. one way to handle this cleanly would be to add a kernel capabilities bitmask to sysconf(), and backport this to all mainstream Linux kernels where current glibc is supposed to run. Support for something like this would be added to glibc the day it's in the main kernel. Eg. glibc has to symlink in the old pthreads library upon bootup, if the feature-set does not enable the more integrated threading library. (nevertheless your patch makes good sense, from an API-cleanliness POV.) Ingo ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch] complain about unknown CLONE_* flags 2002-08-16 10:17 ` Ingo Molnar @ 2001-11-02 5:55 ` Pavel Machek 0 siblings, 0 replies; 32+ messages in thread From: Pavel Machek @ 2001-11-02 5:55 UTC (permalink / raw) To: Ingo Molnar; +Cc: Jamie Lokier, Linus Torvalds, linux-kernel Hi! > > Ingo, how do you handle this sort of backward compatibility in your > > latest pthreads library, or don't you do backward compatibility? > > [btw., it's not me doing it but Ulrich Drepper. I'm mostly doing the 'lets > find out how the kernel could help' side of things.] > > the proper way of doing this is a way of getting fundamental kernel > capabilities, not the 'probing' of the kernel in various ways. Glibc > starts looking like old ISA drivers trying to do nonintrusive > autodetection: 'lets try this port carefully without disturbing state, > perhaps this feature is there'. > > one way to handle this cleanly would be to add a kernel capabilities > bitmask to sysconf(), and backport this to all mainstream Linux kernels I'm afraid that bitmask will get out-of-date. Doing EINVAL seems like a good way to do this. [Hmm, perhaps we need CLONE_DONT which only checks capabilities and returns? It still seems better than sysconf...] Pavel -- Philips Velo 1: 1"x4"x8", 300gram, 60, 12MB, 40bogomips, linux, mutt, details at http://atrey.karlin.mff.cuni.cz/~pavel/velo/index.html. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch] clone_startup(), 2.5.31-A0 2002-08-13 15:44 ` Christoph Hellwig 2002-08-13 16:01 ` Linus Torvalds @ 2002-08-13 16:09 ` Erik Andersen 2002-08-13 16:11 ` Christoph Hellwig 2002-08-13 17:19 ` Ingo Molnar 2002-08-13 17:27 ` Ingo Molnar 2 siblings, 2 replies; 32+ messages in thread From: Erik Andersen @ 2002-08-13 16:09 UTC (permalink / raw) To: Christoph Hellwig, Ingo Molnar, Linus Torvalds, linux-kernel On Tue Aug 13, 2002 at 04:44:15PM +0100, Christoph Hellwig wrote: > On Tue, Aug 13, 2002 at 05:09:03PM +0200, Ingo Molnar wrote: > > > > the attached patch implements a new syscall on x86, clone_startup(). > > The parameters are: > > > > clone_startup(fn, child_stack, flags, tls_desc, pid_addr) > > First the name souns horrible. What about spawn_thread or create_thread > instead? it's not our good old clone and not a lookalike, it's some > pthreadish monster.. How about "clone2"? -Erik -- Erik B. Andersen http://codepoet-consulting.com/ --This message was written using 73% post-consumer electrons-- ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch] clone_startup(), 2.5.31-A0 2002-08-13 16:09 ` [patch] clone_startup(), 2.5.31-A0 Erik Andersen @ 2002-08-13 16:11 ` Christoph Hellwig 2002-08-13 16:32 ` David Mosberger ` (2 more replies) 2002-08-13 17:19 ` Ingo Molnar 1 sibling, 3 replies; 32+ messages in thread From: Christoph Hellwig @ 2002-08-13 16:11 UTC (permalink / raw) To: Erik Andersen, Ingo Molnar, Linus Torvalds, linux-kernel On Tue, Aug 13, 2002 at 10:09:24AM -0600, Erik Andersen wrote: > > First the name souns horrible. What about spawn_thread or create_thread > > instead? it's not our good old clone and not a lookalike, it's some > > pthreadish monster.. > > How about "clone2"? Already used by ia64 for a hybrid between the good old clone and the new monster :) And as I said again, it doesn't really clone - it starts something different, namely the fn argument. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch] clone_startup(), 2.5.31-A0 2002-08-13 16:11 ` Christoph Hellwig @ 2002-08-13 16:32 ` David Mosberger 2002-08-13 16:44 ` John Levon 2002-08-13 19:52 ` H. Peter Anvin 2002-08-13 16:56 ` Ruth Ivimey-Cook 2002-08-13 17:42 ` Richard B. Johnson 2 siblings, 2 replies; 32+ messages in thread From: David Mosberger @ 2002-08-13 16:32 UTC (permalink / raw) To: Christoph Hellwig Cc: Erik Andersen, Ingo Molnar, Linus Torvalds, linux-kernel >>>>> On Tue, 13 Aug 2002 17:11:38 +0100, Christoph Hellwig <hch@infradead.org> said: Chris> On Tue, Aug 13, 2002 at 10:09:24AM -0600, Erik Andersen Chris> wrote: >> > First the name souns horrible. What about spawn_thread or >> create_thread > instead? it's not our good old clone and not a >> lookalike, it's some > pthreadish monster.. >> >> How about "clone2"? Chris> Already used by ia64 for a hybrid between the good old clone Chris> and the new monster :) The original clone() system call was misdesigned. Even if you chose to ignore ia64, clone() cannot be used by portable applications to specify a stack (think "stack-growth direction"). clone() should have specified a stack memory *area* from the beginning. (UNIX got this right from the beginning, see, e.g., sigaltstack()). --david ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch] clone_startup(), 2.5.31-A0 2002-08-13 16:32 ` David Mosberger @ 2002-08-13 16:44 ` John Levon 2002-08-13 19:52 ` H. Peter Anvin 1 sibling, 0 replies; 32+ messages in thread From: John Levon @ 2002-08-13 16:44 UTC (permalink / raw) To: linux-kernel On Tue, Aug 13, 2002 at 09:32:50AM -0700, David Mosberger wrote: > clone() should have specified a stack memory *area* from the > beginning. (UNIX got this right from the beginning, see, e.g., > sigaltstack()). doesn't sigstack() predate that :) regards john -- "It is unbecoming for young men to utter maxims." - Aristotle ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch] clone_startup(), 2.5.31-A0 2002-08-13 16:32 ` David Mosberger 2002-08-13 16:44 ` John Levon @ 2002-08-13 19:52 ` H. Peter Anvin 2002-08-13 20:19 ` David Mosberger 1 sibling, 1 reply; 32+ messages in thread From: H. Peter Anvin @ 2002-08-13 19:52 UTC (permalink / raw) To: linux-kernel Followup to: <15705.13490.713278.815154@napali.hpl.hp.com> By author: David Mosberger <davidm@napali.hpl.hp.com> In newsgroup: linux.dev.kernel > > The original clone() system call was misdesigned. Even if you chose > to ignore ia64, clone() cannot be used by portable applications to > specify a stack (think "stack-growth direction"). > This is something that can be handled in userspace on most architectures. The rest (ia64) can pass all the information on to kernel space. The clone() system call cannot be used by portable applications *AT ALL*, since it inherently needs a user-space assembly wrapper. It's just a matter of how you define the interface to the assembly wrapper. -hpa -- <hpa@transmeta.com> at work, <hpa@zytor.com> in private! "Unix gives you enough rope to shoot yourself in the foot." http://www.zytor.com/~hpa/puzzle.txt <amsp@zytor.com> ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch] clone_startup(), 2.5.31-A0 2002-08-13 19:52 ` H. Peter Anvin @ 2002-08-13 20:19 ` David Mosberger 2002-08-13 20:27 ` H. Peter Anvin 0 siblings, 1 reply; 32+ messages in thread From: David Mosberger @ 2002-08-13 20:19 UTC (permalink / raw) To: H. Peter Anvin; +Cc: linux-kernel >>>>> On 13 Aug 2002 12:52:11 -0700, "H. Peter Anvin" <hpa@zytor.com> said: >> The clone() system call cannot be used by portable applications >> *AT ALL*, since it inherently needs a user-space assembly >> wrapper. It's just a matter of how you define the interface to >> the assembly wrapper. The function call issue is a separate question though. If you want to argue that the libc clone() wrapper is broken, fine by me. (BTW: wasn't that you who complained about platform-specific Linux syscalls recently? ;-) --david ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch] clone_startup(), 2.5.31-A0 2002-08-13 20:19 ` David Mosberger @ 2002-08-13 20:27 ` H. Peter Anvin 2002-08-13 20:52 ` David Mosberger 0 siblings, 1 reply; 32+ messages in thread From: H. Peter Anvin @ 2002-08-13 20:27 UTC (permalink / raw) To: davidm; +Cc: linux-kernel David Mosberger wrote: >>>>>>On 13 Aug 2002 12:52:11 -0700, "H. Peter Anvin" <hpa@zytor.com> said: >>>>> > > >> The clone() system call cannot be used by portable applications > >> *AT ALL*, since it inherently needs a user-space assembly > >> wrapper. It's just a matter of how you define the interface to > >> the assembly wrapper. > > The function call issue is a separate question though. If you want to > argue that the libc clone() wrapper is broken, fine by me. (BTW: > wasn't that you who complained about platform-specific Linux syscalls > recently? ;-) > I was, however, the flaws that you complained of had nothing to do with the syscall -- it's all in the syscall wrapper (which is required for clone(), like it or not.) -hpa ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch] clone_startup(), 2.5.31-A0 2002-08-13 20:27 ` H. Peter Anvin @ 2002-08-13 20:52 ` David Mosberger 2002-08-13 20:57 ` H. Peter Anvin 0 siblings, 1 reply; 32+ messages in thread From: David Mosberger @ 2002-08-13 20:52 UTC (permalink / raw) To: H. Peter Anvin; +Cc: davidm, linux-kernel >>>>> On Tue, 13 Aug 2002 13:27:02 -0700, "H. Peter Anvin" <hpa@zytor.com> said: >> I was, however, the flaws that you complained of had nothing to >> do with the syscall -- it's all in the syscall wrapper (which is >> required for clone(), like it or not.) The issue is not whether a wrapper is needed or not. My point is that it is cleaner to always describe stack areas as memory areas (e.g., as a base/size pair). Note that this is effectively what's happening in the platform-independent part of the kernel today. --david ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch] clone_startup(), 2.5.31-A0 2002-08-13 20:52 ` David Mosberger @ 2002-08-13 20:57 ` H. Peter Anvin 0 siblings, 0 replies; 32+ messages in thread From: H. Peter Anvin @ 2002-08-13 20:57 UTC (permalink / raw) To: davidm; +Cc: linux-kernel David Mosberger wrote: >>>>>>On Tue, 13 Aug 2002 13:27:02 -0700, "H. Peter Anvin" <hpa@zytor.com> said: >>>>> > > >> I was, however, the flaws that you complained of had nothing to > >> do with the syscall -- it's all in the syscall wrapper (which is > >> required for clone(), like it or not.) > > The issue is not whether a wrapper is needed or not. > > My point is that it is cleaner to always describe stack areas as > memory areas (e.g., as a base/size pair). Note that this is > effectively what's happening in the platform-independent part of the > kernel today. > Right, but all the stack handling is a matter of the wrapper. sys_clone doesn't affect the stack at all. -hpa ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch] clone_startup(), 2.5.31-A0 2002-08-13 16:11 ` Christoph Hellwig 2002-08-13 16:32 ` David Mosberger @ 2002-08-13 16:56 ` Ruth Ivimey-Cook 2002-08-13 17:42 ` Richard B. Johnson 2 siblings, 0 replies; 32+ messages in thread From: Ruth Ivimey-Cook @ 2002-08-13 16:56 UTC (permalink / raw) To: Christoph Hellwig Cc: Erik Andersen, Ingo Molnar, Linus Torvalds, linux-kernel On Tue, 13 Aug 2002, Christoph Hellwig wrote: >On Tue, Aug 13, 2002 at 10:09:24AM -0600, Erik Andersen wrote: > >And as I said again, it doesn't really clone - it starts something >different, namely the fn argument. How about: sys_launch or sys_run_process Ruth -- Ruth Ivimey-Cook Software engineer and technical writer. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch] clone_startup(), 2.5.31-A0 2002-08-13 16:11 ` Christoph Hellwig 2002-08-13 16:32 ` David Mosberger 2002-08-13 16:56 ` Ruth Ivimey-Cook @ 2002-08-13 17:42 ` Richard B. Johnson 2 siblings, 0 replies; 32+ messages in thread From: Richard B. Johnson @ 2002-08-13 17:42 UTC (permalink / raw) To: Christoph Hellwig Cc: Erik Andersen, Ingo Molnar, Linus Torvalds, linux-kernel On Tue, 13 Aug 2002, Christoph Hellwig wrote: > On Tue, Aug 13, 2002 at 10:09:24AM -0600, Erik Andersen wrote: > > > First the name souns horrible. What about spawn_thread or create_thread > > > instead? it's not our good old clone and not a lookalike, it's some > > > pthreadish monster.. > > > > How about "clone2"? > > Already used by ia64 for a hybrid between the good old clone and the new > monster :) > > And as I said again, it doesn't really clone - it starts something > different, namely the fn argument. > fn_startup() Cheers, Dick Johnson Penguin : Linux version 2.4.18 on an i686 machine (797.90 BogoMips). The US military has given us many words, FUBAR, SNAFU, now ENRON. Yes, top management were graduates of West Point and Annapolis. ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch] clone_startup(), 2.5.31-A0 2002-08-13 16:09 ` [patch] clone_startup(), 2.5.31-A0 Erik Andersen 2002-08-13 16:11 ` Christoph Hellwig @ 2002-08-13 17:19 ` Ingo Molnar 2002-08-13 20:35 ` Kai Henningsen 1 sibling, 1 reply; 32+ messages in thread From: Ingo Molnar @ 2002-08-13 17:19 UTC (permalink / raw) To: Erik Andersen; +Cc: Christoph Hellwig, Linus Torvalds, linux-kernel On Tue, 13 Aug 2002, Erik Andersen wrote: > > First the name souns horrible. What about spawn_thread or create_thread > > instead? it's not our good old clone and not a lookalike, it's some > > pthreadish monster.. > > How about "clone2"? ni fact it was sys_clone2() first time around, but Ulrich Drepper requested another name for it because in glibc it collided with ia64 where clone2() is something different. So whatever name there is going to be, it should not be sys_clone2(). Ingo ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch] clone_startup(), 2.5.31-A0 2002-08-13 17:19 ` Ingo Molnar @ 2002-08-13 20:35 ` Kai Henningsen 2002-08-15 21:24 ` Thunder from the hill 0 siblings, 1 reply; 32+ messages in thread From: Kai Henningsen @ 2002-08-13 20:35 UTC (permalink / raw) To: linux-kernel mingo@elte.hu (Ingo Molnar) wrote on 13.08.02 in <Pine.LNX.4.44.0208131918360.4369-100000@localhost.localdomain>: > On Tue, 13 Aug 2002, Erik Andersen wrote: > > > > First the name souns horrible. What about spawn_thread or create_thread > > > instead? it's not our good old clone and not a lookalike, it's some > > > pthreadish monster.. > > > > How about "clone2"? > > ni fact it was sys_clone2() first time around, but Ulrich Drepper > requested another name for it because in glibc it collided with ia64 where > clone2() is something different. So whatever name there is going to be, it > should not be sys_clone2(). clone_and_start() or clone_and_go() or something along those lines, perhaps? MfG Kai ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch] clone_startup(), 2.5.31-A0 2002-08-13 20:35 ` Kai Henningsen @ 2002-08-15 21:24 ` Thunder from the hill 0 siblings, 0 replies; 32+ messages in thread From: Thunder from the hill @ 2002-08-15 21:24 UTC (permalink / raw) To: Kai Henningsen; +Cc: linux-kernel Hi, On 13 Aug 2002, Kai Henningsen wrote: > clone_and_start() or clone_and_go() or something along those lines, > perhaps? clone_n_kick() Thunder -- --./../...-/. -.--/---/..-/.-./..././.-../..-. .---/..-/.../- .- --/../-./..-/-/./--..-- ../.----./.-../.-.. --./../...-/. -.--/---/..- .- -/---/--/---/.-./.-./---/.--/.-.-.- --./.-/-.../.-./.././.-../.-.-.- ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch] clone_startup(), 2.5.31-A0 2002-08-13 15:44 ` Christoph Hellwig 2002-08-13 16:01 ` Linus Torvalds 2002-08-13 16:09 ` [patch] clone_startup(), 2.5.31-A0 Erik Andersen @ 2002-08-13 17:27 ` Ingo Molnar 2002-08-14 13:48 ` Rusty Russell 2 siblings, 1 reply; 32+ messages in thread From: Ingo Molnar @ 2002-08-13 17:27 UTC (permalink / raw) To: Christoph Hellwig; +Cc: Linus Torvalds, linux-kernel On Tue, 13 Aug 2002, Christoph Hellwig wrote: > > with the help of this syscall glibc's next-generation pthreads code > > have you discussed this code with IBM's pthread group? I think we don't > want to add a new syscall that is only useful to glibc.. what i did was i thought about the problem in the generic context of threading, so these syscalls help both glibc and NGPT. (i'd have loved to test these concepts with NGPT as well, but i couldnt get it to run the attached simple pthreads application (perf.c - measures a few kinds of threaded workloads), with the latest NGPT version from the NGPT website. So if anyone could send me a working static binary of said application linked against NGPT, that would be great.) btw., with the help of these syscalls and futexes, the layer between Linux and pthreads became very thin - almost nonexistant in the majority of cases. Ingo #define _GNU_SOURCE 1 #include <argp.h> #include <error.h> #include <errno.h> #include <fcntl.h> #include <inttypes.h> #include <limits.h> #include <pthread.h> #include <signal.h> #include <stdbool.h> #include <stdlib.h> #include <string.h> #include <time.h> #include <unistd.h> #include <sys/types.h> #ifndef MAX_THREADS # define MAX_THREADS 100000 #endif #ifndef DEFAULT_THREADS # define DEFAULT_THREADS 50 #endif #define OPT_TO_THREAD 300 #define OPT_TO_PROCESS 301 static const struct argp_option options[] = { { NULL, 0, NULL, 0, "\ This is a test for threads so we allow ther user to selection the number of \ threads which are used at any one time. Independently the total number of \ rounds can be selected. This is the total number of threads which will have \ run when the process terminates:" }, { "threads", 't', "NUMBER", 0, "Number of threads used at once" }, { "starts", 's', "NUMBER", 0, "Total number of working threads" }, { NULL, 0, NULL, 0, "\ Each thread can do one of two things: sleep or do work. The latter is 100% \ CPU bound. The work load is the probability a thread does work. All values \ from zero to 100 (inclusive) are valid. How often each thread repeats this \ can be determined by the number of rounds. The work cost determines how long \ each work session (not sleeping) takes. If it is zero a thread would \ effectively nothing. By setting the number of rounds to zero the thread \ does no work at all and pure thread creation times can be measured." }, { "workload", 'w', "PERCENT", 0, "Percentage of time spent working" }, { "workcost", 'c', "NUMBER", 0, "Factor in the cost of each round of working" }, { "rounds", 'r', "NUMBER", 0, "Number of rounds each thread runs" }, { NULL, 0, NULL, 0, "\ One parameter for each threads execution is the size of the stack. If this \ parameter is not used the system's default stack size is used. If many \ threads are used the stack size should be chosen quite small." }, { "stacksize", 'S', "BYTES", 0, "Size of threads stack" }, { "guardsize", 'g', "BYTES", 0, "Size of stack guard area; must fit into the stack" }, { NULL, 0, NULL, 0, "Signal options:" }, { "to-thread", OPT_TO_THREAD, NULL, 0, "Send signal to main thread" }, { "to-process", OPT_TO_PROCESS, NULL, 0, "Send signal to process (default)" }, { NULL, 0, NULL, 0, "Administrative options:" }, { "progress", 'p', NULL, 0, "Show signs of progress" }, { "timing", 'T', NULL, 0, "Measure time from startup to the last thread finishing" }, { NULL, 0, NULL, 0, NULL } }; /* Prototype for option handler. */ static error_t parse_opt (int key, char *arg, struct argp_state *state); /* Data structure to communicate with argp functions. */ static struct argp argp = { options, parse_opt }; static unsigned long int threads = DEFAULT_THREADS; static unsigned long int workload = 50; static unsigned long int workcost = 20; static unsigned long int rounds = 100; static long int starts = 5000; static unsigned long int stacksize; static long int guardsize = -1; static bool progress; static bool timing; static bool to_thread; static long int running; static pthread_mutex_t running_mutex = PTHREAD_MUTEX_INITIALIZER; static pid_t pid; static pthread_t tmain; static clockid_t cl; static struct timespec start_time; static pthread_mutex_t sum_mutex = PTHREAD_MUTEX_INITIALIZER; unsigned int sum; /* We use 64bit values for the times. */ typedef unsigned long long int hp_timing_t; static void * thread_function (void *arg) { unsigned long int i; unsigned int state = (unsigned long int) arg; for (i = 0; i < rounds; ++i) { /* Determine what to do. */ unsigned int rnum; /* Equal distribution. */ do rnum = rand_r (&state); while (rnum >= UINT_MAX - (UINT_MAX % 100)); rnum %= 100; if (rnum < workload) { int j; int a[4] = { i, rnum, i + rnum, rnum - i }; if (progress) write (STDERR_FILENO, "c", 1); for (j = 0; j < workcost * 10000; ++j) { a[0] += a[3] >> 12; a[1] += a[2] >> 20; a[2] += a[1] ^ 0x3423423; a[3] += a[0] - a[1]; } pthread_mutex_lock (&sum_mutex); sum += a[0] + a[1] + a[2] + a[3]; pthread_mutex_unlock (&sum_mutex); } else { /* Just sleep. */ struct timespec tv; tv.tv_sec = 0; tv.tv_nsec = 40000000; if (progress) write (STDERR_FILENO, "w", 1); nanosleep (&tv, NULL); } } pthread_mutex_lock (&running_mutex); if (--running <= 0 && starts <= 0) { /* We are done. */ if (progress) write (STDERR_FILENO, "\n", 1); if (timing) { struct timespec end_time; if (clock_gettime (cl, &end_time) == 0) { end_time.tv_sec -= start_time.tv_sec; end_time.tv_nsec -= start_time.tv_nsec; if (end_time.tv_nsec < 0) { end_time.tv_nsec += 1000000000; --end_time.tv_sec; } printf ("\nRuntime: %lu.%09lu seconds\n", (unsigned long int) end_time.tv_sec, (unsigned long int) end_time.tv_nsec); } } printf ("Result: %08x\n", sum); exit (0); } pthread_mutex_unlock (&running_mutex); if (to_thread) /* This code sends a signal to the main thread. */ pthread_kill (tmain, SIGUSR1); else /* Use this code to test sending a signal to the process. */ kill (pid, SIGUSR1); if (progress) write (STDERR_FILENO, "f", 1); return NULL; } int main (int argc, char *argv[]) { int remaining; sigset_t ss; pthread_attr_t attr; /* Parse and process arguments. */ argp_parse (&argp, argc, argv, 0, &remaining, NULL); if (timing) { if (clock_getcpuclockid (0, &cl) != 0 || clock_gettime (cl, &start_time) != 0) timing = false; } /* We need this later. */ pid = getpid (); tmain = pthread_self (); /* We use signal SIGUSR1 for communication between the threads and the main thread. We only want sychronous notification. */ sigemptyset (&ss); sigaddset (&ss, SIGUSR1); if (sigprocmask (SIG_BLOCK, &ss, NULL) != 0) error (EXIT_FAILURE, errno, "cannot set signal mask"); /* Create the thread attribute. */ pthread_attr_init (&attr); /* If the user provided a stack size use it. */ if (stacksize != 0 && pthread_attr_setstacksize (&attr, stacksize) != 0) puts ("could not set stack size; will use default"); /* And stack guard size. */ if (guardsize != -1 && pthread_attr_setguardsize (&attr, guardsize) != 0) puts ("invalid stack guard size; will use default"); /* All threads are created detached. */ pthread_attr_setdetachstate (&attr, PTHREAD_CREATE_DETACHED); while (1) { pthread_t th; int err; bool cont = true; bool do_wait = false;; pthread_mutex_lock (&running_mutex); if (starts-- < 0) cont = false; else do_wait = ++running >= threads && starts > 0; pthread_mutex_unlock (&running_mutex); if (! cont) break; if (progress) write (STDERR_FILENO, "t", 1); err = pthread_create (&th, &attr, thread_function, (void *) starts); if (err != 0) error (EXIT_FAILURE, err, "cannot start thread %lu", starts); if (do_wait) sigwaitinfo (&ss, NULL); } /* Do nothing anymore. On of the threads will terminate the program. */ sigfillset (&ss); sigdelset (&ss, SIGINT); while (1) sigsuspend (&ss); /* NOTREACHED */ return 0; } /* Handle program arguments. */ static error_t parse_opt (int key, char *arg, struct argp_state *state) { unsigned long int num; long int snum; switch (key) { case 't': num = strtoul (arg, NULL, 0); if (num < MAX_THREADS) threads = num; else printf ("\ number of threads limited to %u; recompile with a higher limit if necessary", MAX_THREADS); break; case 'w': num = strtoul (arg, NULL, 0); if (num <= 100) workload = num; else puts ("workload must be between 0 and 100 percent"); break; case 'c': workcost = strtoul (arg, NULL, 0); break; case 'r': rounds = strtoul (arg, NULL, 0); break; case 's': starts = strtoul (arg, NULL, 0); break; case 'S': num = strtoul (arg, NULL, 0); if (num >= PTHREAD_STACK_MIN) stacksize = num; else printf ("minimum stack size is %d\n", PTHREAD_STACK_MIN); break; case 'g': snum = strtol (arg, NULL, 0); if (snum < 0) printf ("invalid guard size %s\n", arg); else guardsize = snum; break; case 'p': progress = true; break; case 'T': timing = true; break; case OPT_TO_THREAD: to_thread = true; break; case OPT_TO_PROCESS: to_thread = false; break; default: return ARGP_ERR_UNKNOWN; } return 0; } static hp_timing_t get_clockfreq (void) { /* We read the information from the /proc filesystem. It contains at least one line like cpu MHz : 497.840237 or also cpu MHz : 497.841 We search for this line and convert the number in an integer. */ static hp_timing_t result; int fd; /* If this function was called before, we know the result. */ if (result != 0) return result; fd = open ("/proc/cpuinfo", O_RDONLY); if (__builtin_expect (fd != -1, 1)) { /* XXX AFAIK the /proc filesystem can generate "files" only up to a size of 4096 bytes. */ char buf[4096]; ssize_t n; n = read (fd, buf, sizeof buf); if (__builtin_expect (n, 1) > 0) { char *mhz = memmem (buf, n, "cpu MHz", 7); if (__builtin_expect (mhz != NULL, 1)) { char *endp = buf + n; int seen_decpoint = 0; int ndigits = 0; /* Search for the beginning of the string. */ while (mhz < endp && (*mhz < '0' || *mhz > '9') && *mhz != '\n') ++mhz; while (mhz < endp && *mhz != '\n') { if (*mhz >= '0' && *mhz <= '9') { result *= 10; result += *mhz - '0'; if (seen_decpoint) ++ndigits; } else if (*mhz == '.') seen_decpoint = 1; ++mhz; } /* Compensate for missing digits at the end. */ while (ndigits++ < 6) result *= 10; } } close (fd); } return result; } int clock_getcpuclockid (pid_t pid, clockid_t *clock_id) { /* We don't allow any process ID but our own. */ if (pid != 0 && pid != getpid ()) return EPERM; #ifdef CLOCK_PROCESS_CPUTIME_ID /* Store the number. */ *clock_id = CLOCK_PROCESS_CPUTIME_ID; return 0; #else /* We don't have a timer for that. */ return ENOENT; #endif } #define HP_TIMING_NOW(Var) __asm__ __volatile__ ("rdtsc" : "=A" (Var)) /* Get current value of CLOCK and store it in TP. */ int clock_gettime (clockid_t clock_id, struct timespec *tp) { int retval = -1; switch (clock_id) { case CLOCK_PROCESS_CPUTIME_ID: { static hp_timing_t freq; hp_timing_t tsc; /* Get the current counter. */ HP_TIMING_NOW (tsc); if (freq == 0) { freq = get_clockfreq (); if (freq == 0) return EINVAL; } /* Compute the seconds. */ tp->tv_sec = tsc / freq; /* And the nanoseconds. This computation should be stable until we get machines with about 16GHz frequency. */ tp->tv_nsec = ((tsc % freq) * UINT64_C (1000000000)) / freq; retval = 0; } break; default: errno = EINVAL; break; } return retval; } ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch] clone_startup(), 2.5.31-A0 2002-08-13 17:27 ` Ingo Molnar @ 2002-08-14 13:48 ` Rusty Russell 2002-08-14 18:47 ` Jamie Lokier 0 siblings, 1 reply; 32+ messages in thread From: Rusty Russell @ 2002-08-14 13:48 UTC (permalink / raw) To: Ingo Molnar; +Cc: hch, torvalds, linux-kernel On Tue, 13 Aug 2002 19:27:35 +0200 (CEST) Ingo Molnar <mingo@elte.hu> wrote: > btw., with the help of these syscalls and futexes, the layer between Linux > and pthreads became very thin - almost nonexistant in the majority of > cases. Woohoo! Please send futex patch BTW, I wanna see 8) Rusty. -- there are those who do and those who hang on and you don't see too many doers quoting their contemporaries. -- Larry McVoy ^ permalink raw reply [flat|nested] 32+ messages in thread
* Re: [patch] clone_startup(), 2.5.31-A0 2002-08-14 13:48 ` Rusty Russell @ 2002-08-14 18:47 ` Jamie Lokier 0 siblings, 0 replies; 32+ messages in thread From: Jamie Lokier @ 2002-08-14 18:47 UTC (permalink / raw) To: Rusty Russell; +Cc: Ingo Molnar, hch, torvalds, linux-kernel Rusty Russell wrote: > > btw., with the help of these syscalls and futexes, the layer between Linux > > and pthreads became very thin - almost nonexistant in the majority of > > cases. > > Woohoo! Yes, futexes are wonderful, if quite difficult to figure out at first. (I still haven't understood why the futex example reader-writer locks are they way are.) Even things like waitpid() are potentially faster with futexes -- no pid to lookup! -- Jamie ^ permalink raw reply [flat|nested] 32+ messages in thread
end of thread, other threads:[~2002-08-23 19:48 UTC | newest] Thread overview: 32+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2002-08-13 15:09 [patch] clone_startup(), 2.5.31-A0 Ingo Molnar 2002-08-13 15:30 ` Linus Torvalds 2002-08-13 15:44 ` Christoph Hellwig 2002-08-13 16:01 ` Linus Torvalds 2002-08-13 16:05 ` Christoph Hellwig 2002-08-13 16:20 ` Linus Torvalds 2002-08-13 18:32 ` [patch] CLONE_SETTLS, CLONE_SETTID, 2.5.31-BK Ingo Molnar 2002-08-13 18:41 ` Linus Torvalds 2002-08-13 19:09 ` Ingo Molnar 2002-08-13 19:41 ` Linus Torvalds 2002-08-13 19:41 ` Ingo Molnar 2002-08-13 19:35 ` Ingo Molnar 2002-08-15 22:38 ` [patch] complain about unknown CLONE_* flags Jamie Lokier 2002-08-16 10:17 ` Ingo Molnar 2001-11-02 5:55 ` Pavel Machek 2002-08-13 16:09 ` [patch] clone_startup(), 2.5.31-A0 Erik Andersen 2002-08-13 16:11 ` Christoph Hellwig 2002-08-13 16:32 ` David Mosberger 2002-08-13 16:44 ` John Levon 2002-08-13 19:52 ` H. Peter Anvin 2002-08-13 20:19 ` David Mosberger 2002-08-13 20:27 ` H. Peter Anvin 2002-08-13 20:52 ` David Mosberger 2002-08-13 20:57 ` H. Peter Anvin 2002-08-13 16:56 ` Ruth Ivimey-Cook 2002-08-13 17:42 ` Richard B. Johnson 2002-08-13 17:19 ` Ingo Molnar 2002-08-13 20:35 ` Kai Henningsen 2002-08-15 21:24 ` Thunder from the hill 2002-08-13 17:27 ` Ingo Molnar 2002-08-14 13:48 ` Rusty Russell 2002-08-14 18:47 ` Jamie Lokier
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.