public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* clone() <-> getpid() bug in 2.6?
@ 2004-06-05 15:28 Russell Leighton
  2004-06-05 20:45 ` Linus Torvalds
  0 siblings, 1 reply; 54+ messages in thread
From: Russell Leighton @ 2004-06-05 15:28 UTC (permalink / raw)
  To: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 716 bytes --]


I have a test program (see attached) that shows what looks like a bug in 
2.6.5-1.358 (FedoraCore2)...and breaks my program :(

In summary, I am doing:

 clone(run_thread, stack + sizeof(stack) -1,
            CLONE_FS|CLONE_FILES|CLONE_VM|SIGCHLD, NULL))

According to the man page the child process should have its own pid as 
returned by getpid()...much like fork().

In 2.6 the child receives the parent's pid from getpid(), while 2.4 
works as documented:

In 2.4 the test program does:
 parent pid: 26647
 clone returned pid: 26648
 thread reported pid: 26648

In 2.6 the test program does:
 parent pid: 16665
 thread reported pid: 16665
 clone returned pid: 16666

Is this fixed in later kernels?

Thx

Russ

[-- Attachment #2: clone-pid-test.c --]
[-- Type: text/plain, Size: 733 bytes --]


#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sched.h>
#include <signal.h>
#include <sys/wait.h>

/* to compile:
  gcc -Wall clone-pid-test.c -o clone-pid-test
 */

static int run_thread(void *arg)
{
  printf("thread reported pid: %d\n", getpid()); 

  return 0;
}

static char stack[4096];

static int create_thread()
{
  int
    pid;

  /* create thread */
  if ( (pid = clone(run_thread, stack + sizeof(stack) -1,
		    CLONE_FS|CLONE_FILES|CLONE_VM|SIGCHLD, NULL)) == -1 ) {
    perror("clone:");
    exit(-1);
  }/* end if */

  return pid ;
}

int main(int argc, char *argv[])
{
  printf("parent pid: %d\n", getpid());
  printf("clone returned pid: %d\n",  create_thread());
  wait(NULL);
  return 0;
}

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: clone() <-> getpid() bug in 2.6?
  2004-06-05 15:28 clone() <-> getpid() bug in 2.6? Russell Leighton
@ 2004-06-05 20:45 ` Linus Torvalds
  2004-06-05 20:55   ` Arjan van de Ven
  2004-06-06 14:29   ` Russell Leighton
  0 siblings, 2 replies; 54+ messages in thread
From: Linus Torvalds @ 2004-06-05 20:45 UTC (permalink / raw)
  To: Russell Leighton, Arjan van de Ven; +Cc: Kernel Mailing List



On Sat, 5 Jun 2004, Russell Leighton wrote:
> 
> I have a test program (see attached) that shows what looks like a bug in 
> 2.6.5-1.358 (FedoraCore2)...and breaks my program :(
> 
> In summary, I am doing:
> 
>  clone(run_thread, stack + sizeof(stack) -1,
>             CLONE_FS|CLONE_FILES|CLONE_VM|SIGCHLD, NULL))
> 
> According to the man page the child process should have its own pid as 
> returned by getpid()...much like fork().
> 
> In 2.6 the child receives the parent's pid from getpid(), while 2.4 
> works as documented:
> 
> In 2.4 the test program does:
>  parent pid: 26647
>  clone returned pid: 26648
>  thread reported pid: 26648
> 
> In 2.6 the test program does:
>  parent pid: 16665
>  thread reported pid: 16665
>  clone returned pid: 16666

Hmm.. The above is the correct behaviour if you use CLONE_THREAD 
("getpid()" will then return the _thread_ ID), but it shouldn't happen 
without that. And clearly you don't have it set.

And indeed, it doesn't happen for me on my system:

	parent pid: 13552
	thread reported pid: 13553
	clone returned pid: 13553

so I wonder if either the Fedora libc always adds that CLONE_THREAD thing
to the clone() calls, or whether the FC2 kernel is buggy.

Arjan?

		Linus

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: clone() <-> getpid() bug in 2.6?
  2004-06-05 20:45 ` Linus Torvalds
@ 2004-06-05 20:55   ` Arjan van de Ven
  2004-06-05 21:13     ` Linus Torvalds
  2004-06-05 21:53     ` Chris Wedgwood
  2004-06-06 14:29   ` Russell Leighton
  1 sibling, 2 replies; 54+ messages in thread
From: Arjan van de Ven @ 2004-06-05 20:55 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Russell Leighton, Kernel Mailing List

[-- Attachment #1: Type: text/plain, Size: 358 bytes --]

On Sat, Jun 05, 2004 at 01:45:33PM -0700, Linus Torvalds wrote:
> so I wonder if either the Fedora libc always adds that CLONE_THREAD thing
> to the clone() calls, or whether the FC2 kernel is buggy.

... or glibc internally caches the getpid() result and doesn't notice the
app calls clone() internally... strace seems to show 1 getpid() call total
not 2.


[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: clone() <-> getpid() bug in 2.6?
  2004-06-05 20:55   ` Arjan van de Ven
@ 2004-06-05 21:13     ` Linus Torvalds
  2004-06-05 21:48       ` Robert Love
  2004-06-05 21:53     ` Chris Wedgwood
  1 sibling, 1 reply; 54+ messages in thread
From: Linus Torvalds @ 2004-06-05 21:13 UTC (permalink / raw)
  To: Arjan van de Ven, Ulrich Drepper; +Cc: Russell Leighton, Kernel Mailing List



On Sat, 5 Jun 2004, Arjan van de Ven wrote:
> 
> ... or glibc internally caches the getpid() result and doesn't notice the
> app calls clone() internally... strace seems to show 1 getpid() call total
> not 2.

Why the hell does glibc do that totally useless optimization?

It's a classic "optimize for benchmarks" thing - evil. I don't see any
real app caring, and clearly apps _do_ fail when you get it wrong, as
shown by Russell.

And it's really easy to get it wrong. You can't just invalidate the cache
when you see a "clone()" - you have to make sure that you never ever cache
it ever after either.

Uli, if Arjan is right, then please fix this. It's a buggy and pointless 
optimization. Anybody who optimizes purely for benchmarks should be 
ashamed of themselves.

		Linus



^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: clone() <-> getpid() bug in 2.6?
  2004-06-05 21:13     ` Linus Torvalds
@ 2004-06-05 21:48       ` Robert Love
  2004-06-05 22:44         ` Linus Torvalds
  0 siblings, 1 reply; 54+ messages in thread
From: Robert Love @ 2004-06-05 21:48 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Arjan van de Ven, Ulrich Drepper, Russell Leighton,
	Kernel Mailing List

On Sat, 2004-06-05 at 14:13 -0700, Linus Torvalds wrote:

> Uli, if Arjan is right, then please fix this. It's a buggy and pointless 
> optimization. Anybody who optimizes purely for benchmarks should be 
> ashamed of themselves.

Eh, it definitely does, in nptl/sysdeps/unix/sysv/linux/getpid.c:

	pid_t result = THREAD_GETMEM (THREAD_SELF, pid);
	if (__builtin_expect (result <= 0, 0))
	  result = really_getpid (result);

A few places, including the fork code, fix it:

	/* Adjust the PID field for the new process.  */
	THREAD_SETMEM (self, pid, THREAD_GETMEM (self, tid));

But not direct calls to clone(2).

	Robert Love



^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: clone() <-> getpid() bug in 2.6?
  2004-06-05 20:55   ` Arjan van de Ven
  2004-06-05 21:13     ` Linus Torvalds
@ 2004-06-05 21:53     ` Chris Wedgwood
  2004-06-05 22:47       ` Robert Love
  1 sibling, 1 reply; 54+ messages in thread
From: Chris Wedgwood @ 2004-06-05 21:53 UTC (permalink / raw)
  To: Arjan van de Ven; +Cc: Linus Torvalds, Russell Leighton, Kernel Mailing List

On Sat, Jun 05, 2004 at 10:55:47PM +0200, Arjan van de Ven wrote:

> ... or glibc internally caches the getpid() result and doesn't
> notice the app calls clone() internally... strace seems to show 1
> getpid() call total not 2.

glibc caches getpid() ?!?

it's not like it's a slow syscall or used often


  --cw


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: clone() <-> getpid() bug in 2.6?
  2004-06-05 21:48       ` Robert Love
@ 2004-06-05 22:44         ` Linus Torvalds
  0 siblings, 0 replies; 54+ messages in thread
From: Linus Torvalds @ 2004-06-05 22:44 UTC (permalink / raw)
  To: Robert Love
  Cc: Arjan van de Ven, Ulrich Drepper, Russell Leighton,
	Kernel Mailing List



On Sat, 5 Jun 2004, Robert Love wrote:
> 
> Eh, it definitely does, in nptl/sysdeps/unix/sysv/linux/getpid.c:

What a piece of shit.

> A few places, including the fork code, fix it:
> 
> 	/* Adjust the PID field for the new process.  */
> 	THREAD_SETMEM (self, pid, THREAD_GETMEM (self, tid));
> 
> But not direct calls to clone(2).

As mentioned, you can't even "adjust" it in a clone(). It's just _wrong_
to cache _ever_ after a clone(), whether it was cached before or not. You 
can't just re-set the pid information like in a fork().

The whole notion of caching "pid" is sick and idiotic. It has no redeeming 
values.

		Linus

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: clone() <-> getpid() bug in 2.6?
  2004-06-05 21:53     ` Chris Wedgwood
@ 2004-06-05 22:47       ` Robert Love
  2004-06-05 22:57         ` David S. Miller
                           ` (2 more replies)
  0 siblings, 3 replies; 54+ messages in thread
From: Robert Love @ 2004-06-05 22:47 UTC (permalink / raw)
  To: Chris Wedgwood
  Cc: Arjan van de Ven, Linus Torvalds, Russell Leighton,
	Kernel Mailing List

On Sat, 2004-06-05 at 14:53 -0700, Chris Wedgwood wrote:

> glibc caches getpid() ?!?
> 
> it's not like it's a slow syscall or used often

It is almost certainly done to improve the speed of some stupid
microbenchmark - say, one that just calls getpid() repeatedly (simple
because it is NOT slow) to measure system call overhead.

Or maybe libc uses the PID a lot internally.  I don't know.

But it sure seems wrong.

	Robert Love



^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: clone() <-> getpid() bug in 2.6?
  2004-06-05 22:47       ` Robert Love
@ 2004-06-05 22:57         ` David S. Miller
  2004-06-05 23:01         ` Linus Torvalds
  2004-06-05 23:07         ` Davide Libenzi
  2 siblings, 0 replies; 54+ messages in thread
From: David S. Miller @ 2004-06-05 22:57 UTC (permalink / raw)
  To: Robert Love; +Cc: cw, arjanv, torvalds, russ, linux-kernel

On Sat, 05 Jun 2004 18:47:43 -0400
Robert Love <rml@ximian.com> wrote:

> It is almost certainly done to improve the speed of some stupid
> microbenchmark - say, one that just calls getpid() repeatedly (simple
> because it is NOT slow) to measure system call overhead.

Luckily lmbench and friends use getppid() specifically because it
is impossible to cache that (as far as I can tell).

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: clone() <-> getpid() bug in 2.6?
  2004-06-05 22:47       ` Robert Love
  2004-06-05 22:57         ` David S. Miller
@ 2004-06-05 23:01         ` Linus Torvalds
  2004-06-05 23:07         ` Davide Libenzi
  2 siblings, 0 replies; 54+ messages in thread
From: Linus Torvalds @ 2004-06-05 23:01 UTC (permalink / raw)
  To: Robert Love
  Cc: Chris Wedgwood, Arjan van de Ven, Russell Leighton,
	Kernel Mailing List



On Sat, 5 Jun 2004, Robert Love wrote:
> 
> It is almost certainly done to improve the speed of some stupid
> microbenchmark - say, one that just calls getpid() repeatedly (simple
> because it is NOT slow) to measure system call overhead.

getpid() is only used in bad benchmarks these days, exactly because 
caching of "pid" has been done for a long time. Every single microkernel 
OS ever built did it, I think.

So yes, you'll find some broken benchmarks using getpid(), but nobody sane 
would take those benchmarks seriously anyway. If you want _real_ system 
call performance measurements, use something like lmbench, which does:

 - getppid() - same cost as getpid(), except you can't just cache the
   results on any POSIX OS (well, you could, but it's more complex:
   you end up having to map the uarea-equvalent into user space or have 
   some notifier for the parent dying).
 - read/write: a hell of a lot more important than the simple system calls 
   anyway.
 - open/close/stat/fstat: very common system calls with interesting 
   performance issues.

That gives you some _real_ data.

> Or maybe libc uses the PID a lot internally.  I don't know.

I think some threading libraries historically used hashes of the
thread-specific pid to look up the thread-specific data. But since glibc
shows the same pid for all regular threads, and gets the pid wrong for
other threads, that's clearly _not_ a valid reason for caching it either.

		Linus

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: clone() <-> getpid() bug in 2.6?
  2004-06-05 22:47       ` Robert Love
  2004-06-05 22:57         ` David S. Miller
  2004-06-05 23:01         ` Linus Torvalds
@ 2004-06-05 23:07         ` Davide Libenzi
  2004-06-05 23:18           ` Linus Torvalds
  2004-06-05 23:19           ` Robert Love
  2 siblings, 2 replies; 54+ messages in thread
From: Davide Libenzi @ 2004-06-05 23:07 UTC (permalink / raw)
  To: Robert Love
  Cc: Chris Wedgwood, Arjan van de Ven, Linus Torvalds,
	Russell Leighton, Kernel Mailing List

On Sat, 5 Jun 2004, Robert Love wrote:

> On Sat, 2004-06-05 at 14:53 -0700, Chris Wedgwood wrote:
> 
> > glibc caches getpid() ?!?
> > 
> > it's not like it's a slow syscall or used often
> 
> It is almost certainly done to improve the speed of some stupid
> microbenchmark - say, one that just calls getpid() repeatedly (simple
> because it is NOT slow) to measure system call overhead.
> 
> Or maybe libc uses the PID a lot internally.  I don't know.
> 
> But it sure seems wrong.

It is likely used by pthread_self(), that is pretty much performance 
sensitive. I'd agree with Ulrich here.



- Davide


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: clone() <-> getpid() bug in 2.6?
  2004-06-05 23:07         ` Davide Libenzi
@ 2004-06-05 23:18           ` Linus Torvalds
  2004-06-05 23:26             ` Davide Libenzi
  2004-06-06  5:08             ` Kalin KOZHUHAROV
  2004-06-05 23:19           ` Robert Love
  1 sibling, 2 replies; 54+ messages in thread
From: Linus Torvalds @ 2004-06-05 23:18 UTC (permalink / raw)
  To: Davide Libenzi
  Cc: Robert Love, Chris Wedgwood, Arjan van de Ven, Russell Leighton,
	Kernel Mailing List



On Sat, 5 Jun 2004, Davide Libenzi wrote:
> 
> It is likely used by pthread_self(), that is pretty much performance 
> sensitive. I'd agree with Ulrich here.

It _can't_ be used for pthread_self(), since the pid is the _same_ across 
all threads in a pthread environment.

See?

I re-iterate:

 - getpid() was used in some historical threading packages (maybe even 
   LinuxThreads) to get the thread ID thing.

   But this particular usage requires the old Linux behaviour of returning 
   separate threads ID's for separate threads. CLONE_THREAD does not do 
   that, and in particular the glibc caching _breaks_ any such attempt 
   even when you don't use CLONE_THREAD.

   pthreads() in modern libc sets CLONE_THREAD and then uses the 
   per-thread segment thing to identify threads. No getpid() anywhere.

   Ergo: getpid() caching may result in faster execution, but in this case 
   it's actively _wrong_, and breaks the app.  It's easy to make things go 
   fast if you don't actually care about whether the end result is correct
   or not.

 - getpid() is used in some silly benchmarks.

   But in this case caching getpid() just lies to the benchmark, and is
   pointless.

   Ergo: getpid() caching results in higher scores, but the scores are now 
   meaningless, since they have nothing to do with system call speed any
   more.

So in both of these cases, the caching literally results in WRONG
behaviour.

Can somebody actually find an application that doesn't fall into either of
the above two categories, and calls getpid() more than a handful of times?

		Linus

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: clone() <-> getpid() bug in 2.6?
  2004-06-05 23:07         ` Davide Libenzi
  2004-06-05 23:18           ` Linus Torvalds
@ 2004-06-05 23:19           ` Robert Love
  1 sibling, 0 replies; 54+ messages in thread
From: Robert Love @ 2004-06-05 23:19 UTC (permalink / raw)
  To: Davide Libenzi
  Cc: Chris Wedgwood, Arjan van de Ven, Linus Torvalds,
	Russell Leighton, Kernel Mailing List

On Sat, 2004-06-05 at 16:07 -0700, Davide Libenzi wrote:

> It is likely used by pthread_self(), that is pretty much performance 
> sensitive. I'd agree with Ulrich here.

I think it would want gettid(), not getpid(), for that.  CLONE_THREAD
behavior has getpid() return the same PID for all threads..

	Robert Love



^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: clone() <-> getpid() bug in 2.6?
  2004-06-05 23:18           ` Linus Torvalds
@ 2004-06-05 23:26             ` Davide Libenzi
  2004-06-06  5:08             ` Kalin KOZHUHAROV
  1 sibling, 0 replies; 54+ messages in thread
From: Davide Libenzi @ 2004-06-05 23:26 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Robert Love, Chris Wedgwood, Arjan van de Ven, Russell Leighton,
	Kernel Mailing List

On Sat, 5 Jun 2004, Linus Torvalds wrote:

> 
> 
> On Sat, 5 Jun 2004, Davide Libenzi wrote:
> > 
> > It is likely used by pthread_self(), that is pretty much performance 
> > sensitive. I'd agree with Ulrich here.
> 
> It _can't_ be used for pthread_self(), since the pid is the _same_ across 
> all threads in a pthread environment.

Yeah, I just checked. It is not for that. And like Robert was saying, it'd 
have been a gettid().



- Davide


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: clone() <-> getpid() bug in 2.6?
  2004-06-05 23:18           ` Linus Torvalds
  2004-06-05 23:26             ` Davide Libenzi
@ 2004-06-06  5:08             ` Kalin KOZHUHAROV
  2004-06-06  5:13               ` Chris Wedgwood
  2004-06-06  6:07               ` Linus Torvalds
  1 sibling, 2 replies; 54+ messages in thread
From: Kalin KOZHUHAROV @ 2004-06-06  5:08 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Davide Libenzi, Robert Love, Chris Wedgwood, Arjan van de Ven,
	Russell Leighton, Kernel Mailing List

[-- Attachment #1: Type: text/plain, Size: 5604 bytes --]

Linus Torvalds wrote:
> 
> On Sat, 5 Jun 2004, Davide Libenzi wrote:
> 
>> It is likely used by pthread_self(), that is pretty much
>> performance sensitive. I'd agree with Ulrich here.
> 
> 
> It _can't_ be used for pthread_self(), since the pid is the _same_
> across all threads in a pthread environment.
> 
> See?
> 
> I re-iterate:
> 
> - getpid() was used in some historical threading packages (maybe even
>  LinuxThreads) to get the thread ID thing.
> 
> But this particular usage requires the old Linux behaviour of
> returning separate threads ID's for separate threads. CLONE_THREAD
> does not do that, and in particular the glibc caching _breaks_ any
> such attempt even when you don't use CLONE_THREAD.
> 
> pthreads() in modern libc sets CLONE_THREAD and then uses the 
> per-thread segment thing to identify threads. No getpid() anywhere.
> 
> Ergo: getpid() caching may result in faster execution, but in this
> case it's actively _wrong_, and breaks the app.  It's easy to make
> things go fast if you don't actually care about whether the end
> result is correct or not.
> 
> - getpid() is used in some silly benchmarks.
> 
> But in this case caching getpid() just lies to the benchmark, and is 
> pointless.
> 
> Ergo: getpid() caching results in higher scores, but the scores are
> now meaningless, since they have nothing to do with system call speed
> any more.
> 
> So in both of these cases, the caching literally results in WRONG 
> behaviour.
> 
> Can somebody actually find an application that doesn't fall into
> either of the above two categories, and calls getpid() more than a
> handful of times?
> 
Well, not exactly sure about my reply, but let me try.

The other day I was debugging some config problems with my qmail instalation and I ended up doing:
# strace -p 4563 -f -F
...
[pid 13097] read(3, "\347\374\375TBH~\342\233\337\220\302l\220\317\237\37\25"..., 32) = 32
[pid 13097] close(3)                    = 0
[pid 13097] getpid()                    = 13097
[pid 13097] getpid()                    = 13097
[pid 13097] getuid32()                  = 89
[pid 13097] getpid()                    = 13097
[pid 13097] time(NULL)                  = 1086497450
[pid 13097] getpid()                    = 13097
[pid 13097] getpid()                    = 13097
[pid 13097] getpid()                    = 13097
[pid 13097] getpid()                    = 13097
[pid 13097] getpid()                    = 13097
[pid 13097] getpid()                    = 13097
[pid 13097] getpid()                    = 13097
[pid 13097] getpid()                    = 13097
[pid 13097] getpid()                    = 13097
[pid 13097] getpid()                    = 13097
[pid 13097] getpid()                    = 13097
[pid 13097] getpid()                    = 13097
[pid 13097] getpid()                    = 13097
[pid 13097] getpid()                    = 13097
[pid 13097] getpid()                    = 13097
[pid 13097] getpid()                    = 13097
[pid 13097] getpid()                    = 13097
[pid 13097] getpid()                    = 13097
[pid 13097] getpid()                    = 13097
[pid 13097] getpid()                    = 13097
[pid 13097] getpid()                    = 13097
[pid 13097] getpid()                    = 13097
[pid 13097] getpid()                    = 13097
[pid 13097] getpid()                    = 13097
[pid 13097] getpid()                    = 13097
[pid 13097] getpid()                    = 13097
[pid 13097] getpid()                    = 13097
[pid 13097] getpid()                    = 13097
[pid 13097] getpid()                    = 13097
[pid 13097] getpid()                    = 13097
[pid 13097] getpid()                    = 13097
[pid 13097] getpid()                    = 13097
[pid 13097] getpid()                    = 13097
[pid 13097] getpid()                    = 13097
[pid 13097] getpid()                    = 13097
[pid 13097] getpid()                    = 13097
[pid 13097] getpid()                    = 13097
[pid 13097] getpid()                    = 13097
[pid 13097] getpid()                    = 13097
[pid 13097] getpid()                    = 13097
[pid 13097] getpid()                    = 13097
[pid 13097] getpid()                    = 13097
[pid 13097] getpid()                    = 13097
[pid 13097] getpid()                    = 13097
[pid 13097] getpid()                    = 13097
[pid 13097] getpid()                    = 13097
[pid 13097] getpid()                    = 13097
[pid 13097] getpid()                    = 13097
[pid 13097] getpid()                    = 13097
[pid 13097] getpid()                    = 13097
[pid 13097] getpid()                    = 13097
[pid 13097] getpid()                    = 13097
[pid 13097] getpid()                    = 13097
[pid 13097] time(NULL)                  = 1086497450
[pid 13097] getpid()                    = 13097
[pid 13097] getpid()                    = 13097
[pid 13097] write(0, "\26\3\1\0J\2\0\0F\3\1@\302\242\2527j\266m4{\323>6\232\243"..., 1127) = 1127
[pid 13097] read(0, "\26\3\1\1\6", 5)   = 5
...
where 4563 is a tcpserver process and 13097 is its child. The process was started as:
tcpserver -s -n ssmtp.crt -p -v -R -x /etc/tcprules.d/tcp.smtp.cdb -c 40 -u 89 -g 89 0.0.0.0 465 /var/qmail/bin/qmail-smtpd mail.tar.bz /var/vpopmail/bin/vchkpw /bin/true

No idea why are all those getpid() call there in the child... Looked at the code, but couldn't get it. Attaching tcpserver.c for reference.

Again, sorry if unrelated.

Kalin.

-- 
||///_ o  *****************************
||//'_/>     WWW: http://ThinRope.net/
|||\/<" 
|||\\ ' 
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

[-- Attachment #2: tcpserver.c --]
[-- Type: text/plain, Size: 15039 bytes --]

#include <sys/types.h>
#include <sys/param.h>
#include <netdb.h>
#include <openssl/ssl.h>
#include "uint16.h"
#include "str.h"
#include "byte.h"
#include "fmt.h"
#include "scan.h"
#include "ip4.h"
#include "fd.h"
#include "exit.h"
#include "env.h"
#include "prot.h"
#include "open.h"
#include "wait.h"
#include "readwrite.h"
#include "stralloc.h"
#include "alloc.h"
#include "buffer.h"
#include "error.h"
#include "strerr.h"
#include "sgetopt.h"
#include "pathexec.h"
#include "socket.h"
#include "ndelay.h"
#include "remoteinfo.h"
#include "rules.h"
#include "sig.h"
#include "dns.h"

int verbosity = 1;
int flagkillopts = 1;
int flagdelay = 1;
char *banner = "";
int flagremoteinfo = 1;
int flagremotehost = 1;
int flagparanoid = 0;
unsigned long timeout = 26;
#ifdef WITH_SSL
int flagssl = 0;
struct stralloc certfile = {0};
#define CERTFILE "./cert.pem"

void translate(SSL*, int, int, unsigned int);
#endif

static stralloc tcpremoteinfo;

uint16 localport;
char localportstr[FMT_ULONG];
char localip[4];
char localipstr[IP4_FMT];
static stralloc localhostsa;
char *localhost = 0;

uint16 remoteport;
char remoteportstr[FMT_ULONG];
char remoteip[4];
char remoteipstr[IP4_FMT];
static stralloc remotehostsa;
char *remotehost = 0;

char strnum[FMT_ULONG];
char strnum2[FMT_ULONG];

static stralloc tmp;
static stralloc fqdn;
static stralloc addresses;

char bspace[16];
buffer b;



/* ---------------------------- child */

#define DROP "tcpserver: warning: dropping connection, "

int flagdeny = 0;
int flagallownorules = 0;
char *fnrules = 0;

void drop_nomem(void)
{
  strerr_die2sys(111,DROP,"out of memory");
}
void cats(char *s)
{
  if (!stralloc_cats(&tmp,s)) drop_nomem();
}
void append(char *ch)
{
  if (!stralloc_append(&tmp,ch)) drop_nomem();
}
void safecats(char *s)
{
  char ch;
  int i;

  for (i = 0;i < 100;++i) {
    ch = s[i];
    if (!ch) return;
    if (ch < 33) ch = '?';
    if (ch > 126) ch = '?';
    if (ch == '%') ch = '?'; /* logger stupidity */
    if (ch == ':') ch = '?';
    append(&ch);
  }
  cats("...");
}
void env(char *s,char *t)
{
  if (!pathexec_env(s,t)) drop_nomem();
}
void drop_rules(void)
{
  strerr_die4sys(111,DROP,"unable to read ",fnrules,": ");
}

void found(char *data,unsigned int datalen)
{
  unsigned int next0;
  unsigned int split;

  while ((next0 = byte_chr(data,datalen,0)) < datalen) {
    switch(data[0]) {
      case 'D':
	flagdeny = 1;
	break;
      case '+':
	split = str_chr(data + 1,'=');
	if (data[1 + split] == '=') {
	  data[1 + split] = 0;
	  env(data + 1,data + 1 + split + 1);
	}
	break;
    }
    ++next0;
    data += next0; datalen -= next0;
  }
}

void doit(int t)
{
  int j;

  remoteipstr[ip4_fmt(remoteipstr,remoteip)] = 0;

  if (verbosity >= 2) {
    strnum[fmt_ulong(strnum,getpid())] = 0;
    strerr_warn4("tcpserver: pid ",strnum," from ",remoteipstr,0);
  }

  if (flagkillopts)
    socket_ipoptionskill(t);
  if (!flagdelay)
    socket_tcpnodelay(t);

  if (*banner) {
    buffer_init(&b,write,t,bspace,sizeof bspace);
    if (buffer_putsflush(&b,banner) == -1)
      strerr_die2sys(111,DROP,"unable to print banner: ");
  }

  if (socket_local4(t,localip,&localport) == -1)
    strerr_die2sys(111,DROP,"unable to get local address: ");

  localipstr[ip4_fmt(localipstr,localip)] = 0;
  remoteportstr[fmt_ulong(remoteportstr,remoteport)] = 0;

  if (!localhost)
    if (dns_name4(&localhostsa,localip) == 0)
      if (localhostsa.len) {
	if (!stralloc_0(&localhostsa)) drop_nomem();
	localhost = localhostsa.s;
      }
  env("PROTO","TCP");
  env("TCPLOCALIP",localipstr);
  env("TCPLOCALPORT",localportstr);
  env("TCPLOCALHOST",localhost);

  if (flagremotehost)
    if (dns_name4(&remotehostsa,remoteip) == 0)
      if (remotehostsa.len) {
	if (flagparanoid)
	  if (dns_ip4(&tmp,&remotehostsa) == 0)
	    for (j = 0;j + 4 <= tmp.len;j += 4)
	      if (byte_equal(remoteip,4,tmp.s + j)) {
		flagparanoid = 0;
		break;
	      }
	if (!flagparanoid) {
	  if (!stralloc_0(&remotehostsa)) drop_nomem();
	  remotehost = remotehostsa.s;
	}
      }
  env("TCPREMOTEIP",remoteipstr);
  env("TCPREMOTEPORT",remoteportstr);
  env("TCPREMOTEHOST",remotehost);

  if (flagremoteinfo) {
    if (remoteinfo(&tcpremoteinfo,remoteip,remoteport,localip,localport,timeout) == -1)
      flagremoteinfo = 0;
    if (!stralloc_0(&tcpremoteinfo)) drop_nomem();
  }
  env("TCPREMOTEINFO",flagremoteinfo ? tcpremoteinfo.s : 0);

  if (fnrules) {
    int fdrules;
    fdrules = open_read(fnrules);
    if (fdrules == -1) {
      if (errno != error_noent) drop_rules();
      if (!flagallownorules) drop_rules();
    }
    else {
      if (rules(found,fdrules,remoteipstr,remotehost,flagremoteinfo ? tcpremoteinfo.s : 0) == -1) drop_rules();
      close(fdrules);
    }
  }

  if (verbosity >= 2) {
    strnum[fmt_ulong(strnum,getpid())] = 0;
    if (!stralloc_copys(&tmp,"tcpserver: ")) drop_nomem();
    safecats(flagdeny ? "deny" : "ok");
    cats(" "); safecats(strnum);
    cats(" "); if (localhost) safecats(localhost);
    cats(":"); safecats(localipstr);
    cats(":"); safecats(localportstr);
    cats(" "); if (remotehost) safecats(remotehost);
    cats(":"); safecats(remoteipstr);
    cats(":"); if (flagremoteinfo) safecats(tcpremoteinfo.s);
    cats(":"); safecats(remoteportstr);
    cats("\n");
    buffer_putflush(buffer_2,tmp.s,tmp.len);
  }

  if (flagdeny) _exit(100);
}



/* ---------------------------- parent */

#define FATAL "tcpserver: fatal: "

void usage(void)
{
#ifndef WITH_SSL
  strerr_warn1("\
tcpserver: usage: tcpserver \
[ -1UXpPhHrRoOdDqQv ] \
[ -c limit ] \
[ -x rules.cdb ] \
[ -B banner ] \
[ -g gid ] \
[ -u uid ] \
[ -b backlog ] \
[ -l localname ] \
[ -t timeout ] \
host port program",0);
#else
  strerr_warn1("\
tcpserver: usage: tcpserver \
[ -1UXpPhHrRoOdDqQsSv ] \
[ -c limit ] \
[ -x rules.cdb ] \
[ -B banner ] \
[ -g gid ] \
[ -u uid ] \
[ -b backlog ] \
[ -l localname ] \
[ -t timeout ] \
[ -n certfile ] \
host port program",0);
#endif
  _exit(100);
}

unsigned long limit = 40;
unsigned long numchildren = 0;

int flag1 = 0;
unsigned long backlog = 20;
unsigned long uid = 0;
unsigned long gid = 0;

void printstatus(void)
{
  if (verbosity < 2) return;
  strnum[fmt_ulong(strnum,numchildren)] = 0;
  strnum2[fmt_ulong(strnum2,limit)] = 0;
  strerr_warn4("tcpserver: status: ",strnum,"/",strnum2,0);
}

void sigterm()
{
  _exit(0);
}

void sigchld()
{
  int wstat;
  int pid;
 
  while ((pid = wait_nohang(&wstat)) > 0) {
    if (verbosity >= 2) {
      strnum[fmt_ulong(strnum,pid)] = 0;
      strnum2[fmt_ulong(strnum2,wstat)] = 0;
      strerr_warn4("tcpserver: end ",strnum," status ",strnum2,0);
    }
    if (numchildren) --numchildren; printstatus();
  }
}

main(int argc,char **argv)
{
  char *hostname;
  char *portname;
  int opt;
  struct servent *se;
  char *x;
  unsigned long u;
  int s;
  int t;
#ifdef WITH_SSL
  BIO *sbio;
  SSL *ssl;
  SSL_CTX *ctx;
  int pi2c[2], pi4c[2];
  
  ctx = NULL;

  if (!stralloc_copys(&certfile, CERTFILE) || !stralloc_0(&certfile) )
    strerr_die2x(111,FATAL,"out of memory");
  while ((opt = getopt(argc,argv,"dDvqQhHrRsS1UXx:t:u:g:l:b:B:c:n:pPoO")) != opteof)
#else
  while ((opt = getopt(argc,argv,"dDvqQhHrR1UXx:t:u:g:l:b:B:c:pPoO")) != opteof)
#endif
    switch(opt) {
      case 'b': scan_ulong(optarg,&backlog); break;
      case 'c': scan_ulong(optarg,&limit); break;
      case 'X': flagallownorules = 1; break;
      case 'x': fnrules = optarg; break;
      case 'B': banner = optarg; break;
      case 'd': flagdelay = 1; break;
      case 'D': flagdelay = 0; break;
      case 'v': verbosity = 2; break;
      case 'q': verbosity = 0; break;
      case 'Q': verbosity = 1; break;
      case 'P': flagparanoid = 0; break;
      case 'p': flagparanoid = 1; break;
      case 'O': flagkillopts = 1; break;
      case 'o': flagkillopts = 0; break;
      case 'H': flagremotehost = 0; break;
      case 'h': flagremotehost = 1; break;
      case 'R': flagremoteinfo = 0; break;
      case 'r': flagremoteinfo = 1; break;
      case 't': scan_ulong(optarg,&timeout); break;
      case 'U': x = env_get("UID"); if (x) scan_ulong(x,&uid);
		x = env_get("GID"); if (x) scan_ulong(x,&gid); break;
      case 'u': scan_ulong(optarg,&uid); break;
      case 'g': scan_ulong(optarg,&gid); break;
      case '1': flag1 = 1; break;
      case 'l': localhost = optarg; break;
#ifdef WITH_SSL
      case 's': flagssl = 1; break;
      case 'S': flagssl = 0; break;
      case 'n': if (!stralloc_copys(&certfile, optarg) ||
		    !stralloc_0(&certfile) )
		  strerr_die2x(111,FATAL,"out of memory");
		break;
#endif
      default: usage();
    }
  argc -= optind;
  argv += optind;

  if (!verbosity)
    buffer_2->fd = -1;
 
  hostname = *argv++;
  if (!hostname) usage();
  if (str_equal(hostname,"")) hostname = "0.0.0.0";
  if (str_equal(hostname,"0")) hostname = "0.0.0.0";

  x = *argv++;
  if (!x) usage();
  if (!x[scan_ulong(x,&u)])
    localport = u;
  else {
    se = getservbyname(x,"tcp");
    if (!se)
      strerr_die3x(111,FATAL,"unable to figure out port number for ",x);
    localport = ntohs(se->s_port);
  }

  if (!*argv) usage();
 
  sig_block(sig_child);
  sig_catch(sig_child,sigchld);
  sig_catch(sig_term,sigterm);
  sig_ignore(sig_pipe);
 
  if (!stralloc_copys(&tmp,hostname))
    strerr_die2x(111,FATAL,"out of memory");
  if (dns_ip4_qualify(&addresses,&fqdn,&tmp) == -1)
    strerr_die4sys(111,FATAL,"temporarily unable to figure out IP address for ",hostname,": ");
  if (addresses.len < 4)
    strerr_die3x(111,FATAL,"no IP address for ",hostname);
  byte_copy(localip,4,addresses.s);

#ifdef WITH_SSL
  if (flagssl == 1) {
    /* setup SSL context (load key and cert into ctx) */
    SSL_library_init();
    ctx=SSL_CTX_new(SSLv23_server_method());
    if (!ctx) strerr_die2x(111,FATAL,"unable to create SSL context");

    if(SSL_CTX_use_RSAPrivateKey_file(ctx, certfile.s, SSL_FILETYPE_PEM) != 1)
      strerr_die2x(111,FATAL,"unable to load RSA private key");
    if(SSL_CTX_use_certificate_file(ctx, certfile.s, SSL_FILETYPE_PEM) != 1)
      strerr_die2x(111,FATAL,"unable to load certificate");
  }
#endif
  
  s = socket_tcp();
  if (s == -1)
    strerr_die2sys(111,FATAL,"unable to create socket: ");
  if (socket_bind4_reuse(s,localip,localport) == -1)
    strerr_die2sys(111,FATAL,"unable to bind: ");
  if (socket_local4(s,localip,&localport) == -1)
    strerr_die2sys(111,FATAL,"unable to get local address: ");
  if (socket_listen(s,backlog) == -1)
    strerr_die2sys(111,FATAL,"unable to listen: ");
  ndelay_off(s);

  if (gid) if (prot_gid(gid) == -1)
    strerr_die2sys(111,FATAL,"unable to set gid: ");
  if (uid) if (prot_uid(uid) == -1)
    strerr_die2sys(111,FATAL,"unable to set uid: ");

 
  localportstr[fmt_ulong(localportstr,localport)] = 0;
  if (flag1) {
    buffer_init(&b,write,1,bspace,sizeof bspace);
    buffer_puts(&b,localportstr);
    buffer_puts(&b,"\n");
    buffer_flush(&b);
  }
 
  close(0);
  close(1);
  printstatus();
 
  for (;;) {
    while (numchildren >= limit) sig_pause();

    sig_unblock(sig_child);
    t = socket_accept4(s,remoteip,&remoteport);
    sig_block(sig_child);

    if (t == -1) continue;
    ++numchildren; printstatus();
 
    switch(fork()) {
      case 0:
        close(s);
        doit(t);
        if ((fd_move(0,t) == -1) || (fd_copy(1,0) == -1))
	  strerr_die2sys(111,DROP,"unable to set up descriptors: ");
        sig_uncatch(sig_child);
        sig_unblock(sig_child);
        sig_uncatch(sig_term);
        sig_uncatch(sig_pipe);
#ifdef WITH_SSL
	if (flagssl == 1) {
	  if (pipe(pi2c) != 0)
	    strerr_die2sys(111,DROP,"unable to create pipe: ");
	  if (pipe(pi4c) != 0)
	    strerr_die2sys(111,DROP,"unable to create pipe: ");
	  switch(fork()) {
	    case 0:
	      close(0); close(1);
	      close(pi2c[1]);
	      close(pi4c[0]);
	      if ((fd_move(0,pi2c[0]) == -1) || (fd_move(1,pi4c[1]) == -1))
		strerr_die2sys(111,DROP,"unable to set up descriptors: ");
	      /* signals are allready set in the parent */
	      pathexec(argv);
	      strerr_die4sys(111,DROP,"unable to run ",*argv,": ");
	    case -1:
	      strerr_die2sys(111,DROP,"unable to fork: ");
	    default:
	      ssl = SSL_new(ctx);
	      if (!ssl)
		strerr_die2x(111,DROP,"unable to set up SSL session");
	      sbio = BIO_new_socket(0,BIO_NOCLOSE);
	      if (!sbio)
		strerr_die2x(111,DROP,"unable to set up BIO socket");
	      SSL_set_bio(ssl,sbio,sbio);
	      close(pi2c[0]);
	      close(pi4c[1]);
	      translate(ssl, pi2c[1], pi4c[0], 3600);
	      _exit(0);
	  }
	}
#endif
        pathexec(argv);
	strerr_die4sys(111,DROP,"unable to run ",*argv,": ");
      case -1:
        strerr_warn2(DROP,"unable to fork: ",&strerr_sys);
        --numchildren; printstatus();
    }
    close(t);
  }
}

#ifdef WITH_SSL
static int allwrite(int fd, char *buf, int len)
{
  int w;

  while (len) {
    w = write(fd,buf,len);
    if (w == -1) {
      if (errno == error_intr) continue;
      return -1; /* note that some data may have been written */
    }
    if (w == 0) ; /* luser's fault */
    buf += w;
    len -= w;
  }
  return 0;
}

static int allwritessl(SSL* ssl, char *buf, int len)
{
  int w;

  while (len) {
    w = SSL_write(ssl,buf,len);
    if (w == -1) {
      if (errno == error_intr) continue;
      return -1; /* note that some data may have been written */
    }
    if (w == 0) ; /* luser's fault */
    buf += w;
    len -= w;
  }
  return 0;
}

char tbuf[2048];

void translate(SSL* ssl, int clearout, int clearin, unsigned int iotimeout)
{
  struct taia now;
  struct taia deadline;
  iopause_fd iop[2];
  int flagexitasap;
  int iopl;
  int sslout, sslin;
  int n, r;

  sslin = SSL_get_fd(ssl);
  sslout = SSL_get_fd(ssl);
  if (sslin == -1 || sslout == -1)
    strerr_die2x(111,DROP,"unable to set up SSL connection");
  
  flagexitasap = 0;

  if (SSL_accept(ssl)<=0)
    strerr_die2x(111,DROP,"unable to accept SSL connection");

  while (!flagexitasap) {
    taia_now(&now);
    taia_uint(&deadline,iotimeout);
    taia_add(&deadline,&now,&deadline);

    /* fill iopause struct */
    iopl = 2;
    iop[0].fd = sslin;
    iop[0].events = IOPAUSE_READ;
    iop[1].fd = clearin;
    iop[1].events = IOPAUSE_READ;

    /* do iopause read */
    iopause(iop,iopl,&deadline,&now);
    if (iop[0].revents) {
      /* data on sslin */
      n = SSL_read(ssl, tbuf, sizeof(tbuf));
      if ( n < 0 )
	strerr_die2sys(111,DROP,"unable to read form network: ");
      if ( n == 0 )
	flagexitasap = 1;
      r = allwrite(clearout, tbuf, n);
      if ( r < 0 )
	strerr_die2sys(111,DROP,"unable to write to client: ");
    }
    if (iop[1].revents) {
      /* data on clearin */
      n = read(clearin, tbuf, sizeof(tbuf));
      if ( n < 0 )
	strerr_die2sys(111,DROP,"unable to read form client: ");
      if ( n == 0 )
	flagexitasap = 1;
      r = allwritessl(ssl, tbuf, n);
      if ( r < 0 )
	strerr_die2sys(111,DROP,"unable to write to network: ");
    }
    if (!iop[0].revents && !iop[1].revents)
      strerr_die2x(0, DROP,"timeout reached without input");
  }
}
#endif

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: clone() <-> getpid() bug in 2.6?
  2004-06-06  5:08             ` Kalin KOZHUHAROV
@ 2004-06-06  5:13               ` Chris Wedgwood
  2004-06-06  5:34                 ` Kalin KOZHUHAROV
  2004-06-06  6:07               ` Linus Torvalds
  1 sibling, 1 reply; 54+ messages in thread
From: Chris Wedgwood @ 2004-06-06  5:13 UTC (permalink / raw)
  To: Kalin KOZHUHAROV
  Cc: Linus Torvalds, Davide Libenzi, Robert Love, Arjan van de Ven,
	Russell Leighton, Kernel Mailing List

On Sun, Jun 06, 2004 at 02:08:52PM +0900, Kalin KOZHUHAROV wrote:

> [pid 13097] getpid()                    = 13097
> [pid 13097] getpid()                    = 13097
> [pid 13097] getpid()                    = 13097
> [pid 13097] getpid()                    = 13097
> [pid 13097] getpid()                    = 13097
> [pid 13097] getpid()                    = 13097
> [pid 13097] getpid()                    = 13097
> [pid 13097] getpid()                    = 13097
> [pid 13097] getpid()                    = 13097

is this repeartable?  if you strace -tt how often is this?

what's more, i wonder why this is going on?  i'd almost be tempted to
attach to it with gdb and take a bt from getpid and see wtf is going
on


  --cw

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: clone() <-> getpid() bug in 2.6?
  2004-06-06  5:13               ` Chris Wedgwood
@ 2004-06-06  5:34                 ` Kalin KOZHUHAROV
  0 siblings, 0 replies; 54+ messages in thread
From: Kalin KOZHUHAROV @ 2004-06-06  5:34 UTC (permalink / raw)
  To: Chris Wedgwood
  Cc: Linus Torvalds, Davide Libenzi, Robert Love, Arjan van de Ven,
	Russell Leighton, Kernel Mailing List

Chris Wedgwood wrote:
> On Sun, Jun 06, 2004 at 02:08:52PM +0900, Kalin KOZHUHAROV wrote:
> 
> 
>>[pid 13097] getpid()                    = 13097
>>[pid 13097] getpid()                    = 13097
>>[pid 13097] getpid()                    = 13097
>>[pid 13097] getpid()                    = 13097
>>[pid 13097] getpid()                    = 13097
>>[pid 13097] getpid()                    = 13097
>>[pid 13097] getpid()                    = 13097
>>[pid 13097] getpid()                    = 13097
>>[pid 13097] getpid()                    = 13097
> 
> 
> is this repeartable?  if you strace -tt how often is this?
Yes, definately on this system. Here we go:

poi root # strace -tt -f -F -p 3550
Process 3550 attached - interrupt to quit
14:24:31.589227 accept(3, {sa_family=AF_INET, sin_port=htons(40575), sin_addr=inet_addr("192.168.1.20")}, [16]) = 0
14:25:04.068649 rt_sigprocmask(SIG_BLOCK, [CHLD], NULL, 8) = 0
14:25:04.068898 write(2, "tcpserver: status: 1/40\n", 24) = 24
14:25:04.069229 fork(Process 14320 attached
)                  = 14320
[pid  3550] 14:25:04.069663 close(0 <unfinished ...>
[pid 14320] 14:25:04.069771 close(3 <unfinished ...>
[pid  3550] 14:25:04.069874 <... close resumed> ) = 0
[pid  3550] 14:25:04.069977 rt_sigprocmask(SIG_UNBLOCK, [CHLD], NULL, 8) = 0
[pid  3550] 14:25:04.070169 accept(3,  <unfinished ...>
[pid 14320] 14:25:04.070278 <... close resumed> ) = 0
[pid 14320] 14:25:04.070406 getpid()    = 14320
[pid 14320] 14:25:04.070614 write(2, "tcpserver: pid 14320 from 192.16"..., 39) = 39
...
[pid 14320] 14:25:04.139116 close(3)    = 0
[pid 14320] 14:25:04.139242 close(6)    = 0
[pid 14320] 14:25:04.139350 time(NULL)  = 1086499504
[pid 14320] 14:25:04.139516 getpid()    = 14320
[pid 14320] 14:25:04.139742 read(0, "\26\3\1\0s\1\0\0o\3\1", 11) = 11
[pid 14320] 14:25:04.139910 time(NULL)  = 1086499504
[pid 14320] 14:25:04.140017 getpid()    = 14320
[pid 14320] 14:25:04.140125 read(0, "\0?\230A\323\323\304\226\v\37NI\310\3634i\375\346\300p"..., 109) = 109
[pid 14320] 14:25:04.140278 time(NULL)  = 1086499504
[pid 14320] 14:25:04.140383 getpid()    = 14320
[pid 14320] 14:25:04.140475 getpid()    = 14320
[pid 14320] 14:25:04.140584 getpid()    = 14320
[pid 14320] 14:25:04.140681 open("/dev/urandom", O_RDONLY|O_NONBLOCK|O_NOCTTY) = 3
[pid 14320] 14:25:04.140811 select(4, [3], NULL, NULL, {0, 10000}) = 1 (in [3], left {0, 10000})
[pid 14320] 14:25:04.140972 read(3, "g\t\270X\304\260\17P\'\367\t\"\253\375\24\350\272\240\327"..., 32) = 32
[pid 14320] 14:25:04.141130 close(3)    = 0
[pid 14320] 14:25:04.141224 getpid()    = 14320
[pid 14320] 14:25:04.141326 getpid()    = 14320
[pid 14320] 14:25:04.141435 getuid32()  = 89
[pid 14320] 14:25:04.141531 getpid()    = 14320
[pid 14320] 14:25:04.141650 time(NULL)  = 1086499504
[pid 14320] 14:25:04.141747 getpid()    = 14320
[pid 14320] 14:25:04.141846 getpid()    = 14320
[pid 14320] 14:25:04.141949 getpid()    = 14320
[pid 14320] 14:25:04.142051 getpid()    = 14320
[pid 14320] 14:25:04.142153 getpid()    = 14320
[pid 14320] 14:25:04.142257 getpid()    = 14320
[pid 14320] 14:25:04.142359 getpid()    = 14320
[pid 14320] 14:25:04.142461 getpid()    = 14320
[pid 14320] 14:25:04.142580 getpid()    = 14320
[pid 14320] 14:25:04.142683 getpid()    = 14320
[pid 14320] 14:25:04.142784 getpid()    = 14320
[pid 14320] 14:25:04.142886 getpid()    = 14320
[pid 14320] 14:25:04.142989 getpid()    = 14320
[pid 14320] 14:25:04.143090 getpid()    = 14320
[pid 14320] 14:25:04.143192 getpid()    = 14320
[pid 14320] 14:25:04.143294 getpid()    = 14320
[pid 14320] 14:25:04.143396 getpid()    = 14320
[pid 14320] 14:25:04.143499 getpid()    = 14320
[pid 14320] 14:25:04.143619 getpid()    = 14320
[pid 14320] 14:25:04.143722 getpid()    = 14320
[pid 14320] 14:25:04.143824 getpid()    = 14320
[pid 14320] 14:25:04.143927 getpid()    = 14320
[pid 14320] 14:25:04.144029 getpid()    = 14320
[pid 14320] 14:25:04.144131 getpid()    = 14320
[pid 14320] 14:25:04.144233 getpid()    = 14320
[pid 14320] 14:25:04.144336 getpid()    = 14320
[pid 14320] 14:25:04.144438 getpid()    = 14320
[pid 14320] 14:25:04.144540 getpid()    = 14320
[pid 14320] 14:25:04.144701 getpid()    = 14320
[pid 14320] 14:25:04.144804 getpid()    = 14320
[pid 14320] 14:25:04.144906 getpid()    = 14320
[pid 14320] 14:25:04.145009 getpid()    = 14320
[pid 14320] 14:25:04.145111 getpid()    = 14320
[pid 14320] 14:25:04.145214 getpid()    = 14320
[pid 14320] 14:25:04.145316 getpid()    = 14320
[pid 14320] 14:25:04.145419 getpid()    = 14320
[pid 14320] 14:25:04.145520 getpid()    = 14320
[pid 14320] 14:25:04.145640 getpid()    = 14320
[pid 14320] 14:25:04.145742 getpid()    = 14320
[pid 14320] 14:25:04.145845 getpid()    = 14320
[pid 14320] 14:25:04.145948 getpid()    = 14320
[pid 14320] 14:25:04.146050 getpid()    = 14320
[pid 14320] 14:25:04.146152 getpid()    = 14320
[pid 14320] 14:25:04.146254 getpid()    = 14320
[pid 14320] 14:25:04.146356 getpid()    = 14320
[pid 14320] 14:25:04.146458 getpid()    = 14320
[pid 14320] 14:25:04.146575 getpid()    = 14320
[pid 14320] 14:25:04.146678 getpid()    = 14320
[pid 14320] 14:25:04.146780 getpid()    = 14320
[pid 14320] 14:25:04.146882 getpid()    = 14320
[pid 14320] 14:25:04.146985 getpid()    = 14320
[pid 14320] 14:25:04.147087 getpid()    = 14320
[pid 14320] 14:25:04.147189 getpid()    = 14320
[pid 14320] 14:25:04.147345 time(NULL)  = 1086499504
[pid 14320] 14:25:04.147442 getpid()    = 14320
[pid 14320] 14:25:04.147531 getpid()    = 14320
[pid 14320] 14:25:04.147773 write(0, "\26\3\1\0J\2\0\0F\3\1@\302\252\260\204\203~\367\r\305G"..., 1127) = 1127
[pid 14320] 14:25:04.147937 read(0, "\26\3\1\1\6", 5) = 5
[pid 14320] 14:25:04.151416 read(0, "\20\0\1\2\1\0S\303\2\335E\264\330\250\261oj\307a~en\262"..., 262) = 262
[pid 14320] 14:25:04.151575 getpid()    = 14320
[pid 14320] 14:25:04.151685 time([1086499504]) = 1086499504
[pid 14320] 14:25:04.151790 getpid()    = 14320
[pid 14320] 14:25:04.151883 getpid()    = 14320
[pid 14320] 14:25:04.155052 getpid()    = 14320
[pid 14320] 14:25:04.155150 getpid()    = 14320
[pid 14320] 14:25:04.178154 read(0, "\24\3\1\0\1", 5) = 5
[pid 14320] 14:25:04.178346 read(0, "\1", 1) = 1
[pid 14320] 14:25:04.178646 read(0, "\26\3\1\0000", 5) = 5
[pid 14320] 14:25:04.178810 read(0, "\10\215\371\334e+\324\356\224|\216\371\204\351\330f+\203"..., 48) = 48
[pid 14320] 14:25:04.179055 write(0, "\24\3\1\0\1\1\26\3\1\0000\26\2678\202\31\16\251\346\224"..., 59) = 59
[pid 14320] 14:25:04.179302 gettimeofday({1086499504, 179360}, NULL) = 0
...

The ppid changed because I restarted /service while testing.
I sent a short mail via SMTP/SSL to trigger the above trace.

> what's more, i wonder why this is going on?  i'd almost be tempted to
> attach to it with gdb and take a bt from getpid and see wtf is going
> on
Well, there was no gdb on that machine, compiling it now.
AFAIR, tcpserver is stripped (it is a "production" machine), so I wonder how inforamative it will be.

I havent used gbd a lot, educate me what to do.
For example how do I set a bp to trigger when getpid() is called ?
I know how to set one in a function, but I guess function names are stripped.

Kalin.

-- 
||///_ o  *****************************
||//'_/>     WWW: http://ThinRope.net/
|||\/<" 
|||\\ ' 
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: clone() <-> getpid() bug in 2.6?
  2004-06-06  5:08             ` Kalin KOZHUHAROV
  2004-06-06  5:13               ` Chris Wedgwood
@ 2004-06-06  6:07               ` Linus Torvalds
  2004-06-06  6:43                 ` Kalin KOZHUHAROV
                                   ` (5 more replies)
  1 sibling, 6 replies; 54+ messages in thread
From: Linus Torvalds @ 2004-06-06  6:07 UTC (permalink / raw)
  To: Kalin KOZHUHAROV
  Cc: Davide Libenzi, Robert Love, Chris Wedgwood, Arjan van de Ven,
	Russell Leighton, Kernel Mailing List



On Sun, 6 Jun 2004, Kalin KOZHUHAROV wrote:
>
> Well, not exactly sure about my reply, but let me try.
> 
> The other day I was debugging some config problems with my qmail instalation and I ended up doing:
> # strace -p 4563 -f -F
> ...
> [pid 13097] read(3, "\347\374\375TBH~\342\233\337\220\302l\220\317\237\37\25"..., 32) = 32
> [pid 13097] close(3)                    = 0
> [pid 13097] getpid()                    = 13097
> [pid 13097] getpid()                    = 13097
> [pid 13097] getuid32()                  = 89
> [pid 13097] getpid()                    = 13097
> [pid 13097] time(NULL)                  = 1086497450
> [pid 13097] getpid()                    = 13097
> [pid 13097] getpid()                    = 13097
> [pid 13097] getpid()                    = 13097

qmail is a piece of crap. The source code is completely unreadable, and it 
seems to think that "getpid()" is a good source of random data. Don't ask 
me why.

It literally does things like

	random = now() + (getpid() << 16);

and since there isn't a single comment in the whole source tree, it's
pointless to wonder why. (In case you wonder, "now()" just does a
"time(NULL)" call - whee.).

I don't understand why people bother with it. It's not like Dan Bernstein
is so charming that it makes up for the deficiencies of his programs.

But no, even despite the strange usage, this isn't a performance issue.  
qmail will call "getpid()" a few tens of times per connection because of
the wonderful quality of randomness it provides, or something.

This is another gem you find when grepping for "getpid()" in qmail, and 
apparently the source of most of them:

	if (now() - when < ((60 + (getpid() & 31)) << 6))

Don't you love it how timeouts etc seem to be based on random values that 
are calculated off the lower 5 bits of the process ID? And don't you find 
the above (totally uncommented) line just a thing of beauty and clarity?

Yeah. 

Anyway, you did find something that used more than a handful of getpid() 
calls, but no, it doesn't qualify as performance-critical, and even 
despite it's peyote-induced (or hey, some people are just crazy on their 
own) getpid() usage, it's not a reason to have a buggy glibc.

		Linus

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: clone() <-> getpid() bug in 2.6?
  2004-06-06  6:07               ` Linus Torvalds
@ 2004-06-06  6:43                 ` Kalin KOZHUHAROV
  2004-06-06  7:57                 ` Erik Andersen
                                   ` (4 subsequent siblings)
  5 siblings, 0 replies; 54+ messages in thread
From: Kalin KOZHUHAROV @ 2004-06-06  6:43 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Davide Libenzi, Robert Love, Chris Wedgwood, Arjan van de Ven,
	Russell Leighton, Kernel Mailing List

Linus Torvalds wrote:
> 
> On Sun, 6 Jun 2004, Kalin KOZHUHAROV wrote:
> 
>>Well, not exactly sure about my reply, but let me try.
>>
>>The other day I was debugging some config problems with my qmail instalation and I ended up doing:
>># strace -p 4563 -f -F
>>...
>>[pid 13097] read(3, "\347\374\375TBH~\342\233\337\220\302l\220\317\237\37\25"..., 32) = 32
>>[pid 13097] close(3)                    = 0
>>[pid 13097] getpid()                    = 13097
>>[pid 13097] getpid()                    = 13097
>>[pid 13097] getuid32()                  = 89
>>[pid 13097] getpid()                    = 13097
>>[pid 13097] time(NULL)                  = 1086497450
>>[pid 13097] getpid()                    = 13097
>>[pid 13097] getpid()                    = 13097
>>[pid 13097] getpid()                    = 13097
> 
> 
> qmail is a piece of crap. The source code is completely unreadable, and it 
> seems to think that "getpid()" is a good source of random data. Don't ask 
> me why.
> 
> It literally does things like
> 
> 	random = now() + (getpid() << 16);
> 
> and since there isn't a single comment in the whole source tree, it's
> pointless to wonder why. (In case you wonder, "now()" just does a
> "time(NULL)" call - whee.).
> 
> I don't understand why people bother with it. It's not like Dan Bernstein
> is so charming that it makes up for the deficiencies of his programs.
Well, it just works once you set it up right. And many people use it and you can get community support to a certain extent.

> But no, even despite the strange usage, this isn't a performance issue.  
> qmail will call "getpid()" a few tens of times per connection because of
> the wonderful quality of randomness it provides, or something.
> 
> This is another gem you find when grepping for "getpid()" in qmail, and 
> apparently the source of most of them:
> 
> 	if (now() - when < ((60 + (getpid() & 31)) << 6))
> 
> Don't you love it how timeouts etc seem to be based on random values that 
> are calculated off the lower 5 bits of the process ID? And don't you find 
> the above (totally uncommented) line just a thing of beauty and clarity?
:-) DJB is (in)famous for its "code clarity".

> Yeah. 
> 
> Anyway, you did find something that used more than a handful of getpid() 
> calls, but no, it doesn't qualify as performance-critical, and even 
> despite it's peyote-induced (or hey, some people are just crazy on their 
> own) getpid() usage, it's not a reason to have a buggy glibc.

I definately agree that getpid() should not be cached as it gives inconsistent results.
That is why I just reported I case of "more than a handful of getpid() calls" that struck me recently.

Ok, I think I/we have no more to say about the above getpid() usage.

Is there anybody insisting on getpid() caching?
And can/will anydoby fix that in glibc?

Kalin.


-- 
||///_ o  *****************************
||//'_/>     WWW: http://ThinRope.net/
|||\/<" 
|||\\ ' 
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: clone() <-> getpid() bug in 2.6?
  2004-06-06  6:07               ` Linus Torvalds
  2004-06-06  6:43                 ` Kalin KOZHUHAROV
@ 2004-06-06  7:57                 ` Erik Andersen
  2004-06-06 16:57                   ` Linus Torvalds
  2004-06-06  9:52                 ` Bernd Eckenfels
                                   ` (3 subsequent siblings)
  5 siblings, 1 reply; 54+ messages in thread
From: Erik Andersen @ 2004-06-06  7:57 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Kalin KOZHUHAROV, Davide Libenzi, Robert Love, Chris Wedgwood,
	Arjan van de Ven, Russell Leighton, Kernel Mailing List

On Sat Jun 05, 2004 at 11:07:25PM -0700, Linus Torvalds wrote:
> qmail is a piece of crap. The source code is completely unreadable, and it 
> seems to think that "getpid()" is a good source of random data. Don't ask 
> me why.
> 
> It literally does things like
> 
> 	random = now() + (getpid() << 16);
[-----------snip-----------]

http://sources.redhat.com/cgi-bin/cvsweb.cgi/~checkout~/libc/sysdeps/posix/tempname.c?rev=1.36&content-type=text/plain&cvsroot=glibc

    /* Get some more or less random data.  */
    random_time_bits = ((uint64_t) tv.tv_usec << 16) ^ tv.tv_sec;
    value += random_time_bits ^ __getpid ();

etc....

 -Erik

--
Erik B. Andersen             http://codepoet-consulting.com/
--This message was written using 73% post-consumer electrons--

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: clone() <-> getpid() bug in 2.6?
  2004-06-06  6:07               ` Linus Torvalds
  2004-06-06  6:43                 ` Kalin KOZHUHAROV
  2004-06-06  7:57                 ` Erik Andersen
@ 2004-06-06  9:52                 ` Bernd Eckenfels
  2004-06-06 13:07                   ` Paul Rolland
  2004-06-06 16:33                 ` chris
                                   ` (2 subsequent siblings)
  5 siblings, 1 reply; 54+ messages in thread
From: Bernd Eckenfels @ 2004-06-06  9:52 UTC (permalink / raw)
  To: linux-kernel

In article <Pine.LNX.4.58.0406052244290.7010@ppc970.osdl.org> you wrote:
> It literally does things like
> 
>        random = now() + (getpid() << 16);

It does that  for the unique  filenames and id stamps (maildir format and
message ids). But it should be easy to replace this with a cached getpid
result, if this is realy a performance problem. On a traditional unix system
pid and timestamp should be locally unique for non threaded applications.

> Anyway, you did find something that used more than a handful of getpid() 
> calls, but no, it doesn't qualify as performance-critical, and even 
> despite it's peyote-induced (or hey, some people are just crazy on their 
> own) getpid() usage, it's not a reason to have a buggy glibc.

I wonder if it easyly would be possible to cache the getpid() result in some
thread local segment. Is there any, which is present for all clone flags?
Not tha I care much about this unneeded glibc optimizsation, but more out of
curiousity about the new threadind functionality.

Greetings
Bernd
-- 
eckes privat - http://www.eckes.org/
Project Freefire - http://www.freefire.org/

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: clone() <-> getpid() bug in 2.6?
  2004-06-06  9:52                 ` Bernd Eckenfels
@ 2004-06-06 13:07                   ` Paul Rolland
  2004-06-06 17:20                     ` Patrick J. LoPresti
  0 siblings, 1 reply; 54+ messages in thread
From: Paul Rolland @ 2004-06-06 13:07 UTC (permalink / raw)
  To: 'Bernd Eckenfels', linux-kernel

Hello,

> It does that  for the unique  filenames and id stamps 
> (maildir format and

I though this was a job for mktemp and co.

Regards,
Paul



^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: clone() <-> getpid() bug in 2.6?
  2004-06-05 20:45 ` Linus Torvalds
  2004-06-05 20:55   ` Arjan van de Ven
@ 2004-06-06 14:29   ` Russell Leighton
  2004-06-06 15:38     ` Using getpid() often, another way? [was Re: clone() <-> getpid() bug in 2.6?] Russell Leighton
  1 sibling, 1 reply; 54+ messages in thread
From: Russell Leighton @ 2004-06-06 14:29 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Arjan van de Ven, Kernel Mailing List


I have read the discussion on this issue and I wanted to get 
confirmation of my understanding...

Issue:
    It seems glibc is caching getpid() which is wrong and breaks 
programs like mine.

    You are using an older version of  glibc than I and that is why you 
could run the test program.

Given the above, that means that:
    Upgrading my kernel on the FedoraCore2 system won't help because the 
bug is in glibc

    Assuming that this is a "new" feature of glibc, any others upgrading 
would then starting seeing this bug

Linus Torvalds wrote:

>On Sat, 5 Jun 2004, Russell Leighton wrote:
>  
>
>>I have a test program (see attached) that shows what looks like a bug in 
>>2.6.5-1.358 (FedoraCore2)...and breaks my program :(
>>
>>In summary, I am doing:
>>
>> clone(run_thread, stack + sizeof(stack) -1,
>>            CLONE_FS|CLONE_FILES|CLONE_VM|SIGCHLD, NULL))
>>
>>According to the man page the child process should have its own pid as 
>>returned by getpid()...much like fork().
>>
>>In 2.6 the child receives the parent's pid from getpid(), while 2.4 
>>works as documented:
>>
>>In 2.4 the test program does:
>> parent pid: 26647
>> clone returned pid: 26648
>> thread reported pid: 26648
>>
>>In 2.6 the test program does:
>> parent pid: 16665
>> thread reported pid: 16665
>> clone returned pid: 16666
>>    
>>
>
>Hmm.. The above is the correct behaviour if you use CLONE_THREAD 
>("getpid()" will then return the _thread_ ID), but it shouldn't happen 
>without that. And clearly you don't have it set.
>
>And indeed, it doesn't happen for me on my system:
>
>	parent pid: 13552
>	thread reported pid: 13553
>	clone returned pid: 13553
>
>so I wonder if either the Fedora libc always adds that CLONE_THREAD thing
>to the clone() calls, or whether the FC2 kernel is buggy.
>
>Arjan?
>
>		Linus
>
>
>  
>


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Using getpid() often, another way? [was Re: clone() <-> getpid() bug in 2.6?]
  2004-06-06 14:29   ` Russell Leighton
@ 2004-06-06 15:38     ` Russell Leighton
  2004-06-06 15:44       ` Robert Love
                         ` (2 more replies)
  0 siblings, 3 replies; 54+ messages in thread
From: Russell Leighton @ 2004-06-06 15:38 UTC (permalink / raw)
  To: Kernel Mailing List

Linus said he could not imagine a program using getpid() more than a 
handful of times...
well, (I am ashamed to admit it) I found this getpid() bug with just 
such a program.

Could someone can suggest an alternative to using getpid() for me? 
Here's the problem...

I have a library that creates 2 threads using clone().
[NOTE: I can't use pthreads for a variety of reasons, mostly due
to the wacky signal handling rules...it turns out that using clone() is 
cleaner for me anyway.]

Some of the code can ONLY be executed in one of the threads
called the 'handler' thread. The reason for that restriction is that 
there are
performance, locking and synchronization issues that demand this
constraint.

Entry points to this code are exported for programmers to use.
I currently have a check in these functions similar to:

   if ( !INHANDLERCONTEXT(mon) ) barf();

where:

#define INHANDLERCONTEXT(mon) ( (mon)->handler_q.thread->pid == getpid() )

So the questions are:

  Given a code library with some exported functions which CAN be 
executed outside a particular thread and others that MUST be executed in 
a particular thread, how can I efficiently prevent or trap using of 
these functions outside the proper execution context?

 Would gettid() be any better?

Thx

Russ

Russell Leighton wrote:

>
> I have read the discussion on this issue and I wanted to get 
> confirmation of my understanding...
>
> Issue:
>    It seems glibc is caching getpid() which is wrong and breaks 
> programs like mine.
>
>    You are using an older version of  glibc than I and that is why you 
> could run the test program.
>
> Given the above, that means that:
>    Upgrading my kernel on the FedoraCore2 system won't help because 
> the bug is in glibc
>
>    Assuming that this is a "new" feature of glibc, any others 
> upgrading would then starting seeing this bug
>
> Linus Torvalds wrote:
>
>> On Sat, 5 Jun 2004, Russell Leighton wrote:
>>  
>>
>>> I have a test program (see attached) that shows what looks like a 
>>> bug in 2.6.5-1.358 (FedoraCore2)...and breaks my program :(
>>>
>>> In summary, I am doing:
>>>
>>> clone(run_thread, stack + sizeof(stack) -1,
>>>            CLONE_FS|CLONE_FILES|CLONE_VM|SIGCHLD, NULL))
>>>
>>> According to the man page the child process should have its own pid 
>>> as returned by getpid()...much like fork().
>>>
>>> In 2.6 the child receives the parent's pid from getpid(), while 2.4 
>>> works as documented:
>>>
>>> In 2.4 the test program does:
>>> parent pid: 26647
>>> clone returned pid: 26648
>>> thread reported pid: 26648
>>>
>>> In 2.6 the test program does:
>>> parent pid: 16665
>>> thread reported pid: 16665
>>> clone returned pid: 16666
>>>   
>>
>>
>> Hmm.. The above is the correct behaviour if you use CLONE_THREAD 
>> ("getpid()" will then return the _thread_ ID), but it shouldn't 
>> happen without that. And clearly you don't have it set.
>>
>> And indeed, it doesn't happen for me on my system:
>>
>>     parent pid: 13552
>>     thread reported pid: 13553
>>     clone returned pid: 13553
>>
>> so I wonder if either the Fedora libc always adds that CLONE_THREAD 
>> thing
>> to the clone() calls, or whether the FC2 kernel is buggy.
>>
>> Arjan?
>>
>>         Linus
>>
>>
>>  
>>
>
> -
> To unsubscribe from this list: send the line "unsubscribe 
> linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
>
>


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: Using getpid() often, another way? [was Re: clone() <-> getpid() bug in 2.6?]
  2004-06-06 15:38     ` Using getpid() often, another way? [was Re: clone() <-> getpid() bug in 2.6?] Russell Leighton
@ 2004-06-06 15:44       ` Robert Love
  2004-06-07  0:20         ` Russell Leighton
  2004-06-06 15:58       ` Arjan van de Ven
  2004-06-06 17:19       ` Linus Torvalds
  2 siblings, 1 reply; 54+ messages in thread
From: Robert Love @ 2004-06-06 15:44 UTC (permalink / raw)
  To: Russell Leighton; +Cc: Kernel Mailing List

On Sun, 2004-06-06 at 11:38 -0400, Russell Leighton wrote:

>   Given a code library with some exported functions which CAN be 
> executed outside a particular thread and others that MUST be executed in 
> a particular thread, how can I efficiently prevent or trap using of 
> these functions outside the proper execution context?

Are you sure you cannot use pthreads?  The new implementation (NPTL) has
a lot of improvements, including saner signal handling behavior.

If not, you probably are out of luck.  Best I can think of is somehow
using thread-specific storage, but you would obviously need to index
into it using something OTHER than the PID.  Which basically leaves you
with the stack.  So, unless you could cache the PID in a local
variable..

>  Would gettid() be any better?

If you aren't using CLONE_THREAD, gettid() will just return the PID.
And if you were using CLONE_THREAD, then getpid() would be worthless for
you and you would have to use gettid().  Either way, though, they
basically do the same thing (return current->tid vs. current->pid).

	Robert Love



^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: Using getpid() often, another way? [was Re: clone() <-> getpid() bug in 2.6?]
  2004-06-06 15:38     ` Using getpid() often, another way? [was Re: clone() <-> getpid() bug in 2.6?] Russell Leighton
  2004-06-06 15:44       ` Robert Love
@ 2004-06-06 15:58       ` Arjan van de Ven
  2004-06-06 23:49         ` Russell Leighton
  2004-06-07  0:09         ` Russell Leighton
  2004-06-06 17:19       ` Linus Torvalds
  2 siblings, 2 replies; 54+ messages in thread
From: Arjan van de Ven @ 2004-06-06 15:58 UTC (permalink / raw)
  To: Russell Leighton; +Cc: Kernel Mailing List

[-- Attachment #1: Type: text/plain, Size: 505 bytes --]

On Sun, 2004-06-06 at 17:38, Russell Leighton wrote:
> I have a library that creates 2 threads using clone().
> [NOTE: I can't use pthreads for a variety of reasons, mostly due
> to the wacky signal handling rules...it turns out that using clone() is 
> cleaner for me anyway.]

a library using clone sounds suspect to me, I can't imagine an app using
pthreads being able to just use your library as a result. Note also that
clone() is not a portable interface even within linux architectures.


[-- Attachment #2: This is a digitally signed message part --]
[-- Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: clone() <-> getpid() bug in 2.6?
  2004-06-06  6:07               ` Linus Torvalds
                                   ` (2 preceding siblings ...)
  2004-06-06  9:52                 ` Bernd Eckenfels
@ 2004-06-06 16:33                 ` chris
       [not found]                 ` <200406062022.54320.vda@port.imtp.ilyichevsk.odessa.ua>
  2004-06-07 18:20                 ` Bruce Guenter
  5 siblings, 0 replies; 54+ messages in thread
From: chris @ 2004-06-06 16:33 UTC (permalink / raw)
  To: Linus Torvalds
  Cc: Kalin KOZHUHAROV, Davide Libenzi, Robert Love, Chris Wedgwood,
	Arjan van de Ven, Russell Leighton, Kernel Mailing List

Hi Linus,

On Sat, 5 Jun 2004, Linus Torvalds wrote:

> But no, even despite the strange usage, this isn't a performance issue.
> qmail will call "getpid()" a few tens of times per connection because of
> the wonderful quality of randomness it provides, or something.

The openssl library seems to have a strange love for getpid() too. At
least, my old-ish version "openssl-0.9.6b-35.7" does.
My strace logs show over 50 consecutive getpid() calls during the
processing of the SSL_accept() routine. (I'm adding SSL support to vsftpd;
SSL_accept() is called every client connect and also to accept passive
data connections).

Cheers
Chris

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: clone() <-> getpid() bug in 2.6?
  2004-06-06  7:57                 ` Erik Andersen
@ 2004-06-06 16:57                   ` Linus Torvalds
  2004-06-06 18:53                     ` Simon Kirby
  0 siblings, 1 reply; 54+ messages in thread
From: Linus Torvalds @ 2004-06-06 16:57 UTC (permalink / raw)
  To: Erik Andersen
  Cc: Kalin KOZHUHAROV, Davide Libenzi, Robert Love, Chris Wedgwood,
	Arjan van de Ven, Russell Leighton, Kernel Mailing List



On Sun, 6 Jun 2004, Erik Andersen wrote:
> 
> http://sources.redhat.com/cgi-bin/cvsweb.cgi/~checkout~/libc/sysdeps/posix/tempname.c?rev=1.36&content-type=text/plain&cvsroot=glibc
> 
>     /* Get some more or less random data.  */
>     random_time_bits = ((uint64_t) tv.tv_usec << 16) ^ tv.tv_sec;
>     value += random_time_bits ^ __getpid ();

This one is historically at least a bit understandable: it's quite common 
for various programs to create their lockfiles with the "unique" 
identifier that is the pid of the locker. So the algorithm used to be 
roughly (used by various things, ranging from tty subsystem locking to 
whatever):

	int fd, len;
	int pid = getpid();

	len = sprintf(name, "lockfile.%d", pid);
	fd = open(name, O_EXCL | O_CREAT | O_WRONLY, 0644);
	if (fd < 0) {
		if (errno == EEXIST) {
			fprintf(stderr, "Stale lockfile %s. Remove me\n", name);
			exit(1); /* Or just "unlink + repeat" */
		}
		perror("lockfile");
		exit(1);
	}

	/* Write the pid into the lockfile, fsync it */
	write(fd, name + 9, len - 9);
	fsync(fd);

	/* Try to move it atomically */
	while (rename("lockfile", name) < 0) {
		if (errno != EEXIST) {
			perror("lockfile");
			exit(1);
		}
		/* Try again later */
		sleep(1);
	}

	... We now have an exclusive lock ..

and here "pid" is in fact a good system-wide unique identifier. It's not
random, and it works badly with networked filesystems (which is why you'll
find versions of this algorithm that use the hostname in addition to the
pid too), but it's "obvious".

I think the original "mktemp()" also used to start off with "pid" as a
"unique" identifier. Because while it's not random, it _is_ system-wide
unique, so getpid() actually makes _sense_ in that respect.

In other words: getpid() makes little sense as a random value, but it does
make some sense as a unique value. Maybe less now than it did 20 years ago
due to the prevalence of networking, but history is powerful.

			Linus

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: Using getpid() often, another way? [was Re: clone() <-> getpid() bug in 2.6?]
  2004-06-06 15:38     ` Using getpid() often, another way? [was Re: clone() <-> getpid() bug in 2.6?] Russell Leighton
  2004-06-06 15:44       ` Robert Love
  2004-06-06 15:58       ` Arjan van de Ven
@ 2004-06-06 17:19       ` Linus Torvalds
  2004-06-12  9:15         ` Dominik Straßer
  2 siblings, 1 reply; 54+ messages in thread
From: Linus Torvalds @ 2004-06-06 17:19 UTC (permalink / raw)
  To: Russell Leighton; +Cc: Kernel Mailing List



On Sun, 6 Jun 2004, Russell Leighton wrote:
>
> Linus said he could not imagine a program using getpid() more than a 
> handful of times...

No, I said that I could not imagine using it more than a handful of times 
_except_ for the cases of:

 - thread identification without a native thread area
 - benchmarking.

(and in both of these cases it is _wrong_ to cache the pid value)

> well, (I am ashamed to admit it) I found this getpid() bug with just 
> such a program.
> 
> Could someone can suggest an alternative to using getpid() for me? 
> Here's the problem...
> 
> I have a library that creates 2 threads using clone().

Your problem falls under the thread ID thing. It's fine and understandable 
to use getpid() for that, although you could probably do it faster if you 
are willing to use the support that modern kernels give you and that glibc 
uses: the "TLS" aka Thread Local Storage support.

Thread-local storage involved having a user-mode register that points some 
way to a special part of the address space. On x86, where the general 
register set is very limited and stealing a general reg would thus be bad, 
it uses a segment and loads the TLS pointer into a segment register 
(segment registers are registers too - and since nobody sane uses them for 
anything else these days, both %gs and %fs are freely usable).

>  Would gettid() be any better?

You'd avoid this particular glibc bug with gettid.

		Linus

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: clone() <-> getpid() bug in 2.6?
  2004-06-06 13:07                   ` Paul Rolland
@ 2004-06-06 17:20                     ` Patrick J. LoPresti
  2004-06-06 17:31                       ` Paul Rolland
                                         ` (2 more replies)
  0 siblings, 3 replies; 54+ messages in thread
From: Patrick J. LoPresti @ 2004-06-06 17:20 UTC (permalink / raw)
  To: Paul Rolland; +Cc: linux-kernel

"Paul Rolland" <rol@as2917.net> writes:

> Hello,
> 
> > It does that  for the unique  filenames and id stamps 
> > (maildir format and
> 
> I though this was a job for mktemp and co.

Sure, if you want to create a unique filename on a local unshared
disk.

But suppose you want a unique filename on a partition which might be
NFS-mounted with multiple concurrent writers.  Oh, and the year is
1997.  Still want to use mktemp?

Bernstein decided to name each file:

    <time>.<PID>.<fully-qualified-domain-name>

This approach makes three assumptions.  First, that the time does not
repeat.  Second, that the FQDN is unique.  Third, that the time
changes (i.e., one second elapses) before a PID is reused on the local
system.  Subject to these assumptions, this is a perfectly reliable,
very efficient, lock-free way to create unique file names.

Can you do better?

Sure, Bernstein's code is an unmaintainable nightmare.  But his
designs and implementations are usually pretty sound.  The last
release of qmail was in 1998, and we still use it where I work.  Six
years with zero crashes, zero dropped messages, and zero security
holes is not a trivial track record for an MTA.

Just don't ever try to edit the sources.  :-)

 - Pat

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: clone() <-> getpid() bug in 2.6?
  2004-06-06 17:20                     ` Patrick J. LoPresti
@ 2004-06-06 17:31                       ` Paul Rolland
  2004-06-06 17:43                       ` Davide Libenzi
  2004-06-06 18:17                       ` Rik van Riel
  2 siblings, 0 replies; 54+ messages in thread
From: Paul Rolland @ 2004-06-06 17:31 UTC (permalink / raw)
  To: 'Patrick J. LoPresti'; +Cc: linux-kernel

> > I though this was a job for mktemp and co.
> 
> Sure, if you want to create a unique filename on a local unshared
> disk.

Right.
 
> But suppose you want a unique filename on a partition which might be
> NFS-mounted with multiple concurrent writers.  Oh, and the year is
> 1997.  Still want to use mktemp?

Probably not...

> Bernstein decided to name each file:
> 
>     <time>.<PID>.<fully-qualified-domain-name>
> 
> This approach makes three assumptions.  First, that the time does not
> repeat.  Second, that the FQDN is unique.  Third, that the time
> changes (i.e., one second elapses) before a PID is reused on the local
> system.  Subject to these assumptions, this is a perfectly reliable,
> very efficient, lock-free way to create unique file names.
> 
> Can you do better?
Not sure it's the best place to run a contest...
However, I could have used something like :
<FQDN>.<time-up-to-usec>.<random-number>

But this is biased because I know I must find an answer with <PID> :-)

> Sure, Bernstein's code is an unmaintainable nightmare.  But his
> designs and implementations are usually pretty sound.  The last
> release of qmail was in 1998, and we still use it where I work.  Six
> years with zero crashes, zero dropped messages, and zero security
> holes is not a trivial track record for an MTA.

One point for him.

Paul



^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: clone() <-> getpid() bug in 2.6?
  2004-06-06 17:20                     ` Patrick J. LoPresti
  2004-06-06 17:31                       ` Paul Rolland
@ 2004-06-06 17:43                       ` Davide Libenzi
  2004-06-06 18:17                       ` Rik van Riel
  2 siblings, 0 replies; 54+ messages in thread
From: Davide Libenzi @ 2004-06-06 17:43 UTC (permalink / raw)
  To: Patrick J. LoPresti; +Cc: Paul Rolland, Linux Kernel Mailing List

On Sun, 6 Jun 2004, Patrick J. LoPresti wrote:

> "Paul Rolland" <rol@as2917.net> writes:
> 
> > Hello,
> > 
> > > It does that  for the unique  filenames and id stamps 
> > > (maildir format and
> > 
> > I though this was a job for mktemp and co.
> 
> Sure, if you want to create a unique filename on a local unshared
> disk.
> 
> But suppose you want a unique filename on a partition which might be
> NFS-mounted with multiple concurrent writers.  Oh, and the year is
> 1997.  Still want to use mktemp?
> 
> Bernstein decided to name each file:
> 
>     <time>.<PID>.<fully-qualified-domain-name>
> 
> This approach makes three assumptions.  First, that the time does not
> repeat.  Second, that the FQDN is unique.  Third, that the time
> changes (i.e., one second elapses) before a PID is reused on the local
> system.  Subject to these assumptions, this is a perfectly reliable,
> very efficient, lock-free way to create unique file names.

As long as 1) your program is not MT 2) your process creates exactly *one* 
unique file. Then yes, it is fine. But, if your program is MT (with 
threads sharing the PID) or your single thread process creates two (or 
more) unique files within 1 sec, you've got a problem. Using gettid() 
would solve the MT issue (there's still the tid recycling period), but not 
the time issue.



- Davide


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: clone() <-> getpid() bug in 2.6?
       [not found]                 ` <200406062022.54320.vda@port.imtp.ilyichevsk.odessa.ua>
@ 2004-06-06 17:55                   ` Linus Torvalds
  0 siblings, 0 replies; 54+ messages in thread
From: Linus Torvalds @ 2004-06-06 17:55 UTC (permalink / raw)
  To: Denis Vlasenko
  Cc: Kalin KOZHUHAROV, Davide Libenzi, Robert Love, Chris Wedgwood,
	Arjan van de Ven, Russell Leighton, Kernel Mailing List



On Sun, 6 Jun 2004, Denis Vlasenko wrote:
> >
> > 	if (now() - when < ((60 + (getpid() & 31)) << 6))
> 
> timeout randomization across several qmail-send processes, to prevent
> stampede effect?
> 
> > And don't you find
> > the above (totally uncommented) line just a thing of beauty and clarity?
> 
> Yes, a comment would be in place. :(

The problem is not so much what it does, but how it's coded.

For example, if it really wanted to do this, then how about having a nice 
variable that is initialized at run-time (preferably just once at 
startup), and has a nice name and a comment on what it's all about? Or 
even a macro, to make it more readable? Or even _just_ the comment?

You could write the above incomprehensible one-liner as

	/* Timeout in seconds: 64 - 97 minutes, depending on pid */
	#define ERROR_FAILURE_TIMEOUT ((60 + (getpid() & 31)) << 6)

	...

	/*
	 * If this IP address had a SMTP connection timeout last
	 * time, don't let it try to connect again immediately
	 */
	if (now() - when < ERROR_FAILURE_TIMEOUT)

and it would at least have made some sense.

As it is, you CANNOT read "tcpto.c" and make any sense of it. It's not 
sensible code. You have to know what the hell the whole thing is all 
about, and you have to actually try to figure out what the expression is 
to even understand the code.

Trust me. Try sometime. The code is horrible.

It wouldn't irritate me so much, but then the person who wrote that 
abomination has the galls to complain about sendmail, NFS, System V etc.

And it's not like qmail is so big and has such a long history that it 
would be a huge deal to comment it and write it more readably. But since 
the license doesn't allow for anything but patches, and patches would be 
totally unmaintainable if they tried to fix these kinds of things, it will 
never be done.

Which is a pity, since I actually _agree_ with you that qmail has some
good sides: it's small, it's efficient, and it's pretty modular. It has 
the _potential_ to be great code.

			Linus

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: clone() <-> getpid() bug in 2.6?
  2004-06-06 17:20                     ` Patrick J. LoPresti
  2004-06-06 17:31                       ` Paul Rolland
  2004-06-06 17:43                       ` Davide Libenzi
@ 2004-06-06 18:17                       ` Rik van Riel
  2004-06-06 18:37                         ` Patrick J. LoPresti
  2 siblings, 1 reply; 54+ messages in thread
From: Rik van Riel @ 2004-06-06 18:17 UTC (permalink / raw)
  To: Patrick J. LoPresti; +Cc: Paul Rolland, linux-kernel

On 6 Jun 2004, Patrick J. LoPresti wrote:

> Sure, Bernstein's code is an unmaintainable nightmare.  But his
> designs and implementations are usually pretty sound.  The last
> release of qmail was in 1998, and we still use it where I work.

Ahhh, so YOU'RE the guy who's flooding the whole world
with bounces from spam and virusses to unknown users,
with the virusses still attached ? ;)))

-- 
"Debugging is twice as hard as writing the code in the first place.
Therefore, if you write the code as cleverly as possible, you are,
by definition, not smart enough to debug it." - Brian W. Kernighan


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: clone() <-> getpid() bug in 2.6?
  2004-06-06 18:17                       ` Rik van Riel
@ 2004-06-06 18:37                         ` Patrick J. LoPresti
  0 siblings, 0 replies; 54+ messages in thread
From: Patrick J. LoPresti @ 2004-06-06 18:37 UTC (permalink / raw)
  To: Rik van Riel; +Cc: Paul Rolland, linux-kernel

Rik van Riel <riel@redhat.com> writes:

> On 6 Jun 2004, Patrick J. LoPresti wrote:
> 
> > Sure, Bernstein's code is an unmaintainable nightmare.  But his
> > designs and implementations are usually pretty sound.  The last
> > release of qmail was in 1998, and we still use it where I work.
> 
> Ahhh, so YOU'RE the guy who's flooding the whole world
> with bounces from spam and virusses to unknown users,
> with the virusses still attached ? ;)))

OK, I admit we no longer use it as an external SMTP server.  The world
has changed a bit since 1998.  :-)

My main point still stands, though, which is that djb is crazy, not
stupid (i.e., a mathematician).

 - Pat

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: clone() <-> getpid() bug in 2.6?
  2004-06-06 16:57                   ` Linus Torvalds
@ 2004-06-06 18:53                     ` Simon Kirby
  2004-06-06 19:00                       ` Linus Torvalds
  0 siblings, 1 reply; 54+ messages in thread
From: Simon Kirby @ 2004-06-06 18:53 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Kernel Mailing List

On Sun, Jun 06, 2004 at 09:57:20AM -0700, Linus Torvalds wrote:

> 	/* Write the pid into the lockfile, fsync it */
> 	write(fd, name + 9, len - 9);
> 	fsync(fd);

Unrelated to this discussion -- and there is a close() missing -- but is
there any reason for fsync() to be there?  I've seen this often before,
but I've never understood why it would be necessary to force the data to
disk, especially when it will likely be removed later before it would
have otherwise been written to disk.  Shouldn't the lock file behave
properly without fsync(), even across NFS, and even across all OSes?

Simon-

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: clone() <-> getpid() bug in 2.6?
  2004-06-06 18:53                     ` Simon Kirby
@ 2004-06-06 19:00                       ` Linus Torvalds
  0 siblings, 0 replies; 54+ messages in thread
From: Linus Torvalds @ 2004-06-06 19:00 UTC (permalink / raw)
  To: Simon Kirby; +Cc: Kernel Mailing List



On Sun, 6 Jun 2004, Simon Kirby wrote:
> 
> Unrelated to this discussion -- and there is a close() missing -- but is
> there any reason for fsync() to be there?

It only matters if the lock is meaningful over an involuntary reboot (aka
crash). Usually it isn't. So yes, the fsync() is usually not necessary.

Sometimes the lock file may contain real data that is meaningful for 
recovery, though. The pid generally isn't, but it could point to a log 
that describes what the thing was about to do. Or it could just be helpful 
on next reboot when you have a stale lockfile, and you want to match that 
lockfile to any syslog messages or similar.

		Linus

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: Using getpid() often, another way? [was Re: clone() <-> getpid() bug in 2.6?]
  2004-06-06 15:58       ` Arjan van de Ven
@ 2004-06-06 23:49         ` Russell Leighton
  2004-06-07 12:13           ` Arjan van de Ven
  2004-06-07  0:09         ` Russell Leighton
  1 sibling, 1 reply; 54+ messages in thread
From: Russell Leighton @ 2004-06-06 23:49 UTC (permalink / raw)
  To: arjanv; +Cc: Kernel Mailing List

Arjan van de Ven wrote:

> Note also that
>
>clone() is not a portable interface even within linux architectures.
>
>  
>
Really???!!! How so?

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: Using getpid() often, another way? [was Re: clone() <-> getpid() bug in 2.6?]
  2004-06-06 15:58       ` Arjan van de Ven
  2004-06-06 23:49         ` Russell Leighton
@ 2004-06-07  0:09         ` Russell Leighton
  2004-06-07 12:20           ` Arjan van de Ven
  1 sibling, 1 reply; 54+ messages in thread
From: Russell Leighton @ 2004-06-07  0:09 UTC (permalink / raw)
  To: arjanv; +Cc: Kernel Mailing List

Arjan van de Ven wrote:

>On Sun, 2004-06-06 at 17:38, Russell Leighton wrote:
>  
>
>>I have a library that creates 2 threads using clone().
>>[NOTE: I can't use pthreads for a variety of reasons, mostly due
>>to the wacky signal handling rules...it turns out that using clone() is 
>>cleaner for me anyway.]
>>    
>>
>
>a library using clone sounds suspect to me, I can't imagine an app using
>pthreads being able to just use your library as a result.
>
Why? In  what way would a program that uses pthreads interfere with 
threads created using clone()?

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: Using getpid() often, another way? [was Re: clone() <-> getpid() bug in 2.6?]
  2004-06-06 15:44       ` Robert Love
@ 2004-06-07  0:20         ` Russell Leighton
  0 siblings, 0 replies; 54+ messages in thread
From: Russell Leighton @ 2004-06-07  0:20 UTC (permalink / raw)
  To: Robert Love; +Cc: Kernel Mailing List


Ah, the stack. Yes, I think that will work. You see, the functions that 
need this test
always take a pointer to a struct that holds all the thread info called 
'mon'. That info is the pid of the
thread , the ptr to the stack mmap'd and the size of the stack, I think 
I can change the test to:

{
   int32_t
     stackvar;

    /* if address of the stackvar is OUTSIDE the stack of handler 
thread, then
         you are not running this function from the handler thread */
   if ( &stackvar < mon->handler_q.thread->stack ||
         &stackvar > mon->handler_q.thread->stack + 
mon->handler_q.thread->stacksize - 1) {

        fprintf(stderr, "bad, bad, bad!\n");
        exit(-1);

    }

}

Would that work? If so, that is nice because no syscall.

Robert Love wrote:

>On Sun, 2004-06-06 at 11:38 -0400, Russell Leighton wrote:
>
>  
>
>>  Given a code library with some exported functions which CAN be 
>>executed outside a particular thread and others that MUST be executed in 
>>a particular thread, how can I efficiently prevent or trap using of 
>>these functions outside the proper execution context?
>>    
>>
>
>Are you sure you cannot use pthreads?  The new implementation (NPTL) has
>a lot of improvements, including saner signal handling behavior.
>
>If not, you probably are out of luck.  Best I can think of is somehow
>using thread-specific storage, but you would obviously need to index
>into it using something OTHER than the PID.  Which basically leaves you
>with the stack.  So, unless you could cache the PID in a local
>variable..
>
>  
>
>> Would gettid() be any better?
>>    
>>
>
>If you aren't using CLONE_THREAD, gettid() will just return the PID.
>And if you were using CLONE_THREAD, then getpid() would be worthless for
>you and you would have to use gettid().  Either way, though, they
>basically do the same thing (return current->tid vs. current->pid).
>
>	Robert Love
>
>
>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to majordomo@vger.kernel.org
>More majordomo info at  http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at  http://www.tux.org/lkml/
>
>
>  
>


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: Using getpid() often, another way? [was Re: clone() <-> getpid() bug in 2.6?]
  2004-06-06 23:49         ` Russell Leighton
@ 2004-06-07 12:13           ` Arjan van de Ven
  2004-06-07 13:48             ` Sean Neakums
  0 siblings, 1 reply; 54+ messages in thread
From: Arjan van de Ven @ 2004-06-07 12:13 UTC (permalink / raw)
  To: Russell Leighton; +Cc: Kernel Mailing List

[-- Attachment #1: Type: text/plain, Size: 261 bytes --]


On Sun, Jun 06, 2004 at 07:49:50PM -0400, Russell Leighton wrote:
> Arjan van de Ven wrote:
> 
> >Note also that
> >
> >clone() is not a portable interface even within linux architectures.
> >
> > 
> >
> Really???!!! How so?

for example ia64 doesn't have it.

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: Using getpid() often, another way? [was Re: clone() <-> getpid() bug in 2.6?]
  2004-06-07  0:09         ` Russell Leighton
@ 2004-06-07 12:20           ` Arjan van de Ven
  0 siblings, 0 replies; 54+ messages in thread
From: Arjan van de Ven @ 2004-06-07 12:20 UTC (permalink / raw)
  To: Russell Leighton; +Cc: Kernel Mailing List

[-- Attachment #1: Type: text/plain, Size: 456 bytes --]

On Sun, Jun 06, 2004 at 08:09:17PM -0400, Russell Leighton wrote:
> >a library using clone sounds suspect to me, I can't imagine an app using
> >pthreads being able to just use your library as a result.
> >
> Why? In  what way would a program that uses pthreads interfere with 
> threads created using clone()?

you do all kinds of tricks behind the threading library's back, and share
some stuff while not others. I'd be suprised of that would NOT break.

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: Using getpid() often, another way? [was Re: clone() <-> getpid() bug in 2.6?]
  2004-06-07 12:13           ` Arjan van de Ven
@ 2004-06-07 13:48             ` Sean Neakums
  2004-06-07 14:00               ` Christoph Hellwig
  0 siblings, 1 reply; 54+ messages in thread
From: Sean Neakums @ 2004-06-07 13:48 UTC (permalink / raw)
  To: Arjan van de Ven; +Cc: Russell Leighton, Kernel Mailing List

Arjan van de Ven <arjanv@redhat.com> writes:

> On Sun, Jun 06, 2004 at 07:49:50PM -0400, Russell Leighton wrote:
>> Arjan van de Ven wrote:
>> 
>> >Note also that
>> >
>> >clone() is not a portable interface even within linux architectures.
>> >
>> > 
>> >
>> Really???!!! How so?
>
> for example ia64 doesn't have it.

Then what is the sys_clone2 implementation in arch/is64/kernel/entry.S for?


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: Using getpid() often, another way? [was Re: clone() <-> getpid() bug in 2.6?]
  2004-06-07 13:48             ` Sean Neakums
@ 2004-06-07 14:00               ` Christoph Hellwig
  2004-06-07 14:10                 ` Sean Neakums
  2004-06-07 18:42                 ` David Mosberger
  0 siblings, 2 replies; 54+ messages in thread
From: Christoph Hellwig @ 2004-06-07 14:00 UTC (permalink / raw)
  To: Arjan van de Ven, Russell Leighton, Kernel Mailing List

On Mon, Jun 07, 2004 at 02:48:31PM +0100, Sean Neakums wrote:
> > for example ia64 doesn't have it.
> 
> Then what is the sys_clone2 implementation in arch/is64/kernel/entry.S for?

It's clone with a slightly different calling convention.


^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: Using getpid() often, another way? [was Re: clone() <-> getpid() bug in 2.6?]
  2004-06-07 14:00               ` Christoph Hellwig
@ 2004-06-07 14:10                 ` Sean Neakums
  2004-06-07 18:42                 ` David Mosberger
  1 sibling, 0 replies; 54+ messages in thread
From: Sean Neakums @ 2004-06-07 14:10 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Arjan van de Ven, Russell Leighton, Kernel Mailing List

Christoph Hellwig <hch@infradead.org> writes:

> On Mon, Jun 07, 2004 at 02:48:31PM +0100, Sean Neakums wrote:
>> > for example ia64 doesn't have it.
>> 
>> Then what is the sys_clone2 implementation in arch/is64/kernel/entry.S for?
>
> It's clone with a slightly different calling convention.

Ah.  I misintereted Arjan as saying that ia64 didn't have a clone at
all, which would have been pretty wacky.  Sorry for the noise.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: clone() <-> getpid() bug in 2.6?
  2004-06-06  6:07               ` Linus Torvalds
                                   ` (4 preceding siblings ...)
       [not found]                 ` <200406062022.54320.vda@port.imtp.ilyichevsk.odessa.ua>
@ 2004-06-07 18:20                 ` Bruce Guenter
  2004-06-08 11:06                   ` Kalin KOZHUHAROV
  5 siblings, 1 reply; 54+ messages in thread
From: Bruce Guenter @ 2004-06-07 18:20 UTC (permalink / raw)
  To: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1801 bytes --]

On Mon, Jun 07, 2004 at 11:35:23PM +0000, Linus Torvalds wrote:
> On Sun, 6 Jun 2004, Kalin KOZHUHAROV wrote:
> >
> > Well, not exactly sure about my reply, but let me try.
> > 
> > The other day I was debugging some config problems with my qmail instalation and I ended up doing:
> > # strace -p 4563 -f -F
> > ...
> > [pid 13097] read(3, "\347\374\375TBH~\342\233\337\220\302l\220\317\237\37\25"..., 32) = 32
> > [pid 13097] close(3)                    = 0
> > [pid 13097] getpid()                    = 13097
> > [pid 13097] getpid()                    = 13097
> > [pid 13097] getuid32()                  = 89
> > [pid 13097] getpid()                    = 13097
> > [pid 13097] time(NULL)                  = 1086497450
> > [pid 13097] getpid()                    = 13097
> > [pid 13097] getpid()                    = 13097
> > [pid 13097] getpid()                    = 13097
> 
> qmail is a piece of crap. The source code is completely unreadable, and it 
> seems to think that "getpid()" is a good source of random data. Don't ask 
> me why.

In this case, however, it has nothing directly to do with qmail.  This
is tcpserver, and tcpserver only uses getpid for two things: printing
out status lines with the PID in them (which seems perfectly valid to
me), and once when adding to random initializer for DNS requests.

This strace pattern seemed rather odd to me, so for comparison I straced
my own tcpserver setups, and could not get them to produce more than two
getpid calls per connection.  Something is wrong with this trace,
possibly some weirdness in a patch, like whatever the SSL library is
doing.
-- 
Bruce Guenter <bruceg@em.ca> http://em.ca/~bruceg/ http://untroubled.org/
OpenPGP key: 699980E8 / D0B7 C8DD 365D A395 29DA  2E2A E96F B2DC 6999 80E8

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: Using getpid() often, another way? [was Re: clone() <-> getpid() bug in 2.6?]
  2004-06-07 14:00               ` Christoph Hellwig
  2004-06-07 14:10                 ` Sean Neakums
@ 2004-06-07 18:42                 ` David Mosberger
  2004-06-07 23:02                   ` Russell Leighton
  1 sibling, 1 reply; 54+ messages in thread
From: David Mosberger @ 2004-06-07 18:42 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: Arjan van de Ven, Russell Leighton, Kernel Mailing List

>>>>> On Mon, 7 Jun 2004 15:00:09 +0100, Christoph Hellwig <hch@infradead.org> said:

  Christoph> On Mon, Jun 07, 2004 at 02:48:31PM +0100, Sean Neakums
  Christoph> wrote:
  >> > for example ia64 doesn't have it.

  >> Then what is the sys_clone2 implementation in
  >> arch/is64/kernel/entry.S for?

  Christoph> It's clone with a slightly different calling convention.

Note that the only difference is that the stack-area is expressed as a
range (starting-address + size), rather than a direct stack-pointer
value.  IMHO, it was a mistake to not do it that way right from the
beginning (consider that different arches grow stacks in different
directions, for example).

	--david

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: Using getpid() often, another way? [was Re: clone() <-> getpid() bug in 2.6?]
  2004-06-07 18:42                 ` David Mosberger
@ 2004-06-07 23:02                   ` Russell Leighton
  2004-06-07 23:27                     ` David Mosberger
  2004-06-08  6:01                     ` Arjan van de Ven
  0 siblings, 2 replies; 54+ messages in thread
From: Russell Leighton @ 2004-06-07 23:02 UTC (permalink / raw)
  To: davidm; +Cc: Christoph Hellwig, Arjan van de Ven, Kernel Mailing List

David Mosberger wrote:

>>>>>>On Mon, 7 Jun 2004 15:00:09 +0100, Christoph Hellwig <hch@infradead.org> said:
>>>>>>            
>>>>>>
>
>  Christoph> On Mon, Jun 07, 2004 at 02:48:31PM +0100, Sean Neakums
>  Christoph> wrote:
>  >> > for example ia64 doesn't have it.
>
>  >> Then what is the sys_clone2 implementation in
>  >> arch/is64/kernel/entry.S for?
>
>  Christoph> It's clone with a slightly different calling convention.
>
>Note that the only difference is that the stack-area is expressed as a
>range (starting-address + size), rather than a direct stack-pointer
>value.  IMHO, it was a mistake to not do it that way right from the
>beginning (consider that different arches grow stacks in different
>directions, for example).
>
>	
>
So Ia64 does have it..that's good. Does glibc wrap it?

I agree with the above...could glibc's clone() should have a size added? 
Then the arch specific stack issues
could be hidden.

BTW, does gcc have a built-in #define like __STACK_GROWSUP__ that would 
allow one to deal with the missing size parameter
when you called clone() by adjusting what you passed with and #ifdef?.

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: Using getpid() often, another way? [was Re: clone() <-> getpid() bug in 2.6?]
  2004-06-07 23:02                   ` Russell Leighton
@ 2004-06-07 23:27                     ` David Mosberger
  2004-06-08  6:01                     ` Arjan van de Ven
  1 sibling, 0 replies; 54+ messages in thread
From: David Mosberger @ 2004-06-07 23:27 UTC (permalink / raw)
  To: Russell Leighton
  Cc: davidm, Christoph Hellwig, Arjan van de Ven, Kernel Mailing List

>>>>> On Mon, 07 Jun 2004 19:02:34 -0400, Russell Leighton <russ@elegant-software.com> said:

  Russell> So Ia64 does have it..that's good. Does glibc wrap it?

Here is how it works at the user-level:

 - There _is_ a clone(), with the original interface.  However,
   this version only works when you create a new address-space.

 - There is clone2(), which adds the extra "size" argument.  This
   one works for all cases.

  Russell> I agree with the above...could glibc's clone() should have
  Russell> a size added?  Then the arch specific stack issues could be
  Russell> hidden.

In my opinion, it would make sense for all platforms to support
clone2(), since it's more in line with the normal UNIX-convention of
specifying stacks as a memory-range (e.g., see stack_t).  So far, the
interest in doing this has been lack-luster (and, IIRC, Linus was
against it in the past, so I haven't spent a lot of effort on it).

	--david

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: Using getpid() often, another way? [was Re: clone() <-> getpid() bug in 2.6?]
  2004-06-07 23:02                   ` Russell Leighton
  2004-06-07 23:27                     ` David Mosberger
@ 2004-06-08  6:01                     ` Arjan van de Ven
  2004-06-08  9:48                       ` Eric W. Biederman
  1 sibling, 1 reply; 54+ messages in thread
From: Arjan van de Ven @ 2004-06-08  6:01 UTC (permalink / raw)
  To: Russell Leighton; +Cc: davidm, Christoph Hellwig, Kernel Mailing List

[-- Attachment #1: Type: text/plain, Size: 433 bytes --]

On Mon, Jun 07, 2004 at 07:02:34PM -0400, Russell Leighton wrote:
> >
> So Ia64 does have it..that's good. Does glibc wrap it?
> 
> I agree with the above...could glibc's clone() should have a size added? 
> Then the arch specific stack issues
> could be hidden.

glibc doesn't provide clone other than a raw syscall wrapper, under the
assumption that when you want threads, you'll use it's thread creation call.
Not too unfair imo.

[-- Attachment #2: Type: application/pgp-signature, Size: 189 bytes --]

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: Using getpid() often, another way? [was Re: clone() <-> getpid() bug in 2.6?]
  2004-06-08  6:01                     ` Arjan van de Ven
@ 2004-06-08  9:48                       ` Eric W. Biederman
  0 siblings, 0 replies; 54+ messages in thread
From: Eric W. Biederman @ 2004-06-08  9:48 UTC (permalink / raw)
  To: Arjan van de Ven
  Cc: Russell Leighton, davidm, Christoph Hellwig, Kernel Mailing List

Arjan van de Ven <arjanv@redhat.com> writes:

> On Mon, Jun 07, 2004 at 07:02:34PM -0400, Russell Leighton wrote:
> > >
> > So Ia64 does have it..that's good. Does glibc wrap it?
> > 
> > I agree with the above...could glibc's clone() should have a size added? 
> > Then the arch specific stack issues
> > could be hidden.
> 
> glibc doesn't provide clone other than a raw syscall wrapper, under the
> assumption that when you want threads, you'll use it's thread creation call.
> Not too unfair imo.

That fn parameter certainly more than a raw wrapper.  I do agree that
what is needed is a fairly raw wrapper though. 

I don't see how creating a clone2 wrapper that drops the extra argument
on platforms that don't use it is any different than what glibc already
does though.

Eric



^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: clone() <-> getpid() bug in 2.6?
  2004-06-07 18:20                 ` Bruce Guenter
@ 2004-06-08 11:06                   ` Kalin KOZHUHAROV
  0 siblings, 0 replies; 54+ messages in thread
From: Kalin KOZHUHAROV @ 2004-06-08 11:06 UTC (permalink / raw)
  To: Bruce Guenter; +Cc: linux-kernel

Bruce Guenter wrote:
> On Mon, Jun 07, 2004 at 11:35:23PM +0000, Linus Torvalds wrote:
> 
>>On Sun, 6 Jun 2004, Kalin KOZHUHAROV wrote:
>>
>>>Well, not exactly sure about my reply, but let me try.
>>>
>>>The other day I was debugging some config problems with my qmail instalation and I ended up doing:
>>># strace -p 4563 -f -F
>>>...
>>>[pid 13097] read(3, "\347\374\375TBH~\342\233\337\220\302l\220\317\237\37\25"..., 32) = 32
>>>[pid 13097] close(3)                    = 0
>>>[pid 13097] getpid()                    = 13097
>>>[pid 13097] getpid()                    = 13097
>>>[pid 13097] getuid32()                  = 89
>>>[pid 13097] getpid()                    = 13097
>>>[pid 13097] time(NULL)                  = 1086497450
>>>[pid 13097] getpid()                    = 13097
>>>[pid 13097] getpid()                    = 13097
>>>[pid 13097] getpid()                    = 13097
>>
>>qmail is a piece of crap. The source code is completely unreadable, and it 
>>seems to think that "getpid()" is a good source of random data. Don't ask 
>>me why.
> 
> 
> In this case, however, it has nothing directly to do with qmail.  This
> is tcpserver, and tcpserver only uses getpid for two things: printing
> out status lines with the PID in them (which seems perfectly valid to
> me), and once when adding to random initializer for DNS requests.
> 
> This strace pattern seemed rather odd to me, so for comparison I straced
> my own tcpserver setups, and could not get them to produce more than two
> getpid calls per connection.  Something is wrong with this trace,
> possibly some weirdness in a patch, like whatever the SSL library is
> doing.

Yes, it is strange. I'll have a look at the applied patches. Without SSL there is no such getpidding :-)
Will post you (not LKML) if I find the culprit.

Kalin.

-- 
||///_ o  *****************************
||//'_/>     WWW: http://ThinRope.net/
|||\/<" 
|||\\ ' 
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: Using getpid() often, another way? [was Re: clone() <-> getpid() bug in 2.6?]
  2004-06-06 17:19       ` Linus Torvalds
@ 2004-06-12  9:15         ` Dominik Straßer
  2004-06-12 13:47           ` Linus Torvalds
  0 siblings, 1 reply; 54+ messages in thread
From: Dominik Straßer @ 2004-06-12  9:15 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Russell Leighton, Kernel Mailing List

Linus Torvalds wrote:

>On Sun, 6 Jun 2004, Russell Leighton wrote:
>  
>
>>Linus said he could not imagine a program using getpid() more than a 
>>handful of times...
>>    
>>
>
>No, I said that I could not imagine using it more than a handful of times 
>_except_ for the cases of:
>
> - thread identification without a native thread area
> - benchmarking.
>
>(and in both of these cases it is _wrong_ to cache the pid value)
>
>  
>
>>well, (I am ashamed to admit it) I found this getpid() bug with just 
>>such a program.
>>
>>Could someone can suggest an alternative to using getpid() for me? 
>>Here's the problem...
>>
>>I have a library that creates 2 threads using clone().
>>    
>>
>
>Your problem falls under the thread ID thing. It's fine and understandable 
>to use getpid() for that, although you could probably do it faster if you 
>are willing to use the support that modern kernels give you and that glibc 
>uses: the "TLS" aka Thread Local Storage support.
>
>Thread-local storage involved having a user-mode register that points some 
>way to a special part of the address space. On x86, where the general 
>register set is very limited and stealing a general reg would thus be bad, 
>it uses a segment and loads the TLS pointer into a segment register 
>(segment registers are registers too - and since nobody sane uses them for 
>anything else these days, both %gs and %fs are freely usable).
>
>  
>
>> Would gettid() be any better?
>>    
>>
>
>You'd avoid this particular glibc bug with gettid.
>  
>
I am facing the following problem:
I want to sum up the time spent in the main thread + all threads that 
ever existed.
getrusage dosn't work (and didn't do so in pre-NPTL-times) as the time 
spent in threads is not taken into account.
To work around this problem I created a map pid->time used which used 
getpid in the pre-NPTL-time and looked up the time in /proc/<pid>. As 
this doesn't work with NPTL, changed it to use the gettid syscall as I 
didn't find a saner way.

Regards

Dominik

^ permalink raw reply	[flat|nested] 54+ messages in thread

* Re: Using getpid() often, another way? [was Re: clone() <-> getpid() bug in 2.6?]
  2004-06-12  9:15         ` Dominik Straßer
@ 2004-06-12 13:47           ` Linus Torvalds
  0 siblings, 0 replies; 54+ messages in thread
From: Linus Torvalds @ 2004-06-12 13:47 UTC (permalink / raw)
  To: Dominik Straßer; +Cc: Russell Leighton, Kernel Mailing List, Ingo Molnar



On Sat, 12 Jun 2004, Dominik Straßer wrote:
>
> I am facing the following problem:
> I want to sum up the time spent in the main thread + all threads that 
> ever existed.
> getrusage dosn't work (and didn't do so in pre-NPTL-times) as the time 
> spent in threads is not taken into account.

Hmm.. That's likely a bug. It definitely should work, but I guess the 
self-reaping ends up meaning that the time never gets percolated to the 
parent any more.

Ingo, any comments/ideas?

> To work around this problem I created a map pid->time used which used 
> getpid in the pre-NPTL-time and looked up the time in /proc/<pid>. As 
> this doesn't work with NPTL, changed it to use the gettid syscall as I 
> didn't find a saner way.

Changing it to gettid sounds like the right thing to do, but I also think 
that you shouldn't _need_ to do things like this.

		Linus

^ permalink raw reply	[flat|nested] 54+ messages in thread

end of thread, other threads:[~2004-06-12 13:48 UTC | newest]

Thread overview: 54+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-06-05 15:28 clone() <-> getpid() bug in 2.6? Russell Leighton
2004-06-05 20:45 ` Linus Torvalds
2004-06-05 20:55   ` Arjan van de Ven
2004-06-05 21:13     ` Linus Torvalds
2004-06-05 21:48       ` Robert Love
2004-06-05 22:44         ` Linus Torvalds
2004-06-05 21:53     ` Chris Wedgwood
2004-06-05 22:47       ` Robert Love
2004-06-05 22:57         ` David S. Miller
2004-06-05 23:01         ` Linus Torvalds
2004-06-05 23:07         ` Davide Libenzi
2004-06-05 23:18           ` Linus Torvalds
2004-06-05 23:26             ` Davide Libenzi
2004-06-06  5:08             ` Kalin KOZHUHAROV
2004-06-06  5:13               ` Chris Wedgwood
2004-06-06  5:34                 ` Kalin KOZHUHAROV
2004-06-06  6:07               ` Linus Torvalds
2004-06-06  6:43                 ` Kalin KOZHUHAROV
2004-06-06  7:57                 ` Erik Andersen
2004-06-06 16:57                   ` Linus Torvalds
2004-06-06 18:53                     ` Simon Kirby
2004-06-06 19:00                       ` Linus Torvalds
2004-06-06  9:52                 ` Bernd Eckenfels
2004-06-06 13:07                   ` Paul Rolland
2004-06-06 17:20                     ` Patrick J. LoPresti
2004-06-06 17:31                       ` Paul Rolland
2004-06-06 17:43                       ` Davide Libenzi
2004-06-06 18:17                       ` Rik van Riel
2004-06-06 18:37                         ` Patrick J. LoPresti
2004-06-06 16:33                 ` chris
     [not found]                 ` <200406062022.54320.vda@port.imtp.ilyichevsk.odessa.ua>
2004-06-06 17:55                   ` Linus Torvalds
2004-06-07 18:20                 ` Bruce Guenter
2004-06-08 11:06                   ` Kalin KOZHUHAROV
2004-06-05 23:19           ` Robert Love
2004-06-06 14:29   ` Russell Leighton
2004-06-06 15:38     ` Using getpid() often, another way? [was Re: clone() <-> getpid() bug in 2.6?] Russell Leighton
2004-06-06 15:44       ` Robert Love
2004-06-07  0:20         ` Russell Leighton
2004-06-06 15:58       ` Arjan van de Ven
2004-06-06 23:49         ` Russell Leighton
2004-06-07 12:13           ` Arjan van de Ven
2004-06-07 13:48             ` Sean Neakums
2004-06-07 14:00               ` Christoph Hellwig
2004-06-07 14:10                 ` Sean Neakums
2004-06-07 18:42                 ` David Mosberger
2004-06-07 23:02                   ` Russell Leighton
2004-06-07 23:27                     ` David Mosberger
2004-06-08  6:01                     ` Arjan van de Ven
2004-06-08  9:48                       ` Eric W. Biederman
2004-06-07  0:09         ` Russell Leighton
2004-06-07 12:20           ` Arjan van de Ven
2004-06-06 17:19       ` Linus Torvalds
2004-06-12  9:15         ` Dominik Straßer
2004-06-12 13:47           ` Linus Torvalds

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox