public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Invisible threads in 2.6
@ 2004-05-25  2:21 lm240504
  2004-05-25  7:03 ` Martin Zwickel
  0 siblings, 1 reply; 3+ messages in thread
From: lm240504 @ 2004-05-25  2:21 UTC (permalink / raw)
  To: linux-kernel

I've been experimenting with process/thread accounting in 2.6.x,
and found this strange situation: if the leader thread of a multi-threaded
process terminates, the other threads become undetectable.  After the
main thread becomes a zombie, /proc/<tgid>/task returns ENOENT on
open.  If you happen to know the TID, you can access /proc/<tid>/* directly,
but otherwise, there is no way to observe the remaining threads, as far as
I can see.  Consider this program, for example:

#include <pthread.h>

void *run(void *arg)
{
        for(;;);
}

int main()
{
        pthread_t t;
        int i;
        for (i = 0; i < 10; ++i)
                pthread_create(&t, NULL, run, NULL);
        pthread_exit(NULL);
}

When I run it, the system (predictably) goes to ~100% CPU utilization,
but there seems to be no way to find out who is hogging the CPU with
top(1), ps(1), or anything else.  All they can show is the main thread in
zombie state, consuming 0% CPU.

I'm not sure how to fix this (the pid_alive() test seems to be there for a
reason), but it doesn't seem right.  Any thoughts?

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Invisible threads in 2.6
  2004-05-25  2:21 Invisible threads in 2.6 lm240504
@ 2004-05-25  7:03 ` Martin Zwickel
  0 siblings, 0 replies; 3+ messages in thread
From: Martin Zwickel @ 2004-05-25  7:03 UTC (permalink / raw)
  To: lm240504; +Cc: linux-kernel

On Tue, 25 May 2004 02:21:19 +0000
lm240504@comcast.net bubbled:

> I've been experimenting with process/thread accounting in 2.6.x,
> and found this strange situation: if the leader thread of a multi-threaded
> process terminates, the other threads become undetectable.  After the
> main thread becomes a zombie, /proc/<tgid>/task returns ENOENT on
> open.  If you happen to know the TID, you can access /proc/<tid>/* directly,
> but otherwise, there is no way to observe the remaining threads, as far as
> I can see.  Consider this program, for example:
> 
> #include <pthread.h>
> 
> void *run(void *arg)
> {
>         for(;;);
> }
> 
> int main()
> {
>         pthread_t t;
>         int i;
>         for (i = 0; i < 10; ++i)
>                 pthread_create(&t, NULL, run, NULL);
>         pthread_exit(NULL);
> }
> 
> When I run it, the system (predictably) goes to ~100% CPU utilization,
> but there seems to be no way to find out who is hogging the CPU with
> top(1), ps(1), or anything else.  All they can show is the main thread in
> zombie state, consuming 0% CPU.
> 
> I'm not sure how to fix this (the pid_alive() test seems to be there for a
> reason), but it doesn't seem right.  Any thoughts?

my kernel:
# cat /proc/version 
Linux version 2.6.6-rc3-mm2 (root@phoebee) (gcc version 3.3.2 20031218 (Gentoo
Linux 3.3.2-r5, propolice-3.3-7)) #6 Fri May 7 10:56:06 CEST 2004

I just compiled your example and ran it:
# ./thread_test 

# ps axw
...
12069 pts/175  S+     0:00 ./thread_test
12070 pts/175  S+     0:00 ./thread_test
12071 pts/175  R+     0:06 ./thread_test
12072 pts/175  R+     0:06 ./thread_test
12073 pts/175  R+     0:06 ./thread_test
12074 pts/175  R+     0:06 ./thread_test
12075 pts/175  R+     0:06 ./thread_test
12076 pts/175  R+     0:06 ./thread_test
12077 pts/175  R+     0:06 ./thread_test
12078 pts/175  R+     0:06 ./thread_test
12079 pts/175  R+     0:06 ./thread_test
12080 pts/175  R+     0:06 ./thread_test
...

# top
  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND           
12072 root      25   0 83736  420 1380 R 12.1  0.1   0:16.94 thread_test       
12075 root      25   0 83736  420 1380 R 12.1  0.1   0:16.93 thread_test       
12073 root      25   0 83736  420 1380 R 11.0  0.1   0:16.92 thread_test       
12074 root      25   0 83736  420 1380 R 11.0  0.1   0:16.92 thread_test       
12076 root      25   0 83736  420 1380 R 11.0  0.1   0:16.82 thread_test       
12077 root      25   0 83736  420 1380 R 11.0  0.1   0:16.87 thread_test       
12078 root      25   0 83736  420 1380 R 11.0  0.1   0:16.84 thread_test       
12071 root      25   0 83736  420 1380 R  9.9  0.1   0:16.95 thread_test       
12079 root      25   0 83736  420 1380 R  7.7  0.1   0:16.80 thread_test       
...

On my -mm patched kernel I can see them.

Regards,
Martin
-- 
MyExcuse:
piezo-electric interference

Martin Zwickel <martin.zwickel@technotrend.de>
Research & Development

TechnoTrend AG <http://www.technotrend.de>

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Invisible threads in 2.6
@ 2004-05-25 18:51 lm240504
  0 siblings, 0 replies; 3+ messages in thread
From: lm240504 @ 2004-05-25 18:51 UTC (permalink / raw)
  To: Martin Zwickel; +Cc: linux-kernel

[-- Attachment #1: Type: text/plain, Size: 1380 bytes --]

> my kernel:
> # cat /proc/version 
> Linux version 2.6.6-rc3-mm2 (root@phoebee) (gcc version 3.3.2 20031218 (Gentoo
> Linux 3.3.2-r5, propolice-3.3-7)) #6 Fri May 7 10:56:06 CEST 2004
> 
> I just compiled your example and ran it:
> # ./thread_test 
> 
<snip>
> On my -mm patched kernel I can see them.

I tried 2.6.6-rc3-mm2, and didn't see any difference:

# cat /proc/version
Linux version 2.6.6-rc3-mm2 (lmakhlis@levlinux) (gcc version 3.2.2 20030222 (Red Hat Linux 3.2.2-5)) #3 SMP Tue May 25 14:04:28 EDT 2004
                                                                                
# ./thread_test &
[749]
                                                                                
# ls /proc/749/task
ls: /proc/749/task: No such file or directory
                                                                                
# ps axw
...
  749 tty1     Z      0:00 [thread_test <defunct>]
...

I have now tested it on Fedora Core 2 (2.6.5), SLES 9 Beta (2.6.5) and RHL 9 w/ 2.6.6-rc3-mm2, with identical results.  Could it have anything to do with which thread library the program is using?  Here's mine:

# ldd ./thread_test
        libpthread.so.0 => /lib/tls/libpthread.so.0 (0x40028000)
        libc.so.6 => /lib/tls/libc.so.6 (0x42000000)
        /lib/ld-linux.so.2 => /lib/ld-linux.so.2 (0x40000000)

# strace ./pthread_test
<see attachment>

Lev




[-- Attachment #2: strace.out --]
[-- Type: application/octet-stream, Size: 6910 bytes --]

execve("./thread_test", ["./thread_test"], [/* 11 vars */]) = 0
uname({sys="Linux", node="levlinux", ...}) = 0
brk(0)                                  = 0x804a000
open("/etc/ld.so.preload", O_RDONLY)    = -1 ENOENT (No such file or directory)
open("/etc/ld.so.cache", O_RDONLY)      = 3
fstat64(3, {st_mode=S_IFREG|0644, st_size=69902, ...}) = 0
old_mmap(NULL, 69902, PROT_READ, MAP_PRIVATE, 3, 0) = 0x40016000
close(3)                                = 0
open("/lib/tls/libpthread.so.0", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0p?\0\000"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=80592, ...}) = 0
old_mmap(NULL, 54612, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x40028000
old_mmap(0x40033000, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0xa000) = 0x40033000
old_mmap(0x40034000, 5460, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x40034000
close(3)                                = 0
open("/lib/tls/libc.so.6", O_RDONLY)    = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0\360W\1"..., 512) = 512
fstat64(3, {st_mode=S_IFREG|0755, st_size=1539996, ...}) = 0
old_mmap(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40036000
old_mmap(0x42000000, 1267276, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x42000000
old_mmap(0x42130000, 12288, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0x130000) = 0x42130000
old_mmap(0x42133000, 9804, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x42133000
close(3)                                = 0
set_thread_area({entry_number:-1 -> 6, base_addr:0x400368a0, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 0
munmap(0x40016000, 69902)               = 0
set_tid_address(0x400368e8)             = 729
rt_sigaction(SIGRTMIN, {0x4002bed0, [], SA_RESTORER, 0x400318f8}, NULL, 8) = 0
rt_sigprocmask(SIG_UNBLOCK, [RTMIN], NULL, 8) = 0
rt_sigprocmask(SIG_BLOCK, [33], NULL, 8) = 0
getrlimit(0x3, 0xbffffd14)              = 0
mmap2(NULL, 8392704, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40037000
brk(0)                                  = 0x804a000
brk(0x804b000)                          = 0x804b000
brk(0)                                  = 0x804b000
mprotect(0x40037000, 4096, PROT_NONE)   = 0
clone(child_stack=0x40837a90, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID|CLONE_DETACHED, [730], {entry_number:6, base_addr:0x40837b30, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 730
mmap2(NULL, 8392704, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x40838000
mprotect(0x40838000, 4096, PROT_NONE)   = 0
clone(child_stack=0x41038a90, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID|CLONE_DETACHED, [731], {entry_number:6, base_addr:0x41038b30, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 731
mmap2(NULL, 8392704, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x41039000
mprotect(0x41039000, 4096, PROT_NONE)   = 0
clone(child_stack=0x41839a90, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID|CLONE_DETACHED, [732], {entry_number:6, base_addr:0x41839b30, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 732
mmap2(NULL, 8392704, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x42136000
mprotect(0x42136000, 4096, PROT_NONE)   = 0
clone(child_stack=0x42936a90, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID|CLONE_DETACHED, [733], {entry_number:6, base_addr:0x42936b30, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 733
mmap2(NULL, 8392704, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x42937000
mprotect(0x42937000, 4096, PROT_NONE)   = 0
clone(child_stack=0x43137a90, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID|CLONE_DETACHED, [734], {entry_number:6, base_addr:0x43137b30, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 734
mmap2(NULL, 8392704, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x43138000
mprotect(0x43138000, 4096, PROT_NONE)   = 0
clone(child_stack=0x43938a90, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID|CLONE_DETACHED, [735], {entry_number:6, base_addr:0x43938b30, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 735
mmap2(NULL, 8392704, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x43939000
mprotect(0x43939000, 4096, PROT_NONE)   = 0
clone(child_stack=0x44139a90, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID|CLONE_DETACHED, [736], {entry_number:6, base_addr:0x44139b30, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 736
mmap2(NULL, 8392704, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x4413a000
mprotect(0x4413a000, 4096, PROT_NONE)   = 0
clone(child_stack=0x4493aa90, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID|CLONE_DETACHED, [737], {entry_number:6, base_addr:0x4493ab30, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 737
mmap2(NULL, 8392704, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x4493b000
mprotect(0x4493b000, 4096, PROT_NONE)   = 0
clone(child_stack=0x4513ba90, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID|CLONE_DETACHED, [738], {entry_number:6, base_addr:0x4513bb30, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 738
mmap2(NULL, 8392704, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x4513c000
mprotect(0x4513c000, 4096, PROT_NONE)   = 0
clone(child_stack=0x4593ca90, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID|CLONE_DETACHED, [739], {entry_number:6, base_addr:0x4593cb30, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}) = 739
_exit(0)                                = ?

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2004-05-25 18:53 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-05-25  2:21 Invisible threads in 2.6 lm240504
2004-05-25  7:03 ` Martin Zwickel
  -- strict thread matches above, loose matches on Subject: below --
2004-05-25 18:51 lm240504

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox