* trinity seems not to reap all childs
@ 2014-08-09 16:35 Toralf Förster
2014-08-13 15:02 ` Dave Jones
0 siblings, 1 reply; 2+ messages in thread
From: Toralf Förster @ 2014-08-09 16:35 UTC (permalink / raw)
To: trinity
I do observe in the last few days that under a 32 bit Gentoo UML guest sometimes 1 trinity job survives although all of its parents are gone already.
The console output at th ehost system is :
[main] Bailing main loop because Completed maximum number of operations..
[watchdog] [2604] Watchdog exiting because Completed maximum number of operations..
[init] Ran 100001 syscalls. Successes: 21199 Failures: 78802
A ps shows that there's still 1 job running in the guest :
$ ssh tfoerste@trinity "ps fx -eo pid,start_time,command | grep -e trinity -e sleep | grep -v grep"
2723 17:55 trinity -C 2 -N 100000 -x mremap -q -V /mnt/ramdisk/victims/v1/v2
Logging into the UML guest and looking at the trinity files gives :
tfoerste@trinity ~/t3 $ ls -l
total 480
-rw-r--r-- 1 tfoerste users 253943 Aug 9 18:15 trinity-child0.log
-rw-r--r-- 1 tfoerste users 206229 Aug 9 18:15 trinity-child1.log
-rw-r--r-- 1 tfoerste users 19077 Aug 9 18:15 trinity.log
-rw-r-S-wt 1 tfoerste users 11 Aug 9 18:15 trinity-testfile1
-rwSr--rwT 1 tfoerste users 0 Aug 9 17:49 trinity-testfile2
-rwx--Srwx 1 tfoerste users 1110 Aug 9 18:15 trinity-testfile3
------xrwt 1 tfoerste users 16384 Aug 9 18:15 trinity-testfile4
tfoerste@trinity ~/t3 $ tail -vf *log
==> trinity-child0.log <==
[child0:3003] [2727] unlinkat(dfd=175, pathname="/mnt/ramdisk/victims/v1/v2/f92", flag=0x1f3ff000) = -1 (Invalid argument)
[child0:3003] [2728] eventfd2(count=19, flags=0x1) = 336
[child0:3003] [2729] getpid() = 3003
[child0:3003] [2730] fcntl(fd=175, cmd=0xc, arg=0xf6abffff) = -1 (Bad file descriptor)
[child0:3003] [2731] swapon(path="/mnt/ramdisk/victims/v1/v2/d52", swap_flags=0x18000) = -1 (Operation not permitted)
[child0:3003] [2732] olduname(name=0xc0100220) = -1 (Bad address)
[child0:3003] [2733] getpriority(which=0x0, who=3012) = 1
[child0:3003] [2734] mlock(addr=0x40499000, len=0x100000) = -1 (Cannot allocate memory)
[child0:3003] [2735] time(tloc=0x0) = 0x53e6493b
[child0:3003] [2736] epoll_pwait(epfd=20, events=0x85c0340, maxevents=0xffff0000, timeout=0xff0067aa) = -1 (Invalid argument)
==> trinity-child1.log <==
[child1:3012] [2225] flock(fd=18, cmd=0xf41fffff) = 0
[child1:3012] [2226] getpgid(pid=1) = 1
[child1:3012] [2227] mq_notify(mqdes=175, u_notification=0x4) = -1 (Bad address)
[child1:3012] [2228] msync(start=0x40299000, len=0x100000, flags=0x4) = 0
[child1:3012] [2229] mkdirat(dfd=175, pathname="/mnt/ramdisk/victims/v1/v2/f58", mode=0) = -1 (Permission denied)
[child1:3012] [2230] write(fd=236, buf=0xffffff9b, count=1) = -1 (Bad file descriptor)
[child1:3012] [2231] fsync(fd=210) = 0
[child1:3012] [2232] rt_sigpending(set=0x80e4000[page_rand], sigsetsize=0xff6094ab) = -1 (Invalid argument)
[child1:3012] [2233] tee(fdin=7, fdout=4, len=0x6800, flags=0x3) = -1 (Bad file descriptor)
[child1:3012] [2234] utimes(filename="/mnt/ramdisk/victims/v1/v2/f12", utimes=0x1) = -1 (Bad address)
==> trinity.log <==
[watchdog] 30087 iterations. [F:23623 S:6463 HI:9706]
[watchdog] 40138 iterations. [F:31443 S:8694 HI:9706]
[watchdog] 50215 iterations. [F:39370 S:10844 HI:9706]
[watchdog] 60221 iterations. [F:47228 S:12992 HI:9706]
[watchdog] 70225 iterations. [F:55100 S:15124 HI:9706]
[watchdog] 80278 iterations. [F:63007 S:17270 HI:9706]
[watchdog] 90287 iterations. [F:71013 S:19273 HI:9706]
[main] Bailing main loop because Completed maximum number of operations..
[watchdog] [2604] Watchdog exiting because Completed maximum number of operations..
[init] Ran 100001 syscalls. Successes: 21199 Failures: 78802
^C
tfoerste@trinity ~/t3 $ sudo tail -vf *test*
==> trinity-testfile1 <==
==> trinity-testfile2 <==
==> trinity-testfile3 <==
EF ���P~zeoP ����2 ▒2.6.56#2 Mon Aug 4 21:47:01 CEST 2014i686(none)
==> trinity-testfile4 <==
^C
killing the job helped fortunately:
$ ssh tfoerste@trinity kill 2723
--
Toralf
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: trinity seems not to reap all childs
2014-08-09 16:35 trinity seems not to reap all childs Toralf Förster
@ 2014-08-13 15:02 ` Dave Jones
0 siblings, 0 replies; 2+ messages in thread
From: Dave Jones @ 2014-08-13 15:02 UTC (permalink / raw)
To: Toralf Förster; +Cc: trinity
On Sat, Aug 09, 2014 at 06:35:45PM +0200, Toralf Förster wrote:
> I do observe in the last few days that under a 32 bit Gentoo UML guest sometimes 1 trinity job survives although all of its parents are gone already.
>
>
> The console output at th ehost system is :
>
> [main] Bailing main loop because Completed maximum number of operations..
> [watchdog] [2604] Watchdog exiting because Completed maximum number of operations..
> [init] Ran 100001 syscalls. Successes: 21199 Failures: 78802
>
>
> A ps shows that there's still 1 job running in the guest :
>
> $ ssh tfoerste@trinity "ps fx -eo pid,start_time,command | grep -e trinity -e sleep | grep -v grep"
> 2723 17:55 trinity -C 2 -N 100000 -x mremap -q -V /mnt/ramdisk/victims/v1/v2
If it happens again, grab the output of /proc/2723/stack
(You might need something that enables CONFIG_STACKTRACE in your kernel,
or apply the patch below if nothing does -- I still need to get that
upstream)
> [watchdog] 30087 iterations. [F:23623 S:6463 HI:9706]
> [watchdog] 40138 iterations. [F:31443 S:8694 HI:9706]
> [watchdog] 50215 iterations. [F:39370 S:10844 HI:9706]
> [watchdog] 60221 iterations. [F:47228 S:12992 HI:9706]
> [watchdog] 70225 iterations. [F:55100 S:15124 HI:9706]
> [watchdog] 80278 iterations. [F:63007 S:17270 HI:9706]
> [watchdog] 90287 iterations. [F:71013 S:19273 HI:9706]
> [main] Bailing main loop because Completed maximum number of operations..
> [watchdog] [2604] Watchdog exiting because Completed maximum number of operations..
> [init] Ran 100001 syscalls. Successes: 21199 Failures: 78802
>
> killing the job helped fortunately:
>
>
> $ ssh tfoerste@trinity kill 2723
Puzzling that the watchdog exited while there were still children around.
Something else that might be interesting would be to attach to the
still running pid, and examine shm->running_childs
Dave
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index cb45f59685e6..38133ddb8bb4 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1008,8 +1008,13 @@ config TRACE_IRQFLAGS
either tracing or lock debugging.
config STACKTRACE
- bool
+ bool "Stack backtrace support"
depends on STACKTRACE_SUPPORT
+ help
+ This option causes the kernel to create a /proc/pid/stack for
+ every process, showing its current stack trace.
+ It is also used by various kernel debugging features that require
+ stack trace generation.
config DEBUG_KOBJECT
bool "kobject debugging"
^ permalink raw reply related [flat|nested] 2+ messages in thread
end of thread, other threads:[~2014-08-13 15:02 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-08-09 16:35 trinity seems not to reap all childs Toralf Förster
2014-08-13 15:02 ` Dave Jones
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).