trinity.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* trinity seems not to reap all childs
@ 2014-08-09 16:35 Toralf Förster
  2014-08-13 15:02 ` Dave Jones
  0 siblings, 1 reply; 2+ messages in thread
From: Toralf Förster @ 2014-08-09 16:35 UTC (permalink / raw)
  To: trinity

I do observe in the last few days that under a 32 bit Gentoo UML guest sometimes 1 trinity job survives although all of its parents are gone already.


The console output at th ehost system is :

[main] Bailing main loop because Completed maximum number of operations..
[watchdog] [2604] Watchdog exiting because Completed maximum number of operations..
[init] Ran 100001 syscalls. Successes: 21199  Failures: 78802


A ps shows that there's still 1 job running in the guest :

$ ssh tfoerste@trinity "ps fx -eo pid,start_time,command | grep -e trinity -e sleep | grep -v grep"
 2723 17:55 trinity -C 2 -N 100000 -x mremap -q -V /mnt/ramdisk/victims/v1/v2



Logging into the UML guest and looking at the trinity files gives :


tfoerste@trinity ~/t3 $ ls -l
total 480
-rw-r--r-- 1 tfoerste users 253943 Aug  9 18:15 trinity-child0.log
-rw-r--r-- 1 tfoerste users 206229 Aug  9 18:15 trinity-child1.log
-rw-r--r-- 1 tfoerste users  19077 Aug  9 18:15 trinity.log
-rw-r-S-wt 1 tfoerste users     11 Aug  9 18:15 trinity-testfile1
-rwSr--rwT 1 tfoerste users      0 Aug  9 17:49 trinity-testfile2
-rwx--Srwx 1 tfoerste users   1110 Aug  9 18:15 trinity-testfile3
------xrwt 1 tfoerste users  16384 Aug  9 18:15 trinity-testfile4

tfoerste@trinity ~/t3 $ tail -vf *log
==> trinity-child0.log <==
[child0:3003] [2727] unlinkat(dfd=175, pathname="/mnt/ramdisk/victims/v1/v2/f92", flag=0x1f3ff000) = -1 (Invalid argument)
[child0:3003] [2728] eventfd2(count=19, flags=0x1) = 336
[child0:3003] [2729] getpid() = 3003
[child0:3003] [2730] fcntl(fd=175, cmd=0xc, arg=0xf6abffff) = -1 (Bad file descriptor)
[child0:3003] [2731] swapon(path="/mnt/ramdisk/victims/v1/v2/d52", swap_flags=0x18000) = -1 (Operation not permitted)
[child0:3003] [2732] olduname(name=0xc0100220) = -1 (Bad address)
[child0:3003] [2733] getpriority(which=0x0, who=3012) = 1
[child0:3003] [2734] mlock(addr=0x40499000, len=0x100000) = -1 (Cannot allocate memory)
[child0:3003] [2735] time(tloc=0x0) = 0x53e6493b
[child0:3003] [2736] epoll_pwait(epfd=20, events=0x85c0340, maxevents=0xffff0000, timeout=0xff0067aa) = -1 (Invalid argument)

==> trinity-child1.log <==
[child1:3012] [2225] flock(fd=18, cmd=0xf41fffff) = 0
[child1:3012] [2226] getpgid(pid=1) = 1
[child1:3012] [2227] mq_notify(mqdes=175, u_notification=0x4) = -1 (Bad address)
[child1:3012] [2228] msync(start=0x40299000, len=0x100000, flags=0x4) = 0
[child1:3012] [2229] mkdirat(dfd=175, pathname="/mnt/ramdisk/victims/v1/v2/f58", mode=0) = -1 (Permission denied)
[child1:3012] [2230] write(fd=236, buf=0xffffff9b, count=1) = -1 (Bad file descriptor)
[child1:3012] [2231] fsync(fd=210) = 0
[child1:3012] [2232] rt_sigpending(set=0x80e4000[page_rand], sigsetsize=0xff6094ab) = -1 (Invalid argument)
[child1:3012] [2233] tee(fdin=7, fdout=4, len=0x6800, flags=0x3) = -1 (Bad file descriptor)
[child1:3012] [2234] utimes(filename="/mnt/ramdisk/victims/v1/v2/f12", utimes=0x1) = -1 (Bad address)

==> trinity.log <==
[watchdog] 30087 iterations. [F:23623 S:6463 HI:9706]
[watchdog] 40138 iterations. [F:31443 S:8694 HI:9706]
[watchdog] 50215 iterations. [F:39370 S:10844 HI:9706]
[watchdog] 60221 iterations. [F:47228 S:12992 HI:9706]
[watchdog] 70225 iterations. [F:55100 S:15124 HI:9706]
[watchdog] 80278 iterations. [F:63007 S:17270 HI:9706]
[watchdog] 90287 iterations. [F:71013 S:19273 HI:9706]
[main] Bailing main loop because Completed maximum number of operations..
[watchdog] [2604] Watchdog exiting because Completed maximum number of operations..
[init] Ran 100001 syscalls. Successes: 21199  Failures: 78802
^C

tfoerste@trinity ~/t3 $ sudo tail -vf *test*
==> trinity-testfile1 <==

==> trinity-testfile2 <==

==> trinity-testfile3 <==
EF ���P~zeoP ����2 ▒2.6.56#2 Mon Aug 4 21:47:01 CEST 2014i686(none)
==> trinity-testfile4 <==
^C



killing the job helped fortunately:


$ ssh tfoerste@trinity kill 2723



-- 
Toralf

^ permalink raw reply	[flat|nested] 2+ messages in thread

* Re: trinity seems not to reap all childs
  2014-08-09 16:35 trinity seems not to reap all childs Toralf Förster
@ 2014-08-13 15:02 ` Dave Jones
  0 siblings, 0 replies; 2+ messages in thread
From: Dave Jones @ 2014-08-13 15:02 UTC (permalink / raw)
  To: Toralf Förster; +Cc: trinity

On Sat, Aug 09, 2014 at 06:35:45PM +0200, Toralf Förster wrote:
 > I do observe in the last few days that under a 32 bit Gentoo UML guest sometimes 1 trinity job survives although all of its parents are gone already.
 > 
 > 
 > The console output at th ehost system is :
 > 
 > [main] Bailing main loop because Completed maximum number of operations..
 > [watchdog] [2604] Watchdog exiting because Completed maximum number of operations..
 > [init] Ran 100001 syscalls. Successes: 21199  Failures: 78802
 > 
 > 
 > A ps shows that there's still 1 job running in the guest :
 > 
 > $ ssh tfoerste@trinity "ps fx -eo pid,start_time,command | grep -e trinity -e sleep | grep -v grep"
 >  2723 17:55 trinity -C 2 -N 100000 -x mremap -q -V /mnt/ramdisk/victims/v1/v2

If it happens again, grab the output of /proc/2723/stack
(You might need something that enables CONFIG_STACKTRACE in your kernel,
 or apply the patch below if nothing does -- I still need to get that
 upstream)


 > [watchdog] 30087 iterations. [F:23623 S:6463 HI:9706]
 > [watchdog] 40138 iterations. [F:31443 S:8694 HI:9706]
 > [watchdog] 50215 iterations. [F:39370 S:10844 HI:9706]
 > [watchdog] 60221 iterations. [F:47228 S:12992 HI:9706]
 > [watchdog] 70225 iterations. [F:55100 S:15124 HI:9706]
 > [watchdog] 80278 iterations. [F:63007 S:17270 HI:9706]
 > [watchdog] 90287 iterations. [F:71013 S:19273 HI:9706]
 > [main] Bailing main loop because Completed maximum number of operations..
 > [watchdog] [2604] Watchdog exiting because Completed maximum number of operations..
 > [init] Ran 100001 syscalls. Successes: 21199  Failures: 78802
 > 
 > killing the job helped fortunately:
 > 
 > 
 > $ ssh tfoerste@trinity kill 2723

Puzzling that the watchdog exited while there were still children around.

Something else that might be interesting would be to attach to the
still running pid, and examine shm->running_childs

	Dave

diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index cb45f59685e6..38133ddb8bb4 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -1008,8 +1008,13 @@ config TRACE_IRQFLAGS
 	  either tracing or lock debugging.
 
 config STACKTRACE
-	bool
+	bool "Stack backtrace support"
 	depends on STACKTRACE_SUPPORT
+	help
+	  This option causes the kernel to create a /proc/pid/stack for
+	  every process, showing its current stack trace.
+	  It is also used by various kernel debugging features that require
+	  stack trace generation.
 
 config DEBUG_KOBJECT
 	bool "kobject debugging"

^ permalink raw reply related	[flat|nested] 2+ messages in thread

end of thread, other threads:[~2014-08-13 15:02 UTC | newest]

Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2014-08-09 16:35 trinity seems not to reap all childs Toralf Förster
2014-08-13 15:02 ` Dave Jones

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).