public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* Zombies with 2.4.15pre5 (exit.c)
@ 2001-11-17  3:45 Jeff Long
  2001-11-20 16:08 ` [PATCH] " Dave McCracken
  0 siblings, 1 reply; 4+ messages in thread
From: Jeff Long @ 2001-11-17  3:45 UTC (permalink / raw)
  To: linux-kernel

Running 2.4.15pre5 (UP) on i386, running UML 2.4.14-2.
UML processes create threads on the host system that don't
die.  Threads are stuck at do_exit( ), so I backed out the
patch to kernel/exit.c @ 539 (in 2.4.15pre5 patch):

  p->state = TASK_DEAD;

and things work fine.  I do not see zombies with anything
other than UML processes/native threads.

_________________________________________________________________
Get your FREE download of MSN Explorer at http://explorer.msn.com/intl.asp


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH] Re: Zombies with 2.4.15pre5 (exit.c)
  2001-11-17  3:45 Zombies with 2.4.15pre5 (exit.c) Jeff Long
@ 2001-11-20 16:08 ` Dave McCracken
  2001-11-20 18:40   ` OGAWA Hirofumi
  2001-11-21 18:10   ` Pau Aliagas
  0 siblings, 2 replies; 4+ messages in thread
From: Dave McCracken @ 2001-11-20 16:08 UTC (permalink / raw)
  To: Jeff Long, linux-kernel, Linus Torvalds


--On Saturday, November 17, 2001 03:45:49 +0000 Jeff Long
<jeffwlong@hotmail.com> wrote:

> Running 2.4.15pre5 (UP) on i386, running UML 2.4.14-2.
> UML processes create threads on the host system that don't
> die.  Threads are stuck at do_exit( ), so I backed out the
> patch to kernel/exit.c @ 539 (in 2.4.15pre5 patch):
> 
>   p->state = TASK_DEAD;
> 
> and things work fine.  I do not see zombies with anything
> other than UML processes/native threads.

The intent of the original patch was to make the task unfindable to other
waiters, which fixed a race condition in sys_wait4().  My assumption was
that the task was about to be cleaned up in release_task().  What I missed
was that there are a couple of code paths that don't release the task, but
assume it'll be cleaned up later.

The patch below should fix the problem.

Dave McCracken

======================================================================
Dave McCracken          IBM Linux Base Kernel Team      1-512-838-3059
dmccr@us.ibm.com                                        T/L   678-3059

-----------------------

--- linux-2.4.15-pre7/kernel/exit.c	Tue Nov 20 10:00:26 2001
+++ linux-2.4.15-pre7-patch/kernel/exit.c	Tue Nov 20 09:57:48 2001
@@ -544,8 +544,11 @@
 				retval = ru ? getrusage(p, RUSAGE_BOTH, ru) : 0;
 				if (!retval && stat_addr)
 					retval = put_user(p->exit_code, stat_addr);
-				if (retval)
+				if (retval) {
+					/* Reset state. We're not cleaning up yet */
+					p->state = TASK_ZOMBIE;
 					goto end_wait4; 
+				}
 				retval = p->pid;
 				if (p->p_opptr != p->p_pptr) {
 					write_lock_irq(&tasklist_lock);
@@ -553,6 +556,8 @@
 					p->p_pptr = p->p_opptr;
 					SET_LINKS(p);
 					do_notify_parent(p, SIGCHLD);
+					/* Reset state. We're not cleaning up yet */
+					p->state = TASK_ZOMBIE;
 					write_unlock_irq(&tasklist_lock);
 				} else
 					release_task(p);


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] Re: Zombies with 2.4.15pre5 (exit.c)
  2001-11-20 16:08 ` [PATCH] " Dave McCracken
@ 2001-11-20 18:40   ` OGAWA Hirofumi
  2001-11-21 18:10   ` Pau Aliagas
  1 sibling, 0 replies; 4+ messages in thread
From: OGAWA Hirofumi @ 2001-11-20 18:40 UTC (permalink / raw)
  To: Dave McCracken; +Cc: Jeff Long, linux-kernel, Linus Torvalds

Hi,

Dave McCracken <dmccr@us.ibm.com> writes:

> The intent of the original patch was to make the task unfindable to other
> waiters, which fixed a race condition in sys_wait4().  My assumption was
> that the task was about to be cleaned up in release_task().  What I missed
> was that there are a couple of code paths that don't release the task, but
> assume it'll be cleaned up later.

I think the original patch don't fix race condition, because
tasklist_lock is read_lock(). Furthermore, the threads which did not
receive process status continues waiting, even when there is no child
process.

I wrote the following patch. But, I'm not sure whether it is right.

Thanks
-- 
OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>

--- linux-head/kernel/exit.c	Tue Nov 20 23:32:58 2001
+++ wait/kernel/exit.c	Tue Nov 20 17:56:12 2001
@@ -529,23 +529,27 @@
 				retval = ru ? getrusage(p, RUSAGE_BOTH, ru) : 0; 
 				if (!retval && stat_addr) 
 					retval = put_user((p->exit_code << 8) | 0x7f, stat_addr);
-				if (!retval) {
-					p->exit_code = 0;
-					retval = p->pid;
+				if (retval)
+					goto end_wait4;
+
+				/* exactly one thread return the process status */
+				task_lock(p);
+				if (p->exit_code == 0) {
+					task_unlock(p);
+					goto repeat;
 				}
+				p->exit_code = 0;
+				task_unlock(p);
+				retval = p->pid;
 				goto end_wait4;
 			case TASK_ZOMBIE:
-				/* Make sure no other waiter picks this task up */
-				p->state = TASK_DEAD;
-
-				current->times.tms_cutime += p->times.tms_utime + p->times.tms_cutime;
-				current->times.tms_cstime += p->times.tms_stime + p->times.tms_cstime;
 				read_unlock(&tasklist_lock);
 				retval = ru ? getrusage(p, RUSAGE_BOTH, ru) : 0;
 				if (!retval && stat_addr)
 					retval = put_user(p->exit_code, stat_addr);
 				if (retval)
-					goto end_wait4; 
+					goto end_wait4;
+
 				retval = p->pid;
 				if (p->p_opptr != p->p_pptr) {
 					write_lock_irq(&tasklist_lock);
@@ -554,8 +558,20 @@
 					SET_LINKS(p);
 					do_notify_parent(p, SIGCHLD);
 					write_unlock_irq(&tasklist_lock);
-				} else
-					release_task(p);
+					goto end_wait4;
+				}
+
+				/* exactly one thread return the process status */
+				task_lock(p);
+				if (p->pid == 0) {
+					task_unlock(p);
+					goto repeat;
+				}
+				p->pid = 0;
+				task_unlock(p);
+				current->times.tms_cutime += p->times.tms_utime + p->times.tms_cutime;
+				current->times.tms_cstime += p->times.tms_stime + p->times.tms_cstime;
+				release_task(p);
 				goto end_wait4;
 			default:
 				continue;
--- linux-head/include/linux/sched.h	Tue Nov 20 23:32:58 2001
+++ wait/include/linux/sched.h	Tue Nov 20 17:38:11 2001
@@ -88,7 +88,6 @@
 #define TASK_UNINTERRUPTIBLE	2
 #define TASK_ZOMBIE		4
 #define TASK_STOPPED		8
-#define TASK_DEAD		16
 
 #define __set_task_state(tsk, state_value)		\
 	do { (tsk)->state = (state_value); } while (0)

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: [PATCH] Re: Zombies with 2.4.15pre5 (exit.c)
  2001-11-20 16:08 ` [PATCH] " Dave McCracken
  2001-11-20 18:40   ` OGAWA Hirofumi
@ 2001-11-21 18:10   ` Pau Aliagas
  1 sibling, 0 replies; 4+ messages in thread
From: Pau Aliagas @ 2001-11-21 18:10 UTC (permalink / raw)
  To: Dave McCracken; +Cc: lkml, Jeff Long, OGAWA Hirofumi

On Tue, 20 Nov 2001, Dave McCracken wrote:

> > Running 2.4.15pre5 (UP) on i386, running UML 2.4.14-2.
> > UML processes create threads on the host system that don't
> > die.  Threads are stuck at do_exit( ), so I backed out the
> > patch to kernel/exit.c @ 539 (in 2.4.15pre5 patch):
> > 
> >   p->state = TASK_DEAD;
> > 
> > and things work fine.  I do not see zombies with anything
> > other than UML processes/native threads.
> 
> The intent of the original patch was to make the task unfindable to other
> waiters, which fixed a race condition in sys_wait4().  My assumption was
> that the task was about to be cleaned up in release_task().  What I missed
> was that there are a couple of code paths that don't release the task, but
> assume it'll be cleaned up later.
> 
> The patch below should fix the problem.

It doesn't for me.
I'll try OGAWA Hirofumi's patch -posted to the list- and let you know.

Pau




^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2001-11-21 18:11 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-11-17  3:45 Zombies with 2.4.15pre5 (exit.c) Jeff Long
2001-11-20 16:08 ` [PATCH] " Dave McCracken
2001-11-20 18:40   ` OGAWA Hirofumi
2001-11-21 18:10   ` Pau Aliagas

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox