All of lore.kernel.org
 help / color / mirror / Atom feed
* [uml-devel] The current fix-kill patch
@ 2004-11-23  5:50 Jeff Dike
  2004-11-23 15:10 ` [uml-devel] " Bodo Stroesser
  2004-12-01 18:20 ` Jason Lunz
  0 siblings, 2 replies; 7+ messages in thread
From: Jeff Dike @ 2004-11-23  5:50 UTC (permalink / raw)
  To: Bodo Stroesser, Blaisorblade; +Cc: user-mode-linux devel

Here's the latest fix-kill patch in my tree.  Please test and report...

				Jeff

# This patch changes how UML kills ptraced processes in order to be more 
# correct in the presence of the ptrace changes in 2.6.9.  It used to be that
# ptrace stopped processes could simply be killed and they would go away.  Now,
# there's a new run state for ptraced processes which doesn't receive signals
# until they are PTRACE_KILLed or PTRACE_CONTinued.  So, this patch kills the
# process, as usual, then PTRACE_KILL and PTRACE_CONT.  This is done in
# os_kill_ptrace_process() for use from skas mode, and in tracer() when it
# sees a child process getting a SIGKILL.
Index: 2.6.9/arch/um/kernel/tt/exec_user.c
===================================================================
--- 2.6.9.orig/arch/um/kernel/tt/exec_user.c	2004-11-22 12:18:40.000000000 -0500
+++ 2.6.9/arch/um/kernel/tt/exec_user.c	2004-11-22 13:24:50.000000000 -0500
@@ -36,7 +36,7 @@
 		tracer_panic("do_exec failed to get registers - errno = %d",
 			     errno);
 
-	kill(old_pid, SIGKILL);
+	os_kill_ptraced_process(old_pid, 0);
 
 	if (ptrace(PTRACE_SETOPTIONS, new_pid, 0, (void *)PTRACE_O_TRACESYSGOOD) < 0)
 		tracer_panic("do_exec: PTRACE_SETOPTIONS failed, errno = %d", errno);
Index: 2.6.9/arch/um/kernel/tt/process_kern.c
===================================================================
--- 2.6.9.orig/arch/um/kernel/tt/process_kern.c	2004-11-18 21:24:08.000000000 -0500
+++ 2.6.9/arch/um/kernel/tt/process_kern.c	2004-11-22 12:18:43.000000000 -0500
@@ -65,7 +65,8 @@
 		panic("write of switch_pipe failed, err = %d", -err);
 
 	reading = 1;
-	if((from->exit_state == EXIT_ZOMBIE) || (from->exit_state == EXIT_DEAD))
+	if((from->exit_state == EXIT_ZOMBIE) ||
+	   (from->exit_state == EXIT_DEAD))
 		os_kill_process(os_getpid(), 0);
 
 	err = os_read_file(from->thread.mode.tt.switch_pipe[0], &c, sizeof(c));
@@ -82,7 +83,7 @@
 	prev_sched = current->thread.prev_sched;
 	if((prev_sched->exit_state == EXIT_ZOMBIE) ||
 	   (prev_sched->exit_state == EXIT_DEAD))
-		os_kill_ptraced_process(prev_sched->thread.mode.tt.extern_pid, 1);
+		os_kill_process(prev_sched->thread.mode.tt.extern_pid, 1);
 
 	/* This works around a nasty race with 'jail'.  If we are switching
 	 * between two threads of a threaded app and the incoming process 
Index: 2.6.9/arch/um/kernel/tt/tracer.c
===================================================================
--- 2.6.9.orig/arch/um/kernel/tt/tracer.c	2004-11-22 12:18:40.000000000 -0500
+++ 2.6.9/arch/um/kernel/tt/tracer.c	2004-11-22 13:28:01.000000000 -0500
@@ -320,7 +320,13 @@
 				case OP_HALT:
 					unmap_physmem();
 					kmalloc_ok = 0;
-					ptrace(PTRACE_KILL, pid, 0, 0);
+					os_kill_ptraced_process(pid, 0);
+					/* Now let's reap remaining zombies */
+					errno = 0;
+					do {
+						waitpid(-1, &status, 
+							WUNTRACED);
+					} while (errno != ECHILD);
 					return(op == OP_REBOOT);
 				case OP_NONE:
 					printf("Detaching pid %d\n", pid);
Index: 2.6.9/arch/um/os-Linux/process.c
===================================================================
--- 2.6.9.orig/arch/um/os-Linux/process.c	2004-11-18 21:24:08.000000000 -0500
+++ 2.6.9/arch/um/os-Linux/process.c	2004-11-22 12:18:43.000000000 -0500
@@ -95,9 +95,16 @@
 		
 }
 
+/* Kill off a ptraced child by all means available.  kill it normally first,
+ * then PTRACE_KILL it, then PTRACE_CONT it in case it's in a run state from
+ * which it can't exit directly.
+ */
+
 void os_kill_ptraced_process(int pid, int reap_child)
 {
+	kill(pid, SIGKILL);
 	ptrace(PTRACE_KILL, pid);
+	ptrace(PTRACE_CONT, pid);
 	if(reap_child)
 		CATCH_EINTR(waitpid(pid, NULL, 0));
 }



-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now. 
http://productguide.itmanagersjournal.com/
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [uml-devel] Re: The current fix-kill patch
  2004-11-23  5:50 [uml-devel] The current fix-kill patch Jeff Dike
@ 2004-11-23 15:10 ` Bodo Stroesser
  2004-11-23 20:10   ` Blaisorblade
  2004-12-01 18:20 ` Jason Lunz
  1 sibling, 1 reply; 7+ messages in thread
From: Bodo Stroesser @ 2004-11-23 15:10 UTC (permalink / raw)
  To: Jeff Dike; +Cc: Blaisorblade, user-mode-linux devel

Jeff Dike wrote:
> Here's the latest fix-kill patch in my tree.  Please test and report...
Have tested based on 2.6.9-bb3. I removed "rework-uml-hang-on-2.6.9-host",
added BlaisorBlade's "uml-hang-on-2.6.9-host", and adapted iand added this
new patch. The resulting UML kernel for me works fine on host 2.6.7-skas3-v7
and 2.6.9-skas3-v7.
Next, I tested what happens if the "kill(pid, SIGKILL);" and
"ptrace(PTRACE_CONT, pid);" are removed from os_kill_ptraced_process(), and
it also worked fine.
Since PTRACE_KILL wakes up the ptraced process unconditionally, IMHO
"ptrace(PTRACE_CONT, pid);" has no effect here. And "kill(pid, SIGKILL);"
only is needed, if the ptraced process is not stopped on a ptrace-interception.
Since os_kill_ptraced_process() is called only then, when the ptraced process
is known to be on a ptrace-stop, I would like to remove "kill(pid, SIGKILL);",
too.
os_kill_ptraced_process() is called each time, a process in TT exec()s. Thus
avoiding unneeded calls is a little speedup.

Bodo
> 
> 				Jeff
> 
> # This patch changes how UML kills ptraced processes in order to be more 
> # correct in the presence of the ptrace changes in 2.6.9.  It used to be that
> # ptrace stopped processes could simply be killed and they would go away.  Now,
> # there's a new run state for ptraced processes which doesn't receive signals
> # until they are PTRACE_KILLed or PTRACE_CONTinued.  So, this patch kills the
> # process, as usual, then PTRACE_KILL and PTRACE_CONT.  This is done in
> # os_kill_ptrace_process() for use from skas mode, and in tracer() when it
> # sees a child process getting a SIGKILL.
> Index: 2.6.9/arch/um/kernel/tt/exec_user.c
> ===================================================================
> --- 2.6.9.orig/arch/um/kernel/tt/exec_user.c	2004-11-22 12:18:40.000000000 -0500
> +++ 2.6.9/arch/um/kernel/tt/exec_user.c	2004-11-22 13:24:50.000000000 -0500
> @@ -36,7 +36,7 @@
>  		tracer_panic("do_exec failed to get registers - errno = %d",
>  			     errno);
>  
> -	kill(old_pid, SIGKILL);
> +	os_kill_ptraced_process(old_pid, 0);
>  
>  	if (ptrace(PTRACE_SETOPTIONS, new_pid, 0, (void *)PTRACE_O_TRACESYSGOOD) < 0)
>  		tracer_panic("do_exec: PTRACE_SETOPTIONS failed, errno = %d", errno);
> Index: 2.6.9/arch/um/kernel/tt/process_kern.c
> ===================================================================
> --- 2.6.9.orig/arch/um/kernel/tt/process_kern.c	2004-11-18 21:24:08.000000000 -0500
> +++ 2.6.9/arch/um/kernel/tt/process_kern.c	2004-11-22 12:18:43.000000000 -0500
> @@ -65,7 +65,8 @@
>  		panic("write of switch_pipe failed, err = %d", -err);
>  
>  	reading = 1;
> -	if((from->exit_state == EXIT_ZOMBIE) || (from->exit_state == EXIT_DEAD))
> +	if((from->exit_state == EXIT_ZOMBIE) ||
> +	   (from->exit_state == EXIT_DEAD))
>  		os_kill_process(os_getpid(), 0);
>  
>  	err = os_read_file(from->thread.mode.tt.switch_pipe[0], &c, sizeof(c));
> @@ -82,7 +83,7 @@
>  	prev_sched = current->thread.prev_sched;
>  	if((prev_sched->exit_state == EXIT_ZOMBIE) ||
>  	   (prev_sched->exit_state == EXIT_DEAD))
> -		os_kill_ptraced_process(prev_sched->thread.mode.tt.extern_pid, 1);
> +		os_kill_process(prev_sched->thread.mode.tt.extern_pid, 1);
>  
>  	/* This works around a nasty race with 'jail'.  If we are switching
>  	 * between two threads of a threaded app and the incoming process 
> Index: 2.6.9/arch/um/kernel/tt/tracer.c
> ===================================================================
> --- 2.6.9.orig/arch/um/kernel/tt/tracer.c	2004-11-22 12:18:40.000000000 -0500
> +++ 2.6.9/arch/um/kernel/tt/tracer.c	2004-11-22 13:28:01.000000000 -0500
> @@ -320,7 +320,13 @@
>  				case OP_HALT:
>  					unmap_physmem();
>  					kmalloc_ok = 0;
> -					ptrace(PTRACE_KILL, pid, 0, 0);
> +					os_kill_ptraced_process(pid, 0);
> +					/* Now let's reap remaining zombies */
> +					errno = 0;
> +					do {
> +						waitpid(-1, &status, 
> +							WUNTRACED);
> +					} while (errno != ECHILD);
>  					return(op == OP_REBOOT);
>  				case OP_NONE:
>  					printf("Detaching pid %d\n", pid);
> Index: 2.6.9/arch/um/os-Linux/process.c
> ===================================================================
> --- 2.6.9.orig/arch/um/os-Linux/process.c	2004-11-18 21:24:08.000000000 -0500
> +++ 2.6.9/arch/um/os-Linux/process.c	2004-11-22 12:18:43.000000000 -0500
> @@ -95,9 +95,16 @@
>  		
>  }
>  
> +/* Kill off a ptraced child by all means available.  kill it normally first,
> + * then PTRACE_KILL it, then PTRACE_CONT it in case it's in a run state from
> + * which it can't exit directly.
> + */
> +
>  void os_kill_ptraced_process(int pid, int reap_child)
>  {
> +	kill(pid, SIGKILL);
>  	ptrace(PTRACE_KILL, pid);
> +	ptrace(PTRACE_CONT, pid);
>  	if(reap_child)
>  		CATCH_EINTR(waitpid(pid, NULL, 0));
>  }
> 



-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now. 
http://productguide.itmanagersjournal.com/
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [uml-devel] Re: The current fix-kill patch
  2004-11-23 15:10 ` [uml-devel] " Bodo Stroesser
@ 2004-11-23 20:10   ` Blaisorblade
  2004-11-24 11:18     ` Bodo Stroesser
  0 siblings, 1 reply; 7+ messages in thread
From: Blaisorblade @ 2004-11-23 20:10 UTC (permalink / raw)
  To: user-mode-linux-devel; +Cc: Bodo Stroesser, Jeff Dike

On Tuesday 23 November 2004 16:10, Bodo Stroesser wrote:
> Jeff Dike wrote:
> > Here's the latest fix-kill patch in my tree.  Please test and report...

> Have tested based on 2.6.9-bb3. I removed "rework-uml-hang-on-2.6.9-host",
> added BlaisorBlade's "uml-hang-on-2.6.9-host", and adapted iand added this
> new patch.
Btw, if you send another patch, send it relative to "uml-hang-on-2.6.9-host", 
or to it + fix-kill from Jeff, so I can easily spot the differences.

This is a general rule, also (with a bit of taste, i.e. applies between 
developers, not between users).

> The resulting UML kernel for me works fine on host 
> 2.6.7-skas3-v7 and 2.6.9-skas3-v7.
Not yet tested this Jeff Dike's one.

However, for me, on a 2.6.10-rc2 SKAS patched host (with the update I posted 
here and on my site to make SKAS compile), UML hangs on shutdown in SKAS 
mode.

Nobody else reported this with -bb3, so I suppose there is something new in 
2.6.10-rc2 (I didn't yet test this with 

Please let's start testing it ASAP, so that it's fixed before 2.6.10 release. 
However, I have at least to apply the Jeff's fix before.

> Next, I tested what happens if the "kill(pid, SIGKILL);" and
> "ptrace(PTRACE_CONT, pid);" are removed from os_kill_ptraced_process(), and
> it also worked fine.

> Since PTRACE_KILL wakes up the ptraced process unconditionally, IMHO
> "ptrace(PTRACE_CONT, pid);" has no effect here.
You reported PTRACE_KILL not to work in some Ptrace states. PTRACE_CONT is 
intended to fix this (even if PTRACE_KILL is not needed).

> And "kill(pid, SIGKILL);" 
> only is needed, if the ptraced process is not stopped on a
> ptrace-interception.
Well, PTRACE_CONT is to resume if from the ptrace-interception. Let's start by 
distinguishing bugs (i.e. reggressions in 2.6.9 behaviour vs. 2.6.8 and 
earlier one) from strange APIs. Gdb, or any ptracer, is IMHO never supposed 
to SIGKILL the process directly.

> Since os_kill_ptraced_process() is called only then, 
> when the ptraced process is known to be on a ptrace-stop, I would like to
> remove "kill(pid, SIGKILL);", too.
> os_kill_ptraced_process() is called each time, a process in TT exec()s.
> Thus avoiding unneeded calls is a little speedup.

-- 
Paolo Giarrusso, aka Blaisorblade
Linux registered user n. 292729
http://www.user-mode-linux.org/~blaisorblade



-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now. 
http://productguide.itmanagersjournal.com/
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [uml-devel] Re: The current fix-kill patch
  2004-11-23 20:10   ` Blaisorblade
@ 2004-11-24 11:18     ` Bodo Stroesser
  2004-11-25  4:09       ` Blaisorblade
  0 siblings, 1 reply; 7+ messages in thread
From: Bodo Stroesser @ 2004-11-24 11:18 UTC (permalink / raw)
  To: Blaisorblade; +Cc: user-mode-linux-devel, Jeff Dike

Blaisorblade wrote:
> On Tuesday 23 November 2004 16:10, Bodo Stroesser wrote:
> 
>>Jeff Dike wrote:
>>
>>>Here's the latest fix-kill patch in my tree.  Please test and report...
> 
> 
>>Have tested based on 2.6.9-bb3. I removed "rework-uml-hang-on-2.6.9-host",
>>added BlaisorBlade's "uml-hang-on-2.6.9-host", and adapted iand added this
>>new patch.
> 
> Btw, if you send another patch, send it relative to "uml-hang-on-2.6.9-host", 
> or to it + fix-kill from Jeff, so I can easily spot the differences.
> 
> This is a general rule, also (with a bit of taste, i.e. applies between 
> developers, not between users).
Maybe I was somewhat unclear. "this new patch" was Jeff's "current fix-kill".
So I only removed my "rework-uml-hang ..." and added your "uml-hang-..."
and Jeff's "current fix-kill" on top of it.
I just adapted current fix-kill to apply to 2.6.9, but didn't send these
trivial modifications.
> 
> 
>>The resulting UML kernel for me works fine on host 
>>2.6.7-skas3-v7 and 2.6.9-skas3-v7.
> 
> Not yet tested this Jeff Dike's one.
Yes. Exactly the test of Jeff's patch I did in the first place.
> 
> However, for me, on a 2.6.10-rc2 SKAS patched host (with the update I posted 
> here and on my site to make SKAS compile), UML hangs on shutdown in SKAS 
> mode.
> 
> Nobody else reported this with -bb3, so I suppose there is something new in 
> 2.6.10-rc2 (I didn't yet test this with 
> 
> Please let's start testing it ASAP, so that it's fixed before 2.6.10 release. 
> However, I have at least to apply the Jeff's fix before.
Maybe I could do this soon.
> 
> 
>>Next, I tested what happens if the "kill(pid, SIGKILL);" and
>>"ptrace(PTRACE_CONT, pid);" are removed from os_kill_ptraced_process(), and
>>it also worked fine.
> 
> 
>>Since PTRACE_KILL wakes up the ptraced process unconditionally, IMHO
>>"ptrace(PTRACE_CONT, pid);" has no effect here.
> 
> You reported PTRACE_KILL not to work in some Ptrace states. PTRACE_CONT is 
> intended to fix this (even if PTRACE_KILL is not needed).
No. PTRACE_CONT is not a fix for non working PTRACE_KILL.

What works and what doesn't:

The problem with PTRACE_KILL is not, that it doesn't wakeup the ptraced child,
but that it not always generates a SIGKILL for the child.
On all i386 hosts I have knowledge about (2.6.7, 2.6.9, 2.4.21) PTRACE_KILL wakes
up the ptraced child unconditionally (on 2.4 not tested, only read the code). Thus
PTRACE_CONT is not necessary after PTRCAE_KILL.
But PTRACE_KILL doesn't generate a SIGKILL for processes, that are not stopped on a
ptrace-interception. This is, because the parent doing PTRACE_KILL modifies child's
exit_code only, leaving the task of generating the signal to the child itself, which
does it in the routines that called ptrace_notify(). Why is it done that way: we
have to handle two different cases: on a "signal-received"-interception, the new
signal-number placed into child->exit_code by the parent replaces the signal being
processed by the child. On a singlestep-trap or syscall-interception, child
generates a new signal to itself, calling send_sig().
By the way: there are calls to ptrace_notify() that lack the code of
signal-generation. This at least is inconsistent. And PTRACE_KILL won't work for a
process, that is stopped on the first syscall-interception (PTRACE_SYSCALL). Since
PTRACE_KILL doesn't reset the TIF_SYSCALL_TRACE (and TIF_SYSCAL_EMU) flag, the
process will execute the current syscall and stop on the exit-interception again.
You would have to wait for that happen, and restart it again. Then the queued
SIGKILL can be processed.
Now assume a process is running in a user-space loop or is waiting in a syscall.
Then PTRACE_KILL will wake it up, but no SIGKILL is generated. For a process
waiting in a read(0,...) my test shows, that read() isn't even aborted. I guess, it
wakes up and sees no pending signal, thus it sleeps again. So, PTRACE_KILL has no
effect is cases like these, but doing a kill(SIGKILL) would work.

Considering all this, you could use ptrace(PTRACE_CONT, pid, 0, SIGKILL) instead of
ptrace(PTRACE_KILL, pid), and it would work even better, since it resets all TIF_XXX
flags! While other ptrace commands are restricted to the case, that the child is
stopped, PTRACE_KILL maybe used for a running child, but then it doesn't work.
Poor implementation!

On the other hand, kill(SIGKILL) doesn't kill a ptraced-process that is stopped on a
ptrace-interception since the new TASK_TRACED state is implemented in 2.6.9.
kill(SIGKILL) queues the signal only, after this the process must be resumed with
a PTRACE_CONT or PTRACE_KILL.

I guess, currently the best way to kill a process, not knowing it's state, is:
     kill(SIGKILL)
     ptrace(PTRACE_KILL)
     wait until process stops or exits
     if process is stopped
         ptrace(PTRACE_CONT)
But this is very expensive.

Now, let's look at UML. With Jeff's "current fix-kill", os_kill_ptraced_process()
only is called in cases, where we *know*, the ptraced child is stopped. Thus a
simple PTRACE_KILL will work. No kill(SIGKILL) and PTRACE_CONT are necessary.
On host 2.6.7 and 2.6.9
> 
> 
>>And "kill(pid, SIGKILL);" 
>>only is needed, if the ptraced process is not stopped on a
>>ptrace-interception.
> 
> Well, PTRACE_CONT is to resume if from the ptrace-interception. Let's start by 
> distinguishing bugs (i.e. reggressions in 2.6.9 behaviour vs. 2.6.8 and 
> earlier one) from strange APIs. Gdb, or any ptracer, is IMHO never supposed 
> to SIGKILL the process directly.
What does the ptracer do, if the ptraced process runs in an endlessuser-space-loop?
Maybe, it sends the process another signal (e.g. SIGTRAP) and waits for the
"signal-received"-interception. Then it successfully can do it's PTRACE_KILL.
> 
> 
>>Since os_kill_ptraced_process() is called only then, 
>>when the ptraced process is known to be on a ptrace-stop, I would like to
>>remove "kill(pid, SIGKILL);", too.
>>os_kill_ptraced_process() is called each time, a process in TT exec()s.
>>Thus avoiding unneeded calls is a little speedup.
> 
> 



-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now. 
http://productguide.itmanagersjournal.com/
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [uml-devel] Re: The current fix-kill patch
  2004-11-24 11:18     ` Bodo Stroesser
@ 2004-11-25  4:09       ` Blaisorblade
  2004-11-25 10:13         ` Bodo Stroesser
  0 siblings, 1 reply; 7+ messages in thread
From: Blaisorblade @ 2004-11-25  4:09 UTC (permalink / raw)
  To: Bodo Stroesser; +Cc: user-mode-linux-devel, Jeff Dike

On Wednesday 24 November 2004 12:18, Bodo Stroesser wrote:
> Blaisorblade wrote:
> > On Tuesday 23 November 2004 16:10, Bodo Stroesser wrote:
> >>Jeff Dike wrote:
> >>>Here's the latest fix-kill patch in my tree.  Please test and report...
> >>
> >>Have tested based on 2.6.9-bb3. I removed
> >> "rework-uml-hang-on-2.6.9-host", added BlaisorBlade's
> >> "uml-hang-on-2.6.9-host", and adapted iand added this new patch.
> >
> > Btw, if you send another patch, send it relative to
> > "uml-hang-on-2.6.9-host", or to it + fix-kill from Jeff, so I can easily
> > spot the differences.
> >
> > This is a general rule, also (with a bit of taste, i.e. applies between
> > developers, not between users).

> Maybe I was somewhat unclear. "this new patch" was Jeff's "current
> fix-kill". So I only removed my "rework-uml-hang ..." and added your
> "uml-hang-..." and Jeff's "current fix-kill" on top of it.
> I just adapted current fix-kill to apply to 2.6.9, but didn't send these
> trivial modifications.
Ok, I had this clear. I was referring exactly to your rework-uml-hang..., even 
it was not the exact moment.

> >>The resulting UML kernel for me works fine on host
> >>2.6.7-skas3-v7 and 2.6.9-skas3-v7.
> >
> > Not yet tested this Jeff Dike's one.

> Yes. Exactly the test of Jeff's patch I did in the first place.

> > However, for me, on a 2.6.10-rc2 SKAS patched host (with the update I
> > posted here and on my site to make SKAS compile), UML hangs on shutdown
> > in SKAS mode.
Just retested with -bb3 on 2.6.9 and the hang on shutdown does not happen - 
99% sure I'm using the same Uml kernel.

About this, from patch-2.6.10-rc1-bk24.log somewhere near its -bk snapshot:

ChangeSet@1.2026.10.144, 2004-10-26 08:02:35-07:00, roland@redhat.com
  [PATCH] Wake up signalled tasks when exiting ptrace

  In general it is not safe to do any non-ptrace wakeup of a thread in
  TASK_TRACED, because the waking thread could race with a ptrace call
  that could be doing things like mucking directly with its kernel stack.

  AFAIK noone has established that whatever clobberation ptrace can do to
  a running thread is safe even if it will never return to user mode, so
  we can't allow this even for SIGKILL.

  What we _can_ safely do is make a thread switching out of TASK_TRACED
  resume rather than sitting in TASK_STOPPED if it has a pending SIGKILL
  or SIGCONT.  The following patch does this.

  This should be sufficient for the shutdown case.  When killing all
  processes, if the tracer gets killed first, the tracee goes into
  TASK_STOPPED and will be woken and killed by the SIGKILL (same as
  before).  If the tracee gets killed first, it gets a pending SIGKILL and
  doesn't wake up immediately--but, now, when the tracer gets killed, the
  tracee will then wake up to die.

  This will also fix the (same) situations that can arise now where you
  have used gdb (or whatever ptrace caller), killed -9 the gdb and the
  process being debugged, but still have to kill -CONT the process before
  it goes away (now it should just go away either the first time or when
  you kill gdb).

  Signed-off-by: Roland McGrath <roland@redhat.com>
  Signed-off-by: Linus Torvalds <torvalds@osdl.org>

It seems to partially improve the broken behaviour we observed in 2.6.9, so we 
might just report to Roland the hell on ptrace implementation we are seeing.

> > Nobody else reported this with -bb3, so I suppose there is something new
> > in 2.6.10-rc2 (I didn't yet test this with

> > Please let's start testing it ASAP, so that it's fixed before 2.6.10
> > release. However, I have at least to apply the Jeff's fix before.

> Maybe I could do this soon.

> What works and what doesn't:


> Considering all this, you could use ptrace(PTRACE_CONT, pid, 0, SIGKILL)
> instead of ptrace(PTRACE_KILL, pid), and it would work even better, since
> it resets all TIF_XXX flags!
Sorry, so why don't we actually do this?

> While other ptrace commands are restricted to 
> the case, that the child is stopped, PTRACE_KILL maybe used for a running
> child, but then it doesn't work. Poor implementation!
>
> On the other hand, kill(SIGKILL) doesn't kill a ptraced-process that is
> stopped on a ptrace-interception since the new TASK_TRACED state is
> implemented in 2.6.9. kill(SIGKILL) queues the signal only, after this the
> process must be resumed with a PTRACE_CONT or PTRACE_KILL.
>
> I guess, currently the best way to kill a process, not knowing it's state,
> is: kill(SIGKILL)
>      ptrace(PTRACE_KILL)
>      wait until process stops or exits
>      if process is stopped
>          ptrace(PTRACE_CONT)
> But this is very expensive.
>
> Now, let's look at UML. With Jeff's "current fix-kill",
> os_kill_ptraced_process() only is called in cases, where we *know*, the
> ptraced child is stopped. Thus a simple PTRACE_KILL will work. No
> kill(SIGKILL) and PTRACE_CONT are necessary. On host 2.6.7 and 2.6.9
>
> >>And "kill(pid, SIGKILL);"
> >>only is needed, if the ptraced process is not stopped on a
> >>ptrace-interception.
> >
> > Well, PTRACE_CONT is to resume if from the ptrace-interception. Let's
> > start by distinguishing bugs (i.e. reggressions in 2.6.9 behaviour vs.
> > 2.6.8 and earlier one) from strange APIs.

> > Gdb, or any ptracer, is IMHO 
> > never supposed to SIGKILL the process directly.
Well, I thought this, but given how counterintuitive ptrace() seems to be, 
I'll have to verify this.
> What does the ptracer do, if the ptraced process runs in an
> endlessuser-space-loop? Maybe, it sends the process another signal (e.g.
> SIGTRAP) and waits for the "signal-received"-interception. Then it
> successfully can do it's PTRACE_KILL.

-- 
Paolo Giarrusso, aka Blaisorblade
Linux registered user n. 292729
http://www.user-mode-linux.org/~blaisorblade


-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now. 
http://productguide.itmanagersjournal.com/
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [uml-devel] Re: The current fix-kill patch
  2004-11-25  4:09       ` Blaisorblade
@ 2004-11-25 10:13         ` Bodo Stroesser
  0 siblings, 0 replies; 7+ messages in thread
From: Bodo Stroesser @ 2004-11-25 10:13 UTC (permalink / raw)
  To: Blaisorblade; +Cc: user-mode-linux-devel, Jeff Dike

Blaisorblade wrote:
> On Wednesday 24 November 2004 12:18, Bodo Stroesser wrote:
> 
>>Blaisorblade wrote:
>>
>>>On Tuesday 23 November 2004 16:10, Bodo Stroesser wrote:
>>>
>>>>Jeff Dike wrote:
>>>>
>>>>>Here's the latest fix-kill patch in my tree.  Please test and report...
>>>>
>>>>Have tested based on 2.6.9-bb3. I removed
>>>>"rework-uml-hang-on-2.6.9-host", added BlaisorBlade's
>>>>"uml-hang-on-2.6.9-host", and adapted iand added this new patch.
>>>
>>>Btw, if you send another patch, send it relative to
>>>"uml-hang-on-2.6.9-host", or to it + fix-kill from Jeff, so I can easily
>>>spot the differences.
>>>
>>>This is a general rule, also (with a bit of taste, i.e. applies between
>>>developers, not between users).
> 
> 
>>Maybe I was somewhat unclear. "this new patch" was Jeff's "current
>>fix-kill". So I only removed my "rework-uml-hang ..." and added your
>>"uml-hang-..." and Jeff's "current fix-kill" on top of it.
>>I just adapted current fix-kill to apply to 2.6.9, but didn't send these
>>trivial modifications.
> 
> Ok, I had this clear. I was referring exactly to your rework-uml-hang..., even 
> it was not the exact moment.
When I did my rework, I didn't know, that your patch already was included in
2.6.10 mainline. Thus, since I wanted to undo big parts of your work, I thought
I would be a good idea to have a new patch. I was in error, sorry.
> 
> 
>>>>The resulting UML kernel for me works fine on host
>>>>2.6.7-skas3-v7 and 2.6.9-skas3-v7.
>>>
>>>Not yet tested this Jeff Dike's one.
> 
> 
>>Yes. Exactly the test of Jeff's patch I did in the first place.
> 
> 
>>>However, for me, on a 2.6.10-rc2 SKAS patched host (with the update I
>>>posted here and on my site to make SKAS compile), UML hangs on shutdown
>>>in SKAS mode.
> 
> Just retested with -bb3 on 2.6.9 and the hang on shutdown does not happen - 
> 99% sure I'm using the same Uml kernel.
Yes. I also tested. And I tried to test with a 2.6.10-bkX-mmY host. (tried
2.6.9-rc1 and 2.6.9-rc2-mm5, same result).
Unfortunately this version doesn't boot on my host. It freezes while loading
the driver for my DAC960 Mylex RAID. And the root-fs resides on a logical disk
of that controller ...
> 
> About this, from patch-2.6.10-rc1-bk24.log somewhere near its -bk snapshot:
> 
> ChangeSet@1.2026.10.144, 2004-10-26 08:02:35-07:00, roland@redhat.com
>   [PATCH] Wake up signalled tasks when exiting ptrace
> 
>   In general it is not safe to do any non-ptrace wakeup of a thread in
>   TASK_TRACED, because the waking thread could race with a ptrace call
>   that could be doing things like mucking directly with its kernel stack.
> 
>   AFAIK noone has established that whatever clobberation ptrace can do to
>   a running thread is safe even if it will never return to user mode, so
>   we can't allow this even for SIGKILL.
> 
>   What we _can_ safely do is make a thread switching out of TASK_TRACED
>   resume rather than sitting in TASK_STOPPED if it has a pending SIGKILL
>   or SIGCONT.  The following patch does this.
> 
>   This should be sufficient for the shutdown case.  When killing all
>   processes, if the tracer gets killed first, the tracee goes into
>   TASK_STOPPED and will be woken and killed by the SIGKILL (same as
>   before).  If the tracee gets killed first, it gets a pending SIGKILL and
>   doesn't wake up immediately--but, now, when the tracer gets killed, the
>   tracee will then wake up to die.
> 
>   This will also fix the (same) situations that can arise now where you
>   have used gdb (or whatever ptrace caller), killed -9 the gdb and the
>   process being debugged, but still have to kill -CONT the process before
>   it goes away (now it should just go away either the first time or when
>   you kill gdb).
> 
>   Signed-off-by: Roland McGrath <roland@redhat.com>
>   Signed-off-by: Linus Torvalds <torvalds@osdl.org>
> 
> It seems to partially improve the broken behaviour we observed in 2.6.9, so we 
> might just report to Roland the hell on ptrace implementation we are seeing.
Very interesting change. Good idea, generally!
For UML, this shouldn't make any difference. Our intention is to actively kill
all ptraced processes. Before the tracer exits, no more child should live.
If this change in the host produces a hang on UML shutdown, I guess, it's wrong.
> 
> 
>>>Nobody else reported this with -bb3, so I suppose there is something new
>>>in 2.6.10-rc2 (I didn't yet test this with
> 
> 
>>>Please let's start testing it ASAP, so that it's fixed before 2.6.10
>>>release. However, I have at least to apply the Jeff's fix before.
> 
> 
>>Maybe I could do this soon.
> 
> 
>>What works and what doesn't:
> 
> 
> 
>>Considering all this, you could use ptrace(PTRACE_CONT, pid, 0, SIGKILL)
>>instead of ptrace(PTRACE_KILL, pid), and it would work even better, since
>>it resets all TIF_XXX flags!
> 
> Sorry, so why don't we actually do this?
Because UML doesn't use ptrace(PTRACE_KILL), when the ptraced process is stopped
on the first syscall-interception. Thus no problem occurs in UML. We should use
PTRACE_KILL to have profit from all future enhancements of that function.
> 
> 
>>While other ptrace commands are restricted to 
>>the case, that the child is stopped, PTRACE_KILL maybe used for a running
>>child, but then it doesn't work. Poor implementation!
>>
>>On the other hand, kill(SIGKILL) doesn't kill a ptraced-process that is
>>stopped on a ptrace-interception since the new TASK_TRACED state is
>>implemented in 2.6.9. kill(SIGKILL) queues the signal only, after this the
>>process must be resumed with a PTRACE_CONT or PTRACE_KILL.
>>
>>I guess, currently the best way to kill a process, not knowing it's state,
>>is: kill(SIGKILL)
>>     ptrace(PTRACE_KILL)
>>     wait until process stops or exits
>>     if process is stopped
>>         ptrace(PTRACE_CONT)
>>But this is very expensive.
>>
>>Now, let's look at UML. With Jeff's "current fix-kill",
>>os_kill_ptraced_process() only is called in cases, where we *know*, the
>>ptraced child is stopped. Thus a simple PTRACE_KILL will work. No
>>kill(SIGKILL) and PTRACE_CONT are necessary. On host 2.6.7 and 2.6.9
>>
>>
>>>>And "kill(pid, SIGKILL);"
>>>>only is needed, if the ptraced process is not stopped on a
>>>>ptrace-interception.
>>>
>>>Well, PTRACE_CONT is to resume if from the ptrace-interception. Let's
>>>start by distinguishing bugs (i.e. reggressions in 2.6.9 behaviour vs.
>>>2.6.8 and earlier one) from strange APIs.
> 
> 
>>>Gdb, or any ptracer, is IMHO 
>>>never supposed to SIGKILL the process directly.
> 
> Well, I thought this, but given how counterintuitive ptrace() seems to be, 
> I'll have to verify this.
> 
>>What does the ptracer do, if the ptraced process runs in an
>>endlessuser-space-loop? Maybe, it sends the process another signal (e.g.
>>SIGTRAP) and waits for the "signal-received"-interception. Then it
>>successfully can do it's PTRACE_KILL.
> 
> 



-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now. 
http://productguide.itmanagersjournal.com/
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [uml-devel] Re: The current fix-kill patch
  2004-11-23  5:50 [uml-devel] The current fix-kill patch Jeff Dike
  2004-11-23 15:10 ` [uml-devel] " Bodo Stroesser
@ 2004-12-01 18:20 ` Jason Lunz
  1 sibling, 0 replies; 7+ messages in thread
From: Jason Lunz @ 2004-12-01 18:20 UTC (permalink / raw)
  To: user-mode-linux-devel

jdike@addtoit.com said:
> Here's the latest fix-kill patch in my tree.  Please test and report...

Will there be anything similar put out to let 2.4 uml guests run on
2.6.9?

Jason



-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now. 
http://productguide.itmanagersjournal.com/
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2004-12-01 18:21 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2004-11-23  5:50 [uml-devel] The current fix-kill patch Jeff Dike
2004-11-23 15:10 ` [uml-devel] " Bodo Stroesser
2004-11-23 20:10   ` Blaisorblade
2004-11-24 11:18     ` Bodo Stroesser
2004-11-25  4:09       ` Blaisorblade
2004-11-25 10:13         ` Bodo Stroesser
2004-12-01 18:20 ` Jason Lunz

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.