All of lore.kernel.org
 help / color / mirror / Atom feed
From: Bodo Stroesser <bstroesser@fujitsu-siemens.com>
To: Blaisorblade <blaisorblade_spam@yahoo.it>
Cc: user-mode-linux-devel@lists.sourceforge.net,
	Jeff Dike <jdike@addtoit.com>
Subject: Re: [uml-devel] Re: The current fix-kill patch
Date: Thu, 25 Nov 2004 11:13:15 +0100	[thread overview]
Message-ID: <41A5B03B.8020307@fujitsu-siemens.com> (raw)
In-Reply-To: <200411250509.09827.blaisorblade_spam@yahoo.it>

Blaisorblade wrote:
> On Wednesday 24 November 2004 12:18, Bodo Stroesser wrote:
> 
>>Blaisorblade wrote:
>>
>>>On Tuesday 23 November 2004 16:10, Bodo Stroesser wrote:
>>>
>>>>Jeff Dike wrote:
>>>>
>>>>>Here's the latest fix-kill patch in my tree.  Please test and report...
>>>>
>>>>Have tested based on 2.6.9-bb3. I removed
>>>>"rework-uml-hang-on-2.6.9-host", added BlaisorBlade's
>>>>"uml-hang-on-2.6.9-host", and adapted iand added this new patch.
>>>
>>>Btw, if you send another patch, send it relative to
>>>"uml-hang-on-2.6.9-host", or to it + fix-kill from Jeff, so I can easily
>>>spot the differences.
>>>
>>>This is a general rule, also (with a bit of taste, i.e. applies between
>>>developers, not between users).
> 
> 
>>Maybe I was somewhat unclear. "this new patch" was Jeff's "current
>>fix-kill". So I only removed my "rework-uml-hang ..." and added your
>>"uml-hang-..." and Jeff's "current fix-kill" on top of it.
>>I just adapted current fix-kill to apply to 2.6.9, but didn't send these
>>trivial modifications.
> 
> Ok, I had this clear. I was referring exactly to your rework-uml-hang..., even 
> it was not the exact moment.
When I did my rework, I didn't know, that your patch already was included in
2.6.10 mainline. Thus, since I wanted to undo big parts of your work, I thought
I would be a good idea to have a new patch. I was in error, sorry.
> 
> 
>>>>The resulting UML kernel for me works fine on host
>>>>2.6.7-skas3-v7 and 2.6.9-skas3-v7.
>>>
>>>Not yet tested this Jeff Dike's one.
> 
> 
>>Yes. Exactly the test of Jeff's patch I did in the first place.
> 
> 
>>>However, for me, on a 2.6.10-rc2 SKAS patched host (with the update I
>>>posted here and on my site to make SKAS compile), UML hangs on shutdown
>>>in SKAS mode.
> 
> Just retested with -bb3 on 2.6.9 and the hang on shutdown does not happen - 
> 99% sure I'm using the same Uml kernel.
Yes. I also tested. And I tried to test with a 2.6.10-bkX-mmY host. (tried
2.6.9-rc1 and 2.6.9-rc2-mm5, same result).
Unfortunately this version doesn't boot on my host. It freezes while loading
the driver for my DAC960 Mylex RAID. And the root-fs resides on a logical disk
of that controller ...
> 
> About this, from patch-2.6.10-rc1-bk24.log somewhere near its -bk snapshot:
> 
> ChangeSet@1.2026.10.144, 2004-10-26 08:02:35-07:00, roland@redhat.com
>   [PATCH] Wake up signalled tasks when exiting ptrace
> 
>   In general it is not safe to do any non-ptrace wakeup of a thread in
>   TASK_TRACED, because the waking thread could race with a ptrace call
>   that could be doing things like mucking directly with its kernel stack.
> 
>   AFAIK noone has established that whatever clobberation ptrace can do to
>   a running thread is safe even if it will never return to user mode, so
>   we can't allow this even for SIGKILL.
> 
>   What we _can_ safely do is make a thread switching out of TASK_TRACED
>   resume rather than sitting in TASK_STOPPED if it has a pending SIGKILL
>   or SIGCONT.  The following patch does this.
> 
>   This should be sufficient for the shutdown case.  When killing all
>   processes, if the tracer gets killed first, the tracee goes into
>   TASK_STOPPED and will be woken and killed by the SIGKILL (same as
>   before).  If the tracee gets killed first, it gets a pending SIGKILL and
>   doesn't wake up immediately--but, now, when the tracer gets killed, the
>   tracee will then wake up to die.
> 
>   This will also fix the (same) situations that can arise now where you
>   have used gdb (or whatever ptrace caller), killed -9 the gdb and the
>   process being debugged, but still have to kill -CONT the process before
>   it goes away (now it should just go away either the first time or when
>   you kill gdb).
> 
>   Signed-off-by: Roland McGrath <roland@redhat.com>
>   Signed-off-by: Linus Torvalds <torvalds@osdl.org>
> 
> It seems to partially improve the broken behaviour we observed in 2.6.9, so we 
> might just report to Roland the hell on ptrace implementation we are seeing.
Very interesting change. Good idea, generally!
For UML, this shouldn't make any difference. Our intention is to actively kill
all ptraced processes. Before the tracer exits, no more child should live.
If this change in the host produces a hang on UML shutdown, I guess, it's wrong.
> 
> 
>>>Nobody else reported this with -bb3, so I suppose there is something new
>>>in 2.6.10-rc2 (I didn't yet test this with
> 
> 
>>>Please let's start testing it ASAP, so that it's fixed before 2.6.10
>>>release. However, I have at least to apply the Jeff's fix before.
> 
> 
>>Maybe I could do this soon.
> 
> 
>>What works and what doesn't:
> 
> 
> 
>>Considering all this, you could use ptrace(PTRACE_CONT, pid, 0, SIGKILL)
>>instead of ptrace(PTRACE_KILL, pid), and it would work even better, since
>>it resets all TIF_XXX flags!
> 
> Sorry, so why don't we actually do this?
Because UML doesn't use ptrace(PTRACE_KILL), when the ptraced process is stopped
on the first syscall-interception. Thus no problem occurs in UML. We should use
PTRACE_KILL to have profit from all future enhancements of that function.
> 
> 
>>While other ptrace commands are restricted to 
>>the case, that the child is stopped, PTRACE_KILL maybe used for a running
>>child, but then it doesn't work. Poor implementation!
>>
>>On the other hand, kill(SIGKILL) doesn't kill a ptraced-process that is
>>stopped on a ptrace-interception since the new TASK_TRACED state is
>>implemented in 2.6.9. kill(SIGKILL) queues the signal only, after this the
>>process must be resumed with a PTRACE_CONT or PTRACE_KILL.
>>
>>I guess, currently the best way to kill a process, not knowing it's state,
>>is: kill(SIGKILL)
>>     ptrace(PTRACE_KILL)
>>     wait until process stops or exits
>>     if process is stopped
>>         ptrace(PTRACE_CONT)
>>But this is very expensive.
>>
>>Now, let's look at UML. With Jeff's "current fix-kill",
>>os_kill_ptraced_process() only is called in cases, where we *know*, the
>>ptraced child is stopped. Thus a simple PTRACE_KILL will work. No
>>kill(SIGKILL) and PTRACE_CONT are necessary. On host 2.6.7 and 2.6.9
>>
>>
>>>>And "kill(pid, SIGKILL);"
>>>>only is needed, if the ptraced process is not stopped on a
>>>>ptrace-interception.
>>>
>>>Well, PTRACE_CONT is to resume if from the ptrace-interception. Let's
>>>start by distinguishing bugs (i.e. reggressions in 2.6.9 behaviour vs.
>>>2.6.8 and earlier one) from strange APIs.
> 
> 
>>>Gdb, or any ptracer, is IMHO 
>>>never supposed to SIGKILL the process directly.
> 
> Well, I thought this, but given how counterintuitive ptrace() seems to be, 
> I'll have to verify this.
> 
>>What does the ptracer do, if the ptraced process runs in an
>>endlessuser-space-loop? Maybe, it sends the process another signal (e.g.
>>SIGTRAP) and waits for the "signal-received"-interception. Then it
>>successfully can do it's PTRACE_KILL.
> 
> 



-------------------------------------------------------
SF email is sponsored by - The IT Product Guide
Read honest & candid reviews on hundreds of IT Products from real users.
Discover which products truly live up to the hype. Start reading now. 
http://productguide.itmanagersjournal.com/
_______________________________________________
User-mode-linux-devel mailing list
User-mode-linux-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/user-mode-linux-devel

  reply	other threads:[~2004-11-25 10:13 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2004-11-23  5:50 [uml-devel] The current fix-kill patch Jeff Dike
2004-11-23 15:10 ` [uml-devel] " Bodo Stroesser
2004-11-23 20:10   ` Blaisorblade
2004-11-24 11:18     ` Bodo Stroesser
2004-11-25  4:09       ` Blaisorblade
2004-11-25 10:13         ` Bodo Stroesser [this message]
2004-12-01 18:20 ` Jason Lunz

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=41A5B03B.8020307@fujitsu-siemens.com \
    --to=bstroesser@fujitsu-siemens.com \
    --cc=blaisorblade_spam@yahoo.it \
    --cc=jdike@addtoit.com \
    --cc=user-mode-linux-devel@lists.sourceforge.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.