Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Oleg Nesterov <oleg@redhat.com>
To: Tejun Heo <tj@kernel.org>
Cc: Roland McGrath <roland@redhat.com>,
	jan.kratochvil@redhat.com, linux-kernel@vger.kernel.org,
	torvalds@linux-foundation.org, akpm@linux-foundation.org
Subject: Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
Date: Wed, 9 Feb 2011 22:25:26 +0100	[thread overview]
Message-ID: <20110209212526.GA9999@redhat.com> (raw)
In-Reply-To: <20110209141803.GH3770@htj.dyndns.org>

On 02/09, Tejun Heo wrote:
>
> We can make it behave like the following.  { | } denotes two
> alternative behaviors regarding SIGCONT.
>
>   If a group stop is initiated while, or was in progress when a task
>   is ptraced, the task will stop for group stop and notify the ptracer
>   accordingly.  Note that the task could be trapped elsewhere delaying
>   this from happening.  When the task stops for group stop, it
>   participates in group stop as if it is not ptraced and the real
>   parent is notified of group stop completion.

OK,

>   Note that { the task is put into TASK_TRACED state and group stop
>   resume by SIGCONT is ignored. | the task is put into TASK_STOPPED
>   state and the following PTRACE request will transition it into
>   TASK_TRACED.  If SIGCONT is received before transition to
>   TASK_TRACED is made, the task will resume execution.  If PTRACE
>   request faces with SIGCONT, PTRACE request may fail. }

To me, the first variant looks better. But, only because it is closer
to the current behaviour. I mean, it is better to change the things
incrementally.

But in the longer term - I do not know. Personally, I like the
TASK_STOPPED variant. To the point, I was thinking that (perhaps)
we can change ptrace_stop() so that it simply calls do_signal_stop()
if it notices ->group_stop_count != 0.

>   The ptracer may resume execution of the task using PTRACE_CONT
>   without affecting other tasks in the group.

And this is what I do not like. I just can't accept the fact there
is a running thread in the SIGNAL_STOP_STOPPED group.

But yes: this is what the current code does, I am not sure we can
change this, and both PTRACE_CONT-doesnt-resume-until-SIGCONT and
PTRACE_CONT-acts-as-SIGCONT are not "perfect" too.

>   On ptrace detach, if group stop is in effect, the task will be put
>   into TASK_STOPPED state and if it is the first time the task is
>   stopping for the current group stop, it will participate in group
>   stop completion.

Yes. (but depends on above).

> This can be phrased better but it seems well defined enough for me.  I
> take it that one of your concerns is direct transition into
> TASK_TRACED on group stop while ptraced which prevents the tracee from
> reacting to the following SIGCONT.

Yes,

> I'm not sure how much of an actual
> problem it is given that our notification to real parent hasn't worked
> at all till now

Yes! and this is very good argument in favour of all your objections ;)

Yes, this doesn't work anyway. But I _think_ this is the bug, if
we are going to change this code we should fix this bug as well.

Again, again, this is very subjective, I agree.

> but we can definitely implement proper TASK_STOPPED ->
> TRACED transition on the next PTRACE request.

I guess, you mean that the current code bypasses the
ptrace_stop()->arch_ptrace_stop_needed() code while doing s/STOPPED/TRACED ?

Oh, currently I am ignoring this, my only concern is how this all
looks to the userland. But this is the good point, and I have to admit
that I never realized this is just wrong. Yes, I agree, we should do
something, but this is not visible to user-space (except this should
fix the bug ;)

> There exists a
> fundamental race condition between SIGCONT and the next PTRACE call

Yes, and this race is already here, ptracer should take care.

> If we don't go that route, another solution would be to add a ptrace
> call which can listen to SIGCONT.  ie. PTRACE_WAIT_CONT or whatever
> which the ptracer can call once it knows the tracee entered group
> stop.

Perhaps... Or something else, but surely there is a room for improvements.
Fortunately, the changes like this are "safe". I mean, they can
break nothing. Just we should try to not make them wrong ;)

> In either case, the fundamentals of ptrace operation don't really
> change.  All ptrace operations are still per-task and ptracer almost
> always has control over execution of the tracee.  Sure, it allows
> ptraced task to escape group stop but it seems defined clear enough
> and IMHO actually is a helpful debugging feature.

Heh, I think we found the place where we can't convince each other.
What if we toss a coin?

> After all, it's not
> like stop signals can be used for absoultely reliable job control.
> There's an inherent race against SIGCONT.

Sure, if we are talking about SIGCONT from "nowhere". But, the same
way ^Z is not reliable too.

> > > What do you do about PTRACE requests while a task is group stopped?
> > > Reject them?  Block them?
> >
> > Yes, another known oddity. Of course we shouldn't reject or block.
> > Perhaps we can ignore this initially. If SIGCONT comes after another
> > request does STOPPED/TRACED it clears SIGNAL_STOP_STOPPED, but the
> > tracee won't run until the next PTRACE_CONT, this makes sense.
>
> That conceptually might make sense

I only meant, this makes sense initially.

> but other than the conceptual
> integrity it widely changes the assumptions and is less useful than
> the current behavior.

Hmm, this is what we currently have?

> I don't really see why we would want to do
> that.

No, I think we do not really want this in the longer term. But I
can't say what exactly we want.

> > Only if it attaches to every thread in the thread group. Otherwise,
> > if the non-thread has already initiated the group-stop, the tracee
> > will notice TIF_SIGPENDING eventually and call do_signal_stop(),
> > debugger can't control this.
>
> The debugger is still notified and can override it.

Hmm... no, it can't? Of course it is notified after the tracee
participates and calls do_signal_stop() and gdb can resume it then.
But it can't prevent the tracee from stopping.

> > > I don't think it's an extreme corner case
> > > we can break.  For example, if a user gdb's a program which raises one
> > > of the stop signals, currently the user expects to be able to continue
> > > the program from whithin the gdb.  If we make group stop override
> > > ptrace, there's no other recourse than sending signal from outside.
> >
> > Yes. Of course, gdb can be "fixed", it can send SIGCONT.
> >
> > But yes, this is the noticeable change, that is why I suggested
> > ptrace_resume-acts-as-SIGCONT logic. Ugly, yes, but more or less
> > compatible. (although let me repeat, _pesonally_ I'd prefer to
> > simply tell user-space to learn the new rules ;)
>
> I can't really agree there.  First, to me, it seems like too radical a
> change

(I assume you mean PTRACE_CONT-doesnt-resume variant?)

> and secondly the resulting behavior might look conceptually
> pleasing but is not as useful as the current one.  Why make a change
> which results in reduced usefulness while noticeably breaking existing
> users?

I don't really agree with "not as useful", but this doesn't matter.
I agree with "noticeably breaking", this is enough. (assuming my
guess above is correct).

> > Given that SIGCHLD doesn't queue and with or without your changes
> > we send it per-thread, it is not trivial for gdb to detect the
> > group-stop anyway. Again, the kernel should help somehow.
>
> Hmmm?  Isn't this discoverable from the exit code from wait?

Sure. Probably I misunderstood. I thought, you mean we need something
like per-process "the whole group is stopped" notification for the
debugger.

> > > and I'm not really sure whether that's something worth achieving
> > > at the cost of debugging capabilities especially when we don't _have_
> > > to lose them.
> >
> > But we do not? I mean, at least this is not worse than the current
> > behaviour.
>
> I think it's worse.  With your changes, debuggers can't diddle the
> tasks behind group stop's back which the current users already expect.

OK, I certainly misunderstood you, and now I can't restore the context.
Could you spell?

> > > I agree it adds more integrity to group stop but at the cost of
> > > debugging capabilities.  I'm not yet convinced integrity of group stop
> > > is that important.  Why is it such a big deal?
> >
> > Of course I can't "prove" it is that important. But I think so.
>
> Heh, I'm not asking for proof that it is more useful. :-) But I'm still
> curious why you think it's important because the benefits aren't
> apparent to me.  Roland and you seem to share this opinion without
> much dicussion so maybe I'm missing something?

I can't!

I hate this from the time when I noticed that the application doesn't
respond to ^Z under strace. And I used strace exactly because I wanted
do debug some (I can't recall exactly) problems with jctl. That is all.

But in any case. Some users run the services under ptrace. I mean,
the application borns/runs/dies under ptrace. That is why personally
I certainly do not like anything which delays until detach (say,
the-tracee-doesnt-participate-in-group-stop-until-detach logic).

> > > CLD_STOPPED is too but while ptrace is attached the notifications are
> > > made per-task and delivered to the tracer.
> >
> > No, there is a difference. Sure, CLD_STOPPED is per-process without
> > ptrace. But CLD_CONTINUED continues to be per-process even if all
> > threads are traced.
>
> Hmm... I need to think more about it.  I'm not fully following your
> point.

This is simple. No matter how many threads we have, no matter how
many of them are ptraced, we send a single CLD_CONTINUED notification.
The only difference ptrace can make is: we look at ->group_leader
to decide who will get this notification.

> > > I think this is the key question.  Whether to de-throne
> > > PTRACE_CONT such that it cannot override group stop.  As I've already
> > > said several times already, I think it is a pretty fundamental
> > > property of ptrace
> >
> > Again, I am a bit confused. Note that PTRACE_CONT overrides
> > group stop if we do the above. It should wake up the tracee, in
> > SIGCONT-compatible way (yes, the latter is not exactly clear).
>
> What do you mean?  Waking up in SIGCONT-compatible way?  Sending
> SIGCONT ending the whole group stop?

Yes. I do not mean we should literally do send_sig_info(SIGCONT)
of course.

> > But at least this should be visible to real parent. We shouldn't
> > silently make the stopped tracee running while its real_parent
> > thinks everything is stopped.
>
> I think maybe this is where our different POVs come from.

Yes, probably.

> To me, it
> isn't too objectionable to allow debuggers to diddle with tracees
> behind the real parent's back.  In fact, it would be quite useful when
> debugging job control related behaviors.  I wouldn't have much problem
> accepting the other way around - ie. strict job control even while
> being debugged, but given that it is already allowed and visible, I
> fail to see why we should change the behavior.  It doesn't seem to
> have enough benefits to warrant such visible change.

All I can say is: sure, I see your point, and perhaps you are right
and I am wrong.

I'd really like to force CC list to participate ;)

> If I change the patchset such that group stop while ptraced first
> enters TASK_STOPPED and then transitions into TASK_TRACED on the next
> PTRACE call,

Again, I am not sure I understand what exactly you mean... If you
mean that it is wrong to simply change the state of the tracee in
ptrace_check_attach() without arch_ptrace_stop() - I agree, this
probably should be fixed.

I am wondering, if there is a simpler change... probably not.

But. this looks a bit off-topic (I mean, this looks orthogonal
to the other things we are discussing), or I missed something else?

> there will be race window which would be visible

Personally, I think this is fine.

Oleg.

next prev parent reply	other threads:[~2011-02-09 21:34 UTC|newest]

Thread overview: 160+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-01-28 15:08 [PATCHSET] ptrace,signal: group stop / ptrace updates Tejun Heo
2011-01-28 15:08 ` [PATCH 01/10] signal: fix SIGCONT notification code Tejun Heo
2011-01-28 15:08 ` [PATCH 02/10] ptrace: remove the extra wake_up_process() from ptrace_detach() Tejun Heo
2011-01-28 18:46   ` Roland McGrath
2011-01-31 10:38     ` Tejun Heo
2011-02-01 10:26       ` [PATCH] ptrace: use safer wake up on ptrace_detach() Tejun Heo
2011-02-01 13:40         ` Oleg Nesterov
2011-02-01 15:07           ` Tejun Heo
2011-02-01 19:17             ` Oleg Nesterov
2011-02-02  5:31             ` Roland McGrath
2011-02-02 10:35               ` Tejun Heo
2011-02-02  0:27         ` Andrew Morton
2011-02-02  5:33           ` Roland McGrath
2011-02-02  5:38             ` Andrew Morton
2011-02-02 10:34               ` Tejun Heo
2011-02-02 19:33                 ` Andrew Morton
2011-02-02 20:01                   ` Tejun Heo
2011-02-02 21:40             ` Oleg Nesterov
2011-02-02  5:29         ` Roland McGrath
2011-02-02  5:28       ` [PATCH 02/10] ptrace: remove the extra wake_up_process() from ptrace_detach() Roland McGrath
2011-01-28 15:08 ` [PATCH 03/10] signal: remove superflous try_to_freeze() loop in do_signal_stop() Tejun Heo
2011-01-28 18:46   ` Roland McGrath
2011-01-28 15:08 ` [PATCH 04/10] ptrace: kill tracehook_notify_jctl() Tejun Heo
2011-01-28 21:09   ` Roland McGrath
2011-01-28 15:08 ` [PATCH 05/10] ptrace: add @why to ptrace_stop() Tejun Heo
2011-01-28 18:48   ` Roland McGrath
2011-01-28 15:08 ` [PATCH 06/10] signal: fix premature completion of group stop when interfered by ptrace Tejun Heo
2011-01-28 21:22   ` Roland McGrath
2011-01-31 11:00     ` Tejun Heo
2011-02-02  5:44       ` Roland McGrath
2011-02-02 10:56         ` Tejun Heo
2011-01-28 15:08 ` [PATCH 07/10] signal: use GROUP_STOP_PENDING to stop once for a single group stop Tejun Heo
2011-01-28 15:08 ` [PATCH 08/10] ptrace: participate in group stop from ptrace_stop() iff the task is trapping for " Tejun Heo
2011-01-28 21:30   ` Roland McGrath
2011-01-31 11:26     ` Tejun Heo
2011-02-02  5:57       ` Roland McGrath
2011-02-02 10:53         ` Tejun Heo
2011-02-03 10:02           ` Tejun Heo
2011-02-01 19:36     ` Oleg Nesterov
2011-01-28 15:08 ` [PATCH 09/10] ptrace: make do_signal_stop() use ptrace_stop() if the task is being ptraced Tejun Heo
2011-01-28 15:08 ` [PATCH 10/10] ptrace: clean transitions between TASK_STOPPED and TRACED Tejun Heo
2011-02-03 20:41   ` [PATCH 0/1] (Was: ptrace: clean transitions between TASK_STOPPED and TRACED) Oleg Nesterov
2011-02-03 20:41     ` [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH Oleg Nesterov
2011-02-03 21:36       ` Roland McGrath
2011-02-03 21:44         ` Oleg Nesterov
2011-02-04 10:53           ` Tejun Heo
2011-02-04 13:04             ` Oleg Nesterov
2011-02-04 14:48               ` Tejun Heo
2011-02-04 17:06                 ` Oleg Nesterov
2011-02-05 13:39                   ` Tejun Heo
2011-02-07 13:42                     ` Oleg Nesterov
2011-02-07 14:11                       ` Tejun Heo
2011-02-07 15:37                         ` Oleg Nesterov
2011-02-07 16:31                           ` Tejun Heo
2011-02-07 17:48                             ` Oleg Nesterov
2011-02-09 14:18                               ` Tejun Heo
2011-02-09 14:21                                 ` Tejun Heo
2011-02-09 21:25                                 ` Oleg Nesterov [this message]
2011-02-13 23:01                                   ` Denys Vlasenko
2011-02-14  9:03                                     ` Jan Kratochvil
2011-02-14 11:39                                       ` Denys Vlasenko
2011-02-14 17:32                                         ` Oleg Nesterov
2011-02-14 16:01                                       ` Oleg Nesterov
2011-02-26  3:59                                       ` Pavel Machek
2011-02-14 15:51                                     ` Oleg Nesterov
2011-02-14 14:50                                   ` Tejun Heo
2011-02-14 18:53                                     ` Oleg Nesterov
2011-02-13 22:25                                 ` Denys Vlasenko
2011-02-14 15:13                                   ` Tejun Heo
2011-02-14 16:15                                     ` Oleg Nesterov
2011-02-14 16:33                                       ` Tejun Heo
2011-02-14 17:23                                         ` Oleg Nesterov
2011-02-14 17:20                                     ` Denys Vlasenko
2011-02-14 17:30                                       ` Tejun Heo
2011-02-14 17:45                                         ` Oleg Nesterov
2011-02-14 17:54                                         ` Denys Vlasenko
2011-02-21 15:16                                           ` Tejun Heo
2011-02-21 15:28                                             ` Oleg Nesterov
2011-02-21 16:11                                               ` [pseudo patch] ptrace should respect the group stop Oleg Nesterov
2011-02-22 16:24                                               ` [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH Tejun Heo
2011-02-24 21:08                                                 ` Oleg Nesterov
2011-02-25 15:45                                                   ` Tejun Heo
2011-02-25 17:42                                                     ` Roland McGrath
2011-02-28 15:23                                                     ` Oleg Nesterov
2011-02-14 17:51                                       ` Oleg Nesterov
2011-02-14 18:55                                         ` Denys Vlasenko
2011-02-14 19:01                                           ` Oleg Nesterov
2011-02-14 19:42                                             ` Denys Vlasenko
2011-02-14 20:01                                               ` Oleg Nesterov
2011-02-15 15:24                                                 ` Tejun Heo
2011-02-15 15:58                                                   ` Oleg Nesterov
2011-02-15 17:31                                                   ` Roland McGrath
2011-02-15 20:27                                                     ` Oleg Nesterov
2011-02-18 17:02                                                       ` Tejun Heo
2011-02-18 19:37                                                         ` Oleg Nesterov
2011-02-21 16:22                                                           ` Tejun Heo
2011-02-21 16:49                                                             ` Oleg Nesterov
2011-02-21 16:59                                                               ` Tejun Heo
2011-02-23 19:31                                                                 ` Oleg Nesterov
2011-02-25 15:10                                                                   ` Tejun Heo
2011-02-24 20:29                                                             ` Oleg Nesterov
2011-02-25 15:51                                                               ` Tejun Heo
2011-02-26  2:48                                                                 ` Denys Vlasenko
2011-02-28 12:56                                                                   ` Tejun Heo
2011-02-28 13:16                                                                     ` Denys Vlasenko
2011-02-28 13:29                                                                       ` Tejun Heo
2011-02-28 13:41                                                                         ` Denys Vlasenko
2011-02-28 13:53                                                                           ` Tejun Heo
2011-02-28 14:25                                                                             ` Denys Vlasenko
2011-02-28 14:39                                                                               ` Tejun Heo
2011-02-28 16:48                                                                                 ` Oleg Nesterov
2011-02-28 14:36                                                                   ` Oleg Nesterov
2011-02-16 21:51                                       ` Jan Kratochvil
2011-02-17  3:37                                         ` Denys Vlasenko
2011-02-17 19:19                                           ` Oleg Nesterov
2011-02-18 21:11                                             ` Jan Kratochvil
2011-02-19 20:16                                               ` Oleg Nesterov
2011-02-17 16:49                                         ` Oleg Nesterov
2011-02-17 18:58                                           ` Roland McGrath
2011-02-17 19:33                                             ` Oleg Nesterov
2011-02-18 21:34                                           ` Jan Kratochvil
2011-02-19 20:06                                             ` Oleg Nesterov
2011-02-20  9:40                                               ` Jan Kratochvil
2011-02-20 17:06                                                 ` Denys Vlasenko
2011-02-20 17:48                                                   ` Oleg Nesterov
2011-02-20 19:10                                                   ` Jan Kratochvil
2011-02-20 19:16                                                     ` Oleg Nesterov
2011-02-20 17:16                                                 ` Oleg Nesterov
2011-02-20 18:52                                                   ` Jan Kratochvil
2011-02-20 20:38                                                     ` Oleg Nesterov
2011-02-20 21:06                                                       ` `(T) stopped' preservation after _exit() [Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH] Jan Kratochvil
2011-02-20 21:19                                                         ` Oleg Nesterov
2011-02-20 21:20                                                       ` [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH Jan Kratochvil
2011-02-21 14:23                                                         ` Oleg Nesterov
2011-02-23 16:44                                                           ` Jan Kratochvil
2011-02-14 15:31                                   ` Oleg Nesterov
2011-02-14 17:24                                     ` Denys Vlasenko
2011-02-14 17:39                                       ` Oleg Nesterov
2011-02-14 17:57                                         ` Denys Vlasenko
2011-02-14 18:00                                           ` Oleg Nesterov
2011-02-14 18:06                                             ` Oleg Nesterov
2011-02-14 18:59                                         ` Denys Vlasenko
2011-02-13 21:24                 ` Denys Vlasenko
2011-02-14 15:06                   ` Oleg Nesterov
2011-02-14 15:19                     ` Tejun Heo
2011-02-14 16:20                       ` Oleg Nesterov
2011-02-14 17:05                     ` Denys Vlasenko
2011-02-14 17:18                       ` Oleg Nesterov
2011-01-28 16:54 ` [PATCHSET] ptrace,signal: group stop / ptrace updates Ingo Molnar
2011-01-28 17:41   ` Thomas Gleixner
2011-01-28 18:04     ` Anca Emanuel
2011-01-28 18:36       ` Mathieu Desnoyers
2011-01-28 17:55   ` Oleg Nesterov
2011-01-28 18:29     ` Bash not reacting to Ctrl-C Ingo Molnar
2011-02-05 20:34       ` Oleg Nesterov
2011-02-07 13:08         ` Oleg Nesterov
2011-02-09  6:17           ` Michael Witten
2011-02-09 14:53             ` Ingo Molnar
2011-02-09 19:37               ` Michael Witten
2011-02-11 14:41           ` Pavel Machek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110209212526.GA9999@redhat.com \
    --to=oleg@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=jan.kratochvil@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=roland@redhat.com \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox