All of lore.kernel.org
 help / color / mirror / Atom feed
From: Oleg Nesterov <oleg@redhat.com>
To: Tejun Heo <tj@kernel.org>
Cc: Roland McGrath <roland@redhat.com>,
	jan.kratochvil@redhat.com, linux-kernel@vger.kernel.org,
	torvalds@linux-foundation.org, akpm@linux-foundation.org
Subject: Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH
Date: Wed, 9 Feb 2011 22:25:26 +0100	[thread overview]
Message-ID: <20110209212526.GA9999@redhat.com> (raw)
In-Reply-To: <20110209141803.GH3770@htj.dyndns.org>

On 02/09, Tejun Heo wrote:
>
> We can make it behave like the following.  { | } denotes two
> alternative behaviors regarding SIGCONT.
>
>   If a group stop is initiated while, or was in progress when a task
>   is ptraced, the task will stop for group stop and notify the ptracer
>   accordingly.  Note that the task could be trapped elsewhere delaying
>   this from happening.  When the task stops for group stop, it
>   participates in group stop as if it is not ptraced and the real
>   parent is notified of group stop completion.

OK,

>   Note that { the task is put into TASK_TRACED state and group stop
>   resume by SIGCONT is ignored. | the task is put into TASK_STOPPED
>   state and the following PTRACE request will transition it into
>   TASK_TRACED.  If SIGCONT is received before transition to
>   TASK_TRACED is made, the task will resume execution.  If PTRACE
>   request faces with SIGCONT, PTRACE request may fail. }

To me, the first variant looks better. But, only because it is closer
to the current behaviour. I mean, it is better to change the things
incrementally.

But in the longer term - I do not know. Personally, I like the
TASK_STOPPED variant. To the point, I was thinking that (perhaps)
we can change ptrace_stop() so that it simply calls do_signal_stop()
if it notices ->group_stop_count != 0.

>   The ptracer may resume execution of the task using PTRACE_CONT
>   without affecting other tasks in the group.

And this is what I do not like. I just can't accept the fact there
is a running thread in the SIGNAL_STOP_STOPPED group.

But yes: this is what the current code does, I am not sure we can
change this, and both PTRACE_CONT-doesnt-resume-until-SIGCONT and
PTRACE_CONT-acts-as-SIGCONT are not "perfect" too.

>   On ptrace detach, if group stop is in effect, the task will be put
>   into TASK_STOPPED state and if it is the first time the task is
>   stopping for the current group stop, it will participate in group
>   stop completion.

Yes. (but depends on above).

> This can be phrased better but it seems well defined enough for me.  I
> take it that one of your concerns is direct transition into
> TASK_TRACED on group stop while ptraced which prevents the tracee from
> reacting to the following SIGCONT.

Yes,

> I'm not sure how much of an actual
> problem it is given that our notification to real parent hasn't worked
> at all till now

Yes! and this is very good argument in favour of all your objections ;)

Yes, this doesn't work anyway. But I _think_ this is the bug, if
we are going to change this code we should fix this bug as well.

Again, again, this is very subjective, I agree.

> but we can definitely implement proper TASK_STOPPED ->
> TRACED transition on the next PTRACE request.

I guess, you mean that the current code bypasses the
ptrace_stop()->arch_ptrace_stop_needed() code while doing s/STOPPED/TRACED ?

Oh, currently I am ignoring this, my only concern is how this all
looks to the userland. But this is the good point, and I have to admit
that I never realized this is just wrong. Yes, I agree, we should do
something, but this is not visible to user-space (except this should
fix the bug ;)

> There exists a
> fundamental race condition between SIGCONT and the next PTRACE call

Yes, and this race is already here, ptracer should take care.

> If we don't go that route, another solution would be to add a ptrace
> call which can listen to SIGCONT.  ie. PTRACE_WAIT_CONT or whatever
> which the ptracer can call once it knows the tracee entered group
> stop.

Perhaps... Or something else, but surely there is a room for improvements.
Fortunately, the changes like this are "safe". I mean, they can
break nothing. Just we should try to not make them wrong ;)

> In either case, the fundamentals of ptrace operation don't really
> change.  All ptrace operations are still per-task and ptracer almost
> always has control over execution of the tracee.  Sure, it allows
> ptraced task to escape group stop but it seems defined clear enough
> and IMHO actually is a helpful debugging feature.

Heh, I think we found the place where we can't convince each other.
What if we toss a coin?

> After all, it's not
> like stop signals can be used for absoultely reliable job control.
> There's an inherent race against SIGCONT.

Sure, if we are talking about SIGCONT from "nowhere". But, the same
way ^Z is not reliable too.

> > > What do you do about PTRACE requests while a task is group stopped?
> > > Reject them?  Block them?
> >
> > Yes, another known oddity. Of course we shouldn't reject or block.
> > Perhaps we can ignore this initially. If SIGCONT comes after another
> > request does STOPPED/TRACED it clears SIGNAL_STOP_STOPPED, but the
> > tracee won't run until the next PTRACE_CONT, this makes sense.
>
> That conceptually might make sense

I only meant, this makes sense initially.

> but other than the conceptual
> integrity it widely changes the assumptions and is less useful than
> the current behavior.

Hmm, this is what we currently have?

> I don't really see why we would want to do
> that.

No, I think we do not really want this in the longer term. But I
can't say what exactly we want.

> > Only if it attaches to every thread in the thread group. Otherwise,
> > if the non-thread has already initiated the group-stop, the tracee
> > will notice TIF_SIGPENDING eventually and call do_signal_stop(),
> > debugger can't control this.
>
> The debugger is still notified and can override it.

Hmm... no, it can't? Of course it is notified after the tracee
participates and calls do_signal_stop() and gdb can resume it then.
But it can't prevent the tracee from stopping.

> > > I don't think it's an extreme corner case
> > > we can break.  For example, if a user gdb's a program which raises one
> > > of the stop signals, currently the user expects to be able to continue
> > > the program from whithin the gdb.  If we make group stop override
> > > ptrace, there's no other recourse than sending signal from outside.
> >
> > Yes. Of course, gdb can be "fixed", it can send SIGCONT.
> >
> > But yes, this is the noticeable change, that is why I suggested
> > ptrace_resume-acts-as-SIGCONT logic. Ugly, yes, but more or less
> > compatible. (although let me repeat, _pesonally_ I'd prefer to
> > simply tell user-space to learn the new rules ;)
>
> I can't really agree there.  First, to me, it seems like too radical a
> change

(I assume you mean PTRACE_CONT-doesnt-resume variant?)

> and secondly the resulting behavior might look conceptually
> pleasing but is not as useful as the current one.  Why make a change
> which results in reduced usefulness while noticeably breaking existing
> users?

I don't really agree with "not as useful", but this doesn't matter.
I agree with "noticeably breaking", this is enough. (assuming my
guess above is correct).

> > Given that SIGCHLD doesn't queue and with or without your changes
> > we send it per-thread, it is not trivial for gdb to detect the
> > group-stop anyway. Again, the kernel should help somehow.
>
> Hmmm?  Isn't this discoverable from the exit code from wait?

Sure. Probably I misunderstood. I thought, you mean we need something
like per-process "the whole group is stopped" notification for the
debugger.

> > > and I'm not really sure whether that's something worth achieving
> > > at the cost of debugging capabilities especially when we don't _have_
> > > to lose them.
> >
> > But we do not? I mean, at least this is not worse than the current
> > behaviour.
>
> I think it's worse.  With your changes, debuggers can't diddle the
> tasks behind group stop's back which the current users already expect.

OK, I certainly misunderstood you, and now I can't restore the context.
Could you spell?

> > > I agree it adds more integrity to group stop but at the cost of
> > > debugging capabilities.  I'm not yet convinced integrity of group stop
> > > is that important.  Why is it such a big deal?
> >
> > Of course I can't "prove" it is that important. But I think so.
>
> Heh, I'm not asking for proof that it is more useful. :-) But I'm still
> curious why you think it's important because the benefits aren't
> apparent to me.  Roland and you seem to share this opinion without
> much dicussion so maybe I'm missing something?

I can't!

I hate this from the time when I noticed that the application doesn't
respond to ^Z under strace. And I used strace exactly because I wanted
do debug some (I can't recall exactly) problems with jctl. That is all.

But in any case. Some users run the services under ptrace. I mean,
the application borns/runs/dies under ptrace. That is why personally
I certainly do not like anything which delays until detach (say,
the-tracee-doesnt-participate-in-group-stop-until-detach logic).

> > > CLD_STOPPED is too but while ptrace is attached the notifications are
> > > made per-task and delivered to the tracer.
> >
> > No, there is a difference. Sure, CLD_STOPPED is per-process without
> > ptrace. But CLD_CONTINUED continues to be per-process even if all
> > threads are traced.
>
> Hmm... I need to think more about it.  I'm not fully following your
> point.

This is simple. No matter how many threads we have, no matter how
many of them are ptraced, we send a single CLD_CONTINUED notification.
The only difference ptrace can make is: we look at ->group_leader
to decide who will get this notification.

> > > I think this is the key question.  Whether to de-throne
> > > PTRACE_CONT such that it cannot override group stop.  As I've already
> > > said several times already, I think it is a pretty fundamental
> > > property of ptrace
> >
> > Again, I am a bit confused. Note that PTRACE_CONT overrides
> > group stop if we do the above. It should wake up the tracee, in
> > SIGCONT-compatible way (yes, the latter is not exactly clear).
>
> What do you mean?  Waking up in SIGCONT-compatible way?  Sending
> SIGCONT ending the whole group stop?

Yes. I do not mean we should literally do send_sig_info(SIGCONT)
of course.

> > But at least this should be visible to real parent. We shouldn't
> > silently make the stopped tracee running while its real_parent
> > thinks everything is stopped.
>
> I think maybe this is where our different POVs come from.

Yes, probably.

> To me, it
> isn't too objectionable to allow debuggers to diddle with tracees
> behind the real parent's back.  In fact, it would be quite useful when
> debugging job control related behaviors.  I wouldn't have much problem
> accepting the other way around - ie. strict job control even while
> being debugged, but given that it is already allowed and visible, I
> fail to see why we should change the behavior.  It doesn't seem to
> have enough benefits to warrant such visible change.

All I can say is: sure, I see your point, and perhaps you are right
and I am wrong.

I'd really like to force CC list to participate ;)

> If I change the patchset such that group stop while ptraced first
> enters TASK_STOPPED and then transitions into TASK_TRACED on the next
> PTRACE call,

Again, I am not sure I understand what exactly you mean... If you
mean that it is wrong to simply change the state of the tracee in
ptrace_check_attach() without arch_ptrace_stop() - I agree, this
probably should be fixed.

I am wondering, if there is a simpler change... probably not.

But. this looks a bit off-topic (I mean, this looks orthogonal
to the other things we are discussing), or I missed something else?

> there will be race window which would be visible

Personally, I think this is fine.

Oleg.


  parent reply	other threads:[~2011-02-09 21:34 UTC|newest]

Thread overview: 160+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2011-01-28 15:08 [PATCHSET] ptrace,signal: group stop / ptrace updates Tejun Heo
2011-01-28 15:08 ` [PATCH 01/10] signal: fix SIGCONT notification code Tejun Heo
2011-01-28 15:08 ` [PATCH 02/10] ptrace: remove the extra wake_up_process() from ptrace_detach() Tejun Heo
2011-01-28 18:46   ` Roland McGrath
2011-01-31 10:38     ` Tejun Heo
2011-02-01 10:26       ` [PATCH] ptrace: use safer wake up on ptrace_detach() Tejun Heo
2011-02-01 13:40         ` Oleg Nesterov
2011-02-01 15:07           ` Tejun Heo
2011-02-01 19:17             ` Oleg Nesterov
2011-02-02  5:31             ` Roland McGrath
2011-02-02 10:35               ` Tejun Heo
2011-02-02  0:27         ` Andrew Morton
2011-02-02  5:33           ` Roland McGrath
2011-02-02  5:38             ` Andrew Morton
2011-02-02 10:34               ` Tejun Heo
2011-02-02 19:33                 ` Andrew Morton
2011-02-02 20:01                   ` Tejun Heo
2011-02-02 21:40             ` Oleg Nesterov
2011-02-02  5:29         ` Roland McGrath
2011-02-02  5:28       ` [PATCH 02/10] ptrace: remove the extra wake_up_process() from ptrace_detach() Roland McGrath
2011-01-28 15:08 ` [PATCH 03/10] signal: remove superflous try_to_freeze() loop in do_signal_stop() Tejun Heo
2011-01-28 18:46   ` Roland McGrath
2011-01-28 15:08 ` [PATCH 04/10] ptrace: kill tracehook_notify_jctl() Tejun Heo
2011-01-28 21:09   ` Roland McGrath
2011-01-28 15:08 ` [PATCH 05/10] ptrace: add @why to ptrace_stop() Tejun Heo
2011-01-28 18:48   ` Roland McGrath
2011-01-28 15:08 ` [PATCH 06/10] signal: fix premature completion of group stop when interfered by ptrace Tejun Heo
2011-01-28 21:22   ` Roland McGrath
2011-01-31 11:00     ` Tejun Heo
2011-02-02  5:44       ` Roland McGrath
2011-02-02 10:56         ` Tejun Heo
2011-01-28 15:08 ` [PATCH 07/10] signal: use GROUP_STOP_PENDING to stop once for a single group stop Tejun Heo
2011-01-28 15:08 ` [PATCH 08/10] ptrace: participate in group stop from ptrace_stop() iff the task is trapping for " Tejun Heo
2011-01-28 21:30   ` Roland McGrath
2011-01-31 11:26     ` Tejun Heo
2011-02-02  5:57       ` Roland McGrath
2011-02-02 10:53         ` Tejun Heo
2011-02-03 10:02           ` Tejun Heo
2011-02-01 19:36     ` Oleg Nesterov
2011-01-28 15:08 ` [PATCH 09/10] ptrace: make do_signal_stop() use ptrace_stop() if the task is being ptraced Tejun Heo
2011-01-28 15:08 ` [PATCH 10/10] ptrace: clean transitions between TASK_STOPPED and TRACED Tejun Heo
2011-02-03 20:41   ` [PATCH 0/1] (Was: ptrace: clean transitions between TASK_STOPPED and TRACED) Oleg Nesterov
2011-02-03 20:41     ` [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH Oleg Nesterov
2011-02-03 21:36       ` Roland McGrath
2011-02-03 21:44         ` Oleg Nesterov
2011-02-04 10:53           ` Tejun Heo
2011-02-04 13:04             ` Oleg Nesterov
2011-02-04 14:48               ` Tejun Heo
2011-02-04 17:06                 ` Oleg Nesterov
2011-02-05 13:39                   ` Tejun Heo
2011-02-07 13:42                     ` Oleg Nesterov
2011-02-07 14:11                       ` Tejun Heo
2011-02-07 15:37                         ` Oleg Nesterov
2011-02-07 16:31                           ` Tejun Heo
2011-02-07 17:48                             ` Oleg Nesterov
2011-02-09 14:18                               ` Tejun Heo
2011-02-09 14:21                                 ` Tejun Heo
2011-02-09 21:25                                 ` Oleg Nesterov [this message]
2011-02-13 23:01                                   ` Denys Vlasenko
2011-02-14  9:03                                     ` Jan Kratochvil
2011-02-14 11:39                                       ` Denys Vlasenko
2011-02-14 17:32                                         ` Oleg Nesterov
2011-02-14 16:01                                       ` Oleg Nesterov
2011-02-26  3:59                                       ` Pavel Machek
2011-02-14 15:51                                     ` Oleg Nesterov
2011-02-14 14:50                                   ` Tejun Heo
2011-02-14 18:53                                     ` Oleg Nesterov
2011-02-13 22:25                                 ` Denys Vlasenko
2011-02-14 15:13                                   ` Tejun Heo
2011-02-14 16:15                                     ` Oleg Nesterov
2011-02-14 16:33                                       ` Tejun Heo
2011-02-14 17:23                                         ` Oleg Nesterov
2011-02-14 17:20                                     ` Denys Vlasenko
2011-02-14 17:30                                       ` Tejun Heo
2011-02-14 17:45                                         ` Oleg Nesterov
2011-02-14 17:54                                         ` Denys Vlasenko
2011-02-21 15:16                                           ` Tejun Heo
2011-02-21 15:28                                             ` Oleg Nesterov
2011-02-21 16:11                                               ` [pseudo patch] ptrace should respect the group stop Oleg Nesterov
2011-02-22 16:24                                               ` [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH Tejun Heo
2011-02-24 21:08                                                 ` Oleg Nesterov
2011-02-25 15:45                                                   ` Tejun Heo
2011-02-25 17:42                                                     ` Roland McGrath
2011-02-28 15:23                                                     ` Oleg Nesterov
2011-02-14 17:51                                       ` Oleg Nesterov
2011-02-14 18:55                                         ` Denys Vlasenko
2011-02-14 19:01                                           ` Oleg Nesterov
2011-02-14 19:42                                             ` Denys Vlasenko
2011-02-14 20:01                                               ` Oleg Nesterov
2011-02-15 15:24                                                 ` Tejun Heo
2011-02-15 15:58                                                   ` Oleg Nesterov
2011-02-15 17:31                                                   ` Roland McGrath
2011-02-15 20:27                                                     ` Oleg Nesterov
2011-02-18 17:02                                                       ` Tejun Heo
2011-02-18 19:37                                                         ` Oleg Nesterov
2011-02-21 16:22                                                           ` Tejun Heo
2011-02-21 16:49                                                             ` Oleg Nesterov
2011-02-21 16:59                                                               ` Tejun Heo
2011-02-23 19:31                                                                 ` Oleg Nesterov
2011-02-25 15:10                                                                   ` Tejun Heo
2011-02-24 20:29                                                             ` Oleg Nesterov
2011-02-25 15:51                                                               ` Tejun Heo
2011-02-26  2:48                                                                 ` Denys Vlasenko
2011-02-28 12:56                                                                   ` Tejun Heo
2011-02-28 13:16                                                                     ` Denys Vlasenko
2011-02-28 13:29                                                                       ` Tejun Heo
2011-02-28 13:41                                                                         ` Denys Vlasenko
2011-02-28 13:53                                                                           ` Tejun Heo
2011-02-28 14:25                                                                             ` Denys Vlasenko
2011-02-28 14:39                                                                               ` Tejun Heo
2011-02-28 16:48                                                                                 ` Oleg Nesterov
2011-02-28 14:36                                                                   ` Oleg Nesterov
2011-02-16 21:51                                       ` Jan Kratochvil
2011-02-17  3:37                                         ` Denys Vlasenko
2011-02-17 19:19                                           ` Oleg Nesterov
2011-02-18 21:11                                             ` Jan Kratochvil
2011-02-19 20:16                                               ` Oleg Nesterov
2011-02-17 16:49                                         ` Oleg Nesterov
2011-02-17 18:58                                           ` Roland McGrath
2011-02-17 19:33                                             ` Oleg Nesterov
2011-02-18 21:34                                           ` Jan Kratochvil
2011-02-19 20:06                                             ` Oleg Nesterov
2011-02-20  9:40                                               ` Jan Kratochvil
2011-02-20 17:06                                                 ` Denys Vlasenko
2011-02-20 17:48                                                   ` Oleg Nesterov
2011-02-20 19:10                                                   ` Jan Kratochvil
2011-02-20 19:16                                                     ` Oleg Nesterov
2011-02-20 17:16                                                 ` Oleg Nesterov
2011-02-20 18:52                                                   ` Jan Kratochvil
2011-02-20 20:38                                                     ` Oleg Nesterov
2011-02-20 21:06                                                       ` `(T) stopped' preservation after _exit() [Re: [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH] Jan Kratochvil
2011-02-20 21:19                                                         ` Oleg Nesterov
2011-02-20 21:20                                                       ` [PATCH 1/1] ptrace: make sure do_wait() won't hang after PTRACE_ATTACH Jan Kratochvil
2011-02-21 14:23                                                         ` Oleg Nesterov
2011-02-23 16:44                                                           ` Jan Kratochvil
2011-02-14 15:31                                   ` Oleg Nesterov
2011-02-14 17:24                                     ` Denys Vlasenko
2011-02-14 17:39                                       ` Oleg Nesterov
2011-02-14 17:57                                         ` Denys Vlasenko
2011-02-14 18:00                                           ` Oleg Nesterov
2011-02-14 18:06                                             ` Oleg Nesterov
2011-02-14 18:59                                         ` Denys Vlasenko
2011-02-13 21:24                 ` Denys Vlasenko
2011-02-14 15:06                   ` Oleg Nesterov
2011-02-14 15:19                     ` Tejun Heo
2011-02-14 16:20                       ` Oleg Nesterov
2011-02-14 17:05                     ` Denys Vlasenko
2011-02-14 17:18                       ` Oleg Nesterov
2011-01-28 16:54 ` [PATCHSET] ptrace,signal: group stop / ptrace updates Ingo Molnar
2011-01-28 17:41   ` Thomas Gleixner
2011-01-28 18:04     ` Anca Emanuel
2011-01-28 18:36       ` Mathieu Desnoyers
2011-01-28 17:55   ` Oleg Nesterov
2011-01-28 18:29     ` Bash not reacting to Ctrl-C Ingo Molnar
2011-02-05 20:34       ` Oleg Nesterov
2011-02-07 13:08         ` Oleg Nesterov
2011-02-09  6:17           ` Michael Witten
2011-02-09 14:53             ` Ingo Molnar
2011-02-09 19:37               ` Michael Witten
2011-02-11 14:41           ` Pavel Machek

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20110209212526.GA9999@redhat.com \
    --to=oleg@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=jan.kratochvil@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=roland@redhat.com \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.