All of lore.kernel.org
 help / color / mirror / Atom feed
From: Pedro Alves <palves@redhat.com>
To: Oleg Nesterov <oleg@redhat.com>, Tejun Heo <tj@kernel.org>
Cc: Jan Kratochvil <jan.kratochvil@redhat.com>,
	Andrey Ryabinin <aryabinin@virtuozzo.com>,
	Roland McGrath <roland@hack.frob.com>,
	LKML <linux-kernel@vger.kernel.org>
Subject: Re: ptrace() hangs on attempt to seize/attach stopped & frozen task
Date: Thu, 19 Nov 2015 16:49:51 +0000	[thread overview]
Message-ID: <564DFDAF.3000402@redhat.com> (raw)
In-Reply-To: <20151117193419.GA9993@redhat.com>

On 11/17/2015 07:34 PM, Oleg Nesterov wrote:
> On 11/16, Tejun Heo wrote:

>>> And perhaps we can simply remove this logic? I forgot why do we hide this
>>> STOPPED -> RUNNING -> TRACED transition from the attaching thread. But the
>>> vague feeling tells me that we discussed this before and perhaps it was me
>>> who suggested to avoid the user-visible change when you introduced this
>>> transition...
>>
>> Heh, it was too long ago for me to remember much. :)
> 
> Same here...
> 
>>> Anyway, now I do not understand why do we want to hide it. Lets consider
>>> the following "test-case",
>>>
>>> 	void test(int pid)
>>> 	{
>>> 		kill(pid, SIGSTOP);
>>> 		waitpid(pid, NULL, WSTOPPED);
>>>
>>> 		ptrace(PTRACE_ATTACH-OR-PTRACE_SEIZE, pid, 0,0);
>>>
>>> 		assert(ptrace(PTRACE_DETACH, pid, 0,0) == 0);
>>> 	}
>>>
>>> Yes, it will fail if we remove JOBCTL_TRAPPING. But it can equally fail
>>> if SIGCONT comes before ATTACH, so perhaps we do not really care?
>>>
>>> Jan, Pedro, do you think the patch below can break gdb somehow? With this
>>> patch you can never assume that waitpid(WNOHANG) or ptrace(WHATEVER) will
>>> succeed right after PTRACE_ATTACH/PTRACE_SEIZE, even if you know that the
>>> tracee was TASK_STOPPED before attach.

Not sure, because I don't think I fully understand that proposed change.

Both GDB and gdbserver have special processing for attaching to already-stopped
processes.  (and neither use PTRACE_SEIZE yet.)

Here's the gdbserver version:

 https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gdb/gdbserver/linux-low.c;h=41ab510fa4ac5654f101f08efb68e26b5bc5dbd7;hb=HEAD#l903

Copied here for convenience:

 907 linux_attach_lwp (ptid_t ptid)
 908 {
 909   struct lwp_info *new_lwp;
 910   int lwpid = ptid_get_lwp (ptid);
 911
 912   if (ptrace (PTRACE_ATTACH, lwpid, (PTRACE_TYPE_ARG3) 0, (PTRACE_TYPE_ARG4) 0)
 913       != 0)
 914     return errno;
 915
 916   new_lwp = add_lwp (ptid);
 917
 918   /* We need to wait for SIGSTOP before being able to make the next
 919      ptrace call on this LWP.  */
 920   new_lwp->must_set_ptrace_flags = 1;
 921
 922   if (linux_proc_pid_is_stopped (lwpid))
 923     {
 924       if (debug_threads)
 925         debug_printf ("Attached to a stopped process\n");
 926
 927       /* The process is definitely stopped.  It is in a job control
 928          stop, unless the kernel predates the TASK_STOPPED /
 929          TASK_TRACED distinction, in which case it might be in a
 930          ptrace stop.  Make sure it is in a ptrace stop; from there we
 931          can kill it, signal it, et cetera.
 932
 933          First make sure there is a pending SIGSTOP.  Since we are
 934          already attached, the process can not transition from stopped
 935          to running without a PTRACE_CONT; so we know this signal will
 936          go into the queue.  The SIGSTOP generated by PTRACE_ATTACH is
 937          probably already in the queue (unless this kernel is old
 938          enough to use TASK_STOPPED for ptrace stops); but since
 939          SIGSTOP is not an RT signal, it can only be queued once.  */
 940       kill_lwp (lwpid, SIGSTOP);
 941
 942       /* Finally, resume the stopped process.  This will deliver the
 943          SIGSTOP (or a higher priority signal, just like normal
 944          PTRACE_ATTACH), which we'll catch later on.  */
 945       ptrace (PTRACE_CONT, lwpid, (PTRACE_TYPE_ARG3) 0, (PTRACE_TYPE_ARG4) 0);
 946     }
 947
 948   /* The next time we wait for this LWP we'll see a SIGSTOP as PTRACE_ATTACH
 949      brings it to a halt.
 950

linux_proc_pid_is_stopped checks whether the state in /proc/pid/status is "T (stopped)".

Here's the equivalent in gdb:

  https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gdb/linux-nat.c;h=841ec3949c37438dfba924d8db6b37ffc416dd29;hb=HEAD#l974

This queuing of a SIGSTOP + PTRACE_CONT was necessary because
otherwise when gdb attaches to a job stopped process, gdb would hang in the waitpid
after PTRACE_ATTACH, waiting for the initial SIGSTOP which would never arrive.

If the proposed change makes it so that a new intermediate state can be observed
right after PTRACE_ATTACH, and so linux_proc_pid_is_stopped can return false,
then there's potential for breakage.  But maybe not, if we're sure that
that when that happens, waitpid returns for the initial
PTRACE_ATTACH-induced SIGSTOP.

Thanks,
Pedro Alves


  parent reply	other threads:[~2015-11-19 16:49 UTC|newest]

Thread overview: 16+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-11-09 15:12 ptrace() hangs on attempt to seize/attach stopped & frozen task Andrey Ryabinin
2015-11-09 18:55 ` Oleg Nesterov
2015-11-09 18:02   ` Tejun Heo
2015-11-10 20:20     ` Oleg Nesterov
2015-11-16 18:45       ` Tejun Heo
2015-11-17 19:34         ` Oleg Nesterov
2015-11-17 18:57           ` Tejun Heo
2015-11-19 16:49           ` Pedro Alves [this message]
2015-11-19 17:47             ` Oleg Nesterov
2015-11-19 18:08               ` Pedro Alves
2015-11-10 20:20   ` Oleg Nesterov
2015-11-19 18:47 ` [PATCH 0/2] (Was: ptrace() hangs on attempt to seize/attach stopped & frozen task) Oleg Nesterov
2015-11-19 18:47   ` [PATCH 1/2] ptrace: make wait_on_bit(JOBCTL_TRAPPING_BIT) in ptrace_attach() killable Oleg Nesterov
2015-11-23 23:05     ` Tejun Heo
2015-11-19 18:47   ` [PATCH 2/2] ptrace: task_stopped_code(ptrace => true) can't see TASK_STOPPED task Oleg Nesterov
2015-11-23 23:15     ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=564DFDAF.3000402@redhat.com \
    --to=palves@redhat.com \
    --cc=aryabinin@virtuozzo.com \
    --cc=jan.kratochvil@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=oleg@redhat.com \
    --cc=roland@hack.frob.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.