From mboxrd@z Thu Jan  1 00:00:00 1970
Return-Path: <linux-kernel-owner@vger.kernel.org>
Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand
	id S1759009AbbKSRrJ (ORCPT <rfc822;w@1wt.eu>);
	Thu, 19 Nov 2015 12:47:09 -0500
Received: from mx1.redhat.com ([209.132.183.28]:51469 "EHLO mx1.redhat.com"
	rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP
	id S1752755AbbKSRrH (ORCPT <rfc822;linux-kernel@vger.kernel.org>);
	Thu, 19 Nov 2015 12:47:07 -0500
Date: Thu, 19 Nov 2015 18:47:54 +0100
From: Oleg Nesterov <oleg@redhat.com>
To: Pedro Alves <palves@redhat.com>
Cc: Tejun Heo <tj@kernel.org>, Jan Kratochvil <jan.kratochvil@redhat.com>,
        Andrey Ryabinin <aryabinin@virtuozzo.com>,
        Roland McGrath <roland@hack.frob.com>,
        LKML <linux-kernel@vger.kernel.org>
Subject: Re: ptrace() hangs on attempt to seize/attach stopped & frozen task
Message-ID: <20151119174754.GA13949@redhat.com>
References: <5640B7F2.70406@virtuozzo.com> <20151109185506.GA22744@redhat.com> <20151109180207.GA28507@mtj.duckdns.org> <20151110202017.GA2976@redhat.com> <20151116184516.GJ18894@mtj.duckdns.org> <20151117193419.GA9993@redhat.com> <564DFDAF.3000402@redhat.com>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Disposition: inline
In-Reply-To: <564DFDAF.3000402@redhat.com>
User-Agent: Mutt/1.5.18 (2008-05-17)
Sender: linux-kernel-owner@vger.kernel.org
List-ID: <linux-kernel.vger.kernel.org>
X-Mailing-List: linux-kernel@vger.kernel.org

Thanks Pedro for your email,

I'll recheck tomorrow, but at first glance:

On 11/19, Pedro Alves wrote:
>
> Both GDB and gdbserver have special processing for attaching to already-stopped
> processes.

Yes, I am starting to recall that I have looked at this code years ago ;)

>  907 linux_attach_lwp (ptid_t ptid)
>  908 {
>  909   struct lwp_info *new_lwp;
>  910   int lwpid = ptid_get_lwp (ptid);
>  911
>  912   if (ptrace (PTRACE_ATTACH, lwpid, (PTRACE_TYPE_ARG3) 0, (PTRACE_TYPE_ARG4) 0)
>  913       != 0)
>  914     return errno;
>  915
>  916   new_lwp = add_lwp (ptid);
>  917
>  918   /* We need to wait for SIGSTOP before being able to make the next
>  919      ptrace call on this LWP.  */
>  920   new_lwp->must_set_ptrace_flags = 1;
>  921
>  922   if (linux_proc_pid_is_stopped (lwpid))

This can't happen today. Starting from v3.0 at least.

> This queuing of a SIGSTOP + PTRACE_CONT was necessary because
> otherwise when gdb attaches to a job stopped process, gdb would hang in the waitpid
> after PTRACE_ATTACH, waiting for the initial SIGSTOP which would never arrive.

Yes, because its exit code could be already cleared iirc. This was fixed
even before.

> If the proposed change makes it so that a new intermediate state can be observed
> right after PTRACE_ATTACH, and so linux_proc_pid_is_stopped can return false,
> then there's potential for breakage.

See above,

> But maybe not, if we're sure that
> that when that happens, waitpid returns for the initial
> PTRACE_ATTACH-induced SIGSTOP.

Yes. Just you can't assume that watpid(WNOHANG) will succeed. Is it OK?

Oleg.