* Re: Mutual debugging of 2 processes can stuck in unkillable stopped state [not found] <f2f32ffa-52ad-ff67-19d8-95305a70a6f8@omprussia.ru> @ 2021-03-29 16:49 ` Oleg Nesterov 2021-03-29 17:01 ` Igor Zhbanov 0 siblings, 1 reply; 6+ messages in thread From: Oleg Nesterov @ 2021-03-29 16:49 UTC (permalink / raw) To: Igor Zhbanov; +Cc: linux-trace-devel, linux-kernel On 03/29, Igor Zhbanov wrote: > > Mutual debugging of 2 processes can stuck in unkillable stopped state can't reproduce and can't understand... > Hi! > > When one process, let's say "A", is tracing the another process "B", and the > process "B" is trying to attach to the process "A", then both of them are > getting stuck in the "t+" state. And they are ignoring all of the signals > including the SIGKILL, Why do you think so? What is your kernel version? "t" means TASK_TRACED, SIGKILL should wake it up and terminate. > so it is not possible to terminate them without > a reboot. > > To reproduce: > 1) Run two terminals > 2) Attach with "strace -p ..." from the first terminal to the shell (bash) of > the second terminal. > 3) In the second terminal run "exec strace -p ..." to attach to the PID of the > first strace. > > Then you'll see that the second strace is hanging without any output. And the > first strace will output following and hang too: > ptrace(PTRACE_SEIZE, 11795, NULL, > PTRACE_O_TRACESYSGOOD|PTRACE_O_TRACEEXEC|PTRACE_O_TRACEEXIT > > (The 11795 is the PID of the first strace itself.) > > And in the process list you will see following: > ps awux | grep strace > user 11776 0.0 0.0 24752 2248 pts/3 t+ 13:53 0:00 strace -p 11795 > user 11795 0.0 0.0 24752 3888 pts/1 t+ 13:54 0:00 strace -p 11776 OK, may be they sleep in PTRACE_EVENT_EXIT? After you tried to send SIGKILL? please show us the output from "cat /proc/{11795,11776}/stack". And "cat /proc/{11795,11776}/status" just in case. Oleg. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Mutual debugging of 2 processes can stuck in unkillable stopped state 2021-03-29 16:49 ` Mutual debugging of 2 processes can stuck in unkillable stopped state Oleg Nesterov @ 2021-03-29 17:01 ` Igor Zhbanov 2021-03-29 17:38 ` Oleg Nesterov 0 siblings, 1 reply; 6+ messages in thread From: Igor Zhbanov @ 2021-03-29 17:01 UTC (permalink / raw) To: Oleg Nesterov; +Cc: linux-trace-devel, linux-kernel Hi Oleg! I've tried both 5.3.18 and 5.10.0. The behavior is the same. The important thing is to run "exec strace -p ..." on the second terminal to create the loop A->B->A. So the last line from the first strace we see is: ptrace(PTRACE_SEIZE, 1990, NULL, PTRACE_O_TRACESYSGOOD|PTRACE_O_TRACEEXEC|PTRACE_O_TRACEEXIT I.e. it printed the syscall prior to its execution and hanged after the execution. izh@suse2:~> ps awux|grep strace izh 1891 0.0 0.0 24752 3828 pts/1 ts+ 19:52 0:00 strace -p 1990 izh 1990 0.0 0.0 24752 3628 pts/0 t+ 19:53 0:00 strace -p 1891 izh@suse2:~> kill 1990 1891 izh@suse2:~> kill -9 1990 1891 izh@suse2:~> sudo cat /proc/1891/stack [sudo] password for root: [<0>] ptrace_stop+0x14a/0x260 [<0>] ptrace_do_notify+0x91/0xb0 [<0>] ptrace_notify+0x4e/0x70 [<0>] do_exit+0x910/0xb70 [<0>] do_group_exit+0x3a/0xa0 [<0>] get_signal+0x124/0x800 [<0>] arch_do_signal_or_restart+0xa9/0x290 [<0>] exit_to_user_mode_prepare+0xe7/0x1a0 [<0>] syscall_exit_to_user_mode+0x18/0x40 [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 izh@suse2:~> sudo cat /proc/1990/stack [<0>] ptrace_stop+0x14a/0x260 [<0>] ptrace_do_notify+0x91/0xb0 [<0>] ptrace_notify+0x4e/0x70 [<0>] do_exit+0x910/0xb70 [<0>] do_group_exit+0x3a/0xa0 [<0>] get_signal+0x124/0x800 [<0>] arch_do_signal_or_restart+0xa9/0x290 [<0>] exit_to_user_mode_prepare+0xe7/0x1a0 [<0>] syscall_exit_to_user_mode+0x18/0x40 [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 izh@suse2:~> cat /proc/1891/status Name: strace Umask: 0022 State: t (tracing stop) Tgid: 1891 Ngid: 0 Pid: 1891 PPid: 1890 TracerPid: 1990 Uid: 1000 1000 1000 1000 Gid: 100 100 100 100 FDSize: 256 Groups: 100 NStgid: 1891 NSpid: 1891 NSpgid: 1891 NSsid: 1891 VmPeak: 24752 kB VmSize: 24752 kB VmLck: 0 kB VmPin: 0 kB VmHWM: 3828 kB VmRSS: 3828 kB RssAnon: 520 kB RssFile: 3308 kB RssShmem: 0 kB VmData: 284 kB VmStk: 132 kB VmExe: 1108 kB VmLib: 2828 kB VmPTE: 80 kB VmSwap: 0 kB HugetlbPages: 0 kB CoreDumping: 0 THP_enabled: 1 Threads: 1 SigQ: 4/15639 SigPnd: 0000000000000000 ShdPnd: 0000000000014100 SigBlk: 0000000000002000 SigIgn: 0000000000300000 SigCgt: 0000000180007007 CapInh: 0000000000000000 CapPrm: 0000000000000000 CapEff: 0000000000000000 CapBnd: 000001ffffffffff CapAmb: 0000000000000000 NoNewPrivs: 0 Seccomp: 0 Seccomp_filters: 0 Speculation_Store_Bypass: vulnerable SpeculationIndirectBranch: always enabled Cpus_allowed: 7 Cpus_allowed_list: 0-2 Mems_allowed: 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000001 Mems_allowed_list: 0 voluntary_ctxt_switches: 1561 nonvoluntary_ctxt_switches: 7 izh@suse2:~> cat /proc/1990/status Name: strace Umask: 0022 State: t (tracing stop) Tgid: 1990 Ngid: 0 Pid: 1990 PPid: 1847 TracerPid: 1891 Uid: 1000 1000 1000 1000 Gid: 100 100 100 100 FDSize: 256 Groups: 100 NStgid: 1990 NSpid: 1990 NSpgid: 1990 NSsid: 1847 VmPeak: 24752 kB VmSize: 24752 kB VmLck: 0 kB VmPin: 0 kB VmHWM: 3628 kB VmRSS: 3628 kB RssAnon: 520 kB RssFile: 3108 kB RssShmem: 0 kB VmData: 284 kB VmStk: 132 kB VmExe: 1108 kB VmLib: 2828 kB VmPTE: 88 kB VmSwap: 0 kB HugetlbPages: 0 kB CoreDumping: 0 THP_enabled: 1 Threads: 1 SigQ: 4/15639 SigPnd: 0000000000000000 ShdPnd: 0000000000014100 SigBlk: 0000000000002000 SigIgn: 0000000000300000 SigCgt: 0000000180007007 CapInh: 0000000000000000 CapPrm: 0000000000000000 CapEff: 0000000000000000 CapBnd: 000001ffffffffff CapAmb: 0000000000000000 NoNewPrivs: 0 Seccomp: 0 Seccomp_filters: 0 Speculation_Store_Bypass: vulnerable SpeculationIndirectBranch: always enabled Cpus_allowed: 7 Cpus_allowed_list: 0-2 Mems_allowed: 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000001 Mems_allowed_list: 0 voluntary_ctxt_switches: 180 nonvoluntary_ctxt_switches: 848 On 29.03.2021 19:49, Oleg Nesterov wrote: > On 03/29, Igor Zhbanov wrote: >> >> Mutual debugging of 2 processes can stuck in unkillable stopped state > > can't reproduce and can't understand... > >> Hi! >> >> When one process, let's say "A", is tracing the another process "B", and the >> process "B" is trying to attach to the process "A", then both of them are >> getting stuck in the "t+" state. And they are ignoring all of the signals >> including the SIGKILL, > > Why do you think so? What is your kernel version? > > "t" means TASK_TRACED, SIGKILL should wake it up and terminate. > >> so it is not possible to terminate them without >> a reboot. >> >> To reproduce: >> 1) Run two terminals >> 2) Attach with "strace -p ..." from the first terminal to the shell (bash) of >> the second terminal. >> 3) In the second terminal run "exec strace -p ..." to attach to the PID of the >> first strace. >> >> Then you'll see that the second strace is hanging without any output. And the >> first strace will output following and hang too: >> ptrace(PTRACE_SEIZE, 11795, NULL, >> PTRACE_O_TRACESYSGOOD|PTRACE_O_TRACEEXEC|PTRACE_O_TRACEEXIT >> >> (The 11795 is the PID of the first strace itself.) >> >> And in the process list you will see following: >> ps awux | grep strace >> user 11776 0.0 0.0 24752 2248 pts/3 t+ 13:53 0:00 strace -p 11795 >> user 11795 0.0 0.0 24752 3888 pts/1 t+ 13:54 0:00 strace -p 11776 > > OK, may be they sleep in PTRACE_EVENT_EXIT? After you tried to send SIGKILL? > > please show us the output from "cat /proc/{11795,11776}/stack". And > "cat /proc/{11795,11776}/status" just in case. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Mutual debugging of 2 processes can stuck in unkillable stopped state 2021-03-29 17:01 ` Igor Zhbanov @ 2021-03-29 17:38 ` Oleg Nesterov 2021-03-29 17:44 ` Igor Zhbanov 0 siblings, 1 reply; 6+ messages in thread From: Oleg Nesterov @ 2021-03-29 17:38 UTC (permalink / raw) To: Igor Zhbanov; +Cc: linux-trace-devel, linux-kernel Hi Igor, So. As expected, they sleep in EVENT_EXIT _after_ you have already sent SIGKILL. Oh. I can only repeat that PTRACE_EVENT_EXIT must die ;) Or at least we should finally define its semantics. Igor, thanks for your report, but (I think) this has nothing to do with mutual debugging. I'll return to this problem in a couple of days, I'm a bit busy right now. Thanks, Oleg. On 03/29, Igor Zhbanov wrote: > Hi Oleg! > > I've tried both 5.3.18 and 5.10.0. The behavior is the same. > The important thing is to run "exec strace -p ..." on the second terminal > to create the loop A->B->A. > > So the last line from the first strace we see is: > ptrace(PTRACE_SEIZE, 1990, NULL, PTRACE_O_TRACESYSGOOD|PTRACE_O_TRACEEXEC|PTRACE_O_TRACEEXIT > > I.e. it printed the syscall prior to its execution and hanged after the > execution. > > izh@suse2:~> ps awux|grep strace > izh 1891 0.0 0.0 24752 3828 pts/1 ts+ 19:52 0:00 strace -p 1990 > izh 1990 0.0 0.0 24752 3628 pts/0 t+ 19:53 0:00 strace -p 1891 > > izh@suse2:~> kill 1990 1891 > izh@suse2:~> kill -9 1990 1891 > > izh@suse2:~> sudo cat /proc/1891/stack > [sudo] password for root: > [<0>] ptrace_stop+0x14a/0x260 > [<0>] ptrace_do_notify+0x91/0xb0 > [<0>] ptrace_notify+0x4e/0x70 > [<0>] do_exit+0x910/0xb70 > [<0>] do_group_exit+0x3a/0xa0 > [<0>] get_signal+0x124/0x800 > [<0>] arch_do_signal_or_restart+0xa9/0x290 > [<0>] exit_to_user_mode_prepare+0xe7/0x1a0 > [<0>] syscall_exit_to_user_mode+0x18/0x40 > [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > izh@suse2:~> sudo cat /proc/1990/stack > [<0>] ptrace_stop+0x14a/0x260 > [<0>] ptrace_do_notify+0x91/0xb0 > [<0>] ptrace_notify+0x4e/0x70 > [<0>] do_exit+0x910/0xb70 > [<0>] do_group_exit+0x3a/0xa0 > [<0>] get_signal+0x124/0x800 > [<0>] arch_do_signal_or_restart+0xa9/0x290 > [<0>] exit_to_user_mode_prepare+0xe7/0x1a0 > [<0>] syscall_exit_to_user_mode+0x18/0x40 > [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > izh@suse2:~> cat /proc/1891/status > Name: strace > Umask: 0022 > State: t (tracing stop) > Tgid: 1891 > Ngid: 0 > Pid: 1891 > PPid: 1890 > TracerPid: 1990 > Uid: 1000 1000 1000 1000 > Gid: 100 100 100 100 > FDSize: 256 > Groups: 100 > NStgid: 1891 > NSpid: 1891 > NSpgid: 1891 > NSsid: 1891 > VmPeak: 24752 kB > VmSize: 24752 kB > VmLck: 0 kB > VmPin: 0 kB > VmHWM: 3828 kB > VmRSS: 3828 kB > RssAnon: 520 kB > RssFile: 3308 kB > RssShmem: 0 kB > VmData: 284 kB > VmStk: 132 kB > VmExe: 1108 kB > VmLib: 2828 kB > VmPTE: 80 kB > VmSwap: 0 kB > HugetlbPages: 0 kB > CoreDumping: 0 > THP_enabled: 1 > Threads: 1 > SigQ: 4/15639 > SigPnd: 0000000000000000 > ShdPnd: 0000000000014100 > SigBlk: 0000000000002000 > SigIgn: 0000000000300000 > SigCgt: 0000000180007007 > CapInh: 0000000000000000 > CapPrm: 0000000000000000 > CapEff: 0000000000000000 > CapBnd: 000001ffffffffff > CapAmb: 0000000000000000 > NoNewPrivs: 0 > Seccomp: 0 > Seccomp_filters: 0 > Speculation_Store_Bypass: vulnerable > SpeculationIndirectBranch: always enabled > Cpus_allowed: 7 > Cpus_allowed_list: 0-2 > Mems_allowed: 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000001 > Mems_allowed_list: 0 > voluntary_ctxt_switches: 1561 > nonvoluntary_ctxt_switches: 7 > > izh@suse2:~> cat /proc/1990/status > Name: strace > Umask: 0022 > State: t (tracing stop) > Tgid: 1990 > Ngid: 0 > Pid: 1990 > PPid: 1847 > TracerPid: 1891 > Uid: 1000 1000 1000 1000 > Gid: 100 100 100 100 > FDSize: 256 > Groups: 100 > NStgid: 1990 > NSpid: 1990 > NSpgid: 1990 > NSsid: 1847 > VmPeak: 24752 kB > VmSize: 24752 kB > VmLck: 0 kB > VmPin: 0 kB > VmHWM: 3628 kB > VmRSS: 3628 kB > RssAnon: 520 kB > RssFile: 3108 kB > RssShmem: 0 kB > VmData: 284 kB > VmStk: 132 kB > VmExe: 1108 kB > VmLib: 2828 kB > VmPTE: 88 kB > VmSwap: 0 kB > HugetlbPages: 0 kB > CoreDumping: 0 > THP_enabled: 1 > Threads: 1 > SigQ: 4/15639 > SigPnd: 0000000000000000 > ShdPnd: 0000000000014100 > SigBlk: 0000000000002000 > SigIgn: 0000000000300000 > SigCgt: 0000000180007007 > CapInh: 0000000000000000 > CapPrm: 0000000000000000 > CapEff: 0000000000000000 > CapBnd: 000001ffffffffff > CapAmb: 0000000000000000 > NoNewPrivs: 0 > Seccomp: 0 > Seccomp_filters: 0 > Speculation_Store_Bypass: vulnerable > SpeculationIndirectBranch: always enabled > Cpus_allowed: 7 > Cpus_allowed_list: 0-2 > Mems_allowed: 00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000000,00000001 > Mems_allowed_list: 0 > voluntary_ctxt_switches: 180 > nonvoluntary_ctxt_switches: 848 > > On 29.03.2021 19:49, Oleg Nesterov wrote: > >On 03/29, Igor Zhbanov wrote: > >> > >>Mutual debugging of 2 processes can stuck in unkillable stopped state > > > >can't reproduce and can't understand... > > > >>Hi! > >> > >>When one process, let's say "A", is tracing the another process "B", and the > >>process "B" is trying to attach to the process "A", then both of them are > >>getting stuck in the "t+" state. And they are ignoring all of the signals > >>including the SIGKILL, > > > >Why do you think so? What is your kernel version? > > > >"t" means TASK_TRACED, SIGKILL should wake it up and terminate. > > > >>so it is not possible to terminate them without > >>a reboot. > >> > >>To reproduce: > >>1) Run two terminals > >>2) Attach with "strace -p ..." from the first terminal to the shell (bash) of > >> the second terminal. > >>3) In the second terminal run "exec strace -p ..." to attach to the PID of the > >> first strace. > >> > >>Then you'll see that the second strace is hanging without any output. And the > >>first strace will output following and hang too: > >>ptrace(PTRACE_SEIZE, 11795, NULL, > >> PTRACE_O_TRACESYSGOOD|PTRACE_O_TRACEEXEC|PTRACE_O_TRACEEXIT > >> > >>(The 11795 is the PID of the first strace itself.) > >> > >>And in the process list you will see following: > >>ps awux | grep strace > >>user 11776 0.0 0.0 24752 2248 pts/3 t+ 13:53 0:00 strace -p 11795 > >>user 11795 0.0 0.0 24752 3888 pts/1 t+ 13:54 0:00 strace -p 11776 > > > >OK, may be they sleep in PTRACE_EVENT_EXIT? After you tried to send SIGKILL? > > > >please show us the output from "cat /proc/{11795,11776}/stack". And > >"cat /proc/{11795,11776}/status" just in case. > ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Mutual debugging of 2 processes can stuck in unkillable stopped state 2021-03-29 17:38 ` Oleg Nesterov @ 2021-03-29 17:44 ` Igor Zhbanov 2021-04-12 7:25 ` Igor Zhbanov 0 siblings, 1 reply; 6+ messages in thread From: Igor Zhbanov @ 2021-03-29 17:44 UTC (permalink / raw) To: Oleg Nesterov; +Cc: linux-trace-devel, linux-kernel Hi Oleg, On 29.03.2021 20:38, Oleg Nesterov wrote: > Hi Igor, > > So. As expected, they sleep in EVENT_EXIT _after_ you have already > sent SIGKILL. Here is the processes stack before sending any signals to them: izh@suse2:~> sudo cat /proc/1751/stack [<0>] ptrace_stop+0x14e/0x260 [<0>] ptrace_do_notify+0x91/0xb0 [<0>] ptrace_notify+0x4e/0x70 [<0>] syscall_slow_exit_work+0x90/0x150 [<0>] do_syscall_64+0x129/0x1f0 [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 izh@suse2:~> sudo cat /proc/1979/stack [<0>] ptrace_stop+0x14e/0x260 [<0>] get_signal+0x4d5/0x840 [<0>] do_signal+0x30/0x6a0 [<0>] exit_to_usermode_loop+0x8b/0x120 [<0>] do_syscall_64+0x1c3/0x1f0 [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > Igor, thanks for your report, but (I think) this has nothing to do > with mutual debugging. I'll return to this problem in a couple of > days, I'm a bit busy right now. Sorry for inexact description. I said it from the user perspective when the processes are stopped and can't be killed. :-) ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Mutual debugging of 2 processes can stuck in unkillable stopped state 2021-03-29 17:44 ` Igor Zhbanov @ 2021-04-12 7:25 ` Igor Zhbanov 2021-04-13 17:17 ` Oleg Nesterov 0 siblings, 1 reply; 6+ messages in thread From: Igor Zhbanov @ 2021-04-12 7:25 UTC (permalink / raw) To: Oleg Nesterov; +Cc: linux-trace-devel, linux-kernel Hi Oleg, So what is the cause of this problem? Thank you. On 29.03.2021 20:44, Igor Zhbanov wrote: > > Here is the processes stack before sending any signals to them: > > izh@suse2:~> sudo cat /proc/1751/stack > [<0>] ptrace_stop+0x14e/0x260 > [<0>] ptrace_do_notify+0x91/0xb0 > [<0>] ptrace_notify+0x4e/0x70 > [<0>] syscall_slow_exit_work+0x90/0x150 > [<0>] do_syscall_64+0x129/0x1f0 > [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > > izh@suse2:~> sudo cat /proc/1979/stack > [<0>] ptrace_stop+0x14e/0x260 > [<0>] get_signal+0x4d5/0x840 > [<0>] do_signal+0x30/0x6a0 > [<0>] exit_to_usermode_loop+0x8b/0x120 > [<0>] do_syscall_64+0x1c3/0x1f0 > [<0>] entry_SYSCALL_64_after_hwframe+0x44/0xa9 > >> Igor, thanks for your report, but (I think) this has nothing to do >> with mutual debugging. I'll return to this problem in a couple of >> days, I'm a bit busy right now. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: Mutual debugging of 2 processes can stuck in unkillable stopped state 2021-04-12 7:25 ` Igor Zhbanov @ 2021-04-13 17:17 ` Oleg Nesterov 0 siblings, 0 replies; 6+ messages in thread From: Oleg Nesterov @ 2021-04-13 17:17 UTC (permalink / raw) To: Igor Zhbanov; +Cc: linux-trace-devel, linux-kernel Hi Igor, sorry for delay... On 04/12, Igor Zhbanov wrote: > > Hi Oleg, > > So what is the cause of this problem? The cause is clear. And well known ;) And again, this has almost nothing to do with the mutual debugging. The tracee sleeps in ptrace_stop(). You send SIGKILL. This wakes the tracee up, it dequeues the signal, calls do_exit(), and stops again in PTRACE_EVENT_EXIT. With SIGKILL in signal->shared_pending. This all looks as if the tracee doesn't react to SIGKILL. The only problem is that any change can break something which relies on the current behaviour :/ I'll write another email on this. Oleg. ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2021-04-13 17:17 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <f2f32ffa-52ad-ff67-19d8-95305a70a6f8@omprussia.ru>
2021-03-29 16:49 ` Mutual debugging of 2 processes can stuck in unkillable stopped state Oleg Nesterov
2021-03-29 17:01 ` Igor Zhbanov
2021-03-29 17:38 ` Oleg Nesterov
2021-03-29 17:44 ` Igor Zhbanov
2021-04-12 7:25 ` Igor Zhbanov
2021-04-13 17:17 ` Oleg Nesterov
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox