From: Wei Fu <fuweid89@gmail.com>
To: oleg@redhat.com
Cc: Sudhanva.Huruli@microsoft.com, akpm@linux-foundation.org,
apais@linux.microsoft.com, axboe@kernel.dk, boqun.feng@gmail.com,
brauner@kernel.org, ebiederm@xmission.com, frederic@kernel.org,
fuweid89@gmail.com, j.granados@samsung.com,
jiangshanlai@gmail.com, joel@joelfernandes.org,
josh@joshtriplett.org, linux-kernel@vger.kernel.org,
mathieu.desnoyers@efficios.com, michael.christie@oracle.com,
mjguzik@gmail.com, neeraj.upadhyay@kernel.org,
paulmck@kernel.org, qiang.zhang1211@gmail.com,
rachelmenge@linux.microsoft.com, rcu@vger.kernel.org,
rostedt@goodmis.org, weifu@microsoft.com
Subject: Re: [RCU] zombie task hung in synchronize_rcu_expedited
Date: Fri, 7 Jun 2024 11:02:19 +0800 [thread overview]
Message-ID: <20240607030219.2990306-1-fuweid89@gmail.com> (raw)
In-Reply-To: <20240606172848.GC22450@redhat.com>
>
> > ```
> > # unshare(CLONE_NEWPID | CLONE_NEWNS)
> >
> > npm start (pid 2522045)
> > |__npm run zombie (pid 2522605)
> > |__ sh -c "whle true; do echo zombie; sleep 1; done" (pid 2522869)
> > ```
>
> only 3 processes? nothing is running? Is the last process 2522869 a
> zombie too?
Yes. The pid-2522045 sent SIGKILL to all the processes in that pid namespace,
when it exited. The last process 2522869 was zombie as well. Sometimes,
`npm start` could exit before `npm run zombie` forks `sh`. You might see there
are only two processes in that pid namespace.
>
> Could you show your .config? In particular, CONFIG_PREEMPT...
I'm using [6.5.0-1021-azure][1] kernel and preempt is disabled.
Highlight part of .config.
```
$ cat /boot/config-6.5.0-1021-azure | grep _RCU
CONFIG_TREE_RCU=y
# CONFIG_RCU_EXPERT is not set
CONFIG_TASKS_RCU_GENERIC=y
CONFIG_TASKS_RUDE_RCU=y
CONFIG_TASKS_TRACE_RCU=y
CONFIG_RCU_STALL_COMMON=y
CONFIG_RCU_NEED_SEGCBLIST=y
CONFIG_RCU_NOCB_CPU=y
# CONFIG_RCU_NOCB_CPU_DEFAULT_ALL is not set
# CONFIG_RCU_LAZY is not set
CONFIG_MMU_GATHER_RCU_TABLE_FREE=y
# CONFIG_RCU_SCALE_TEST is not set
# CONFIG_RCU_TORTURE_TEST is not set
# CONFIG_RCU_REF_SCALE_TEST is not set
CONFIG_RCU_CPU_STALL_TIMEOUT=60
CONFIG_RCU_EXP_CPU_STALL_TIMEOUT=0
CONFIG_RCU_CPU_STALL_CPUTIME=y
# CONFIG_RCU_TRACE is not set
# CONFIG_RCU_EQS_DEBUG is not set
$ cat /boot/config-6.5.0-1021-azure | grep _PREEMPT
CONFIG_PREEMPT_VOLUNTARY_BUILD=y
# CONFIG_PREEMPT_NONE is not set
CONFIG_PREEMPT_VOLUNTARY=y
# CONFIG_PREEMPT is not set
# CONFIG_PREEMPT_DYNAMIC is not set
CONFIG_HAVE_PREEMPT_DYNAMIC=y
CONFIG_HAVE_PREEMPT_DYNAMIC_CALL=y
CONFIG_PREEMPT_NOTIFIERS=y
CONFIG_DRM_I915_PREEMPT_TIMEOUT=640
CONFIG_DRM_I915_PREEMPT_TIMEOUT_COMPUTE=7500
# CONFIG_PREEMPTIRQ_DELAY_TEST is not set
$ cat /boot/config-6.5.0-1021-azure | grep HZ
CONFIG_NO_HZ_COMMON=y
# CONFIG_HZ_PERIODIC is not set
# CONFIG_NO_HZ_IDLE is not set
CONFIG_NO_HZ_FULL=y
CONFIG_NO_HZ=y
# CONFIG_HZ_100 is not set
CONFIG_HZ_250=y
# CONFIG_HZ_300 is not set
# CONFIG_HZ_1000 is not set
CONFIG_HZ=250
CONFIG_MACHZ_WDT=m
```
>
> > The `npm start (pid 2522045)` was stuck in kernel_wait4. And its child,
>
> so this is the init task in this namespace,
Yes~
>
> > `npm run zombie (pid 2522605)`, has two threads. One of them was in D status.
> ...
> > $ sudo cat /proc/2522605/task/*/stack
> > [<0>] synchronize_rcu_expedited+0x177/0x1f0
> > [<0>] namespace_unlock+0xd6/0x1b0
> > [<0>] put_mnt_ns+0x73/0xa0
> > [<0>] free_nsproxy+0x1c/0x1b0
> > [<0>] switch_task_namespaces+0x5d/0x70
> > [<0>] exit_task_namespaces+0x10/0x20
> > [<0>] do_exit+0x2ce/0x500
> > [<0>] io_sq_thread+0x48e/0x5a0
> > [<0>] ret_from_fork+0x3c/0x60
> > [<0>] ret_from_fork_asm+0x1b/0x30
>
> so I guess this is the trace of its sub-thread 2522645.
Sorry for unclear message.
Yes~
>
> What about the process 2522605? Has it exited too?
The process-2522605 has two threads. The main thread-2522605 was in zombie
status. Yes. That main thread has exited as well. Only thread-2522645 was
stuck in synchronize_rcu_expedited.
>
> > > But zap_pid_ns_processes() shouldn't cause the soft-lockup, it should
> > > sleep in kernel_wait4().
> >
> > I run `cat /proc/2522045/status` and found that the status was kept switching
> > between running and sleeping.
>
> OK, this shouldn't happen in this case. So it really looks like it spins
> in a busy-wait loop because TIF_NOTIFY_SIGNAL is not cleared. It can be
> reported as sleeping because do_wait() sets/clears TASK_INTERRUPTIBLE,
> although the window is small...
>
I can reproduce this issue in v5.15, v6.1, v6.5, v6.8, v6.9 and v6.10-rc2.
All the kernels disable CONFIG_PREEMPT and PREEMPT_RCU. And it's very easy to
reproduce this in v5.15.x with 8 vcores in few minutes. For the other versions
of kernel, it could take 30 minutes or few hours.
Rachel provides [golang-repro][2] which is similar to docker repro. It can be
built as static binary which is friendly to reproduce.
Hope this information can help.
Thanks,
Wei
[1]: https://gist.github.com/fuweid/ae8bad349fee3e00a4f1ce82397831ac
[2]: https://github.com/rlmenge/rcu-soft-lock-issue-repro?tab=readme-ov-file#golang-repro
next prev parent reply other threads:[~2024-06-07 3:02 UTC|newest]
Thread overview: 23+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-06-05 23:42 [RCU] zombie task hung in synchronize_rcu_expedited Rachel Menge
2024-06-06 11:10 ` Oleg Nesterov
2024-06-06 15:45 ` Wei Fu
2024-06-06 17:28 ` Oleg Nesterov
2024-06-07 3:02 ` Wei Fu [this message]
2024-06-07 6:25 ` Oleg Nesterov
2024-06-07 15:04 ` Wei Fu
2024-06-07 21:22 ` Oleg Nesterov
2024-06-08 12:42 ` Oleg Nesterov
2024-06-10 0:07 ` Wei Fu
2024-06-08 12:06 ` [PATCH] zap_pid_ns_processes: clear TIF_NOTIFY_SIGNAL along with TIF_SIGPENDING Oleg Nesterov
2024-06-08 17:00 ` Boqun Feng
2024-06-09 14:12 ` Wei Fu
2024-06-12 16:57 ` Jens Axboe
2024-06-13 12:40 ` Eric W. Biederman
2024-06-13 14:02 ` Wei Fu
2024-06-13 14:49 ` Oleg Nesterov
2024-06-13 15:30 ` Oleg Nesterov
2024-06-08 15:48 ` [PATCH] zap_pid_ns_processes: don't send SIGKILL to sub-threads Oleg Nesterov
2024-06-13 13:01 ` Eric W. Biederman
2024-06-13 15:00 ` Oleg Nesterov
2024-06-13 16:23 ` Eric W. Biederman
2024-07-05 16:08 ` Oleg Nesterov
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20240607030219.2990306-1-fuweid89@gmail.com \
--to=fuweid89@gmail.com \
--cc=Sudhanva.Huruli@microsoft.com \
--cc=akpm@linux-foundation.org \
--cc=apais@linux.microsoft.com \
--cc=axboe@kernel.dk \
--cc=boqun.feng@gmail.com \
--cc=brauner@kernel.org \
--cc=ebiederm@xmission.com \
--cc=frederic@kernel.org \
--cc=j.granados@samsung.com \
--cc=jiangshanlai@gmail.com \
--cc=joel@joelfernandes.org \
--cc=josh@joshtriplett.org \
--cc=linux-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=michael.christie@oracle.com \
--cc=mjguzik@gmail.com \
--cc=neeraj.upadhyay@kernel.org \
--cc=oleg@redhat.com \
--cc=paulmck@kernel.org \
--cc=qiang.zhang1211@gmail.com \
--cc=rachelmenge@linux.microsoft.com \
--cc=rcu@vger.kernel.org \
--cc=rostedt@goodmis.org \
--cc=weifu@microsoft.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox