Re: nfsd stuck in D (disk sleep) state

public inbox for linux-nfs@vger.kernel.org
 help / color / mirror / Atom feed

From: Jeff Layton <jlayton@kernel.org>
To: "Tom Talpey" <tom@talpey.com>,
	"Benoît Gschwind" <benoit.gschwind@minesparis.psl.eu>,
	"Chuck Lever III" <chuck.lever@oracle.com>
Cc: Linux NFS Mailing List <linux-nfs@vger.kernel.org>
Subject: Re: nfsd stuck in D (disk sleep) state
Date: Tue, 29 Oct 2024 09:49:59 -0400	[thread overview]
Message-ID: <4cf67d2f79f3ccb1a81875adb2f98f4e04b9fd1d.camel@kernel.org> (raw)
In-Reply-To: <3301c899e4d57f5a988373d76fdceb9ec2bee8af.camel@kernel.org>

On Mon, 2024-10-28 at 13:27 -0400, Jeff Layton wrote:
> On Mon, 2024-10-28 at 08:46 -0400, Tom Talpey wrote:
> > On 10/28/2024 5:18 AM, Benoît Gschwind wrote:
> > > Hello,
> > > 
> > > The issue trigger again, I attached the result of:
> > > 
> > > # dmesg -W | tee dmesg.txt
> > > 
> > > using:
> > > 
> > > # echo t > /proc/sysrq-trigger
> > > 
> > > I have the following PID stuck:
> > > 
> > >      1474 D (disk sleep)       0:54:58.602 [nfsd]
> > >      1475 D (disk sleep)       0:54:58.602 [nfsd]
> > >      1484 D (disk sleep)       0:54:58.602 [nfsd]
> > >      1495 D (disk sleep)       0:54:58.602 [nfsd]
> > 
> > Hmm, 1495 is stuck in nfsd4_create_session
> > 
> >  > [427468.304955] task:nfsd            state:D stack:0     pid:1495 
> > ppid:2      flags:0x00004000
> >  > [427468.304962] Call Trace:
> >  > [427468.304965]  <TASK>
> >  > [427468.304971]  __schedule+0x34d/0x9e0
> >  > [427468.304983]  schedule+0x5a/0xd0
> >  > [427468.304991]  schedule_timeout+0x118/0x150
> >  > [427468.305003]  wait_for_completion+0x86/0x160
> >  > [427468.305015]  __flush_workqueue+0x152/0x420
> >  > [427468.305031]  nfsd4_create_session+0x79f/0xba0 [nfsd]
> >  > [427468.305092]  nfsd4_proc_compound+0x34c/0x660 [nfsd]
> >  > [427468.305147]  nfsd_dispatch+0x1a1/0x2b0 [nfsd]
> >  > [427468.305199]  svc_process_common+0x295/0x610 [sunrpc]
> >  > [427468.305269]  ? svc_recv+0x491/0x810 [sunrpc]
> >  > [427468.305337]  ? nfsd_svc+0x370/0x370 [nfsd]
> >  > [427468.305389]  ? nfsd_shutdown_threads+0x90/0x90 [nfsd]
> >  > [427468.305437]  svc_process+0xad/0x100 [sunrpc]
> >  > [427468.305505]  nfsd+0x99/0x140 [nfsd]
> >  > [427468.305555]  kthread+0xda/0x100
> >  > [427468.305562]  ? kthread_complete_and_exit+0x20/0x20
> >  > [427468.305572]  ret_from_fork+0x22/0x30
> > 
> > and the other three are stuck in nfsd4_destroy_session
> > 
> 
> All 4 processes are stuck waiting on a flush_workqueue call. It's
> probably one of these:
> 
>     flush_workqueue(clp->cl_callback_wq);
> 
> So the question here is really -- why are the callback workqueue jobs
> stuck? I do see this:
> 
> [427468.316839] Workqueue: nfsd4_callbacks nfsd4_run_cb_work [nfsd]
> [427468.316899] Call Trace:
> [427468.316902]  <TASK>
> [427468.316908]  __schedule+0x34d/0x9e0
> [427468.316919]  schedule+0x5a/0xd0
> [427468.316927]  schedule_timeout+0x94/0x150
> [427468.316937]  ? __bpf_trace_tick_stop+0x10/0x10
> [427468.316947]  rpc_shutdown_client+0xf2/0x150 [sunrpc]
> [427468.317015]  ? cpuusage_read+0x10/0x10
> [427468.317026]  nfsd4_process_cb_update+0x4c/0x270 [nfsd]
> [427468.317097]  nfsd4_run_cb_work+0x9f/0x150 [nfsd]
> [427468.317146]  process_one_work+0x1c7/0x380
> [427468.317158]  worker_thread+0x4d/0x380
> [427468.317170]  ? rescuer_thread+0x3a0/0x3a0
> [427468.317177]  kthread+0xda/0x100
> [427468.317185]  ? kthread_complete_and_exit+0x20/0x20
> [427468.317195]  ret_from_fork+0x22/0x30
> [427468.317213]  </TASK>
> 
> Maybe the RPC client is having trouble clearing clnt->cl_tasks ? 
> 

Interestingly, I hit this today while testing the patches that widen
the backchannel:

[  484.550350] INFO: task nfsd:1168 blocked for more than 120 seconds.
[  484.551746]       Not tainted 6.12.0-rc5+ #188
[  484.552854] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  484.586316] task:nfsd            state:D stack:0     pid:1168  tgid:1168  ppid:2      flags:0x00004000
[  484.588417] Call Trace:
[  484.589070]  <TASK>
[  484.589624]  __schedule+0x4e4/0xb90
[  484.622306]  schedule+0x26/0xf0
[  484.623162]  schedule_timeout+0x121/0x150
[  484.624344]  ? __smp_call_single_queue+0xa7/0x110
[  484.625375]  ? x2apic_send_IPI+0x45/0x50
[  484.654299]  ? ttwu_queue_wakelist+0xbf/0x110
[  484.655454]  __wait_for_common+0x91/0x1c0
[  484.656372]  ? __pfx_schedule_timeout+0x10/0x10
[  484.657429]  __flush_workqueue+0x13a/0x400
[  484.686304]  nfsd4_create_session+0x9cd/0xd90 [nfsd]
[  484.687562]  nfsd4_proc_compound+0x396/0x750 [nfsd]
[  484.688746]  nfsd_dispatch+0xc6/0x210 [nfsd]
[  484.689835]  svc_process_common+0x4e6/0x6d0 [sunrpc]
[  484.722305]  ? __pfx_nfsd_dispatch+0x10/0x10 [nfsd]
[  484.723532]  svc_process+0x12d/0x1c0 [sunrpc]
[  484.724619]  svc_recv+0x7cf/0x980 [sunrpc]
[  484.725618]  ? __pfx_nfsd+0x10/0x10 [nfsd]
[  484.758308]  nfsd+0x9f/0x100 [nfsd]
[  484.759184]  kthread+0xdd/0x110
[  484.759905]  ? __pfx_kthread+0x10/0x10
[  484.760870]  ret_from_fork+0x30/0x50
[  484.761603]  ? __pfx_kthread+0x10/0x10
[  484.790302]  ret_from_fork_asm+0x1a/0x30
[  484.791151]  </TASK>

The host was hung after this, so I can't fully confirm the scenario
below, but I think it's probably correct:

The client might want to create a new session. When that happens, the
create_session call tries to flush_workqueue() on the client-wide
backchannel workqueue.

That workqueue is stuck, as it's waiting for all of the callback RPCs
to die off (as evidenced by the stack from Benoît's info that shows it
hung in rpc_shutdown_client). So the question I guess at this point is
why these RPC tasks aren't dying like they're supposed to when
rpc_killall_tasks() is run.

I'm guessing that the backchannel-widening patches may make this a bit
easier to reproduce? In any case, this problem doesn't seem to be
confined to v6.1. 

I'm still looking at this, so stay tuned...
-- 
Jeff Layton <jlayton@kernel.org>

next prev parent reply	other threads:[~2024-10-29 13:50 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-10-23 19:27 nfsd stuck in D (disk sleep) state Benoît Gschwind
2024-10-23 19:38 ` Chuck Lever III
2024-10-24  8:07   ` Benoît Gschwind
2024-10-28  9:18   ` Benoît Gschwind
2024-10-28 12:46     ` Tom Talpey
2024-10-28 13:32       ` Chuck Lever III
2024-10-28 17:01       ` Benoît Gschwind
2024-10-28 17:38         ` Tom Talpey
2024-10-28 17:27       ` Jeff Layton
2024-10-29 13:49         ` Jeff Layton [this message]
2024-10-29 20:04           ` Jeff Layton
2025-01-20  9:22             ` Benoît Gschwind

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=4cf67d2f79f3ccb1a81875adb2f98f4e04b9fd1d.camel@kernel.org \
    --to=jlayton@kernel.org \
    --cc=benoit.gschwind@minesparis.psl.eu \
    --cc=chuck.lever@oracle.com \
    --cc=linux-nfs@vger.kernel.org \
    --cc=tom@talpey.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox