Re: Hung task panic as part of NFS RDMA Disconnect due to possible bug on 6.2.0-34-generic client

Linux RDMA and InfiniBand development
 help / color / mirror / Atom feed

* Re: Hung task panic as part of NFS RDMA Disconnect due to possible bug on 6.2.0-34-generic client
       [not found] <CAMyEAb9XbL55taNXD_MrTxJz62s6ByDWiK8m1Nxj1_G3pg-M6A@mail.gmail.com>
@ 2023-11-30  7:36 ` Bagas Sanjaya
  2023-11-30 12:15   ` Sukruth Sridharan (he/him)
  2023-11-30 12:23   ` Sukruth Sridharan (he/him)
  0 siblings, 2 replies; 4+ messages in thread
From: Bagas Sanjaya @ 2023-11-30  7:36 UTC (permalink / raw)
  To: Sukruth Sridharan (he/him), Linux Network File System, Linux RDMA
  Cc: Chuck Lever, Jeff Layton, Neil Brown, Olga Kornievskaia, Dai Ngo,
	Tom Talpey, Saeed Mahameed, Leon Romanovsky

[-- Attachment #1: Type: text/plain, Size: 3975 bytes --]

On Thu, Nov 30, 2023 at 10:52:59AM +0530, Sukruth Sridharan (he/him) wrote:
> I notice the following hung task panic on 6.2.0-34 kernel during RDMA disconnect
> 
> [Wed Nov  1 08:03:54 2023] INFO: task kworker/u16:5:2274646 blocked
> for more than 120 seconds.
> [Wed Nov  1 08:03:55 2023]       Tainted: G        W  OE
> 6.2.0-34-generic #34-Ubuntu
> [Wed Nov  1 08:03:55 2023] "echo 0 >
> /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [Wed Nov  1 08:03:55 2023] task:kworker/u16:5   state:D stack:0
> pid:2274646 ppid:2      flags:0x00004000
> [Wed Nov  1 08:03:55 2023] Workqueue: xprtiod xprt_autoclose [sunrpc]
> [Wed Nov  1 08:03:55 2023] Call Trace:
> [Wed Nov  1 08:03:55 2023]  <TASK>
> [Wed Nov  1 08:03:55 2023]  __schedule+0x2aa/0x610
> [Wed Nov  1 08:03:55 2023]  schedule+0x63/0x110
> [Wed Nov  1 08:03:55 2023]  schedule_timeout+0x157/0x170
> [Wed Nov  1 08:03:55 2023]  wait_for_completion+0x88/0x150
> [Wed Nov  1 08:03:55 2023]  rpcrdma_xprt_disconnect+0x33f/0x350 [rpcrdma]
> [Wed Nov  1 08:03:55 2023]  xprt_rdma_close+0x12/0x40 [rpcrdma]
> [Wed Nov  1 08:03:55 2023]  xprt_autoclose+0x5c/0x120 [sunrpc]
> [Wed Nov  1 08:03:55 2023]  process_one_work+0x225/0x430
> [Wed Nov  1 08:03:55 2023]  worker_thread+0x50/0x3e0
> [Wed Nov  1 08:03:55 2023]  ? __pfx_worker_thread+0x10/0x10
> [Wed Nov  1 08:03:55 2023]  kthread+0xe9/0x110
> [Wed Nov  1 08:03:55 2023]  ? __pfx_kthread+0x10/0x10
> [Wed Nov  1 08:03:55 2023]  ret_from_fork+0x2c/0x50
> [Wed Nov  1 08:03:55 2023]  </TASK>
> 
> The flow which induced the bug is as follows:
> 1. Client initiates connection
> 2. Server hands off the response to the first RPC on the connection to
> the NIC (Mellanox ConnectX-5)
> 3. NIC tries to send the response around 6 times and fails the response with RNR
> 4. Client issues disconnect (possibly because it didn't receive a response)
> 5. Server cleans up the connection state
> 6. Client runs into the above panic as part of disconnect while draining the IOs
> 
> It looks like re_receiving is set only in rpcrdma_post_recvs, and the
> reason why it wouldn't be reset is if memory-region allocation code
> fails.
> That is possible if disconnect on the client somehow blocks allocation.
> 
> void rpcrdma_post_recvs(struct rpcrdma_xprt *r_xprt, int needed, bool temp)
> {
>         // ... (some initialization code)
> 
>     if (atomic_inc_return(&ep->re_receiving) > 1)
>         goto out;
> 
>         // ... (some allocation code)
> 
>     if (!wr) // <<<<<<<<<<<<<<<<<< PROBLEM HERE >>>>>>>>>>>>>>>>>>>
>         goto out;
> 
>         // ... (post recv code, and some error handling)
> 
>     if (atomic_dec_return(&ep->re_receiving) > 0)
>         complete(&ep->re_done);
> 
> out:
>     trace_xprtrdma_post_recvs(r_xprt, count);
>     ep->re_receive_count += count;
>     return;
> }
> 
> static void rpcrdma_xprt_drain(struct rpcrdma_xprt *r_xprt)
> {
>     struct rpcrdma_ep *ep = r_xprt->rx_ep;
>     struct rdma_cm_id *id = ep->re_id;
> 
>     /* Wait for rpcrdma_post_recvs() to leave its critical
>      * section.
>      */
>     if (atomic_inc_return(&ep->re_receiving) > 1) //
> <<<<<<<<<<<<<<<<<<< This is not reset, so wait gets stuck
> >>>>>>>>>>>>>>>>>
>         wait_for_completion(&ep->re_done);
> 
>     /* Flush Receives, then wait for deferred Reply work
>      * to complete.
>      */
>     ib_drain_rq(id->qp);
> 
>     /* Deferred Reply processing might have scheduled
>      * local invalidations.
>      */
>     ib_drain_sq(id->qp);
> 
>     rpcrdma_ep_put(ep);
> }
> 
> Can you help conclude if the above theory around the bug being in the
> client code is right? If not, can you help with steps/data points
> required to debug this further?
> 

Can you verify that the bug still occurs with latest vanilla kernel
(currently v6.7-rc3)?

Thanks.

-- 
An old man doll... just what I always wanted! - Clara

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Hung task panic as part of NFS RDMA Disconnect due to possible bug on 6.2.0-34-generic client
  2023-11-30  7:36 ` Hung task panic as part of NFS RDMA Disconnect due to possible bug on 6.2.0-34-generic client Bagas Sanjaya
@ 2023-11-30 12:15   ` Sukruth Sridharan (he/him)
  2023-11-30 12:17     ` Bagas Sanjaya
  2023-11-30 12:23   ` Sukruth Sridharan (he/him)
  1 sibling, 1 reply; 4+ messages in thread
From: Sukruth Sridharan (he/him) @ 2023-11-30 12:15 UTC (permalink / raw)
  To: Bagas Sanjaya
  Cc: Linux Network File System, Linux RDMA, Chuck Lever, Jeff Layton,
	Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	Saeed Mahameed, Leon Romanovsky

The issue has been seen once in the past few weeks.
Unfortunately, we're yet to see a repro of the same.
We will try to repro it on the latest kernel.
Curious if there's any improvements gone in that you suspect would
have resolved the issue?

Thanks,
Sukruth


On Thu, Nov 30, 2023 at 1:06 PM Bagas Sanjaya <bagasdotme@gmail.com> wrote:
>
> On Thu, Nov 30, 2023 at 10:52:59AM +0530, Sukruth Sridharan (he/him) wrote:
> > I notice the following hung task panic on 6.2.0-34 kernel during RDMA disconnect
> >
> > [Wed Nov  1 08:03:54 2023] INFO: task kworker/u16:5:2274646 blocked
> > for more than 120 seconds.
> > [Wed Nov  1 08:03:55 2023]       Tainted: G        W  OE
> > 6.2.0-34-generic #34-Ubuntu
> > [Wed Nov  1 08:03:55 2023] "echo 0 >
> > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > [Wed Nov  1 08:03:55 2023] task:kworker/u16:5   state:D stack:0
> > pid:2274646 ppid:2      flags:0x00004000
> > [Wed Nov  1 08:03:55 2023] Workqueue: xprtiod xprt_autoclose [sunrpc]
> > [Wed Nov  1 08:03:55 2023] Call Trace:
> > [Wed Nov  1 08:03:55 2023]  <TASK>
> > [Wed Nov  1 08:03:55 2023]  __schedule+0x2aa/0x610
> > [Wed Nov  1 08:03:55 2023]  schedule+0x63/0x110
> > [Wed Nov  1 08:03:55 2023]  schedule_timeout+0x157/0x170
> > [Wed Nov  1 08:03:55 2023]  wait_for_completion+0x88/0x150
> > [Wed Nov  1 08:03:55 2023]  rpcrdma_xprt_disconnect+0x33f/0x350 [rpcrdma]
> > [Wed Nov  1 08:03:55 2023]  xprt_rdma_close+0x12/0x40 [rpcrdma]
> > [Wed Nov  1 08:03:55 2023]  xprt_autoclose+0x5c/0x120 [sunrpc]
> > [Wed Nov  1 08:03:55 2023]  process_one_work+0x225/0x430
> > [Wed Nov  1 08:03:55 2023]  worker_thread+0x50/0x3e0
> > [Wed Nov  1 08:03:55 2023]  ? __pfx_worker_thread+0x10/0x10
> > [Wed Nov  1 08:03:55 2023]  kthread+0xe9/0x110
> > [Wed Nov  1 08:03:55 2023]  ? __pfx_kthread+0x10/0x10
> > [Wed Nov  1 08:03:55 2023]  ret_from_fork+0x2c/0x50
> > [Wed Nov  1 08:03:55 2023]  </TASK>
> >
> > The flow which induced the bug is as follows:
> > 1. Client initiates connection
> > 2. Server hands off the response to the first RPC on the connection to
> > the NIC (Mellanox ConnectX-5)
> > 3. NIC tries to send the response around 6 times and fails the response with RNR
> > 4. Client issues disconnect (possibly because it didn't receive a response)
> > 5. Server cleans up the connection state
> > 6. Client runs into the above panic as part of disconnect while draining the IOs
> >
> > It looks like re_receiving is set only in rpcrdma_post_recvs, and the
> > reason why it wouldn't be reset is if memory-region allocation code
> > fails.
> > That is possible if disconnect on the client somehow blocks allocation.
> >
> > void rpcrdma_post_recvs(struct rpcrdma_xprt *r_xprt, int needed, bool temp)
> > {
> >         // ... (some initialization code)
> >
> >     if (atomic_inc_return(&ep->re_receiving) > 1)
> >         goto out;
> >
> >         // ... (some allocation code)
> >
> >     if (!wr) // <<<<<<<<<<<<<<<<<< PROBLEM HERE >>>>>>>>>>>>>>>>>>>
> >         goto out;
> >
> >         // ... (post recv code, and some error handling)
> >
> >     if (atomic_dec_return(&ep->re_receiving) > 0)
> >         complete(&ep->re_done);
> >
> > out:
> >     trace_xprtrdma_post_recvs(r_xprt, count);
> >     ep->re_receive_count += count;
> >     return;
> > }
> >
> > static void rpcrdma_xprt_drain(struct rpcrdma_xprt *r_xprt)
> > {
> >     struct rpcrdma_ep *ep = r_xprt->rx_ep;
> >     struct rdma_cm_id *id = ep->re_id;
> >
> >     /* Wait for rpcrdma_post_recvs() to leave its critical
> >      * section.
> >      */
> >     if (atomic_inc_return(&ep->re_receiving) > 1) //
> > <<<<<<<<<<<<<<<<<<< This is not reset, so wait gets stuck
> > >>>>>>>>>>>>>>>>>
> >         wait_for_completion(&ep->re_done);
> >
> >     /* Flush Receives, then wait for deferred Reply work
> >      * to complete.
> >      */
> >     ib_drain_rq(id->qp);
> >
> >     /* Deferred Reply processing might have scheduled
> >      * local invalidations.
> >      */
> >     ib_drain_sq(id->qp);
> >
> >     rpcrdma_ep_put(ep);
> > }
> >
> > Can you help conclude if the above theory around the bug being in the
> > client code is right? If not, can you help with steps/data points
> > required to debug this further?
> >
>
> Can you verify that the bug still occurs with latest vanilla kernel
> (currently v6.7-rc3)?
>
> Thanks.
>
> --
> An old man doll... just what I always wanted! - Clara

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Hung task panic as part of NFS RDMA Disconnect due to possible bug on 6.2.0-34-generic client
  2023-11-30 12:15   ` Sukruth Sridharan (he/him)
@ 2023-11-30 12:17     ` Bagas Sanjaya
  0 siblings, 0 replies; 4+ messages in thread
From: Bagas Sanjaya @ 2023-11-30 12:17 UTC (permalink / raw)
  To: Sukruth Sridharan (he/him)
  Cc: Linux Network File System, Linux RDMA, Chuck Lever, Jeff Layton,
	Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	Saeed Mahameed, Leon Romanovsky

On 11/30/23 19:15, Sukruth Sridharan (he/him) wrote:
> The issue has been seen once in the past few weeks.
> Unfortunately, we're yet to see a repro of the same.
> We will try to repro it on the latest kernel.
> Curious if there's any improvements gone in that you suspect would
> have resolved the issue?
> 

Please don't top-post; reply inline with appropriate context instead.

Sorry, I don't know about that question.

Thanks.

-- 
An old man doll... just what I always wanted! - Clara


^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: Hung task panic as part of NFS RDMA Disconnect due to possible bug on 6.2.0-34-generic client
  2023-11-30  7:36 ` Hung task panic as part of NFS RDMA Disconnect due to possible bug on 6.2.0-34-generic client Bagas Sanjaya
  2023-11-30 12:15   ` Sukruth Sridharan (he/him)
@ 2023-11-30 12:23   ` Sukruth Sridharan (he/him)
  1 sibling, 0 replies; 4+ messages in thread
From: Sukruth Sridharan (he/him) @ 2023-11-30 12:23 UTC (permalink / raw)
  To: Bagas Sanjaya
  Cc: Linux Network File System, Linux RDMA, Chuck Lever, Jeff Layton,
	Neil Brown, Olga Kornievskaia, Dai Ngo, Tom Talpey,
	Saeed Mahameed, Leon Romanovsky

On Thu, Nov 30, 2023 at 1:06 PM Bagas Sanjaya <bagasdotme@gmail.com> wrote:
>
> On Thu, Nov 30, 2023 at 10:52:59AM +0530, Sukruth Sridharan (he/him) wrote:
> > I notice the following hung task panic on 6.2.0-34 kernel during RDMA disconnect
> >
> > [Wed Nov  1 08:03:54 2023] INFO: task kworker/u16:5:2274646 blocked
> > for more than 120 seconds.
> > [Wed Nov  1 08:03:55 2023]       Tainted: G        W  OE
> > 6.2.0-34-generic #34-Ubuntu
> > [Wed Nov  1 08:03:55 2023] "echo 0 >
> > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > [Wed Nov  1 08:03:55 2023] task:kworker/u16:5   state:D stack:0
> > pid:2274646 ppid:2      flags:0x00004000
> > [Wed Nov  1 08:03:55 2023] Workqueue: xprtiod xprt_autoclose [sunrpc]
> > [Wed Nov  1 08:03:55 2023] Call Trace:
> > [Wed Nov  1 08:03:55 2023]  <TASK>
> > [Wed Nov  1 08:03:55 2023]  __schedule+0x2aa/0x610
> > [Wed Nov  1 08:03:55 2023]  schedule+0x63/0x110
> > [Wed Nov  1 08:03:55 2023]  schedule_timeout+0x157/0x170
> > [Wed Nov  1 08:03:55 2023]  wait_for_completion+0x88/0x150
> > [Wed Nov  1 08:03:55 2023]  rpcrdma_xprt_disconnect+0x33f/0x350 [rpcrdma]
> > [Wed Nov  1 08:03:55 2023]  xprt_rdma_close+0x12/0x40 [rpcrdma]
> > [Wed Nov  1 08:03:55 2023]  xprt_autoclose+0x5c/0x120 [sunrpc]
> > [Wed Nov  1 08:03:55 2023]  process_one_work+0x225/0x430
> > [Wed Nov  1 08:03:55 2023]  worker_thread+0x50/0x3e0
> > [Wed Nov  1 08:03:55 2023]  ? __pfx_worker_thread+0x10/0x10
> > [Wed Nov  1 08:03:55 2023]  kthread+0xe9/0x110
> > [Wed Nov  1 08:03:55 2023]  ? __pfx_kthread+0x10/0x10
> > [Wed Nov  1 08:03:55 2023]  ret_from_fork+0x2c/0x50
> > [Wed Nov  1 08:03:55 2023]  </TASK>
> >
> > The flow which induced the bug is as follows:
> > 1. Client initiates connection
> > 2. Server hands off the response to the first RPC on the connection to
> > the NIC (Mellanox ConnectX-5)
> > 3. NIC tries to send the response around 6 times and fails the response with RNR
> > 4. Client issues disconnect (possibly because it didn't receive a response)
> > 5. Server cleans up the connection state
> > 6. Client runs into the above panic as part of disconnect while draining the IOs
> >
> > It looks like re_receiving is set only in rpcrdma_post_recvs, and the
> > reason why it wouldn't be reset is if memory-region allocation code
> > fails.
> > That is possible if disconnect on the client somehow blocks allocation.
> >
> > void rpcrdma_post_recvs(struct rpcrdma_xprt *r_xprt, int needed, bool temp)
> > {
> >         // ... (some initialization code)
> >
> >     if (atomic_inc_return(&ep->re_receiving) > 1)
> >         goto out;
> >
> >         // ... (some allocation code)
> >
> >     if (!wr) // <<<<<<<<<<<<<<<<<< PROBLEM HERE >>>>>>>>>>>>>>>>>>>
> >         goto out;
> >
> >         // ... (post recv code, and some error handling)
> >
> >     if (atomic_dec_return(&ep->re_receiving) > 0)
> >         complete(&ep->re_done);
> >
> > out:
> >     trace_xprtrdma_post_recvs(r_xprt, count);
> >     ep->re_receive_count += count;
> >     return;
> > }
> >
> > static void rpcrdma_xprt_drain(struct rpcrdma_xprt *r_xprt)
> > {
> >     struct rpcrdma_ep *ep = r_xprt->rx_ep;
> >     struct rdma_cm_id *id = ep->re_id;
> >
> >     /* Wait for rpcrdma_post_recvs() to leave its critical
> >      * section.
> >      */
> >     if (atomic_inc_return(&ep->re_receiving) > 1) //
> > <<<<<<<<<<<<<<<<<<< This is not reset, so wait gets stuck
> > >>>>>>>>>>>>>>>>>
> >         wait_for_completion(&ep->re_done);
> >
> >     /* Flush Receives, then wait for deferred Reply work
> >      * to complete.
> >      */
> >     ib_drain_rq(id->qp);
> >
> >     /* Deferred Reply processing might have scheduled
> >      * local invalidations.
> >      */
> >     ib_drain_sq(id->qp);
> >
> >     rpcrdma_ep_put(ep);
> > }
> >
> > Can you help conclude if the above theory around the bug being in the
> > client code is right? If not, can you help with steps/data points
> > required to debug this further?
> >
>
> Can you verify that the bug still occurs with latest vanilla kernel
> (currently v6.7-rc3)?
>
> Thanks.
>
> --
> An old man doll... just what I always wanted! - Clara

The issue has been seen once in the past few weeks.
Unfortunately, we're yet to see a repro of the same.
We will try to repro it on the latest kernel.
Curious if there's any improvements gone in that you suspect would
have resolved the issue?

(Apologies for the top post earlier)

Thanks,
Sukruth

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2023-11-30 12:23 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <CAMyEAb9XbL55taNXD_MrTxJz62s6ByDWiK8m1Nxj1_G3pg-M6A@mail.gmail.com>
2023-11-30  7:36 ` Hung task panic as part of NFS RDMA Disconnect due to possible bug on 6.2.0-34-generic client Bagas Sanjaya
2023-11-30 12:15   ` Sukruth Sridharan (he/him)
2023-11-30 12:17     ` Bagas Sanjaya
2023-11-30 12:23   ` Sukruth Sridharan (he/him)

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox