public inbox for linux-nfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Chuck Lever <chuck.lever@oracle.com>
To: Rik Theys <rik.theys@gmail.com>
Cc: Christian Herzog <herzog@phys.ethz.ch>,
	Salvatore Bonaccorso <carnil@debian.org>,
	linux-nfs@vger.kernel.org
Subject: Re: nfsd4 laundromat_main hung tasks
Date: Thu, 16 Jan 2025 09:12:53 -0500	[thread overview]
Message-ID: <99ea3fa8-ffd6-4fbd-af73-c0dabb261973@oracle.com> (raw)
In-Reply-To: <CAPwv0JnM=Cz=sMazCSuuRbOjHURQ2bDox7F=OqQoT9DxbsaHzw@mail.gmail.com>

On 1/16/25 4:03 AM, Rik Theys wrote:

>> The laundromat failure mode is not blocked in rpc_shutdown_client, so
>> there aren't any outstanding callback RPCs to observe.
>>
>> The DESTROY_SESSION failure mode is blocking on the flush_workqueue call
>> in nfsd4_shutdown_callback(), while this failure mode appears to have
>> passed that call and blocked on the wait for in-flight RPCs to go to
>> zero (as Jeff analyzed a few days ago).
> 
> If I look at the trace, nfs4_laundromat calls
> nfs4_process_client_reaplist, which calls __destroy_client at some
> point.
> 
> When I look at the __destroy_client function in nfs4state.c, I see it
> does a spin_lock(&state_lock) and spin_unlock(&state_lock) to perform
> certain actions, but it seems the lock is not (again) acquired when
> the nfsd4_shutdown_callback() function is called? According to the
> comment above the nfsd4_shutdown_callback function in nfs4callback.c,
> the function must be called under the state lock? Is it possible that
> the function is called without this state lock? Or is the comment no
> longer relevant?

The comment is stale.

Commit b687f6863eed ("nfsd: remove the client_mutex and the nfs4_lock/
unlock_state wrappers") removed the mutex that used to wrap calls to
this function.


> Another thing I've noticed (but I'm not sure it's relevant here) is
> that there's a client in /proc/nfs/nfsd/clients that has a states file
> that crashes nfsdclnts as the field does not have a "superblock"
> field:
> 
> # cat 8536/{info,states}
> clientid: 0x6d0596d0675df2b3
> address: "10.87.29.32:864"
> status: courtesy
> seconds from last renew: 2807740
> name: "Linux NFSv4.2 betelgeuse.esat.kuleuven.be"
> minor version: 2
> Implementation domain: "kernel.org"
> Implementation name: "Linux 4.18.0-553.32.1.el8_10.x86_64 #1 SMP Wed
> Dec 11 16:33:48 UTC 2024 x86_64"
> Implementation time: [0, 0]
> callback state: UNKNOWN
> callback address: 10.87.29.32:0
> admin-revoked states: 0
> - 0x00000001b3f25d67d096056d19facf00: { type: deleg, access: w }
> 
> This is one of the clients that has multiple entries in the
> /proc/fs/nfsd/clients directory, but of all the clients that have
> duplicate entries, this is the only one where the "broken" client is
> in the "courtesy" state for a long time now. It's also the only
> "broken" client that still has an entry in the states file. The others
> are all in the "unconfirmed" state and the states file is empty.

Likely that client entry is pinned somehow by this bug.

-- 
Chuck Lever

      reply	other threads:[~2025-01-16 14:13 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-10 19:49 nfsd4 laundromat_main hung tasks Rik Theys
2025-01-10 20:30 ` Chuck Lever
     [not found]   ` <CAPwv0J=oKBnCia_mmhm-tYLPqw03jO=LxfUbShSyXFp-mKET5A@mail.gmail.com>
     [not found]     ` <49654519-9166-4593-ac62-77400cebebb4@oracle.com>
2025-01-12 12:42       ` Rik Theys
2025-01-12 18:57         ` Chuck Lever
2025-01-13 12:30           ` Rik Theys
2025-01-13 13:39             ` Chuck Lever
2025-01-13 22:12         ` Chuck Lever
2025-01-14  8:23           ` Rik Theys
2025-01-14 14:51             ` Chuck Lever
2025-01-14 15:30               ` Rik Theys
2025-01-14 16:10                 ` Chuck Lever
2025-01-14 19:02                   ` Chuck Lever
2025-01-16  9:03                     ` Rik Theys
2025-01-16 14:12                       ` Chuck Lever [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=99ea3fa8-ffd6-4fbd-af73-c0dabb261973@oracle.com \
    --to=chuck.lever@oracle.com \
    --cc=carnil@debian.org \
    --cc=herzog@phys.ethz.ch \
    --cc=linux-nfs@vger.kernel.org \
    --cc=rik.theys@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox