From: Chuck Lever <chuck.lever@oracle.com>
To: Rik Theys <rik.theys@gmail.com>
Cc: Christian Herzog <herzog@phys.ethz.ch>,
Salvatore Bonaccorso <carnil@debian.org>,
linux-nfs@vger.kernel.org
Subject: Re: nfsd4 laundromat_main hung tasks
Date: Thu, 16 Jan 2025 09:12:53 -0500 [thread overview]
Message-ID: <99ea3fa8-ffd6-4fbd-af73-c0dabb261973@oracle.com> (raw)
In-Reply-To: <CAPwv0JnM=Cz=sMazCSuuRbOjHURQ2bDox7F=OqQoT9DxbsaHzw@mail.gmail.com>
On 1/16/25 4:03 AM, Rik Theys wrote:
>> The laundromat failure mode is not blocked in rpc_shutdown_client, so
>> there aren't any outstanding callback RPCs to observe.
>>
>> The DESTROY_SESSION failure mode is blocking on the flush_workqueue call
>> in nfsd4_shutdown_callback(), while this failure mode appears to have
>> passed that call and blocked on the wait for in-flight RPCs to go to
>> zero (as Jeff analyzed a few days ago).
>
> If I look at the trace, nfs4_laundromat calls
> nfs4_process_client_reaplist, which calls __destroy_client at some
> point.
>
> When I look at the __destroy_client function in nfs4state.c, I see it
> does a spin_lock(&state_lock) and spin_unlock(&state_lock) to perform
> certain actions, but it seems the lock is not (again) acquired when
> the nfsd4_shutdown_callback() function is called? According to the
> comment above the nfsd4_shutdown_callback function in nfs4callback.c,
> the function must be called under the state lock? Is it possible that
> the function is called without this state lock? Or is the comment no
> longer relevant?
The comment is stale.
Commit b687f6863eed ("nfsd: remove the client_mutex and the nfs4_lock/
unlock_state wrappers") removed the mutex that used to wrap calls to
this function.
> Another thing I've noticed (but I'm not sure it's relevant here) is
> that there's a client in /proc/nfs/nfsd/clients that has a states file
> that crashes nfsdclnts as the field does not have a "superblock"
> field:
>
> # cat 8536/{info,states}
> clientid: 0x6d0596d0675df2b3
> address: "10.87.29.32:864"
> status: courtesy
> seconds from last renew: 2807740
> name: "Linux NFSv4.2 betelgeuse.esat.kuleuven.be"
> minor version: 2
> Implementation domain: "kernel.org"
> Implementation name: "Linux 4.18.0-553.32.1.el8_10.x86_64 #1 SMP Wed
> Dec 11 16:33:48 UTC 2024 x86_64"
> Implementation time: [0, 0]
> callback state: UNKNOWN
> callback address: 10.87.29.32:0
> admin-revoked states: 0
> - 0x00000001b3f25d67d096056d19facf00: { type: deleg, access: w }
>
> This is one of the clients that has multiple entries in the
> /proc/fs/nfsd/clients directory, but of all the clients that have
> duplicate entries, this is the only one where the "broken" client is
> in the "courtesy" state for a long time now. It's also the only
> "broken" client that still has an entry in the states file. The others
> are all in the "unconfirmed" state and the states file is empty.
Likely that client entry is pinned somehow by this bug.
--
Chuck Lever
prev parent reply other threads:[~2025-01-16 14:13 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-10 19:49 nfsd4 laundromat_main hung tasks Rik Theys
2025-01-10 20:30 ` Chuck Lever
[not found] ` <CAPwv0J=oKBnCia_mmhm-tYLPqw03jO=LxfUbShSyXFp-mKET5A@mail.gmail.com>
[not found] ` <49654519-9166-4593-ac62-77400cebebb4@oracle.com>
2025-01-12 12:42 ` Rik Theys
2025-01-12 18:57 ` Chuck Lever
2025-01-13 12:30 ` Rik Theys
2025-01-13 13:39 ` Chuck Lever
2025-01-13 22:12 ` Chuck Lever
2025-01-14 8:23 ` Rik Theys
2025-01-14 14:51 ` Chuck Lever
2025-01-14 15:30 ` Rik Theys
2025-01-14 16:10 ` Chuck Lever
2025-01-14 19:02 ` Chuck Lever
2025-01-16 9:03 ` Rik Theys
2025-01-16 14:12 ` Chuck Lever [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=99ea3fa8-ffd6-4fbd-af73-c0dabb261973@oracle.com \
--to=chuck.lever@oracle.com \
--cc=carnil@debian.org \
--cc=herzog@phys.ethz.ch \
--cc=linux-nfs@vger.kernel.org \
--cc=rik.theys@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox