Linux NFS development
 help / color / mirror / Atom feed
From: 郭玲兴 <guolingxing@supcon.com>
To: "Rick Macklem" <rick.macklem@gmail.com>
Cc: "Lionel Cons" <lionelcons1972@gmail.com>,
	linux-nfs@vger.kernel.org,  linux-kernel@vger.kernel.org
Subject: Re: Re: [BUG] NFSv4.1 client hang in nfs4_drain_slot_tbl under concurrent workload against Windows NFS server
Date: Mon, 18 May 2026 09:06:15 +0800 (GMT+08:00)	[thread overview]
Message-ID: <552f72d1.1c692.19e389e8ea4.Coremail.guolingxing@supcon.com> (raw)
In-Reply-To: <CAM5tNy4A-a4q-t_z7v_sHFW0VeyPLu_yEJ4RQ4DxXVAF-5kROg@mail.gmail.com>

 Hi Rick, hi Lionel

Below are the environment details.

Server:
  Windows Server 2022
  Version 10.0.20348.587

User/account setup:
  No user mapping is configured.
  No AD, LDAP, or passwd-based mapping is used.
  Unmapped users are handled by the default "Everyone" account.

Authentication:
  sec=sys (AUTH_SYS), as reported by nfsstat -m

Architecture:
  Linux clients: x86_64
  Windows server: x86_64

Memory:
  Each Linux client VM has 16 GB RAM

We also observed the following on two independent clients:

Client A:
  age: 498061
  lease_time: 120
  lease_expired: 497941

Client B:
  age: 69598
  lease_time: 120
  lease_expired: 69478

In both cases, lease_expired is approximately equal to
age - lease_time, which suggests that the lease expired
shortly after mount and was not renewed afterward.

At hang time:

- both clients hang under concurrent workload
- both clients are blocked in nfs4_drain_slot_tbl
- no NFS RPC traffic is observed, only TCP ACKs
- nfsstat reports retrans=0
- on the Windows server side, the session state is reported
  as "Initialized"

We are tracing the RPC lifecycle to identify which RPC does
not complete.

Regarding the "soft" mount option: understood. We will retest
with a hard mount as well.

One question is whether the observed behavior is expected.
Even if a soft mount contributes to the problem, is it expected
that a single RPC timeout can leave the client in a state with
no forward progress, blocked in nfs4_drain_slot_tbl, and with
lease renewal no longer occurring? Or would that more likely
indicate a client-side recovery bug?


Thanks,
Guo Lingxing


> -----原始邮件-----
> 发件人: "Rick Macklem" <rick.macklem@gmail.com>
> 发送时间:2026-05-16 22:23:51 (星期六)
> 收件人: "Lionel Cons" <lionelcons1972@gmail.com>
> 抄送: 郭玲兴 <guolingxing@supcon.com>, linux-nfs@vger.kernel.org, linux-kernel@vger.kernel.org
> 主题: Re: [BUG] NFSv4.1 client hang in nfs4_drain_slot_tbl under concurrent workload against Windows NFS server
> 
> On Wed, May 6, 2026 at 6:32 AM Lionel Cons <lionelcons1972@gmail.com> wrote:
> >
> > On Wed, 6 May 2026 at 09:49, 郭玲兴 <guolingxing@supcon.com> wrote:
> > >
> > > Hi,
> > >
> > >
> > > We encountered a reproducible NFSv4.1 client hang issue under concurrent workload.
> > >
> > >
> > > Environment:
> > > - Two independent Linux clients (VMs)
> > > - Both mount the same Windows NFS server (NFSv4.1)
> > > - Kernel version: 6.1.78
> > > - Mount options: vers=4.1,soft,proto=tcp,timeo=60,retrans=10
> Just fyi, "soft" mounts are often going to be troublesome for NFSv4.1.
> (Whenever an RPC times out and doesn't wait for a reply from the server,
> it will leave a session slot messed up.)
> 
> rick
> 
> >
> > Which version of WindowsServer do you use, e.g what does the "ver"
> > command in cmd.exe output? How did you set up the user accounts, and
> > which authentication (AUTH_SYS, GSS, ...) do you use?
> > Which CPU architecture do you use? How much memory do you have on the
> > Linux NFS client?
> >
> > Lionel
> >






      reply	other threads:[~2026-05-18  1:08 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-06  7:46 [BUG] NFSv4.1 client hang in nfs4_drain_slot_tbl under concurrent workload against Windows NFS server 郭玲兴
2026-05-06 13:28 ` Lionel Cons
2026-05-07  0:50   ` 郭玲兴
2026-05-07  1:22     ` 郭玲兴
2026-05-16 14:23   ` Rick Macklem
2026-05-18  1:06     ` 郭玲兴 [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=552f72d1.1c692.19e389e8ea4.Coremail.guolingxing@supcon.com \
    --to=guolingxing@supcon.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=lionelcons1972@gmail.com \
    --cc=rick.macklem@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox