public inbox for linux-nfs@vger.kernel.org
 help / color / mirror / Atom feed
* NFS stuck in nfs_lookup_revalidate
@ 2025-06-03 11:57 Lukáš Hejtmánek
  2025-06-04 17:40 ` Anna Schumaker
       [not found] ` <CAOuNp5kzbyVfbdumXJF3bb=RKxdE5P8aKJDeSoLtgEV9=9xU+g@mail.gmail.com>
  0 siblings, 2 replies; 5+ messages in thread
From: Lukáš Hejtmánek @ 2025-06-03 11:57 UTC (permalink / raw)
  To: linux-nfs@vger.kernel.org; +Cc: Zdenek Salvet


Hello,

We are experiencing repeated PostgreSQL process freezes when using NFSv3-mounted storage on clients running Ubuntu kernels in the 6.x series (specifically tested on 6.2, 6.8, and 6.11).

Setup:
    - Storage: All-flash disk array exported via NFS (v3)
    - Client OS: Ubuntu with kernel versions 6.2, 6.8, 6.11
    - Application: PostgreSQL using the NFS volume as its data directory

Symptoms:
On affected systems, PostgreSQL processes (particularly autovacuum workers) intermittently hang. 
The stack trace shows a consistent pattern involving __nfs_lookup_revalidate:
[<0>] __nfs_lookup_revalidate+0x113/0x160 [nfs]
[<0>] nfs_lookup_revalidate+0x15/0x30 [nfs]
[<0>] lookup_fast+0x87/0x100
[<0>] open_last_lookups+0x5f/0x400
[<0>] path_openat+0x99/0x2d0
[<0>] do_filp_open+0xaf/0x170
[<0>] do_sys_openat2+0xb3/0xe0
[<0>] __x64_sys_openat+0x55/0xa0
[<0>] x64_sys_call+0x1eb1/0x25a0
[<0>] do_syscall_64+0x7f/0x180
[<0>] entry_SYSCALL_64_after_hwframe+0x78/0x80

Process Tree Example:
863813 ?        Zsl   12:24  \_ [manager] <defunct>
3644504 ?        Ds     0:00      \_ postgres: mlflow: autovacuum worker template1

The autovacuum worker is most commonly affected.

Workaround Attempt:
We observed some improvement by modifying the NFS client source fs/nfs/dir.c (around line 1833):

Change:
dentry->d_fsdata = NFS_FSDATA_BLOCKED;

To:
smp_store_release(&dentry->d_fsdata, NFS_FSDATA_BLOCKED);

While this mitigates the issue somewhat, it does not fully resolve the hangs.

Is this a known issue with NFS in 6.x kernels?
Is there a recommended patch or workaround?
Are there any known regressions related to __nfs_lookup_revalidate or dentry locking?

Problem can be related to the all-flash array, that is able to provide about 30k IOPS over NFS and 5633 TPS in pgbench (pgbench -T 300 -c100 -j20 -r).

Other NFS connections to the same NFS servers are not affected and are usable, however, the process cannot be kille obviously and the client node reboot is required.

I believe that in 5.x kernel series it was more stable.

--
Lukáš Hejtmánek

Linux Administrator only because
  Full Time Multitasking Ninja 
  is not an official job title


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: NFS stuck in nfs_lookup_revalidate
  2025-06-03 11:57 NFS stuck in nfs_lookup_revalidate Lukáš Hejtmánek
@ 2025-06-04 17:40 ` Anna Schumaker
       [not found] ` <CAOuNp5kzbyVfbdumXJF3bb=RKxdE5P8aKJDeSoLtgEV9=9xU+g@mail.gmail.com>
  1 sibling, 0 replies; 5+ messages in thread
From: Anna Schumaker @ 2025-06-04 17:40 UTC (permalink / raw)
  To: Lukáš Hejtmánek, linux-nfs@vger.kernel.org; +Cc: Zdenek Salvet

Hi Lukáš

On 6/3/25 7:57 AM, Lukáš Hejtmánek wrote:
> 
> Hello,
> 
> We are experiencing repeated PostgreSQL process freezes when using NFSv3-mounted storage on clients running Ubuntu kernels in the 6.x series (specifically tested on 6.2, 6.8, and 6.11).
> 
> Setup:
>     - Storage: All-flash disk array exported via NFS (v3)
>     - Client OS: Ubuntu with kernel versions 6.2, 6.8, 6.11
>     - Application: PostgreSQL using the NFS volume as its data directory
> 
> Symptoms:
> On affected systems, PostgreSQL processes (particularly autovacuum workers) intermittently hang. 
> The stack trace shows a consistent pattern involving __nfs_lookup_revalidate:
> [<0>] __nfs_lookup_revalidate+0x113/0x160 [nfs]
> [<0>] nfs_lookup_revalidate+0x15/0x30 [nfs]
> [<0>] lookup_fast+0x87/0x100
> [<0>] open_last_lookups+0x5f/0x400
> [<0>] path_openat+0x99/0x2d0
> [<0>] do_filp_open+0xaf/0x170
> [<0>] do_sys_openat2+0xb3/0xe0
> [<0>] __x64_sys_openat+0x55/0xa0
> [<0>] x64_sys_call+0x1eb1/0x25a0
> [<0>] do_syscall_64+0x7f/0x180
> [<0>] entry_SYSCALL_64_after_hwframe+0x78/0x80
> 
> Process Tree Example:
> 863813 ?        Zsl   12:24  \_ [manager] <defunct>
> 3644504 ?        Ds     0:00      \_ postgres: mlflow: autovacuum worker template1
> 
> The autovacuum worker is most commonly affected.
> 
> Workaround Attempt:
> We observed some improvement by modifying the NFS client source fs/nfs/dir.c (around line 1833):
> 
> Change:
> dentry->d_fsdata = NFS_FSDATA_BLOCKED;
> 
> To:
> smp_store_release(&dentry->d_fsdata, NFS_FSDATA_BLOCKED);
> 
> While this mitigates the issue somewhat, it does not fully resolve the hangs.
> 
> Is this a known issue with NFS in 6.x kernels?
> Is there a recommended patch or workaround?
> Are there any known regressions related to __nfs_lookup_revalidate or dentry locking?

I'm not aware of this being a known issue or any regressions in __nfs_lookup_revalidate(),
so I can't recommend a patch or workaround to try. Have you tried an upstream kernel
to verify if it's still an issue there? There were a handful of patches that went into
v6.14 that touch the lookup path, and I'm curious if they make a difference either way.

Anna

> 
> Problem can be related to the all-flash array, that is able to provide about 30k IOPS over NFS and 5633 TPS in pgbench (pgbench -T 300 -c100 -j20 -r).
> 
> Other NFS connections to the same NFS servers are not affected and are usable, however, the process cannot be kille obviously and the client node reboot is required.
> 
> I believe that in 5.x kernel series it was more stable.
> 
> --
> Lukáš Hejtmánek
> 
> Linux Administrator only because
>   Full Time Multitasking Ninja 
>   is not an official job title
> 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: NFS stuck in nfs_lookup_revalidate
       [not found] ` <CAOuNp5kzbyVfbdumXJF3bb=RKxdE5P8aKJDeSoLtgEV9=9xU+g@mail.gmail.com>
@ 2025-06-04 19:11   ` Zdenek Salvet
  2025-07-18 22:19     ` Lukáš Hejtmánek
  0 siblings, 1 reply; 5+ messages in thread
From: Zdenek Salvet @ 2025-06-04 19:11 UTC (permalink / raw)
  To: Santosh Pradhan; +Cc: Lukáš Hejtmánek, linux-nfs

On Thu, Jun 05, 2025 at 12:00:44AM +0530, Santosh Pradhan wrote:
> I am not sure but I vaguely remember that there was some similar issue and
> Neil introduced store_release_wake_up() which puts a full barrier  smp_mb()
> before calling wake_up_var().
> 
> index d0e0b435a843..e754e3e478a5 100644
> --- a/fs/nfs/dir.c
> +++ b/fs/nfs/dir.c
> @@ -1830,6 +1830,7 @@ static void unblock_revalidate(struct dentry *dentry)
>  {
>         /* store_release ensures wait_var_event() sees the update */
>         smp_store_release(&dentry->d_fsdata, NULL);
> + smp_mb();
>         wake_up_var(&dentry->d_fsdata);
>  }

Hello,
yes, upon rereading exact definition of smp_store_release(),
I am reasonably sure the comment is wrong and smp_mb() should be used,
it ensures the right ordering between store to d_fsdata and wait queue
data, not the smp_store_release(&dentry->d_fsdata,...).

Best regards,
Zdenek Salvet                                              salvet@ics.muni.cz 
Institute of Computer Science of Masaryk University, Brno, Czech Republic
and CESNET, z.s.p.o., Prague, Czech Republic
Phone: ++420-549 49 6534                           Fax: ++420-541 212 747
----------------------------------------------------------------------------
      Teamwork is essential -- it allows you to blame someone else.


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: NFS stuck in nfs_lookup_revalidate
  2025-06-04 19:11   ` Zdenek Salvet
@ 2025-07-18 22:19     ` Lukáš Hejtmánek
  2025-07-19  0:02       ` Trond Myklebust
  0 siblings, 1 reply; 5+ messages in thread
From: Lukáš Hejtmánek @ 2025-07-18 22:19 UTC (permalink / raw)
  To: Santosh Pradhan; +Cc: linux-nfs@vger.kernel.org, salvet@ics.muni.cz

Hello,

> On 4. 6. 2025, at 21:11, Zdenek Salvet <salvet@ics.muni.cz> wrote:
> 
> On Thu, Jun 05, 2025 at 12:00:44AM +0530, Santosh Pradhan wrote:
>> I am not sure but I vaguely remember that there was some similar issue and
>> Neil introduced store_release_wake_up() which puts a full barrier  smp_mb()
>> before calling wake_up_var().
>> 
>> index d0e0b435a843..e754e3e478a5 100644
>> --- a/fs/nfs/dir.c
>> +++ b/fs/nfs/dir.c
>> @@ -1830,6 +1830,7 @@ static void unblock_revalidate(struct dentry *dentry)
>> {
>>        /* store_release ensures wait_var_event() sees the update */
>>        smp_store_release(&dentry->d_fsdata, NULL);
>> + smp_mb();
>>        wake_up_var(&dentry->d_fsdata);
>> }

any chance to get this into mainline? It seems that the current git master does not inlude smb_mb(); in this code.

--
Lukáš Hejtmánek

Linux Administrator only because
  Full Time Multitasking Ninja 
  is not an official job title


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: NFS stuck in nfs_lookup_revalidate
  2025-07-18 22:19     ` Lukáš Hejtmánek
@ 2025-07-19  0:02       ` Trond Myklebust
  0 siblings, 0 replies; 5+ messages in thread
From: Trond Myklebust @ 2025-07-19  0:02 UTC (permalink / raw)
  To: Lukáš Hejtmánek, Santosh Pradhan
  Cc: linux-nfs@vger.kernel.org, salvet@ics.muni.cz

On Fri, 2025-07-18 at 22:19 +0000, Lukáš Hejtmánek wrote:
> Hello,
> 
> > On 4. 6. 2025, at 21:11, Zdenek Salvet <salvet@ics.muni.cz> wrote:
> > 
> > On Thu, Jun 05, 2025 at 12:00:44AM +0530, Santosh Pradhan wrote:
> > > I am not sure but I vaguely remember that there was some similar
> > > issue and
> > > Neil introduced store_release_wake_up() which puts a full
> > > barrier  smp_mb()
> > > before calling wake_up_var().
> > > 
> > > index d0e0b435a843..e754e3e478a5 100644
> > > --- a/fs/nfs/dir.c
> > > +++ b/fs/nfs/dir.c
> > > @@ -1830,6 +1830,7 @@ static void unblock_revalidate(struct
> > > dentry *dentry)
> > > {
> > >        /* store_release ensures wait_var_event() sees the update
> > > */
> > >        smp_store_release(&dentry->d_fsdata, NULL);
> > > + smp_mb();
> > >        wake_up_var(&dentry->d_fsdata);
> > > }
> 
> any chance to get this into mainline? It seems that the current git
> master does not inlude smb_mb(); in this code.
> 
> 
I'm adding a fix that will go in to the next merge window.

-- 
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trondmy@kernel.org, trond.myklebust@hammerspace.com

^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2025-07-19  0:02 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-03 11:57 NFS stuck in nfs_lookup_revalidate Lukáš Hejtmánek
2025-06-04 17:40 ` Anna Schumaker
     [not found] ` <CAOuNp5kzbyVfbdumXJF3bb=RKxdE5P8aKJDeSoLtgEV9=9xU+g@mail.gmail.com>
2025-06-04 19:11   ` Zdenek Salvet
2025-07-18 22:19     ` Lukáš Hejtmánek
2025-07-19  0:02       ` Trond Myklebust

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox