* NFS stuck in nfs_lookup_revalidate
@ 2025-06-03 11:57 Lukáš Hejtmánek
2025-06-04 17:40 ` Anna Schumaker
[not found] ` <CAOuNp5kzbyVfbdumXJF3bb=RKxdE5P8aKJDeSoLtgEV9=9xU+g@mail.gmail.com>
0 siblings, 2 replies; 5+ messages in thread
From: Lukáš Hejtmánek @ 2025-06-03 11:57 UTC (permalink / raw)
To: linux-nfs@vger.kernel.org; +Cc: Zdenek Salvet
Hello,
We are experiencing repeated PostgreSQL process freezes when using NFSv3-mounted storage on clients running Ubuntu kernels in the 6.x series (specifically tested on 6.2, 6.8, and 6.11).
Setup:
- Storage: All-flash disk array exported via NFS (v3)
- Client OS: Ubuntu with kernel versions 6.2, 6.8, 6.11
- Application: PostgreSQL using the NFS volume as its data directory
Symptoms:
On affected systems, PostgreSQL processes (particularly autovacuum workers) intermittently hang.
The stack trace shows a consistent pattern involving __nfs_lookup_revalidate:
[<0>] __nfs_lookup_revalidate+0x113/0x160 [nfs]
[<0>] nfs_lookup_revalidate+0x15/0x30 [nfs]
[<0>] lookup_fast+0x87/0x100
[<0>] open_last_lookups+0x5f/0x400
[<0>] path_openat+0x99/0x2d0
[<0>] do_filp_open+0xaf/0x170
[<0>] do_sys_openat2+0xb3/0xe0
[<0>] __x64_sys_openat+0x55/0xa0
[<0>] x64_sys_call+0x1eb1/0x25a0
[<0>] do_syscall_64+0x7f/0x180
[<0>] entry_SYSCALL_64_after_hwframe+0x78/0x80
Process Tree Example:
863813 ? Zsl 12:24 \_ [manager] <defunct>
3644504 ? Ds 0:00 \_ postgres: mlflow: autovacuum worker template1
The autovacuum worker is most commonly affected.
Workaround Attempt:
We observed some improvement by modifying the NFS client source fs/nfs/dir.c (around line 1833):
Change:
dentry->d_fsdata = NFS_FSDATA_BLOCKED;
To:
smp_store_release(&dentry->d_fsdata, NFS_FSDATA_BLOCKED);
While this mitigates the issue somewhat, it does not fully resolve the hangs.
Is this a known issue with NFS in 6.x kernels?
Is there a recommended patch or workaround?
Are there any known regressions related to __nfs_lookup_revalidate or dentry locking?
Problem can be related to the all-flash array, that is able to provide about 30k IOPS over NFS and 5633 TPS in pgbench (pgbench -T 300 -c100 -j20 -r).
Other NFS connections to the same NFS servers are not affected and are usable, however, the process cannot be kille obviously and the client node reboot is required.
I believe that in 5.x kernel series it was more stable.
--
Lukáš Hejtmánek
Linux Administrator only because
Full Time Multitasking Ninja
is not an official job title
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: NFS stuck in nfs_lookup_revalidate
2025-06-03 11:57 NFS stuck in nfs_lookup_revalidate Lukáš Hejtmánek
@ 2025-06-04 17:40 ` Anna Schumaker
[not found] ` <CAOuNp5kzbyVfbdumXJF3bb=RKxdE5P8aKJDeSoLtgEV9=9xU+g@mail.gmail.com>
1 sibling, 0 replies; 5+ messages in thread
From: Anna Schumaker @ 2025-06-04 17:40 UTC (permalink / raw)
To: Lukáš Hejtmánek, linux-nfs@vger.kernel.org; +Cc: Zdenek Salvet
Hi Lukáš
On 6/3/25 7:57 AM, Lukáš Hejtmánek wrote:
>
> Hello,
>
> We are experiencing repeated PostgreSQL process freezes when using NFSv3-mounted storage on clients running Ubuntu kernels in the 6.x series (specifically tested on 6.2, 6.8, and 6.11).
>
> Setup:
> - Storage: All-flash disk array exported via NFS (v3)
> - Client OS: Ubuntu with kernel versions 6.2, 6.8, 6.11
> - Application: PostgreSQL using the NFS volume as its data directory
>
> Symptoms:
> On affected systems, PostgreSQL processes (particularly autovacuum workers) intermittently hang.
> The stack trace shows a consistent pattern involving __nfs_lookup_revalidate:
> [<0>] __nfs_lookup_revalidate+0x113/0x160 [nfs]
> [<0>] nfs_lookup_revalidate+0x15/0x30 [nfs]
> [<0>] lookup_fast+0x87/0x100
> [<0>] open_last_lookups+0x5f/0x400
> [<0>] path_openat+0x99/0x2d0
> [<0>] do_filp_open+0xaf/0x170
> [<0>] do_sys_openat2+0xb3/0xe0
> [<0>] __x64_sys_openat+0x55/0xa0
> [<0>] x64_sys_call+0x1eb1/0x25a0
> [<0>] do_syscall_64+0x7f/0x180
> [<0>] entry_SYSCALL_64_after_hwframe+0x78/0x80
>
> Process Tree Example:
> 863813 ? Zsl 12:24 \_ [manager] <defunct>
> 3644504 ? Ds 0:00 \_ postgres: mlflow: autovacuum worker template1
>
> The autovacuum worker is most commonly affected.
>
> Workaround Attempt:
> We observed some improvement by modifying the NFS client source fs/nfs/dir.c (around line 1833):
>
> Change:
> dentry->d_fsdata = NFS_FSDATA_BLOCKED;
>
> To:
> smp_store_release(&dentry->d_fsdata, NFS_FSDATA_BLOCKED);
>
> While this mitigates the issue somewhat, it does not fully resolve the hangs.
>
> Is this a known issue with NFS in 6.x kernels?
> Is there a recommended patch or workaround?
> Are there any known regressions related to __nfs_lookup_revalidate or dentry locking?
I'm not aware of this being a known issue or any regressions in __nfs_lookup_revalidate(),
so I can't recommend a patch or workaround to try. Have you tried an upstream kernel
to verify if it's still an issue there? There were a handful of patches that went into
v6.14 that touch the lookup path, and I'm curious if they make a difference either way.
Anna
>
> Problem can be related to the all-flash array, that is able to provide about 30k IOPS over NFS and 5633 TPS in pgbench (pgbench -T 300 -c100 -j20 -r).
>
> Other NFS connections to the same NFS servers are not affected and are usable, however, the process cannot be kille obviously and the client node reboot is required.
>
> I believe that in 5.x kernel series it was more stable.
>
> --
> Lukáš Hejtmánek
>
> Linux Administrator only because
> Full Time Multitasking Ninja
> is not an official job title
>
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: NFS stuck in nfs_lookup_revalidate
[not found] ` <CAOuNp5kzbyVfbdumXJF3bb=RKxdE5P8aKJDeSoLtgEV9=9xU+g@mail.gmail.com>
@ 2025-06-04 19:11 ` Zdenek Salvet
2025-07-18 22:19 ` Lukáš Hejtmánek
0 siblings, 1 reply; 5+ messages in thread
From: Zdenek Salvet @ 2025-06-04 19:11 UTC (permalink / raw)
To: Santosh Pradhan; +Cc: Lukáš Hejtmánek, linux-nfs
On Thu, Jun 05, 2025 at 12:00:44AM +0530, Santosh Pradhan wrote:
> I am not sure but I vaguely remember that there was some similar issue and
> Neil introduced store_release_wake_up() which puts a full barrier smp_mb()
> before calling wake_up_var().
>
> index d0e0b435a843..e754e3e478a5 100644
> --- a/fs/nfs/dir.c
> +++ b/fs/nfs/dir.c
> @@ -1830,6 +1830,7 @@ static void unblock_revalidate(struct dentry *dentry)
> {
> /* store_release ensures wait_var_event() sees the update */
> smp_store_release(&dentry->d_fsdata, NULL);
> + smp_mb();
> wake_up_var(&dentry->d_fsdata);
> }
Hello,
yes, upon rereading exact definition of smp_store_release(),
I am reasonably sure the comment is wrong and smp_mb() should be used,
it ensures the right ordering between store to d_fsdata and wait queue
data, not the smp_store_release(&dentry->d_fsdata,...).
Best regards,
Zdenek Salvet salvet@ics.muni.cz
Institute of Computer Science of Masaryk University, Brno, Czech Republic
and CESNET, z.s.p.o., Prague, Czech Republic
Phone: ++420-549 49 6534 Fax: ++420-541 212 747
----------------------------------------------------------------------------
Teamwork is essential -- it allows you to blame someone else.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: NFS stuck in nfs_lookup_revalidate
2025-06-04 19:11 ` Zdenek Salvet
@ 2025-07-18 22:19 ` Lukáš Hejtmánek
2025-07-19 0:02 ` Trond Myklebust
0 siblings, 1 reply; 5+ messages in thread
From: Lukáš Hejtmánek @ 2025-07-18 22:19 UTC (permalink / raw)
To: Santosh Pradhan; +Cc: linux-nfs@vger.kernel.org, salvet@ics.muni.cz
Hello,
> On 4. 6. 2025, at 21:11, Zdenek Salvet <salvet@ics.muni.cz> wrote:
>
> On Thu, Jun 05, 2025 at 12:00:44AM +0530, Santosh Pradhan wrote:
>> I am not sure but I vaguely remember that there was some similar issue and
>> Neil introduced store_release_wake_up() which puts a full barrier smp_mb()
>> before calling wake_up_var().
>>
>> index d0e0b435a843..e754e3e478a5 100644
>> --- a/fs/nfs/dir.c
>> +++ b/fs/nfs/dir.c
>> @@ -1830,6 +1830,7 @@ static void unblock_revalidate(struct dentry *dentry)
>> {
>> /* store_release ensures wait_var_event() sees the update */
>> smp_store_release(&dentry->d_fsdata, NULL);
>> + smp_mb();
>> wake_up_var(&dentry->d_fsdata);
>> }
any chance to get this into mainline? It seems that the current git master does not inlude smb_mb(); in this code.
--
Lukáš Hejtmánek
Linux Administrator only because
Full Time Multitasking Ninja
is not an official job title
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: NFS stuck in nfs_lookup_revalidate
2025-07-18 22:19 ` Lukáš Hejtmánek
@ 2025-07-19 0:02 ` Trond Myklebust
0 siblings, 0 replies; 5+ messages in thread
From: Trond Myklebust @ 2025-07-19 0:02 UTC (permalink / raw)
To: Lukáš Hejtmánek, Santosh Pradhan
Cc: linux-nfs@vger.kernel.org, salvet@ics.muni.cz
On Fri, 2025-07-18 at 22:19 +0000, Lukáš Hejtmánek wrote:
> Hello,
>
> > On 4. 6. 2025, at 21:11, Zdenek Salvet <salvet@ics.muni.cz> wrote:
> >
> > On Thu, Jun 05, 2025 at 12:00:44AM +0530, Santosh Pradhan wrote:
> > > I am not sure but I vaguely remember that there was some similar
> > > issue and
> > > Neil introduced store_release_wake_up() which puts a full
> > > barrier smp_mb()
> > > before calling wake_up_var().
> > >
> > > index d0e0b435a843..e754e3e478a5 100644
> > > --- a/fs/nfs/dir.c
> > > +++ b/fs/nfs/dir.c
> > > @@ -1830,6 +1830,7 @@ static void unblock_revalidate(struct
> > > dentry *dentry)
> > > {
> > > /* store_release ensures wait_var_event() sees the update
> > > */
> > > smp_store_release(&dentry->d_fsdata, NULL);
> > > + smp_mb();
> > > wake_up_var(&dentry->d_fsdata);
> > > }
>
> any chance to get this into mainline? It seems that the current git
> master does not inlude smb_mb(); in this code.
>
>
I'm adding a fix that will go in to the next merge window.
--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
trondmy@kernel.org, trond.myklebust@hammerspace.com
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2025-07-19 0:02 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-03 11:57 NFS stuck in nfs_lookup_revalidate Lukáš Hejtmánek
2025-06-04 17:40 ` Anna Schumaker
[not found] ` <CAOuNp5kzbyVfbdumXJF3bb=RKxdE5P8aKJDeSoLtgEV9=9xU+g@mail.gmail.com>
2025-06-04 19:11 ` Zdenek Salvet
2025-07-18 22:19 ` Lukáš Hejtmánek
2025-07-19 0:02 ` Trond Myklebust
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox