Linux NFS development
 help / color / mirror / Atom feed
From: Jeff Layton <jlayton@kernel.org>
To: "Mkrtchyan, Tigran" <tigran.mkrtchyan@desy.de>,
	linux-nfs <linux-nfs@vger.kernel.org>
Subject: Re: Blocking stat calls during (p)NFS write
Date: Thu, 09 Oct 2025 07:48:57 -0400	[thread overview]
Message-ID: <3302602f30bb95a69af97b3adada8da0147f3781.camel@kernel.org> (raw)
In-Reply-To: <1050158977.25165493.1760007431022.JavaMail.zimbra@desy.de>

On Thu, 2025-10-09 at 12:57 +0200, Mkrtchyan, Tigran wrote:
> Dear NFS fellows,
> 
> We have noticed that when data is written into a large file, a stat call on that file blocks.
> This is quite simple to reproduce. Open two terminals, in one copy a 4GB file into NFS NFS-mounted
> directory, in another window run `stat /path/to/file`.
> 
> By looking at the perf output, I can see:
> 
> stat  301583 [007] 265242.393512:      sched:sched_wake_idle_without_ipi: cpu=2
>         ffffffffb5e0cfa0 call_function_single_prep_ipi+0x90 ([kernel.kallsyms])
>         ffffffffb5e0cfa0 call_function_single_prep_ipi+0x90 ([kernel.kallsyms])
>         ffffffffb5ef176b __smp_call_single_queue+0xdb ([kernel.kallsyms])
>         ffffffffb5ef1866 generic_exec_single+0x36 ([kernel.kallsyms])
>         ffffffffb5ef1bf2 smp_call_function_single_async+0x22 ([kernel.kallsyms])
>         ffffffffb5ec97f4 update_process_times+0xa4 ([kernel.kallsyms])
>         ffffffffb5ee279f tick_nohz_handler+0x8f ([kernel.kallsyms])
>         ffffffffb5eca7c0 __hrtimer_run_queues+0x110 ([kernel.kallsyms])
>         ffffffffb5ecb62c hrtimer_interrupt+0xfc ([kernel.kallsyms])
>         ffffffffb5d5deb5 __sysvec_apic_timer_interrupt+0x55 ([kernel.kallsyms])
>         ffffffffb6f6364c sysvec_apic_timer_interrupt+0x6c ([kernel.kallsyms])
>         ffffffffb5a0160a asm_sysvec_apic_timer_interrupt+0x1a ([kernel.kallsyms])
>         ffffffffb5e07b8f finish_task_switch.isra.0+0x9f ([kernel.kallsyms])
>         ffffffffb6f6ad91 __schedule+0x301 ([kernel.kallsyms])
>         ffffffffb6f6b277 schedule+0x27 ([kernel.kallsyms])
>         ffffffffb6f6b326 io_schedule+0x46 ([kernel.kallsyms])
>         ffffffffb608fddf folio_wait_bit+0xef ([kernel.kallsyms])
>         ffffffffb609c95e folio_wait_writeback+0x2e ([kernel.kallsyms])
>         ffffffffb608f0ff __filemap_fdatawait_range+0x7f ([kernel.kallsyms])
>         ffffffffb6091bc8 filemap_write_and_wait_range+0xc8 ([kernel.kallsyms])
>         ffffffffc1fc5cc7 nfs_getattr+0x567 ([kernel.kallsyms])
>         ffffffffb61eb3fe vfs_getattr_nosec+0xbe ([kernel.kallsyms])
>         ffffffffb61eb673 vfs_statx+0xa3 ([kernel.kallsyms])
>         ffffffffb61ec363 do_statx+0x63 ([kernel.kallsyms])
>         ffffffffb61ec5c0 __x64_sys_statx+0x90 ([kernel.kallsyms])
>         ffffffffb6f5e5ae do_syscall_64+0x7e ([kernel.kallsyms])
>         ffffffffb5a0012f entry_SYSCALL_64_after_hwframe+0x76 ([kernel.kallsyms])
>             7f17c3b1fb6e statx+0xe (/usr/lib64/libc.so.6)
>             55b6106f684f main+0x45f (/usr/bin/stat)
>             7f17c3a3a5b5 __libc_start_call_main+0x75 (/usr/lib64/libc.so.6)
>             7f17c3a3a668 __libc_start_main@@GLIBC_2.34+0x88 (/usr/lib64/libc.so.6)
>             55b6106f6bb5 _start+0x25 (/usr/bin/stat)
> 
> 
> So I assume that blocking comes from inode invalidation in inode.c#nfs_getattr call:
> 
> 1003         /* Flush out writes to the server in order to update c/mtime/version.  */
> 1004         if ((request_mask & (STATX_CTIME | STATX_MTIME | STATX_CHANGE_COOKIE)) &&
> 1005             S_ISREG(inode->i_mode)) {
> 1006                 if (nfs_have_delegated_mtime(inode))
> 1007                         filemap_fdatawrite(inode->i_mapping);
> 1008                 else
> 1009                         filemap_write_and_wait(inode->i_mapping);
> 1010         }
> 

stat() requires updated mtime and ctime. The client must write any
dirty data back before it can report those accurately since the server
is authoritative for those attributes. If the server supports delegated
timestamps however, then you don't need to wait for writeback to
finish.

> 
> The packets look as follows:
> 
> No.     Time           Rpc Time   Source                Destination           Protocol Info
>      16 21.114244808              10.1.0.71       10.1.0.34         NFS      V4 Call (Reply In 18) OPEN DH: 0xf359c2c5/f42.iso | LAYOUTGET
>      18 21.158017758   0.043772950 10.1.0.34         10.1.0.71       NFS      V4 Reply (Call In 16) OPEN StateID: 0x7ca4 | LAYOUTGET
>      29 42.354912088              10.1.0.71       10.1.0.34         NFS      V4 Call (Reply In 30) LAYOUTCOMMIT
>      30 42.357843007   0.002930919 10.1.0.34         10.1.0.71       NFS      V4 Reply (Call In 29) LAYOUTCOMMIT
>      32 42.357993362              10.1.0.71       10.1.0.34         NFS      V4 Call (Reply In 35) GETATTR FH: 0x7d6441d9
>      33 42.358016100              10.1.0.71       10.1.0.34         NFS      V4 Call (Reply In 36) LAYOUTRETURN
>      35 42.359324091   0.001330729 10.1.0.34         10.1.0.71       NFS      V4 Reply (Call In 32) GETATTR
>      36 42.378923839   0.020907739 10.1.0.34         10.1.0.71       NFS      V4 Reply (Call In 33) LAYOUTRETURN
>      38 42.379133795              10.1.0.71       10.1.0.34         NFS      V4 Call (Reply In 39) CLOSE StateID: 0x7ca4
>      39 42.380213999   0.001080204 10.1.0.34         10.1.0.71       NFS      V4 Reply (Call In 38) CLOSE 
> 
> So, GETATTR is sent after LAYOUTCOMMIT.
> This behavior is observed with 6.17 and RHEL kernels.

-- 
Jeff Layton <jlayton@kernel.org>

      reply	other threads:[~2025-10-09 11:48 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-10-09 10:57 Blocking stat calls during (p)NFS write Mkrtchyan, Tigran
2025-10-09 11:48 ` Jeff Layton [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=3302602f30bb95a69af97b3adada8da0147f3781.camel@kernel.org \
    --to=jlayton@kernel.org \
    --cc=linux-nfs@vger.kernel.org \
    --cc=tigran.mkrtchyan@desy.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox