Linux XFS filesystem development
 help / color / mirror / Atom feed
From: Carlos Maiolino <cem@kernel.org>
To: Andres Freund <andres@anarazel.de>
Cc: linux-xfs@vger.kernel.org, Christoph Hellwig <hch@lst.de>,
	 Christian Brauner <brauner@kernel.org>,
	Pankaj <pankaj.raghav@linux.dev>
Subject: Re: Increase in XFS journal flushes with (direct_write;fdatasync)+
Date: Wed, 6 May 2026 17:05:11 +0200	[thread overview]
Message-ID: <aftQP63r22Q9SWb9@andromeda.toxiclabs.cc> (raw)
In-Reply-To: <7ys6erh3nnyeerv2nybyfvp7dmaknuxrlxv74wx56ocdothkc6@ekfiadtkfn2r>

On Wed, May 06, 2026 at 09:26:25AM -0400, Andres Freund wrote:
> Hi,
> 
> While looking at performance issues on Samsung client drives due to slow FUA,
> I tried to reproduce older numbers on a recent kernel.  And couldn't, at first
> - but not because the problem went away, but because the fdatasync numbers
> (which shouldn't use FUA) got *much* worse.
> 
> These drives have FUA writes that are slower than full flushes, making
> O_DIRECT|O_DSYNC writes perform poorly and fdatasync() comparatively better.
> 
> 
> What I'm seeing is that with recent kernels the fdatasync() performance is
> roughly as bad as the O_DSYNC, whereas previously it was > 2x as
> fasts. blktrace showed that there are ongoing FUA writes during a workload
> with just overwriting writes and an fdatasync after every write.
> 
> 
> At first I thought it was a regression between 7.0..7.1-rc2, but that turned
> out to be only because the 7.0 machine did not have lazytime enabled. After
> fixing that discrepancy, the regression is also visible in 7.0.  I have
> confirmed it's not visible in 6.18.
> 
> Repro Workload:
> 
> fio --directory ${mountpoint}/fio/ --overwrite 1 --size=$((4096*123)) --buffered 0 --bs=4096 --rw=write --name write-fdatasync --fdatasync=1 |grep IOPS
> 
> On v7.1-rc2-5-g6d35786de2811:
> 
> mounted with lazytime:
> write: IOPS=158, BW=636KiB/s (651kB/s)(492KiB/774msec); 0 zone resets
> 
> mounted with nolazyatime:
> write: IOPS=594, BW=2377KiB/s (2434kB/s)(492KiB/207msec); 0 zone resets
> 
> 
> 
> Running it with perf stat and a few events [1] shows:
> 
> using lazytime
>   write: IOPS=174, BW=697KiB/s (714kB/s)(492KiB/706msec); 0 zone resets
> 
>  Performance counter stats for 'fio --directory /srv/fio/ --overwrite 1 --size=503808 --buffered 0 --bs=4096 --rw=write --name write-fdatasync --fdatasync=1':
> 
>                123      syscalls:sys_enter_pwrite64
>                122      syscalls:sys_exit_fdatasync
>                121      xfs:xlog_iclog_write
>                121      xfs:xlog_iclog_sync
>                123      xfs:xfs_file_direct_write
>                122      xfs:xfs_update_time
>                122      xfs:xfs_log_reserve
>                123      xfs:xfs_trans_add_item
>                  8      writeback:writeback_dirty_inode
>                122      xfs:xfs_trans_commit
> 
>        1.170287744 seconds time elapsed
> 
>        0.192673000 seconds user
>        0.054510000 seconds sys
> 
> 
> using nolazytime
>   write: IOPS=672, BW=2689KiB/s (2753kB/s)(492KiB/183msec); 0 zone resets
> 
>  Performance counter stats for 'fio --directory /srv/fio/ --overwrite 1 --size=503808 --buffered 0 --bs=4096 --rw=write --name write-fdatasync --fdatasync=1':
> 
>                123      syscalls:sys_enter_pwrite64
>                122      syscalls:sys_exit_fdatasync
>                  1      xfs:xlog_iclog_write
>                  1      xfs:xlog_iclog_sync
>                123      xfs:xfs_file_direct_write
>                 55      xfs:xfs_update_time
>                 55      xfs:xfs_log_reserve
>                 55      xfs:xfs_trans_add_item
>                  7      writeback:writeback_dirty_inode
>                 55      xfs:xfs_trans_commit
> 
>        0.667253953 seconds time elapsed
> 
>        0.160385000 seconds user
>        0.061264000 seconds sys
> 
> 
> The relevant difference presumably is that nolazytime has a lot more log
> flushes (xfs:xlog_iclog_sync).
> 
> 
> ext4 does not show that behaviour.
> 
> 
> Presumably this happened as part of
> 
> commit 74554251dfc9374ebf1a9dfc54d6745d56bb9265
> Merge: 996812c453caf 77ef2c3ff5916
> Author: Linus Torvalds <torvalds@linux-foundation.org>
> Date:   2026-02-09 11:25:01 -0800
> 
>     Merge tag 'vfs-7.0-rc1.nonblocking_timestamps' of git://git.kernel.org/pub/scm/linux/kernel/git/vfs/vfs
> 
>     Pull vfs timestamp updates from Christian Brauner:
>      "This contains the changes to support non-blocking timestamp updates.
> 
> Or in one of the followup fixes.  Hence CCing the folks involved in that.

Thanks for the info, I'll look into it next week assuming nobody looks
into it first.

> 
> 
> Greetings,
> 
> Andres Freund
> 
> 
> [1] mountpoint=/srv; for opt in lazytime nolazytime; do echo "using $opt"; mount $mountpoint -o remount,$opt && perf stat -e syscalls:sys_enter_pwrite64,syscalls:sys_exit_fdatasync,xfs:xlog_iclog_write,xfs:xlog_iclog_sync,xfs:xfs_file_direct_write,xfs:xfs_update_time,xfs:xfs_log_reserve,xfs:xfs_trans_add_item,writeback:writeback_dirty_inode,xfs:xfs_trans_commit fio --directory ${mountpoint}/fio/ --overwrite 1 --size=$((4096*123)) --buffered 0 --bs=4096 --rw=write --name write-fdatasync --fdatasync=1 |grep IOPS || break;done

  reply	other threads:[~2026-05-06 15:05 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <IS8F8EYS5pW4UU5a3jxOTy-f18EgkDa_2zAUswRgTm6NVtvmajAaQyu9CDxkTelDnfXfCl7L_692C77zRAxwFQ==@protonmail.internalid>
2026-05-06 13:26 ` Increase in XFS journal flushes with (direct_write;fdatasync)+ Andres Freund
2026-05-06 15:05   ` Carlos Maiolino [this message]
2026-05-07 20:34   ` Pankaj Raghav (Samsung)
2026-05-08  8:10     ` Christoph Hellwig
2026-05-08  8:29       ` Pankaj Raghav
2026-05-08  8:43         ` Christoph Hellwig
2026-05-08 11:42           ` Jeff Layton
2026-05-08 11:47             ` Pankaj Raghav
2026-05-11  8:56               ` Christoph Hellwig
2026-05-11 10:31                 ` Pankaj Raghav

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aftQP63r22Q9SWb9@andromeda.toxiclabs.cc \
    --to=cem@kernel.org \
    --cc=andres@anarazel.de \
    --cc=brauner@kernel.org \
    --cc=hch@lst.de \
    --cc=linux-xfs@vger.kernel.org \
    --cc=pankaj.raghav@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox