From: Qu Wenruo <wqu@suse.com>
To: Mark Harmstone <mark@harmstone.com>, dsterba@suse.cz
Cc: linux-btrfs@vger.kernel.org, josef@toxicpanda.com, boris@bur.io
Subject: Re: [PATCH] btrfs: don't force DIO writes to be serialized
Date: Thu, 23 Apr 2026 19:50:43 +0930 [thread overview]
Message-ID: <5a405a46-d210-453e-ae72-7730172cfe71@suse.com> (raw)
In-Reply-To: <850c7c2d-ed97-424f-8ede-4491bacb02ac@harmstone.com>
在 2026/4/23 19:34, Mark Harmstone 写道:
> On 22/04/2026 9.57 pm, David Sterba wrote:
>> On Wed, Apr 22, 2026 at 03:03:35PM +0100, Mark Harmstone wrote:
>>> Before btrfs switched to the new mount API in 2023, we were setting
>>> SB_NOSEC in btrfs_mount_root(). This flag tells the VFS that the
>>> filesystem may have files which don't have security xattrs, enabling it
>>> to do some optimizations.
>>>
>>> Unfortunately this was missed in the transition, meaning that IS_NOSEC
>>> will always return false for a btrfs inode. This means that
>>> btrfs_direct_write() calls will always get the inode lock exclusively,
>>> meaning that DIO writes to the same file will be serialized.
>>>
>>> On my machine, this one-line change results in a ~59% improvement in DIO
>>> throughput:
>>
>> That's quite an improvement. What's the actual fio script you've used?
>> Also the DIO depends on the block group profile wrt the buffered
>> fallback so that would be good to know too.
>
> It is. There's a big dropoff in DIO write performance in 6.8 that we
> never recovered from.
There is the bounded page solution from iomap already, which will no
longer fallback to buffered IO but to use extra page copy to make sure
the final bio won't change its content halfway.
IIRC it's one extra flag and remove the btrfs' specific fallback checks,
but I haven't yet verified the behavior/code.
Thanks,
Qu
> I'm going to look into some sort of automated
> performance so this kind of thing can't happen casually.
>
> This was on a VM with 8 cores and 8GB of RAM, with a real NVMe exposed
> through PCI passthrough. The figures for XFS and ext4 in comparison are
> both about ~3GB/s.
>
> # cat go
> #!/bin/bash
> mkfs.btrfs -f /dev/nvme0n1
> mount /dev/nvme0n1 /mnt/test
> mkdir /mnt/test/nocow
> chattr +C /mnt/test/nocow
> fio /root/test.fio
>
> # cat /root/test.fio
> [global]
> rw=randwrite
> ioengine=io_uring
> iodepth=64
> size=1g
> direct=1
> startdelay=20
> force_async=4
> ramp_time=5
> runtime=60
> group_reporting=1
> numjobs=32
> time_based
> disk_util=0
> clat_percentiles=0
> disable_lat=1
> disable_clat=1
> disable_slat=1
> filename=/mnt/test/nocow/fiofile
> [test]
> name=test
> bs=4k
> stonewall
>
>
next prev parent reply other threads:[~2026-04-23 10:20 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-22 14:03 [PATCH] btrfs: don't force DIO writes to be serialized Mark Harmstone
2026-04-22 20:57 ` David Sterba
2026-04-23 10:04 ` Mark Harmstone
2026-04-23 10:20 ` Qu Wenruo [this message]
2026-04-23 10:26 ` Mark Harmstone
2026-04-24 10:28 ` David Sterba
2026-04-28 15:13 ` David Sterba
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=5a405a46-d210-453e-ae72-7730172cfe71@suse.com \
--to=wqu@suse.com \
--cc=boris@bur.io \
--cc=dsterba@suse.cz \
--cc=josef@toxicpanda.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=mark@harmstone.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox