From: Christian Brauner <brauner@kernel.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Christian Brauner <brauner@kernel.org>,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: [GIT PULL 06/14 for v6.17] vfs fallocate
Date: Fri, 25 Jul 2025 13:27:17 +0200 [thread overview]
Message-ID: <20250725-vfs-fallocate-91b9067277e8@brauner> (raw)
In-Reply-To: <20250725-vfs-617-1bcbd4ae2ea6@brauner>
Hey Linus,
/* Summary */
fallocate() currently supports creating preallocated files efficiently.
However, on most filesystems fallocate() will preallocate blocks in an
unwriten state even if FALLOC_FL_ZERO_RANGE is specified.
The extent state must later be converted to a written state when the
user writes data into this range, which can trigger numerous metadata
changes and journal I/O. This may leads to significant write
amplification and performance degradation in synchronous write mode.
At the moment, the only method to avoid this is to create an empty file
and write zero data into it (for example, using 'dd' with a large block
size). However, this method is slow and consumes a considerable amount
of disk bandwidth.
Now that more and more flash-based storage devices are available it is
possible to efficiently write zeros to SSDs using the unmap write zeroes
command if the devices do not write physical zeroes to the media.
For example, if SCSI SSDs support the UMMAP bit or NVMe SSDs support the
DEAC bit[1], the write zeroes command does not write actual data to the
device, instead, NVMe converts the zeroed range to a deallocated state,
which works fast and consumes almost no disk write bandwidth.
This series implements the BLK_FEAT_WRITE_ZEROES_UNMAP feature and
BLK_FLAG_WRITE_ZEROES_UNMAP_DISABLED flag for SCSI, NVMe and
device-mapper drivers, and add the FALLOC_FL_WRITE_ZEROES and
STATX_ATTR_WRITE_ZEROES_UNMAP support for ext4 and raw bdev devices.
fallocate() is subsequently extended with the FALLOC_FL_WRITE_ZEROES
flag. FALLOC_FL_WRITE_ZEROES zeroes a specified file range in such a way
that subsequent writes to that range do not require further changes to
the file mapping metadata. This flag is beneficial for subsequent pure
overwriting within this range, as it can save on block allocation and,
consequently, significant metadata changes.
/* Testing */
gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3)
No build failures or warnings were observed.
/* Conflicts */
Merge conflicts with mainline
=============================
No known conflicts.
Merge conflicts with other trees
================================
No known conflicts.
The following changes since commit e04c78d86a9699d136910cfc0bdcf01087e3267e:
Linux 6.16-rc2 (2025-06-15 13:49:41 -0700)
are available in the Git repository at:
git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.17-rc1.fallocate
for you to fetch changes up to 4f984fe7b4d9aea332c7ff59827a4e168f0e4e1b:
Merge patch series "fallocate: introduce FALLOC_FL_WRITE_ZEROES flag" (2025-06-23 12:45:32 +0200)
Please consider pulling these changes from the signed vfs-6.17-rc1.fallocate tag.
Thanks!
Christian
----------------------------------------------------------------
vfs-6.17-rc1.fallocate
----------------------------------------------------------------
Christian Brauner (1):
Merge patch series "fallocate: introduce FALLOC_FL_WRITE_ZEROES flag"
Zhang Yi (9):
block: introduce max_{hw|user}_wzeroes_unmap_sectors to queue limits
nvme: set max_hw_wzeroes_unmap_sectors if device supports DEAC bit
nvmet: set WZDS and DRB if device enables unmap write zeroes operation
scsi: sd: set max_hw_wzeroes_unmap_sectors if device supports SD_ZERO_*_UNMAP
dm: clear unmap write zeroes limits when disabling write zeroes
fs: introduce FALLOC_FL_WRITE_ZEROES to fallocate
block: factor out common part in blkdev_fallocate()
block: add FALLOC_FL_WRITE_ZEROES support
ext4: add FALLOC_FL_WRITE_ZEROES support
Documentation/ABI/stable/sysfs-block | 33 ++++++++++++++++++
block/blk-settings.c | 20 +++++++++--
block/blk-sysfs.c | 26 ++++++++++++++
block/fops.c | 44 +++++++++++++-----------
drivers/md/dm-table.c | 4 ++-
drivers/nvme/host/core.c | 20 ++++++-----
drivers/nvme/target/io-cmd-bdev.c | 4 +++
drivers/scsi/sd.c | 5 +++
fs/ext4/extents.c | 66 ++++++++++++++++++++++++++++++------
fs/open.c | 1 +
include/linux/blkdev.h | 10 ++++++
include/linux/falloc.h | 3 +-
include/trace/events/ext4.h | 3 +-
include/uapi/linux/falloc.h | 17 ++++++++++
14 files changed, 212 insertions(+), 44 deletions(-)
next prev parent reply other threads:[~2025-07-25 11:27 UTC|newest]
Thread overview: 44+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-25 11:27 [GIT PULL 00/14 for v6.17] vfs 6.17 Christian Brauner
2025-07-25 11:27 ` [GIT PULL 05/14 for v6.17] vfs async dir Christian Brauner
2025-07-28 23:40 ` pr-tracker-bot
2025-07-25 11:27 ` [GIT PULL 09/14 for v6.17] vfs bpf Christian Brauner
2025-07-28 23:40 ` pr-tracker-bot
2025-07-29 18:15 ` Alexei Starovoitov
2025-07-31 8:27 ` Christian Brauner
2025-07-31 21:57 ` Alexei Starovoitov
2025-08-04 14:24 ` Christian Brauner
2025-07-25 11:27 ` [GIT PULL 02/14 for v6.17] vfs coredump Christian Brauner
2025-07-28 18:57 ` Linus Torvalds
2025-07-31 9:37 ` Christian Brauner
2025-07-28 23:40 ` pr-tracker-bot
2025-07-25 11:27 ` Christian Brauner [this message]
2025-07-28 23:40 ` [GIT PULL 06/14 for v6.17] vfs fallocate pr-tracker-bot
2025-07-25 11:27 ` [GIT PULL 12/14 for v6.17] vfs fileattr Christian Brauner
2025-07-28 23:40 ` pr-tracker-bot
2025-07-25 11:27 ` [GIT PULL 11/14 for v6.17] vfs integrity Christian Brauner
2025-07-28 1:29 ` Hugh Dickins
2025-07-28 22:21 ` Linus Torvalds
2025-07-29 7:49 ` Christoph Hellwig
2025-07-29 8:39 ` Linus Torvalds
2025-07-31 8:00 ` Christian Brauner
2025-07-28 23:40 ` pr-tracker-bot
2025-07-25 11:27 ` [GIT PULL 14/14 for v6.17] vfs iomap Christian Brauner
2025-07-27 13:10 ` Sasha Levin
2025-07-28 16:39 ` Joanne Koong
2025-07-31 8:29 ` Christian Brauner
2025-07-28 23:40 ` pr-tracker-bot
2025-07-25 11:27 ` [GIT PULL 01/14 for v6.17] vfs misc Christian Brauner
2025-07-28 23:40 ` pr-tracker-bot
2025-07-25 11:27 ` [GIT PULL 07/14 for v6.17] vfs mmap Christian Brauner
2025-07-28 23:40 ` pr-tracker-bot
2025-07-25 11:27 ` [GIT PULL 04/14 for v6.17] namespace updates Christian Brauner
2025-07-28 23:40 ` pr-tracker-bot
2025-07-25 11:27 ` [GIT PULL 03/14 for v6.17] overlayfs Christian Brauner
2025-07-28 23:40 ` pr-tracker-bot
2025-07-25 11:27 ` [GIT PULL 08/14 for v6.17] vfs pidfs Christian Brauner
2025-07-28 23:40 ` pr-tracker-bot
2025-07-25 11:27 ` [GIT PULL 10/14 for v6.17] vfs rust Christian Brauner
2025-07-28 23:40 ` pr-tracker-bot
2025-07-25 11:27 ` [GIT PULL 13/14 for v6.17] vfs super Christian Brauner
2025-07-28 23:40 ` pr-tracker-bot
2025-07-31 9:40 ` [GIT PULL 00/14 for v6.17] vfs 6.17 Christian Brauner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250725-vfs-fallocate-91b9067277e8@brauner \
--to=brauner@kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).