linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Christian Brauner <brauner@kernel.org>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Christian Brauner <brauner@kernel.org>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org
Subject: [GIT PULL 06/14 for v6.17] vfs fallocate
Date: Fri, 25 Jul 2025 13:27:17 +0200	[thread overview]
Message-ID: <20250725-vfs-fallocate-91b9067277e8@brauner> (raw)
In-Reply-To: <20250725-vfs-617-1bcbd4ae2ea6@brauner>

Hey Linus,

/* Summary */
fallocate() currently supports creating preallocated files efficiently.
However, on most filesystems fallocate() will preallocate blocks in an
unwriten state even if FALLOC_FL_ZERO_RANGE is specified.

The extent state must later be converted to a written state when the
user writes data into this range, which can trigger numerous metadata
changes and journal I/O. This may leads to significant write
amplification and performance degradation in synchronous write mode.

At the moment, the only method to avoid this is to create an empty file
and write zero data into it (for example, using 'dd' with a large block
size). However, this method is slow and consumes a considerable amount
of disk bandwidth.

Now that more and more flash-based storage devices are available it is
possible to efficiently write zeros to SSDs using the unmap write zeroes
command if the devices do not write physical zeroes to the media.

For example, if SCSI SSDs support the UMMAP bit or NVMe SSDs support the
DEAC bit[1], the write zeroes command does not write actual data to the
device, instead, NVMe converts the zeroed range to a deallocated state,
which works fast and consumes almost no disk write bandwidth.

This series implements the BLK_FEAT_WRITE_ZEROES_UNMAP feature and
BLK_FLAG_WRITE_ZEROES_UNMAP_DISABLED flag for SCSI, NVMe and
device-mapper drivers, and add the FALLOC_FL_WRITE_ZEROES and
STATX_ATTR_WRITE_ZEROES_UNMAP support for ext4 and raw bdev devices.

fallocate() is subsequently extended with the FALLOC_FL_WRITE_ZEROES
flag. FALLOC_FL_WRITE_ZEROES zeroes a specified file range in such a way
that subsequent writes to that range do not require further changes to
the file mapping metadata. This flag is beneficial for subsequent pure
overwriting within this range, as it can save on block allocation and,
consequently, significant metadata changes.

/* Testing */

gcc (Debian 14.2.0-19) 14.2.0
Debian clang version 19.1.7 (3)

No build failures or warnings were observed.

/* Conflicts */

Merge conflicts with mainline
=============================

No known conflicts.

Merge conflicts with other trees
================================

No known conflicts.

The following changes since commit e04c78d86a9699d136910cfc0bdcf01087e3267e:

  Linux 6.16-rc2 (2025-06-15 13:49:41 -0700)

are available in the Git repository at:

  git@gitolite.kernel.org:pub/scm/linux/kernel/git/vfs/vfs tags/vfs-6.17-rc1.fallocate

for you to fetch changes up to 4f984fe7b4d9aea332c7ff59827a4e168f0e4e1b:

  Merge patch series "fallocate: introduce FALLOC_FL_WRITE_ZEROES flag" (2025-06-23 12:45:32 +0200)

Please consider pulling these changes from the signed vfs-6.17-rc1.fallocate tag.

Thanks!
Christian

----------------------------------------------------------------
vfs-6.17-rc1.fallocate

----------------------------------------------------------------
Christian Brauner (1):
      Merge patch series "fallocate: introduce FALLOC_FL_WRITE_ZEROES flag"

Zhang Yi (9):
      block: introduce max_{hw|user}_wzeroes_unmap_sectors to queue limits
      nvme: set max_hw_wzeroes_unmap_sectors if device supports DEAC bit
      nvmet: set WZDS and DRB if device enables unmap write zeroes operation
      scsi: sd: set max_hw_wzeroes_unmap_sectors if device supports SD_ZERO_*_UNMAP
      dm: clear unmap write zeroes limits when disabling write zeroes
      fs: introduce FALLOC_FL_WRITE_ZEROES to fallocate
      block: factor out common part in blkdev_fallocate()
      block: add FALLOC_FL_WRITE_ZEROES support
      ext4: add FALLOC_FL_WRITE_ZEROES support

 Documentation/ABI/stable/sysfs-block | 33 ++++++++++++++++++
 block/blk-settings.c                 | 20 +++++++++--
 block/blk-sysfs.c                    | 26 ++++++++++++++
 block/fops.c                         | 44 +++++++++++++-----------
 drivers/md/dm-table.c                |  4 ++-
 drivers/nvme/host/core.c             | 20 ++++++-----
 drivers/nvme/target/io-cmd-bdev.c    |  4 +++
 drivers/scsi/sd.c                    |  5 +++
 fs/ext4/extents.c                    | 66 ++++++++++++++++++++++++++++++------
 fs/open.c                            |  1 +
 include/linux/blkdev.h               | 10 ++++++
 include/linux/falloc.h               |  3 +-
 include/trace/events/ext4.h          |  3 +-
 include/uapi/linux/falloc.h          | 17 ++++++++++
 14 files changed, 212 insertions(+), 44 deletions(-)

  parent reply	other threads:[~2025-07-25 11:27 UTC|newest]

Thread overview: 44+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-25 11:27 [GIT PULL 00/14 for v6.17] vfs 6.17 Christian Brauner
2025-07-25 11:27 ` [GIT PULL 05/14 for v6.17] vfs async dir Christian Brauner
2025-07-28 23:40   ` pr-tracker-bot
2025-07-25 11:27 ` [GIT PULL 09/14 for v6.17] vfs bpf Christian Brauner
2025-07-28 23:40   ` pr-tracker-bot
2025-07-29 18:15   ` Alexei Starovoitov
2025-07-31  8:27     ` Christian Brauner
2025-07-31 21:57       ` Alexei Starovoitov
2025-08-04 14:24         ` Christian Brauner
2025-07-25 11:27 ` [GIT PULL 02/14 for v6.17] vfs coredump Christian Brauner
2025-07-28 18:57   ` Linus Torvalds
2025-07-31  9:37     ` Christian Brauner
2025-07-28 23:40   ` pr-tracker-bot
2025-07-25 11:27 ` Christian Brauner [this message]
2025-07-28 23:40   ` [GIT PULL 06/14 for v6.17] vfs fallocate pr-tracker-bot
2025-07-25 11:27 ` [GIT PULL 12/14 for v6.17] vfs fileattr Christian Brauner
2025-07-28 23:40   ` pr-tracker-bot
2025-07-25 11:27 ` [GIT PULL 11/14 for v6.17] vfs integrity Christian Brauner
2025-07-28  1:29   ` Hugh Dickins
2025-07-28 22:21     ` Linus Torvalds
2025-07-29  7:49       ` Christoph Hellwig
2025-07-29  8:39         ` Linus Torvalds
2025-07-31  8:00           ` Christian Brauner
2025-07-28 23:40   ` pr-tracker-bot
2025-07-25 11:27 ` [GIT PULL 14/14 for v6.17] vfs iomap Christian Brauner
2025-07-27 13:10   ` Sasha Levin
2025-07-28 16:39     ` Joanne Koong
2025-07-31  8:29       ` Christian Brauner
2025-07-28 23:40   ` pr-tracker-bot
2025-07-25 11:27 ` [GIT PULL 01/14 for v6.17] vfs misc Christian Brauner
2025-07-28 23:40   ` pr-tracker-bot
2025-07-25 11:27 ` [GIT PULL 07/14 for v6.17] vfs mmap Christian Brauner
2025-07-28 23:40   ` pr-tracker-bot
2025-07-25 11:27 ` [GIT PULL 04/14 for v6.17] namespace updates Christian Brauner
2025-07-28 23:40   ` pr-tracker-bot
2025-07-25 11:27 ` [GIT PULL 03/14 for v6.17] overlayfs Christian Brauner
2025-07-28 23:40   ` pr-tracker-bot
2025-07-25 11:27 ` [GIT PULL 08/14 for v6.17] vfs pidfs Christian Brauner
2025-07-28 23:40   ` pr-tracker-bot
2025-07-25 11:27 ` [GIT PULL 10/14 for v6.17] vfs rust Christian Brauner
2025-07-28 23:40   ` pr-tracker-bot
2025-07-25 11:27 ` [GIT PULL 13/14 for v6.17] vfs super Christian Brauner
2025-07-28 23:40   ` pr-tracker-bot
2025-07-31  9:40 ` [GIT PULL 00/14 for v6.17] vfs 6.17 Christian Brauner

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250725-vfs-fallocate-91b9067277e8@brauner \
    --to=brauner@kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).