public inbox for linux-fsdevel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH -next v2 00/22]  ext4: use iomap for regular file's buffered I/O path
@ 2026-02-03  6:25 Zhang Yi
  2026-02-03  6:25 ` [PATCH -next v2 01/22] ext4: make ext4_block_zero_page_range() pass out did_zero Zhang Yi
                   ` (22 more replies)
  0 siblings, 23 replies; 56+ messages in thread
From: Zhang Yi @ 2026-02-03  6:25 UTC (permalink / raw)
  To: linux-ext4
  Cc: linux-fsdevel, linux-kernel, tytso, adilger.kernel, jack, ojaswin,
	ritesh.list, hch, djwong, yi.zhang, yi.zhang, yizhang089,
	libaokun1, yangerkun, yukuai

From: Zhang Yi <yi.zhang@huaweicloud.com>

Changes since V1:
 - Rebase this series on linux-next 20260122.
 - Refactor partial block zero range, stop passing handle to
   ext4_block_truncate_page() and ext4_zero_partial_blocks(), and move
   partial block zeroing operation outside an active journal transaction
   to prevent potential deadlocks because of the lock ordering of folio
   and transaction start.
 - Clarify the lock ordering of folio lock and transaction start, update
   the comments accordingly.
 - Fix some issues related to fast commit, pollute post-EOF folio.
 - Some minor code and comments optimizations.

v1:     https://lore.kernel.org/linux-ext4/20241022111059.2566137-1-yi.zhang@huaweicloud.com/
RFC v4: https://lore.kernel.org/linux-ext4/20240410142948.2817554-1-yi.zhang@huaweicloud.com/
RFC v3: https://lore.kernel.org/linux-ext4/20240127015825.1608160-1-yi.zhang@huaweicloud.com/
RFC v2: https://lore.kernel.org/linux-ext4/20240102123918.799062-1-yi.zhang@huaweicloud.com/
RFC v1: https://lore.kernel.org/linux-ext4/20231123125121.4064694-1-yi.zhang@huaweicloud.com/

Original Cover (Updated):

This series adds the iomap buffered I/O path supports for regular files.
It implements the core iomap APIs on ext4 and introduces two mount
options called 'buffered_iomap' and "nobuffered_iomap" to enable and
disable the iomap buffered I/O path. This series supports the default
features, default mount options and bigalloc feature for ext4. We do not
yet support online defragmentation, inline data, fs_verify, fs_crypt,
non-extent, and data=journal mode, it will fall to buffered_head I/O
path automatically if these features and options are used.

Key notes on the iomap implementations in this series.
 - Don't use ordered data mode to prevent exposing stale data when
   performing append write and truncating down.
 - Override dioread_nolock mount option, always allocate unwritten
   extents for new blocks.
 - When performing write back, don't use reserved journal handle and
   postponing updating i_disksize until I/O is done.
 - The lock ordering of the folio lock and start transaction is the
   opposite of that in the buffer_head buffered write path.

Series details:

Patch 01-08: Refactor partial block zeroing operation, move it out of an
             active running journal transaction, and handle post EOF
             partial block zeroing properly.
Patch 09-21: Implement the core iomap buffered read, write path, dirty
             folio write back path, mmap path and partial block zeroing
             path for ext4 regular file. 
Patch 22:    Introduce 'buffered_iomap' and 'nobuffer_iomap' mount option
             to enable and disable the iomap buffered I/O path.

Tests and Performance:

I tested this series using xfstests-bld with auto configurations, as
well as fast_commit and 64k configurations. No new regressions were
observed.

I used fio to test my virtual machine with a 150 GB memory disk and
found an improvement of approximately 30% to 50% in large I/O write
performance, while read performance showed no significant difference.

 buffered write
 ==============

  buffer_head:
  bs      write cache    uncached write
  1k       423  MiB/s      36.3 MiB/s
  4k       1067 MiB/s      58.4 MiB/s
  64k      4321 MiB/s      869  MiB/s
  1M       4640 MiB/s      3158 MiB/s
  
  iomap:
  bs      write cache    uncached write
  1k       403  MiB/s      57   MiB/s
  4k       1093 MiB/s      61   MiB/s
  64k      6488 MiB/s      1206 MiB/s
  1M       7378 MiB/s      4818 MiB/s

 buffered read
 =============

  buffer_head:
  bs      read hole   read cache      read data
  1k       635  MiB/s    661  MiB/s    605  MiB/s
  4k       1987 MiB/s    2128 MiB/s    1761 MiB/s
  64k      6068 MiB/s    9472 MiB/s    4475 MiB/s
  1M       5471 MiB/s    8657 MiB/s    4405 MiB/s

  iomap:
  bs      read hole   read cache       read data
  1k       643  MiB/s    653  MiB/s    602  MiB/s
  4k       2075 MiB/s    2159 MiB/s    1716 MiB/s
  64k      6267 MiB/s    9545MiB/s     4451 MiB/s
  1M       6072 MiB/s    9191MiB/s     4467 MiB/s

Comments and suggestions are welcome!

Thanks,
Yi.


Zhang Yi (22):
  ext4: make ext4_block_zero_page_range() pass out did_zero
  ext4: make ext4_block_truncate_page() return zeroed length
  ext4: only order data when partially block truncating down
  ext4: factor out journalled block zeroing range
  ext4: stop passing handle to ext4_journalled_block_zero_range()
  ext4: don't zero partial block under an active handle when truncating
    down
  ext4: move ext4_block_zero_page_range() out of an active handle
  ext4: zero post EOF partial block before appending write
  ext4: add a new iomap aops for regular file's buffered IO path
  ext4: implement buffered read iomap path
  ext4: pass out extent seq counter when mapping da blocks
  ext4: implement buffered write iomap path
  ext4: implement writeback iomap path
  ext4: implement mmap iomap path
  iomap: correct the range of a partial dirty clear
  iomap: support invalidating partial folios
  ext4: implement partial block zero range iomap path
  ext4: do not order data for inodes using buffered iomap path
  ext4: add block mapping tracepoints for iomap buffered I/O path
  ext4: disable online defrag when inode using iomap buffered I/O path
  ext4: partially enable iomap for the buffered I/O path of regular
    files
  ext4: introduce a mount option for iomap buffered I/O path

 fs/ext4/ext4.h              |  21 +-
 fs/ext4/ext4_jbd2.c         |   1 +
 fs/ext4/ext4_jbd2.h         |   7 +-
 fs/ext4/extents.c           |  31 +-
 fs/ext4/file.c              |  40 +-
 fs/ext4/ialloc.c            |   1 +
 fs/ext4/inode.c             | 822 ++++++++++++++++++++++++++++++++----
 fs/ext4/move_extent.c       |  11 +
 fs/ext4/page-io.c           | 119 ++++++
 fs/ext4/super.c             |  32 +-
 fs/iomap/buffered-io.c      |  12 +-
 include/trace/events/ext4.h |  45 ++
 12 files changed, 1033 insertions(+), 109 deletions(-)

-- 
2.52.0


^ permalink raw reply	[flat|nested] 56+ messages in thread

end of thread, other threads:[~2026-02-11 15:23 UTC | newest]

Thread overview: 56+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-02-03  6:25 [PATCH -next v2 00/22] ext4: use iomap for regular file's buffered I/O path Zhang Yi
2026-02-03  6:25 ` [PATCH -next v2 01/22] ext4: make ext4_block_zero_page_range() pass out did_zero Zhang Yi
2026-02-03  6:25 ` [PATCH -next v2 02/22] ext4: make ext4_block_truncate_page() return zeroed length Zhang Yi
2026-02-03  6:25 ` [PATCH -next v2 03/22] ext4: only order data when partially block truncating down Zhang Yi
2026-02-03  9:59   ` Jan Kara
2026-02-04  6:42     ` Zhang Yi
2026-02-04 14:18       ` Jan Kara
2026-02-05  3:27         ` Baokun Li
2026-02-05 14:07           ` Jan Kara
2026-02-06  1:14             ` Baokun Li
2026-02-05  7:50         ` Zhang Yi
2026-02-05 15:05           ` Jan Kara
2026-02-06 11:09             ` Zhang Yi
2026-02-06 15:35               ` Jan Kara
2026-02-09  8:28                 ` Zhang Yi
2026-02-10 12:02                   ` Zhang Yi
2026-02-10 14:07                     ` Jan Kara
2026-02-10 16:11                       ` Zhang Yi
2026-02-11 11:42                         ` Jan Kara
2026-02-11 13:38                           ` Zhang Yi
2026-02-04  4:21   ` kernel test robot
2026-02-10  7:05   ` Ojaswin Mujoo
2026-02-10 15:57     ` Zhang Yi
2026-02-11 15:23       ` Ojaswin Mujoo
2026-02-03  6:25 ` [PATCH -next v2 04/22] ext4: factor out journalled block zeroing range Zhang Yi
2026-02-03  6:25 ` [PATCH -next v2 05/22] ext4: stop passing handle to ext4_journalled_block_zero_range() Zhang Yi
2026-02-03  6:25 ` [PATCH -next v2 06/22] ext4: don't zero partial block under an active handle when truncating down Zhang Yi
2026-02-03  6:25 ` [PATCH -next v2 07/22] ext4: move ext4_block_zero_page_range() out of an active handle Zhang Yi
2026-02-03  6:25 ` [PATCH -next v2 08/22] ext4: zero post EOF partial block before appending write Zhang Yi
2026-02-03  6:25 ` [PATCH -next v2 09/22] ext4: add a new iomap aops for regular file's buffered IO path Zhang Yi
2026-02-03  6:25 ` [PATCH -next v2 10/22] ext4: implement buffered read iomap path Zhang Yi
2026-02-03  6:25 ` [PATCH -next v2 11/22] ext4: pass out extent seq counter when mapping da blocks Zhang Yi
2026-02-03  6:25 ` [PATCH -next v2 12/22] ext4: implement buffered write iomap path Zhang Yi
2026-02-03  6:25 ` [PATCH -next v2 13/22] ext4: implement writeback " Zhang Yi
2026-02-03  6:25 ` [PATCH -next v2 14/22] ext4: implement mmap " Zhang Yi
2026-02-03  6:25 ` [PATCH -next v2 15/22] iomap: correct the range of a partial dirty clear Zhang Yi
2026-02-03  6:25 ` [PATCH -next v2 16/22] iomap: support invalidating partial folios Zhang Yi
2026-02-03  6:25 ` [PATCH -next v2 17/22] ext4: implement partial block zero range iomap path Zhang Yi
2026-02-04  0:21   ` kernel test robot
2026-02-03  6:25 ` [PATCH -next v2 18/22] ext4: do not order data for inodes using buffered " Zhang Yi
2026-02-03  6:25 ` [PATCH -next v2 19/22] ext4: add block mapping tracepoints for iomap buffered I/O path Zhang Yi
2026-02-03  6:25 ` [PATCH -next v2 20/22] ext4: disable online defrag when inode using " Zhang Yi
2026-02-03  6:25 ` [PATCH -next v2 21/22] ext4: partially enable iomap for the buffered I/O path of regular files Zhang Yi
2026-02-03  6:25 ` [PATCH -next v2 22/22] ext4: introduce a mount option for iomap buffered I/O path Zhang Yi
2026-02-03  6:43 ` [PATCH -next v2 00/22] ext4: use iomap for regular file's " Christoph Hellwig
2026-02-03  9:18   ` Zhang Yi
2026-02-03 13:14     ` Theodore Tso
2026-02-04  1:33       ` Zhang Yi
2026-02-04  1:59       ` Baokun Li
2026-02-04 14:23         ` Jan Kara
2026-02-05  2:06           ` Zhang Yi
2026-02-05  3:04             ` Baokun Li
2026-02-05 12:58             ` Jan Kara
2026-02-06  2:15               ` Zhang Yi
2026-02-05  2:55           ` Baokun Li
2026-02-05 12:46             ` Jan Kara

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox