Linux EXT4 FS development
 help / color / mirror / Atom feed
* [PATCH 0/2] ext4: allow more DIO writes under shared i_rwsem
@ 2026-06-11 16:34 Baokun Li
  2026-06-11 16:34 ` [PATCH 1/2] ext4: skip overwrite check for aligned non-extending DIO writes Baokun Li
  2026-06-11 16:34 ` [PATCH 2/2] ext4: base unaligned DIO lock decision on partial block zeroing Baokun Li
  0 siblings, 2 replies; 3+ messages in thread
From: Baokun Li @ 2026-06-11 16:34 UTC (permalink / raw)
  To: linux-ext4
  Cc: tytso, adilger.kernel, jack, yi.zhang, ojaswin, ritesh.list,
	peng_wang

Hi all,

This series relaxes the i_rwsem requirements of ext4_dio_write_iter()
so that more direct I/O writes can proceed under the shared lock.

It continues the work started by Peng Wang's RFC [1]; I'm taking
over this effort going forward.

ext4_dio_write_checks() currently calls ext4_overwrite_io() to decide
whether the shared lock is sufficient. Its single ext4_map_blocks()
lookup only sees the first contiguous extent of the same type, which
forces the exclusive lock for two cases that are actually safe under
the shared lock (see individual patches for the full safety
argument):

  1. Aligned writes spanning multiple already-allocated extents (e.g.
     written + unwritten, or two discontiguous written extents).

  2. Unaligned writes whose head/tail partial blocks land on written
     extents but the fully-covered middle blocks include hole or
     unwritten extents.

Patch 1 skips the ext4_overwrite_io() pre-check entirely for aligned
non-extending writes, letting them proceed under the shared lock
regardless of extent state.

Patch 2 replaces ext4_overwrite_io() with ext4_dio_needs_zeroing(),
which directly answers the question driving the lock decision. It
checks only the head and tail partial blocks (at most two
ext4_map_blocks() calls), and ignores the state of middle blocks.


Testing
=======

"kvm-xfstests -c ext4/all -g auto" passes with no new failures.


Performance
===========

Hardware: /dev/sda (rotational disk, ~1 GB/s sustained write)
Filesystem: ext4 default mkfs

Test 1: aligned 8K DIO writes spanning written+unwritten extent
boundaries. Each thread writes its own 1G region sequentially; the
file is rebuilt between runs so every block is written exactly once.
Metric: IOPS.

  JOBS         base    +patch 1    +patch 1+2    speedup
  ----    ---------    --------    ----------    -------
     1       42,322      43,329        43,087      1.02x
     2       68,516      70,677        66,958      1.03x
     4       62,489      97,072       101,468      1.62x
     8       58,701     110,819       113,679      1.94x
    16       58,569     116,392       115,272      1.97x
    32       60,860     117,244       119,621      1.97x

Wall time at JOBS=32: 69.2s (base) -> 35.4s (patched), 1.96x faster.

Test 2: unaligned DIO writes (14336 bytes at +512 within each 16K
stripe). Each stripe is laid out as [written][unwritten][unwritten]
[written], so the head and tail partial blocks land on written
extents but the middle is unwritten. Metric: IOPS.

  JOBS         base    +patch 1    +patch 1+2    speedup
  ----    ---------    --------    ----------    -------
     1       15,547      15,975        17,381      1.12x
     2       15,910      14,808        34,172      2.15x
     4       15,014      14,828        57,567      3.83x
     8       15,022      14,648        81,947      5.46x
    16       14,586      14,262        99,126      6.80x
    32       14,047      13,809        92,519      6.59x

Wall time at JOBS=32: 149.3s (base) -> 22.7s (patched), 6.58x faster.

In test 2, patch 1 alone has no effect (slight noise) because patch 1
only touches the aligned write path. Patch 2 introduces
ext4_dio_needs_zeroing() which precisely identifies when partial
block zeroing is required, allowing the shared lock for the much
larger set of unaligned writes that don't actually trigger zeroing.

Comments and questions are, as always, welcome.

Thanks,
Baokun

[1]: https://patch.msgid.link/20260607124935.6168-1-peng_wang@linux.alibaba.com


Baokun Li (2):
  ext4: skip overwrite check for aligned non-extending DIO writes
  ext4: base unaligned DIO lock decision on partial block zeroing

 fs/ext4/file.c | 132 +++++++++++++++++++++++++++++++++----------------
 1 file changed, 89 insertions(+), 43 deletions(-)

-- 
2.43.7


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2026-06-11 16:35 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-11 16:34 [PATCH 0/2] ext4: allow more DIO writes under shared i_rwsem Baokun Li
2026-06-11 16:34 ` [PATCH 1/2] ext4: skip overwrite check for aligned non-extending DIO writes Baokun Li
2026-06-11 16:34 ` [PATCH 2/2] ext4: base unaligned DIO lock decision on partial block zeroing Baokun Li

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox