linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ojaswin Mujoo <ojaswin@linux.ibm.com>
To: linux-ext4@vger.kernel.org, "Theodore Ts'o" <tytso@mit.edu>
Cc: Jan Kara <jack@suse.cz>, Baokun Li <libaokun1@huawei.com>,
	Ritesh Harjani <ritesh.list@gmail.com>,
	Zhang Yi <yi.zhang@huawei.com>,
	linux-kernel@vger.kernel.org,
	"Darrick J . Wong" <djwong@kernel.org>,
	linux-fsdevel@vger.kernel.org
Subject: [RFC v4 0/7] ext4: Add extsize support
Date: Mon, 21 Jul 2025 02:27:26 +0530	[thread overview]
Message-ID: <cover.1753044253.git.ojaswin@linux.ibm.com> (raw)

This is the v4 for adding extsize support in ext4. extsize is primarily
being implemented as a building block to eventually support multiblock
atomic writes in ext4 without having to reformat the filesystem with
bigalloc. The long term goal behind implementing extsize is two fold:

1. We eventually want to give users a way to perform atomic writes
without needing a FS reformat to bigalloc.
  - this can be achieved via configurations like extsize + software
    fallback or extsize + forcealign. (More about forcealign can be
    found in previous RFC [1])

2. We want to implement a software atomic write fallback for ext4 (just
like XFS) and at the same time we want to give users the choice of
whether they want only HW accelerated (fast) atomic writes or are they
okay with falling back to software emulation (slow). Wanting to opt out
of SW fallback was also a point raised by some attendees in LSFMM.
  a) For users wanting guaranteed HW atomic writes, we want to implement
  extsize + forcealign. This ensures atomic writes are always HW
  accelerated however the write is bound to fail if the allocator can't
  guarantee HW acceleration for any reason (eg no aligned blocks
  available).

  b) For users which prefer software fallback rather than failing the
  write, we want to implement extsize + software fallback. extsize
  ensures we try to get aligned blocks for HW accelerated atomic writes
  on best effort basis, and SW fallback ensures we don't fail the write
  in case HW atomic writes are not possible. This is inline with how XFS
  has implemented multi block atomic writes.

The above approach helps ext4 provide more choice to the user about how
they want to perform the write based on what is more suitable for their
workload.

Both the approaches need extsize as a building block for the solutions
hence we are pushing the extsize changes separately and once community
is happy with these we can work on the next steps.

changes in v4 :
- removed forcealign patches so we can independently review extsize and
  then build on that later
- refactored previous implementation of ext4_map_query/create_blocks to
  use EXT4_EX_QUERY_FILTER
- removed some extra warn ons that were expected to hit in certain cases

[1] RFC v3: https://lore.kernel.org/linux-ext4/cover.1742800203.git.ojaswin@linux.ibm.com/

Testing: I've tested with xfstests auto and don't see any regressions.
Also tested with internal extsize related tests that I plan to upstream
soon.

Ojaswin Mujoo (7):
  ext4: add aligned allocation hint in mballoc
  ext4: allow inode preallocation for aligned alloc
  ext4: support for extsize hint using FS_IOC_FS(GET/SET)XATTR
  ext4: pass lblk and len explicitly to ext4_split_extent*()
  ext4: add extsize hint support
  ext4: make extsize work with EOF allocations
  ext4: add ext4_map_blocks_extsize() wrapper to handle overwrites

 fs/ext4/ext4.h              |  15 +-
 fs/ext4/ext4_jbd2.h         |  15 ++
 fs/ext4/extents.c           | 229 ++++++++++++++---
 fs/ext4/inode.c             | 485 ++++++++++++++++++++++++++++++++----
 fs/ext4/ioctl.c             | 122 +++++++++
 fs/ext4/mballoc.c           | 123 +++++++--
 fs/ext4/super.c             |   1 +
 include/trace/events/ext4.h |   1 +
 8 files changed, 881 insertions(+), 110 deletions(-)

-- 
2.49.0


             reply	other threads:[~2025-07-20 20:57 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-20 20:57 Ojaswin Mujoo [this message]
2025-07-20 20:57 ` [RFC v4 1/7] ext4: add aligned allocation hint in mballoc Ojaswin Mujoo
2025-07-20 20:57 ` [RFC v4 2/7] ext4: allow inode preallocation for aligned alloc Ojaswin Mujoo
2025-07-20 20:57 ` [RFC v4 3/7] ext4: support for extsize hint using FS_IOC_FS(GET/SET)XATTR Ojaswin Mujoo
2025-07-20 20:57 ` [RFC v4 4/7] ext4: pass lblk and len explicitly to ext4_split_extent*() Ojaswin Mujoo
2025-07-20 20:57 ` [RFC v4 5/7] ext4: add extsize hint support Ojaswin Mujoo
2025-07-20 20:57 ` [RFC v4 6/7] ext4: make extsize work with EOF allocations Ojaswin Mujoo
2025-07-20 20:57 ` [RFC v4 7/7] ext4: add ext4_map_blocks_extsize() wrapper to handle overwrites Ojaswin Mujoo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cover.1753044253.git.ojaswin@linux.ibm.com \
    --to=ojaswin@linux.ibm.com \
    --cc=djwong@kernel.org \
    --cc=jack@suse.cz \
    --cc=libaokun1@huawei.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=ritesh.list@gmail.com \
    --cc=tytso@mit.edu \
    --cc=yi.zhang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).