linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
To: John Garry <john.g.garry@oracle.com>, linux-ext4@vger.kernel.org
Cc: Theodore Ts'o <tytso@mit.edu>, Jan Kara <jack@suse.cz>,
	djwong@kernel.org, Ojaswin Mujoo <ojaswin@linux.ibm.com>,
	linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH v5 7/7] ext4: Add atomic block write documentation
Date: Fri, 16 May 2025 19:45:22 +0530	[thread overview]
Message-ID: <87msbcwsjp.fsf@gmail.com> (raw)
In-Reply-To: <3b69be2c-51b7-4090-b267-0d213d0cecae@oracle.com>

John Garry <john.g.garry@oracle.com> writes:

> On 15/05/2025 20:50, Ritesh Harjani (IBM) wrote:
>
> thanks for adding this info
>
>> Application Interface
>
> Should we put this into a common file, as it is just not relevant to ext4?
>
> Or move this file to a common location, and have separate sections for 
> ext4 and xfs? This would save having scattered files for instructions.
>

The purpose of adding this documentation was mainly to note down some of
the implementation details around multi-fsblock atomic writes for ext4
using bigalloc which otherwise are easy to miss. But since there was no
general documentation available on atomic writes, we added a bit more
info around it mainly enough to cover ext4.

>> +~~~~~~~~~~~~~~~~~~~~~
>> +
>> +Applications can use the ``pwritev2()`` system call with the ``RWF_ATOMIC`` flag
>> +to perform atomic writes:
>> +
>> +.. code-block:: c
>> +
>> +    pwritev2(fd, iov, iovcnt, offset, RWF_ATOMIC);
>> +
>> +The write must be aligned to the filesystem's block size and not exceed the
>> +filesystem's maximum atomic write unit size.
>> +See ``generic_atomic_write_valid()`` for more details.
>> +
>> +``statx()`` system call with ``STATX_WRITE_ATOMIC`` flag can provides following
>> +details:
>> +
>> + * ``stx_atomic_write_unit_min``: Minimum size of an atomic write request.
>> + * ``stx_atomic_write_unit_max``: Maximum size of an atomic write request.
>> + * ``stx_atomic_write_segments_max``: Upper limit for segments. The number of
>> +   separate memory buffers that can be gathered into a write operation
>
> there will also be stx_atomic_write_unit_max_opt, as queued for 6.16
>
> For HW-only support, I think that it is ok to just return same as 
> stx_atomic_write_unit_max when we can atomic write > 1 filesystem block
>

Yes, so for HW-only support like ext4 it may not be strictly required.
To avoid the dependency on XFS patch series, I think it will be better if we add
those changes after XFS multi-fsblock atomic write has landed :)


>> +   (e.g., the iovcnt parameter for IOV_ITER).
>
>
>> Currently, this is always set to one.
>
> JFYI, for xfs supporting filesystem-based atomic writes only, i.e. no HW 
> support, we could set this to a higher value
>

Yes. But again, XFS specific detail, not strictly relevant for EXT4 atomic write documentation.

>> +
>> +The STATX_ATTR_WRITE_ATOMIC flag in ``statx->attributes`` is set if atomic
>> +writes are supported.
>> +
>> +.. _atomic_write_bdev_support:
>> +
>> +Hardware Support
>> +----------------
>> +
>> +The underlying storage device must support atomic write operations.
>> +Modern NVMe and SCSI devices often provide this capability.
>> +The Linux kernel exposes this information through sysfs:
>> +
>> +* ``/sys/block/<device>/queue/atomic_write_unit_min`` - Minimum atomic write size
>> +* ``/sys/block/<device>/queue/atomic_write_unit_max`` - Maximum atomic write size
>
> there is also the max bytes and boundary files. I am not sure if it was 
> intentional to omit them.
>

The intention of this section was mainly for sysadmin to first check if
the underlying block device supports atomic writes and what are it's awu
units to decide an appropriate blocksize and/or clustersize for ext4
filesystem.

See section "Creating Filesystems with Atomic Write Support"  which
refers to this section first.

>> +
>> +Nonzero values for these attributes indicate that the device supports
>> +atomic writes.
>> +
>> +See Also
>
> thanks,
> John

Thanks for the review John. 

I think the current documentation mainly caters to ext4 specific
implementation notes on single and multi-fsblock atomic writes.

IMO, it is ok for us to keep this Documentation as is for v6.16 and
let's work on a more general doc which can cover details like:
- block device driver support (scsi & nvme)
- block layer support (bio split & merge )
- Filesystem & iomap support (iomap, ext4, xfs)
- VFS layer support (statx, pwritev2...)

We can add these documentations in their respective subsystem
directories and add a more common Documentation where VFS details are
kept, which will refer to these subsystem specific details.

Thoughts?

-ritesh

  parent reply	other threads:[~2025-05-16 14:36 UTC|newest]

Thread overview: 20+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-05-15 19:50 [PATCH v5 0/7] ext4: Add multi-fsblock atomic write support with bigalloc Ritesh Harjani (IBM)
2025-05-15 19:50 ` [PATCH v5 1/7] ext4: Document an edge case for overwrites Ritesh Harjani (IBM)
2025-05-15 19:50 ` [PATCH v5 2/7] ext4: Check if inode uses extents in ext4_inode_can_atomic_write() Ritesh Harjani (IBM)
2025-05-15 19:50 ` [PATCH v5 3/7] ext4: Make ext4_meta_trans_blocks() non-static for later use Ritesh Harjani (IBM)
2025-05-15 19:50 ` [PATCH v5 4/7] ext4: Add support for EXT4_GET_BLOCKS_QUERY_LEAF_BLOCKS Ritesh Harjani (IBM)
2025-05-15 19:50 ` [PATCH v5 5/7] ext4: Add multi-fsblock atomic write support with bigalloc Ritesh Harjani (IBM)
2025-05-15 19:50 ` [PATCH v5 6/7] ext4: Enable support for ext4 multi-fsblock atomic write using bigalloc Ritesh Harjani (IBM)
2025-05-15 19:50 ` [PATCH v5 7/7] ext4: Add atomic block write documentation Ritesh Harjani (IBM)
2025-05-16  8:55   ` John Garry
2025-05-16 12:19     ` Theodore Ts'o
2025-05-16 13:05       ` John Garry
2025-05-16 13:31         ` Carlos Maiolino
2025-05-16 14:48           ` Theodore Ts'o
2025-05-16 15:10             ` Darrick J. Wong
2025-05-16 18:13               ` Carlos Maiolino
2025-05-16 14:36       ` Ritesh Harjani
2025-05-16 14:15     ` Ritesh Harjani [this message]
2025-05-19 10:07 ` [PATCH v5 0/7] ext4: Add multi-fsblock atomic write support with bigalloc Ritesh Harjani
2025-05-19 15:47   ` Theodore Ts'o
2025-05-20 14:40 ` Theodore Ts'o

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=87msbcwsjp.fsf@gmail.com \
    --to=ritesh.list@gmail.com \
    --cc=djwong@kernel.org \
    --cc=jack@suse.cz \
    --cc=john.g.garry@oracle.com \
    --cc=linux-ext4@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=ojaswin@linux.ibm.com \
    --cc=tytso@mit.edu \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).