linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Ritesh Harjani (IBM) <ritesh.list@gmail.com>
To: Dave Chinner <david@fromorbit.com>, John Garry <john.g.garry@oracle.com>
Cc: John Garry <john.g.garry@oracle.com>,
	chandan.babu@oracle.com, djwong@kernel.org, dchinner@redhat.com,
	hch@lst.de, viro@zeniv.linux.org.uk, brauner@kernel.org,
	jack@suse.cz, linux-xfs@vger.kernel.org,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	catherine.hoang@oracle.com, martin.petersen@oracle.com
Subject: Re: [PATCH v4 00/14] forcealign for xfs
Date: Tue, 10 Sep 2024 18:03:12 +0530	[thread overview]
Message-ID: <8734m7henr.fsf@gmail.com> (raw)
In-Reply-To: <ZtlQt/7VHbOtQ+gY@dread.disaster.area>

Dave Chinner <david@fromorbit.com> writes:

> On Thu, Sep 05, 2024 at 09:26:25AM +0530, Ritesh Harjani wrote:
>> Dave Chinner <david@fromorbit.com> writes:
>> > On Wed, Sep 04, 2024 at 11:44:29PM +0530, Ritesh Harjani wrote:
>> >> 3. It is the FORCEALIGN feature which _mandates_ both allocation
>> >> (by using extsize hint) and de-allocation to happen _only_ in
>> >> extsize chunks.
>> >>
>> >>    i.e. forcealign mandates -
>> >>    - the logical and physical start offset should be aligned as
>> >>    per args->alignment
>> >>    - extent length be aligned as per args->prod/mod.
>> >>      If above two cannot be satisfied then return -ENOSPC.
>> >
>> > Yes.
>> >
>> >> 
>> >>    - Does the unmapping of extents also only happens in extsize
>> >>    chunks (with forcealign)?
>> >
>> > Yes, via use of xfs_inode_alloc_unitsize() in the high level code
>> > aligning the fsbno ranges to be unmapped.
>> >
>> > Remember, force align requires both logical file offset and
>> > physical block number to be correctly aligned,
>> 
>> This is where I would like to double confirm it again. Even the
>> extsize hint feature (w/o FORCEALIGN) will try to allocate aligned
>> physical start and logical start file offset and length right?
>
> No.
>
>> (Or does extsize hint only restricts alignment to logical start file
>> offset + length and not the physical start?)
>
> Neither.
>
> extsize hint by itself (i.e. existing behaviour) has no alignment
> effect at all. All it affects is -size- of the extent. i.e. once
> the extent start is chosen, extent size hints will trim the length
> of the extent to a multiple of the extent size hint. Alignment is
> not considered at all.
>
>> Also it looks like there is no difference with ATOMIC_WRITE AND
>> FORCEALIGN feature with XFS, correct? (except that ATOMIC_WRITE is
>> adding additional natural alignment restrictions on pos and len). 
>
> Atomic write requires additional hardware support, and it restricts
> the valid sizes of extent size hints that can be set. Only atomic
> writes can be done on files marked as configured for atomic writes;
> force alignment can be done on any file...
>
>> So why maintain 2 separate on disk inode flags for FORCEALIGN AND
>> ATOMIC_WRITE?
>
> the atomic write flag indicates that a file has been set up
> correctly for atomic writes to be able to issues reliably. force
> alignment doesn't guarantee that - it's just a mechanism that tells
> the allocator to behave a specific way.
>
>> - Do you foresee FORCEALIGN to be also used at other places w/o
>> ATOMIC_WRITE where feature differentiation between the two on an
>> inode is required?
>
> The already exist. For example, reliably allocating huge page
> mappings on DAX filesystems requires 2MB forced alignment. 
>
>> - Does the same reasoning will hold for XFS_SB_FEAT_RO_COMPAT_FORCEALIGN
>> & XFS_SB_FEAT_RO_COMPAT_ATOMICWRITES too?
>
> Same as above.
>
>> - But why ro_compact for ATOMICWRITES? There aren't any on disk metadata
>> changes within XFS filesystem to support atomic writes, right? 
>
> Because if you downgrade the kernel to something that doesn't
> support atomic writes, then non-atomic sized/aligned data can be
> written to the file and/or torn writes can occur.
>
> Worse, extent size hints that don't match the underlying hardware
> support could be set up for inodes, and when the kernel is upgraded
> again then atomic writes will fail on inodes that have atomic write
> flags set on them....
>
>> Is it something to just prevent users from destroying their own data
>> by not allowing a rw mount from an older kernel where users could do
>> unaligned writes to files marked for atomic writes?
>> Or is there any other reasoning to prevent XFS filesystem from becoming
>> inconsistent if an older kernel does a rw mount here.
>
> The older kernel does not know what the unknown inode flag means
> (i.e. atomic writes) and so, by definition, we cannot allow it to
> modify metadata or file data because it may not modify it in the
> correct way for that flag being set on the inode.
>
> Kernels that don't understand feature flags need to treat the
> filesystem as read-only, no matter how trivial the feature addition
> might seem.
>

1. Will it require a fresh formatting of filesystem with mkfs.xfs for
enabling atomic writes (/forcealign) on XFS?
  a. Is that because reflink is not support with atomic writes
  (/forcealign) today?

As I understand for setting forcealign attr on any inode it checks for
whether xfs_has_forcealign(mp). That means forcealign can _only_ be
enabled during mkfs time and it also needs reflink to be disabled with
-m reflink=0. Right?

-ritesh

  parent reply	other threads:[~2024-09-10 13:18 UTC|newest]

Thread overview: 58+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-13 16:36 [PATCH v4 00/14] forcealign for xfs John Garry
2024-08-13 16:36 ` [PATCH v4 01/14] xfs: only allow minlen allocations when near ENOSPC John Garry
2024-08-23 16:28   ` Darrick J. Wong
2024-08-13 16:36 ` [PATCH v4 02/14] xfs: always tail align maxlen allocations John Garry
2024-08-23 16:31   ` Darrick J. Wong
2024-08-29 17:58     ` John Garry
2024-08-29 21:34       ` Darrick J. Wong
2024-08-13 16:36 ` [PATCH v4 03/14] xfs: simplify extent allocation alignment John Garry
2024-08-13 16:36 ` [PATCH v4 04/14] xfs: make EOF allocation simpler John Garry
2024-09-04 18:25   ` Ritesh Harjani
2024-09-05  7:51     ` John Garry
2024-08-13 16:36 ` [PATCH v4 05/14] xfs: introduce forced allocation alignment John Garry
2024-08-13 16:36 ` [PATCH v4 06/14] xfs: align args->minlen for " John Garry
2024-08-13 16:36 ` [PATCH v4 07/14] xfs: Introduce FORCEALIGN inode flag John Garry
2024-08-13 16:36 ` [PATCH v4 08/14] xfs: Update xfs_inode_alloc_unitsize() for forcealign John Garry
2024-08-13 16:36 ` [PATCH v4 09/14] xfs: Update xfs_setattr_size() " John Garry
2024-08-13 16:36 ` [PATCH v4 10/14] xfs: Do not free EOF blocks " John Garry
2024-08-13 16:36 ` [PATCH v4 11/14] xfs: Only free full extents " John Garry
2024-08-13 16:36 ` [PATCH v4 12/14] xfs: Unmap blocks according to forcealign John Garry
2024-08-23 16:35   ` Darrick J. Wong
2024-08-13 16:36 ` [PATCH v4 13/14] xfs: Don't revert allocated offset for forcealign John Garry
2024-08-13 16:36 ` [PATCH v4 14/14] xfs: Enable file data forcealign feature John Garry
2024-09-04 18:14 ` [PATCH v4 00/14] forcealign for xfs Ritesh Harjani
2024-09-04 23:20   ` Dave Chinner
2024-09-05  3:56     ` Ritesh Harjani
2024-09-05  6:33       ` Dave Chinner
2024-09-10  2:51         ` Ritesh Harjani
2024-09-16  6:33           ` Dave Chinner
2024-09-10 12:33         ` Ritesh Harjani [this message]
2024-09-16  7:03           ` Dave Chinner
2024-09-16 10:24             ` John Garry
2024-09-17 20:54               ` Darrick J. Wong
2024-09-17 23:34                 ` Dave Chinner
2024-09-17 22:12               ` Dave Chinner
2024-09-18  7:59                 ` John Garry
2024-09-23  2:57                   ` Dave Chinner
2024-09-23  3:33                     ` Christoph Hellwig
2024-09-23  8:16                       ` John Garry
2024-09-23 12:07                         ` Christoph Hellwig
2024-09-23 12:33                           ` John Garry
2024-09-24  6:17                             ` Christoph Hellwig
2024-09-24  9:48                               ` John Garry
2024-11-29 11:36                                 ` John Garry
2024-09-23  8:00                     ` John Garry
2024-09-05 10:15     ` John Garry
2024-09-05 21:47       ` Dave Chinner
2024-09-06 14:31         ` John Garry
2024-09-08 22:49           ` Dave Chinner
2024-09-09 16:18             ` John Garry
2024-09-16  5:25               ` Dave Chinner
2024-09-16  9:44                 ` John Garry
2024-09-17 22:27                   ` Dave Chinner
2024-09-18 10:12                     ` John Garry
2024-11-14 12:48                       ` Long Li
2024-11-14 16:22                         ` John Garry
2024-11-14 20:07                         ` Dave Chinner
2024-11-15  8:14                           ` John Garry
2024-11-15 11:20                           ` Long Li

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8734m7henr.fsf@gmail.com \
    --to=ritesh.list@gmail.com \
    --cc=brauner@kernel.org \
    --cc=catherine.hoang@oracle.com \
    --cc=chandan.babu@oracle.com \
    --cc=david@fromorbit.com \
    --cc=dchinner@redhat.com \
    --cc=djwong@kernel.org \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=john.g.garry@oracle.com \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=martin.petersen@oracle.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).