From: "Darrick J. Wong" <djwong@kernel.org>
To: Dave Chinner <david@fromorbit.com>
Cc: John Garry <john.g.garry@oracle.com>,
axboe@kernel.dk, kbusch@kernel.org, hch@lst.de, sagi@grimberg.me,
martin.petersen@oracle.com, viro@zeniv.linux.org.uk,
brauner@kernel.org, dchinner@redhat.com, jejb@linux.ibm.com,
linux-block@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-nvme@lists.infradead.org, linux-scsi@vger.kernel.org,
linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org,
linux-security-module@vger.kernel.org, paul@paul-moore.com,
jmorris@namei.org, serge@hallyn.com
Subject: Re: [PATCH RFC 03/16] xfs: Support atomic write for statx
Date: Fri, 5 May 2023 15:10:48 -0700 [thread overview]
Message-ID: <20230505221048.GL15394@frogsfrogsfrogs> (raw)
In-Reply-To: <20230503221749.GF3223426@dread.disaster.area>
On Thu, May 04, 2023 at 08:17:49AM +1000, Dave Chinner wrote:
> On Wed, May 03, 2023 at 06:38:08PM +0000, John Garry wrote:
> > Support providing info on atomic write unit min and max.
> >
> > Darrick Wong originally authored this change.
> >
> > Signed-off-by: John Garry <john.g.garry@oracle.com>
> > ---
> > fs/xfs/xfs_iops.c | 10 ++++++++++
> > 1 file changed, 10 insertions(+)
> >
> > diff --git a/fs/xfs/xfs_iops.c b/fs/xfs/xfs_iops.c
> > index 24718adb3c16..e542077704aa 100644
> > --- a/fs/xfs/xfs_iops.c
> > +++ b/fs/xfs/xfs_iops.c
> > @@ -614,6 +614,16 @@ xfs_vn_getattr(
> > stat->dio_mem_align = bdev_dma_alignment(bdev) + 1;
> > stat->dio_offset_align = bdev_logical_block_size(bdev);
> > }
> > + if (request_mask & STATX_WRITE_ATOMIC) {
> > + struct xfs_buftarg *target = xfs_inode_buftarg(ip);
> > + struct block_device *bdev = target->bt_bdev;
> > +
> > + stat->atomic_write_unit_min = queue_atomic_write_unit_min(bdev->bd_queue);
> > + stat->atomic_write_unit_max = queue_atomic_write_unit_max(bdev->bd_queue);
>
> I'm not sure this is right.
>
> Given that we may have a 4kB physical sector device, XFS will not
> allow IOs smaller than physical sector size. The initial values of
> queue_atomic_write_unit_min/max() will be (1 << SECTOR_SIZE) which
> is 512 bytes. IOs done with 4kB sector size devices will fail in
> this case.
>
> Further, XFS has a software sector size - it can define the sector
> size for the filesystem to be 4KB on a 512 byte sector device. And
> in that case, the filesystem will reject 512 byte sized/aligned IOs
> as they are smaller than the filesystem sector size (i.e. a config
> that prevents sub-physical sector IO for 512 logical/4kB physical
> devices).
Yep. I'd forgotten about those.
> There may other filesystem constraints - realtime devices have fixed
> minimum allocation sizes which may be larger than atomic write
> limits, which means that IO completion needs to split extents into
> multiple unwritten/written extents, extent size hints might be in
> use meaning we have different allocation alignment constraints to
> atomic write constraints, stripe alignment of extent allocation may
> through out atomic write alignment, etc.
>
> These are all solvable, but we need to make sure here that the
> filesystem constraints are taken into account here, not just the
> block device limits.
>
> As such, it is probably better to query these limits at filesystem
> mount time and add them to the xfs buftarg (same as we do for
> logical and physical sector sizes) and then use the xfs buftarg
I'm not sure that's right either. device mapper can switch the
underlying storage out from under us, yes? That would be a dirty thing
to do in my book, but I've long wondered if we need to be more resilient
to that kind of evilness.
> values rather than having to go all the way to the device queue
> here. That way we can ensure at mount time that atomic write limits
> don't conflict with logical/physical IO limits, and we can further
> constrain atomic limits during mount without always having to
> recalculate those limits from first principles on every stat()
> call...
With Christoph's recent patchset to allow block devices to call back
into filesystems, we could add one for "device queue limits changed"
that would cause recomputation of those elements, solving what I was
just mumbling about above.
--D
> Cheers,
>
> Dave.
> --
> Dave Chinner
> david@fromorbit.com
next prev parent reply other threads:[~2023-05-05 22:10 UTC|newest]
Thread overview: 50+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-05-03 18:38 [PATCH RFC 00/16] block atomic writes John Garry
2023-05-03 18:38 ` [PATCH RFC 01/16] block: Add atomic write operations to request_queue limits John Garry
2023-05-03 21:39 ` Dave Chinner
2023-05-04 18:14 ` John Garry
2023-05-04 22:26 ` Dave Chinner
2023-05-05 7:54 ` John Garry
2023-05-05 22:00 ` Darrick J. Wong
2023-05-07 1:59 ` Martin K. Petersen
2023-05-05 23:18 ` Dave Chinner
2023-05-06 9:38 ` John Garry
2023-05-07 2:35 ` Martin K. Petersen
2023-05-05 22:47 ` Eric Biggers
2023-05-05 23:31 ` Dave Chinner
2023-05-06 0:08 ` Eric Biggers
2023-05-09 0:19 ` Mike Snitzer
2023-05-17 17:02 ` John Garry
2023-05-03 18:38 ` [PATCH RFC 02/16] fs/bdev: Add atomic write support info to statx John Garry
2023-05-03 21:58 ` Dave Chinner
2023-05-04 8:45 ` John Garry
2023-05-04 22:40 ` Dave Chinner
2023-05-05 8:01 ` John Garry
2023-05-05 22:04 ` Darrick J. Wong
2023-05-03 18:38 ` [PATCH RFC 03/16] xfs: Support atomic write for statx John Garry
2023-05-03 22:17 ` Dave Chinner
2023-05-05 22:10 ` Darrick J. Wong [this message]
2023-05-03 18:38 ` [PATCH RFC 04/16] fs: Add RWF_ATOMIC and IOCB_ATOMIC flags for atomic write support John Garry
2023-05-03 18:38 ` [PATCH RFC 05/16] block: Add REQ_ATOMIC flag John Garry
2023-05-03 18:38 ` [PATCH RFC 06/16] block: Limit atomic writes according to bio and queue limits John Garry
2023-05-03 18:53 ` Keith Busch
2023-05-04 8:24 ` John Garry
2023-05-03 18:38 ` [PATCH RFC 07/16] block: Add bdev_find_max_atomic_write_alignment() John Garry
2023-05-03 18:38 ` [PATCH RFC 08/16] block: Add support for atomic_write_unit John Garry
2023-05-03 18:38 ` [PATCH RFC 09/16] block: Add blk_validate_atomic_write_op() John Garry
2023-05-03 18:38 ` [PATCH RFC 10/16] block: Add fops atomic write support John Garry
2023-05-03 18:38 ` [PATCH RFC 11/16] fs: iomap: Atomic " John Garry
2023-05-04 5:00 ` Dave Chinner
2023-05-05 21:19 ` Darrick J. Wong
2023-05-05 23:56 ` Dave Chinner
2023-05-03 18:38 ` [PATCH RFC 12/16] xfs: Add support for fallocate2 John Garry
2023-05-03 23:26 ` Dave Chinner
2023-05-05 22:23 ` Darrick J. Wong
2023-05-05 23:42 ` Dave Chinner
2023-05-03 18:38 ` [PATCH RFC 13/16] scsi: sd: Support reading atomic properties from block limits VPD John Garry
2023-05-03 18:38 ` [PATCH RFC 14/16] scsi: sd: Add WRITE_ATOMIC_16 support John Garry
2023-05-03 18:48 ` Bart Van Assche
2023-05-04 8:17 ` John Garry
2023-05-03 18:38 ` [PATCH RFC 15/16] scsi: scsi_debug: Atomic write support John Garry
2023-05-03 18:38 ` [PATCH RFC 16/16] nvme: Support atomic writes John Garry
2023-05-03 18:49 ` Bart Van Assche
2023-05-04 8:19 ` John Garry
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230505221048.GL15394@frogsfrogsfrogs \
--to=djwong@kernel.org \
--cc=axboe@kernel.dk \
--cc=brauner@kernel.org \
--cc=david@fromorbit.com \
--cc=dchinner@redhat.com \
--cc=hch@lst.de \
--cc=jejb@linux.ibm.com \
--cc=jmorris@namei.org \
--cc=john.g.garry@oracle.com \
--cc=kbusch@kernel.org \
--cc=linux-block@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=linux-scsi@vger.kernel.org \
--cc=linux-security-module@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=martin.petersen@oracle.com \
--cc=paul@paul-moore.com \
--cc=sagi@grimberg.me \
--cc=serge@hallyn.com \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).