From: John Garry <john.g.garry@oracle.com>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: brauner@kernel.org, cem@kernel.org, dchinner@redhat.com,
hch@lst.de, linux-xfs@vger.kernel.org,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
ojaswin@linux.ibm.com, ritesh.list@gmail.com,
martin.petersen@oracle.com
Subject: Re: [PATCH RFC 07/10] xfs: Add xfs_file_dio_write_atomic()
Date: Thu, 6 Feb 2025 10:43:08 +0000 [thread overview]
Message-ID: <2150abab-5c70-424b-ad83-74868f8afc8a@oracle.com> (raw)
In-Reply-To: <20250205195540.GY21808@frogsfrogsfrogs>
On 05/02/2025 19:55, Darrick J. Wong wrote:
> On Tue, Feb 04, 2025 at 12:01:24PM +0000, John Garry wrote:
>> Add xfs_file_dio_write_atomic() for dedicated handling of atomic writes.
>>
>> In case of -EAGAIN being returned from iomap_dio_rw(), reissue the write
>> in CoW-based atomic write mode.
>>
>> In the CoW-based atomic write mode, first unshare blocks so that we don't
>> have a cow fork for the data in the range which we are writing.
>>
>> Signed-off-by: John Garry <john.g.garry@oracle.com>
>> ---
>> fs/xfs/xfs_file.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++
>> 1 file changed, 51 insertions(+)
>>
>> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
>> index fd05b66aea3f..12af5cdc3094 100644
>> --- a/fs/xfs/xfs_file.c
>> +++ b/fs/xfs/xfs_file.c
>> @@ -619,6 +619,55 @@ xfs_file_dio_write_aligned(
>> return ret;
>> }
>>
>> +static noinline ssize_t
>> +xfs_file_dio_write_atomic(
>> + struct xfs_inode *ip,
>> + struct kiocb *iocb,
>> + struct iov_iter *from)
>> +{
>> + unsigned int iolock = XFS_IOLOCK_SHARED;
>> + bool use_cow = false;
>> + unsigned int dio_flags;
>> + ssize_t ret;
>> +
>> +retry:
>> + ret = xfs_ilock_iocb_for_write(iocb, &iolock);
>> + if (ret)
>> + return ret;
>> +
>> + ret = xfs_file_write_checks(iocb, from, &iolock);
>> + if (ret)
>> + goto out_unlock;
>> +
>> + if (use_cow) {
>> + ret = xfs_reflink_unshare(ip, iocb->ki_pos,
>> + iov_iter_count(from));
>
> Nit: continuation lines should be indented two tabs:
>
> ret = xfs_reflink_unshare(ip, iocb->ki_pos,
> iov_iter_count(from));
ok
>
>> + if (ret)
>> + goto out_unlock;
>> + }
>> +
>> + trace_xfs_file_direct_write(iocb, from);
>> + if (use_cow)
>> + dio_flags = IOMAP_DIO_ATOMIC_COW;
>> + else
>> + dio_flags = 0;
>
> I also think you could eliminate use_cow by initializing dio_flags to
> zero at the top, OR'ing in IOMAP_DIO_ATOMIC_COW in the retry clause
> below, and using (dio_flags & IOMAP_DIO_ATOMIC_COW) to determine if you
> should call unshare above.
ok, fine, if you think that it is better
>
> Note: This serializes all the software untorn direct writes. I think
> a more performant solution would allocate the cow staging blocks ondisk,
> attach them to the directio ioend context, and alter ->iomap_begin and
> the ioend remap to use the attached blocks, but that's a lot more
> surgery.
sure, that does sound like it's quite intrusive. But whatever we do I
would like to keep the behaviour that racing reads and atomic writes
mean that a read sees all old or all new data. That is how SCSI and NVMe
behaves, even though it is not an advertised atomic write feature.
Thanks,
John
next prev parent reply other threads:[~2025-02-06 10:43 UTC|newest]
Thread overview: 35+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-02-04 12:01 [PATCH RFC 00/10] large atomic writes for xfs with CoW John Garry
2025-02-04 12:01 ` [PATCH RFC 01/10] xfs: Switch atomic write size check in xfs_file_write_iter() John Garry
2025-02-04 12:01 ` [PATCH RFC 02/10] xfs: Refactor xfs_reflink_end_cow_extent() John Garry
2025-02-05 19:50 ` Darrick J. Wong
2025-02-06 10:35 ` John Garry
2025-02-06 21:38 ` Darrick J. Wong
2025-02-04 12:01 ` [PATCH RFC 03/10] iomap: Support CoW-based atomic writes John Garry
2025-02-05 20:11 ` Darrick J. Wong
2025-02-06 11:21 ` John Garry
2025-02-06 21:40 ` Darrick J. Wong
2025-02-04 12:01 ` [PATCH RFC 04/10] xfs: Make xfs_find_trim_cow_extent() public John Garry
2025-02-04 12:01 ` [PATCH RFC 05/10] xfs: Reflink CoW-based atomic write support John Garry
2025-02-04 12:01 ` [PATCH RFC 06/10] xfs: iomap " John Garry
2025-02-05 20:05 ` Darrick J. Wong
2025-02-06 11:10 ` John Garry
2025-02-06 21:44 ` Darrick J. Wong
2025-02-07 11:48 ` John Garry
2025-02-04 12:01 ` [PATCH RFC 07/10] xfs: Add xfs_file_dio_write_atomic() John Garry
2025-02-05 19:55 ` Darrick J. Wong
2025-02-06 10:43 ` John Garry [this message]
2025-02-10 16:59 ` John Garry
2025-02-04 12:01 ` [PATCH RFC 08/10] xfs: Commit CoW-based atomic writes atomically John Garry
2025-02-05 19:47 ` Darrick J. Wong
2025-02-06 10:27 ` John Garry
2025-02-06 21:50 ` Darrick J. Wong
2025-02-07 11:52 ` John Garry
2025-02-04 12:01 ` [PATCH RFC 09/10] xfs: Update atomic write max size John Garry
2025-02-05 19:41 ` Darrick J. Wong
2025-02-06 9:15 ` John Garry
2025-02-06 21:54 ` Darrick J. Wong
2025-02-07 11:53 ` John Garry
2025-02-04 12:01 ` [PATCH RFC 10/10] xfs: Allow block allocator to take an alignment hint John Garry
2025-02-05 19:20 ` Darrick J. Wong
2025-02-06 8:10 ` John Garry
2025-02-06 21:54 ` Darrick J. Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=2150abab-5c70-424b-ad83-74868f8afc8a@oracle.com \
--to=john.g.garry@oracle.com \
--cc=brauner@kernel.org \
--cc=cem@kernel.org \
--cc=dchinner@redhat.com \
--cc=djwong@kernel.org \
--cc=hch@lst.de \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=martin.petersen@oracle.com \
--cc=ojaswin@linux.ibm.com \
--cc=ritesh.list@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox