From: John Garry <john.g.garry@oracle.com>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: hch@lst.de, cem@kernel.org, linux-xfs@vger.kernel.org
Subject: Re: [PATCH] xfs: disallow atomic writes on DAX
Date: Fri, 18 Jul 2025 09:15:07 +0100 [thread overview]
Message-ID: <1b61ffa1-870d-4b30-9ba8-014a9ca5d33f@oracle.com> (raw)
In-Reply-To: <20250717160255.GP2672049@frogsfrogsfrogs>
On 17/07/2025 17:02, Darrick J. Wong wrote:
> On Thu, Jul 17, 2025 at 03:19:00PM +0000, John Garry wrote:
>> Atomic writes are not currently supported for DAX, but two problems exist:
>> - we may go down DAX write path for IOCB_ATOMIC, which does not handle
>> IOCB_ATOMIC properly
>> - we report non-zero atomic write limits in statx (for DAX inodes)
>>
>> We may want atomic writes support on DAX in future, but just disallow for
>> now.
>>
>> For this, ensure when IOCB_ATOMIC is set that we check the write size
>> versus the atomic write min and max before branching off to the DAX write
>> path. This is not strictly required for DAX, as we should not get this far
>> in the write path as FMODE_CAN_ATOMIC_WRITE should not be set.
>>
>> In addition, due to reflink being supported for DAX, we automatically get
>> CoW-based atomic writes support being advertised. Remedy this by
>> disallowing atomic writes for a DAX inode for both sw and hw modes.
>
> ...because it's fsdax and who's really going to test/use software atomic
> writes there ?
Right
>
>> Finally make atomic write size and DAX mount always mutually exclusive.
>
> Why? You could have a rt-on-pmem filesystem with S_DAX files, and still
> want to do atomic writes to files on the data device.
How can I test that, i.e. put something on data device?
I tried something like this:
$mkfs.xfs -f -m rmapbt=0,reflink=1 -d rtinherit=1 -r rtdev=/dev/pmem0
/dev/pmem1
$mount /dev/pmem1 mnt -o dax=always,rtdev=/dev/pmem0 -o
max_atomic_write=16k
$mkdir mnt/non_rt
$xfs_io -c "chattr -t" mnt/non_rt/ #make non-rt
$touch mnt/non_rt/file
$xfs_io -c "lsattr -v" mnt/non_rt/file
[has-xattr] mnt/non_rt/file
$xfs_io -c "statx -r -m 0x10000" mnt/non_rt/file
---
stat.atomic_write_unit_min = 0
stat.atomic_write_unit_max = 0
stat.atomic_write_segments_max = 0
---
I thought that losing the rtinherit flag would put the file on the data
device. From adding some kernel debug, it seems that this file is still
IS_DAX() == true
>
>> Reported-by: "Darrick J. Wong" <djwong@kernel.org>
>> Fixes: 9dffc58f2384 ("xfs: update atomic write limits")
>> Signed-off-by: John Garry <john.g.garry@oracle.com>
>>
>> diff --git a/fs/xfs/xfs_file.c b/fs/xfs/xfs_file.c
>> index db21b5a4b881..84876f41da93 100644
>> --- a/fs/xfs/xfs_file.c
>> +++ b/fs/xfs/xfs_file.c
>> @@ -1102,9 +1102,6 @@ xfs_file_write_iter(
>> if (xfs_is_shutdown(ip->i_mount))
>> return -EIO;
>>
>> - if (IS_DAX(inode))
>> - return xfs_file_dax_write(iocb, from);
>> -
>> if (iocb->ki_flags & IOCB_ATOMIC) {
>> if (ocount < xfs_get_atomic_write_min(ip))
>> return -EINVAL;
>> @@ -1117,6 +1114,9 @@ xfs_file_write_iter(
>> return ret;
>> }
>>
>> + if (IS_DAX(inode))
>> + return xfs_file_dax_write(iocb, from);
>> +
>> if (iocb->ki_flags & IOCB_DIRECT) {
>> /*
>> * Allow a directio write to fall back to a buffered
>> diff --git a/fs/xfs/xfs_inode.h b/fs/xfs/xfs_inode.h
>> index 07fbdcc4cbf5..b142cd4f446a 100644
>> --- a/fs/xfs/xfs_inode.h
>> +++ b/fs/xfs/xfs_inode.h
>> @@ -356,11 +356,22 @@ static inline bool xfs_inode_has_bigrtalloc(const struct xfs_inode *ip)
>> (XFS_IS_REALTIME_INODE(ip) ? \
>> (ip)->i_mount->m_rtdev_targp : (ip)->i_mount->m_ddev_targp)
>>
>> -static inline bool xfs_inode_can_hw_atomic_write(const struct xfs_inode *ip)
>> +static inline bool xfs_inode_can_hw_atomic_write(struct xfs_inode *ip)
>
> Why drop const here? VFS_IC() should be sufficient, I think.
>
I dropped that const as I got a complaint about ignoring the const when
passing to VFS_I(). But, as you say, I can use VFS_IC()
Thanks,
John
next prev parent reply other threads:[~2025-07-18 8:15 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-17 15:19 [PATCH] xfs: disallow atomic writes on DAX John Garry
2025-07-17 16:02 ` Darrick J. Wong
2025-07-18 8:15 ` John Garry [this message]
2025-07-18 19:38 ` Darrick J. Wong
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1b61ffa1-870d-4b30-9ba8-014a9ca5d33f@oracle.com \
--to=john.g.garry@oracle.com \
--cc=cem@kernel.org \
--cc=djwong@kernel.org \
--cc=hch@lst.de \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).