From: Nirjhar Roy <nirjhar@linux.ibm.com>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: Catherine Hoang <catherine.hoang@oracle.com>,
linux-xfs@vger.kernel.org, ritesh.list@gmail.com,
ojaswin@linux.ibm.com, fstests@vger.kernel.org, zlang@kernel.org
Subject: Re: [PATCH v2] xfs: add a test for atomic writes
Date: Fri, 20 Dec 2024 10:32:51 +0530 [thread overview]
Message-ID: <b6b5055f-e5ad-4e93-93b6-546584a6910a@linux.ibm.com> (raw)
In-Reply-To: <20241219172734.GE6160@frogsfrogsfrogs>
On 12/19/24 22:57, Darrick J. Wong wrote:
> On Thu, Dec 19, 2024 at 08:43:36PM +0530, Nirjhar Roy wrote:
>> On Mon, 2024-12-16 at 18:08 -0800, Catherine Hoang wrote:
>>> Add a test to validate the new atomic writes feature.
>>>
>>> Signed-off-by: Catherine Hoang <catherine.hoang@oracle.com>
>>> ---
>>> common/rc | 14 ++++++++
>>> tests/xfs/611 | 81
>>> +++++++++++++++++++++++++++++++++++++++++++++++
>>> tests/xfs/611.out | 2 ++
>> Now that ext4 also has support for block atomic writes, do you think it
>> appropritate to put it under generic?
>>> 3 files changed, 97 insertions(+)
>>> create mode 100755 tests/xfs/611
>>> create mode 100644 tests/xfs/611.out
>>>
>>> diff --git a/common/rc b/common/rc
>>> index 2ee46e51..b9da749e 100644
>>> --- a/common/rc
>>> +++ b/common/rc
>>> @@ -5148,6 +5148,20 @@ _require_scratch_btime()
>>> _scratch_unmount
>>> }
>>>
>>> +_require_scratch_write_atomic()
>>> +{
>>> + _require_scratch
>>> + _scratch_mkfs > /dev/null 2>&1
>>> + _scratch_mount
>> Minor: Do we need the _scratch_mount and _scratch_unmount? We can
>> directly statx the underlying device too, right?
> Yes, we need the scratch fs, because the filesystem might not support
> untorn writes even if the underlying block device does. Or it can
> decide to constrain the supported io sizes.
Oh, right. We need both the file system and the device to support the
atomic write feature.
>
>>> +
>>> + export STATX_WRITE_ATOMIC=0x10000
>>> + $XFS_IO_PROG -c "statx -r -m $STATX_WRITE_ATOMIC" $SCRATCH_MNT
>>> \
>>> + | grep atomic >>$seqres.full 2>&1 || \
>>> + _notrun "write atomic not supported by this filesystem"
>> Are we assuming that the SCRATCH_DEV supports atomic writes here? If
>> not, do you think the idea of checking if the underlying device
>> supports atomic writes will be appropriate here?
>>
>> I tried running the test with a loop device (with no atomic writes
>> support) and this function did not execute _notrun. The test did fail
>> expectedly with "atomic write min 0, should be fs block size 4096".
> Oh, yeah, awu_min==awu_max==0 should be an automatic _notrun.
Yes.
>
>> However, the test shouldn't have begun or reached this stage if the
>> underlying device doesn't support atomic writes, right?
> _require* helpers decide if the test preconditions have been satisfied,
> so this is exactly where the test would bail out.
>> Maybe look at how scsi_debug is used? Tests like tests/generic/704 and
>> common/scsi_debug?
>>> +
>>> + _scratch_unmount
>>> +}
>>> +
>>> _require_inode_limits()
>>> {
>>> if [ $(_get_free_inode $TEST_DIR) -eq 0 ]; then
>>> diff --git a/tests/xfs/611 b/tests/xfs/611
>>> new file mode 100755
>>> index 00000000..a26ec143
>>> --- /dev/null
>>> +++ b/tests/xfs/611
>>> @@ -0,0 +1,81 @@
>>> +#! /bin/bash
>>> +# SPDX-License-Identifier: GPL-2.0
>>> +# Copyright (c) 2024 Oracle. All Rights Reserved.
>>> +#
>>> +# FS QA Test 611
>>> +#
>>> +# Validate atomic write support
>>> +#
>>> +. ./common/preamble
>>> +_begin_fstest auto quick rw
>>> +
>>> +_supported_fs xfs
>>> +_require_scratch
>>> +_require_scratch_write_atomic
>>> +
>>> +test_atomic_writes()
>>> +{
>>> + local bsize=$1
>>> +
>>> + _scratch_mkfs_xfs -b size=$bsize >> $seqres.full
>>> + _scratch_mount
>>> + _xfs_force_bdev data $SCRATCH_MNT
>>> +
>>> + testfile=$SCRATCH_MNT/testfile
>>> + touch $testfile
>>> +
>>> + file_min_write=$($XFS_IO_PROG -c "statx -r -m
>>> $STATX_WRITE_ATOMIC" $testfile | \
>>> + grep atomic_write_unit_min | cut -d ' ' -f 3)
>>> + file_max_write=$($XFS_IO_PROG -c "statx -r -m
>>> $STATX_WRITE_ATOMIC" $testfile | \
>>> + grep atomic_write_unit_max | cut -d ' ' -f 3)
>>> + file_max_segments=$($XFS_IO_PROG -c "statx -r -m
>>> $STATX_WRITE_ATOMIC" $testfile | \
>>> + grep atomic_write_segments_max | cut -d ' ' -f 3)
>>> +
>> Minor: A refactoring suggestion. Can we put the commands to fetch the
>> atomic_write_unit_min , atomic_write_unit_max and
>> atomic_write_segments_max in a function and re-use them? We are using
>> these commands to get bdev_min_write/bdev_max_write as well, so a
>> function might make the code look more compact. Some maybe something
>> like:
>>
>> _get_at_wr_unit_min()
> Don't reuse another English word ("at") as an abbreviation, please.
>
> _atomic_write_unit_min()
Okay.
>
>> {
>> $XFS_IO_PROG -c "statx -r -m $STATX_WRITE_ATOMIC" $1 | grep
>> atomic_write_unit_min | \
>> grep -o '[0-9]\+'
>> }
>>
>> _get_at_wr_unit_max()
>> {
>> $XFS_IO_PROG -c "statx -r -m $STATX_WRITE_ATOMIC" $1 | grep
>> atomic_write_unit_max | \
>> grep -o '[0-9]\+'
>> }
>> and then,
>> file_min_write=$(_get_at_wr_unit_min $testfile) and similarly for file_max_write, file_max_segments, bdev_min_write/bdev_max_write
>>
>>
>>> + # Check that atomic min/max = FS block size
>>> + test $file_min_write -eq $bsize || \
>>> + echo "atomic write min $file_min_write, should be fs block
>>> size $bsize"
>>> + test $file_min_write -eq $bsize || \
>>> + echo "atomic write max $file_max_write, should be fs block
>>> size $bsize"
>>> + test $file_max_segments -eq 1 || \
>>> + echo "atomic write max segments $file_max_segments, should
>>> be 1"
>>> +
>>> + # Check that we can perform an atomic write of len = FS block
>>> size
>>> + bytes_written=$($XFS_IO_PROG -dc "pwrite -A -D 0 $bsize"
>>> $testfile | \
>>> + grep wrote | awk -F'[/ ]' '{print $2}')
>> is "$XFS_IO_PROG -dc pwrite -A -D 0 $bsize" $testfile actually making a
>> pwritev2 syscall?
>>
>> Let's look at the output below:
>> (tested with latest master of xfsprogs-dev (commit 90d6da68) on
>> pagesize and block size 4k (x86_64 vm)
>>
>> mount /dev/sdc /mnt1/test
>> touch /mnt1/test/new
>> strace -f xfs_io -c "pwrite -A -D 0 4096" /mnt1/test/new
> You need to pass -d to xfs_io to get directio mode. The test does that,
> but your command line doesn't.
Yes. Sorry I missed that.
>
>> <last few lines>
>> openat(AT_FDCWD, "/mnt1/test/new", O_RDWR) = 3
>> ...
>> ...
>> pwrite64(3,
>> "\315\315\315\315\315\315\315\315\315\315\315\315\315\315\315\315\315\3
>> 15\315\315\315\315\315\315\315\315\315\315\315\315\315\315"..., 4096,
>> 0) = 4096
> That seems like a bug though. "pwrite -A -D -V1 0 4096"?
Sorry, what is the bug that you are pointing out here?
The command given by you i.e, xfs_io -dc "pwrite -A -D -V1 4096" <path>
works fine.
--
NR
>
>> newfstatat(1, "", {st_mode=S_IFCHR|0620, st_rdev=makedev(0x88, 0x1),
>> ...}, AT_EMPTY_PATH) = 0
>> write(1, "wrote 4096/4096 bytes at offset "..., 34wrote 4096/4096 bytes
>> at offset 0
>> ) = 34
>> write(1, "4 KiB, 1 ops; 0.0001 sec (23.819"..., 644 KiB, 1 ops; 0.0001
>> sec (23.819 MiB/sec and 6097.5610 ops/sec)
>> ) = 64
>> exit_group(0)
>>
>> So the issues are as follows:
>> 1. file /mnt1/test/new is NOT opened with O_DIRECT flag i.e, direct io
>> mode which is one of the requirements for atomic write (buffered io
>> doesn't support atomic write, correct me if I am wrong).
>> 2. pwrite64 doesn't take the RWF_ATOMIC flag and hence I think this
>> write is just a non-atomic write with no stdout output difference as
>> such.
>>
>> Also if you look at the function
>>
>> do_pwrite() in xfsprogs-dev/io/pwrite.c
>>
>> static ssize_t
>> do_pwrite(
>> int fd,
>> off_t offset,
>> long long count,
>> size_t buffer_size,
>> int pwritev2_flags)
>> {
>> if (!vectors)
>> return pwrite(fd, io_buffer, min(count, buffer_size),
>> offset);
>>
>> return do_pwritev(fd, offset, count, pwritev2_flags);
>> }
>>
>> it will not call pwritev/pwritev2 unless we have vectors for which you
>> will need -V parameter with pwrite subcommand of xfs_io.
>>
>>
>> So I think the correct way to do this would be the following:
>>
>> bytes_written=$($XFS_IO_PROGS -c "open -d $testfile" -c "pwrite -A -D
>> -V 1 0 $bsize" | grep wrote | awk -F'[/ ]' '{print $2}').
> Agreed.
>
>> This also bring us to 2 more test cases that we can add:
>>
>> a. Atomic write with vec count > 1
>> $XFS_IO_PROGS -c "open -d $testfile" -c "pwrite -A -D -V 2 0 $bsize"
>> (This should fail with Invalid argument since currently iovec count is
>> restricted to 1)
>>
>> b.
>> Open a file withOUT O_DIRECT and try to perform an atomic write. This
>> should fail with Operation not Supported (EOPNOTSUPP). So something
>> like
>> $XFS_IO_PROGS -c "open -f $testfile" -c "pwrite -A -V 1 0 $bsize"
> Yeah, those are good subcases.
>
>> 3. It is better to use -b $bsize with pwrite else, the write might be
>> spilitted into multiple atomic writes. For example try the following:
>>
>> $XFS_IO_PROGS -c "open -fd $testfile" -c "pwrite -A -D -V 1 0 $((
>> $bsize * 2 ))"
>> The above is expected to fail as the size of the atomic write is
>> greater than the limit i.e, 1 block but it will still succeed. Look at
>> the strace and you will see 2 pwritev2 system calls. However the
>> following will fail expectedly with -EINVAL:
>> $XFS_IO_PROGS -c "open -fd $testfile" -c "pwrite -A -D -V 1 -b $((
>> $bsize * 2 )) 0 $(( $bsize * 2 ))"
> Good catch.
>
>>
>>
>>> + test $bytes_written -eq $bsize || echo "atomic write len=$bsize
>>> failed"
>>> +
>>> + # Check that we can perform an atomic write on an unwritten
>>> block
>>> + $XFS_IO_PROG -c "falloc $bsize $bsize" $testfile
>>> + bytes_written=$($XFS_IO_PROG -dc "pwrite -A -D $bsize $bsize"
>>> $testfile | \
>>> + grep wrote | awk -F'[/ ]' '{print $2}')
>>> + test $bytes_written -eq $bsize || echo "atomic write to
>>> unwritten block failed"
>>> +
>>> + # Check that we can perform an atomic write on a sparse hole
>>> + $XFS_IO_PROG -c "fpunch 0 $bsize" $testfile
>>> + bytes_written=$($XFS_IO_PROG -dc "pwrite -A -D 0 $bsize"
>>> $testfile | \
>>> + grep wrote | awk -F'[/ ]' '{print $2}')
>>> + test $bytes_written -eq $bsize || echo "atomic write to sparse
>>> hole failed"
>>> +
>>> + # Reject atomic write if len is out of bounds
>>> + $XFS_IO_PROG -dc "pwrite -A -D 0 $((bsize - 1))" $testfile 2>>
>>> $seqres.full && \
>>> + echo "atomic write len=$((bsize - 1)) should fail"
>>> + $XFS_IO_PROG -dc "pwrite -A -D 0 $((bsize + 1))" $testfile 2>>
>>> $seqres.full && \
>>> + echo "atomic write len=$((bsize + 1)) should fail"
>> Have we covered the scenario where the offset % len != 0 Should fail -
>> Should fail with Invalid arguments -EINVAL.
> I think you're right.
>
>> Also do you think adding similar tests with raw writes to the
>> underlying devices bypassing the fs layer will add some value? There
>> are slight less strict or different rules in the block layer which IMO
>> worth to be tested. Please let me know your thoughts.
> That should be in blktests.
>
>>> +
>>> + _scratch_unmount
>>> +}
>>> +
>>> +bdev_min_write=$($XFS_IO_PROG -c "statx -r -m $STATX_WRITE_ATOMIC"
>>> $SCRATCH_DEV | \
>>> + grep atomic_write_unit_min | cut -d ' ' -f 3)
>>> +bdev_max_write=$($XFS_IO_PROG -c "statx -r -m $STATX_WRITE_ATOMIC"
>>> $SCRATCH_DEV | \
>>> + grep atomic_write_unit_max | cut -d ' ' -f 3)
>>> +
>> Similar comment before - Refactor this into a function.
>>> +for ((bsize=$bdev_min_write; bsize<=bdev_max_write; bsize*=2)); do
>>> + _scratch_mkfs_xfs_supported -b size=$bsize >> $seqres.full 2>&1
>>> && \
>>> + test_atomic_writes $bsize
>>> +done;
>> Minor: This might fail on some archs(x86_64) if the kernel isn't
>> compiled without CONFIG_TRANSPARENT_HUGEPAGE to enable block size
>> greater than 4k on x86_64.
> Oh, yeah, that's a good catch.
>
> --D
>
>> --
>> NR
>>> +
>>> +# success, all done
>>> +echo Silence is golden
>>> +status=0
>>> +exit
>>> diff --git a/tests/xfs/611.out b/tests/xfs/611.out
>>> new file mode 100644
>>> index 00000000..b8a44164
>>> --- /dev/null
>>> +++ b/tests/xfs/611.out
>>> @@ -0,0 +1,2 @@
>>> +QA output created by 611
>>> +Silence is golden
>>
--
---
Nirjhar Roy
Linux Kernel Developer
IBM, Bangalore
next prev parent reply other threads:[~2024-12-20 5:03 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-12-17 2:08 [PATCH v2] xfs: add a test for atomic writes Catherine Hoang
2024-12-19 0:41 ` Darrick J. Wong
2024-12-19 10:48 ` John Garry
2024-12-19 17:32 ` Darrick J. Wong
2024-12-20 20:57 ` Catherine Hoang
2024-12-19 15:13 ` Nirjhar Roy
2024-12-19 17:27 ` Darrick J. Wong
2024-12-20 5:02 ` Nirjhar Roy [this message]
2024-12-23 17:26 ` Darrick J. Wong
2024-12-24 5:10 ` Nirjhar Roy
2024-12-23 12:13 ` Ritesh Harjani
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=b6b5055f-e5ad-4e93-93b6-546584a6910a@linux.ibm.com \
--to=nirjhar@linux.ibm.com \
--cc=catherine.hoang@oracle.com \
--cc=djwong@kernel.org \
--cc=fstests@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=ojaswin@linux.ibm.com \
--cc=ritesh.list@gmail.com \
--cc=zlang@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox