Re: [PATCH v2] xfs: add a test for atomic writes

public inbox for linux-xfs@vger.kernel.org
 help / color / mirror / Atom feed

From: Nirjhar Roy <nirjhar@linux.ibm.com>
To: "Darrick J. Wong" <djwong@kernel.org>
Cc: Catherine Hoang <catherine.hoang@oracle.com>,
	linux-xfs@vger.kernel.org, ritesh.list@gmail.com,
	ojaswin@linux.ibm.com, fstests@vger.kernel.org, zlang@kernel.org
Subject: Re: [PATCH v2] xfs: add a test for atomic writes
Date: Tue, 24 Dec 2024 10:40:40 +0530	[thread overview]
Message-ID: <a49db46c-d593-4cf8-bbe2-960b4b9eb820@linux.ibm.com> (raw)
In-Reply-To: <20241223172652.GL6160@frogsfrogsfrogs>


On 12/23/24 22:56, Darrick J. Wong wrote:
> On Fri, Dec 20, 2024 at 10:32:51AM +0530, Nirjhar Roy wrote:
>> On 12/19/24 22:57, Darrick J. Wong wrote:
>>> On Thu, Dec 19, 2024 at 08:43:36PM +0530, Nirjhar Roy wrote:
>>>> On Mon, 2024-12-16 at 18:08 -0800, Catherine Hoang wrote:
>>>>> Add a test to validate the new atomic writes feature.
>>>>>
>>>>> Signed-off-by: Catherine Hoang <catherine.hoang@oracle.com>
>>>>> ---
>>>>>    common/rc         | 14 ++++++++
>>>>>    tests/xfs/611     | 81
>>>>> +++++++++++++++++++++++++++++++++++++++++++++++
>>>>>    tests/xfs/611.out |  2 ++
>>>> Now that ext4 also has support for block atomic writes, do you think it
>>>> appropritate to put it under generic?
> Yeah, it ought to be a generic test.
>
>>>>>    3 files changed, 97 insertions(+)
>>>>>    create mode 100755 tests/xfs/611
>>>>>    create mode 100644 tests/xfs/611.out
>>>>>
>>>>> diff --git a/common/rc b/common/rc
>>>>> index 2ee46e51..b9da749e 100644
>>>>> --- a/common/rc
>>>>> +++ b/common/rc
>>>>> @@ -5148,6 +5148,20 @@ _require_scratch_btime()
>>>>>    	_scratch_unmount
>>>>>    }
>>>>> +_require_scratch_write_atomic()
>>>>> +{
>>>>> +	_require_scratch
>>>>> +	_scratch_mkfs > /dev/null 2>&1
>>>>> +	_scratch_mount
>>>> Minor: Do we need the _scratch_mount and _scratch_unmount? We can
>>>> directly statx the underlying device too, right?
>>> Yes, we need the scratch fs, because the filesystem might not support
>>> untorn writes even if the underlying block device does.  Or it can
>>> decide to constrain the supported io sizes.
>> Oh, right. We need both the file system and the device to support the atomic
>> write feature.
>>>>> +
>>>>> +	export STATX_WRITE_ATOMIC=0x10000
>>>>> +	$XFS_IO_PROG -c "statx -r -m $STATX_WRITE_ATOMIC" $SCRATCH_MNT
>>>>> \
>>>>> +		| grep atomic >>$seqres.full 2>&1 || \
>>>>> +		_notrun "write atomic not supported by this filesystem"
>>>> Are we assuming that the SCRATCH_DEV supports atomic writes here? If
>>>> not, do you think the idea of checking if the underlying device
>>>> supports atomic writes will be appropriate here?
>>>>
>>>> I tried running the test with a loop device (with no atomic writes
>>>> support) and this function did not execute _notrun. The test did fail
>>>> expectedly with "atomic write min 0, should be fs block size 4096".
>>> Oh, yeah, awu_min==awu_max==0 should be an automatic _notrun.
>> Yes.
>>>> However, the test shouldn't have begun or reached this stage if the
>>>> underlying device doesn't support atomic writes, right?
>>> _require* helpers decide if the test preconditions have been satisfied,
>>> so this is exactly where the test would bail out.
>>>> Maybe look at how scsi_debug is used? Tests like tests/generic/704 and
>>>> common/scsi_debug?
>>>>> +
>>>>> +	_scratch_unmount
>>>>> +}
>>>>> +
>>>>>    _require_inode_limits()
>>>>>    {
>>>>>    	if [ $(_get_free_inode $TEST_DIR) -eq 0 ]; then
>>>>> diff --git a/tests/xfs/611 b/tests/xfs/611
>>>>> new file mode 100755
>>>>> index 00000000..a26ec143
>>>>> --- /dev/null
>>>>> +++ b/tests/xfs/611
>>>>> @@ -0,0 +1,81 @@
>>>>> +#! /bin/bash
>>>>> +# SPDX-License-Identifier: GPL-2.0
>>>>> +# Copyright (c) 2024 Oracle.  All Rights Reserved.
>>>>> +#
>>>>> +# FS QA Test 611
>>>>> +#
>>>>> +# Validate atomic write support
>>>>> +#
>>>>> +. ./common/preamble
>>>>> +_begin_fstest auto quick rw
>>>>> +
>>>>> +_supported_fs xfs
>>>>> +_require_scratch
>>>>> +_require_scratch_write_atomic
>>>>> +
>>>>> +test_atomic_writes()
>>>>> +{
>>>>> +    local bsize=$1
>>>>> +
>>>>> +    _scratch_mkfs_xfs -b size=$bsize >> $seqres.full
>>>>> +    _scratch_mount
>>>>> +    _xfs_force_bdev data $SCRATCH_MNT
>>>>> +
>>>>> +    testfile=$SCRATCH_MNT/testfile
>>>>> +    touch $testfile
>>>>> +
>>>>> +    file_min_write=$($XFS_IO_PROG -c "statx -r -m
>>>>> $STATX_WRITE_ATOMIC" $testfile | \
>>>>> +        grep atomic_write_unit_min | cut -d ' ' -f 3)
>>>>> +    file_max_write=$($XFS_IO_PROG -c "statx -r -m
>>>>> $STATX_WRITE_ATOMIC" $testfile | \
>>>>> +        grep atomic_write_unit_max | cut -d ' ' -f 3)
>>>>> +    file_max_segments=$($XFS_IO_PROG -c "statx -r -m
>>>>> $STATX_WRITE_ATOMIC" $testfile | \
>>>>> +        grep atomic_write_segments_max | cut -d ' ' -f 3)
>>>>> +
>>>> Minor: A refactoring suggestion. Can we put the commands to fetch the
>>>> atomic_write_unit_min , atomic_write_unit_max and
>>>> atomic_write_segments_max in a function and re-use them? We are using
>>>> these commands to get bdev_min_write/bdev_max_write as well, so a
>>>> function might make the code look more compact. Some maybe something
>>>> like:
>>>>
>>>> _get_at_wr_unit_min()
>>> Don't reuse another English word ("at") as an abbreviation, please.
>>>
>>> _atomic_write_unit_min()
>> Okay.
>>>> {
>>>> 	$XFS_IO_PROG -c "statx -r -m $STATX_WRITE_ATOMIC" $1 | grep
>>>> atomic_write_unit_min | \
>>>> 		grep -o '[0-9]\+'
>>>> }
>>>>
>>>> _get_at_wr_unit_max()
>>>> {
>>>> 	$XFS_IO_PROG -c "statx -r -m $STATX_WRITE_ATOMIC" $1 | grep
>>>> atomic_write_unit_max | \
>>>> 		grep -o '[0-9]\+'
>>>> }
>>>>    and then,
>>>> file_min_write=$(_get_at_wr_unit_min $testfile) and similarly for file_max_write, file_max_segments, bdev_min_write/bdev_max_write
>>>>
>>>>
>>>>> +    # Check that atomic min/max = FS block size
>>>>> +    test $file_min_write -eq $bsize || \
>>>>> +        echo "atomic write min $file_min_write, should be fs block
>>>>> size $bsize"
>>>>> +    test $file_min_write -eq $bsize || \
>>>>> +        echo "atomic write max $file_max_write, should be fs block
>>>>> size $bsize"
>>>>> +    test $file_max_segments -eq 1 || \
>>>>> +        echo "atomic write max segments $file_max_segments, should
>>>>> be 1"
>>>>> +
>>>>> +    # Check that we can perform an atomic write of len = FS block
>>>>> size
>>>>> +    bytes_written=$($XFS_IO_PROG -dc "pwrite -A -D 0 $bsize"
>>>>> $testfile | \
>>>>> +        grep wrote | awk -F'[/ ]' '{print $2}')
>>>> is "$XFS_IO_PROG -dc pwrite -A -D 0 $bsize" $testfile actually making a
>>>> pwritev2 syscall?
>>>>
>>>> Let's look at the output below:
>>>> (tested with latest master of xfsprogs-dev (commit 90d6da68) on
>>>> pagesize and block size 4k (x86_64 vm)
>>>>
>>>> mount /dev/sdc  /mnt1/test
>>>> touch /mnt1/test/new
>>>> strace -f xfs_io -c "pwrite -A -D 0 4096" /mnt1/test/new
>>> You need to pass -d to xfs_io to get directio mode.  The test does that,
>>> but your command line doesn't.
>> Yes. Sorry I missed that.
>>>> <last few lines>
>>>> openat(AT_FDCWD, "/mnt1/test/new", O_RDWR) = 3
>>>> ...
>>>> ...
>>>> pwrite64(3,
>>>> "\315\315\315\315\315\315\315\315\315\315\315\315\315\315\315\315\315\3
>>>> 15\315\315\315\315\315\315\315\315\315\315\315\315\315\315"..., 4096,
>>>> 0) = 4096
>>> That seems like a bug though.  "pwrite -A -D -V1 0 4096"?
>> Sorry, what is the bug that you are pointing out here?
>>
>> The command given by you i.e,  xfs_io -dc "pwrite -A -D -V1 4096" <path>
>> works fine.
> Yeah, sorry, I was clumsily agreeing with your version.  Zarro boogs
> here, as it were. :)
>
> --D

Okay got it.

--

NR

>
>> --
>>
>> NR
>>
>>>> newfstatat(1, "", {st_mode=S_IFCHR|0620, st_rdev=makedev(0x88, 0x1),
>>>> ...}, AT_EMPTY_PATH) = 0
>>>> write(1, "wrote 4096/4096 bytes at offset "..., 34wrote 4096/4096 bytes
>>>> at offset 0
>>>> ) = 34
>>>> write(1, "4 KiB, 1 ops; 0.0001 sec (23.819"..., 644 KiB, 1 ops; 0.0001
>>>> sec (23.819 MiB/sec and 6097.5610 ops/sec)
>>>> ) = 64
>>>> exit_group(0)
>>>>
>>>> So the issues are as follows:
>>>> 1. file /mnt1/test/new is NOT opened with O_DIRECT flag i.e, direct io
>>>> mode which is one of the requirements for atomic write (buffered io
>>>> doesn't support atomic write, correct me if I am wrong).
>>>> 2. pwrite64 doesn't take the RWF_ATOMIC flag and hence I think this
>>>> write is just a non-atomic write with no stdout output difference as
>>>> such.
>>>>
>>>> Also if you look at the function
>>>>
>>>> do_pwrite() in xfsprogs-dev/io/pwrite.c
>>>>
>>>> static ssize_t
>>>> do_pwrite(
>>>> 	int		fd,
>>>> 	off_t		offset,
>>>> 	long long	count,
>>>> 	size_t		buffer_size,
>>>> 	int		pwritev2_flags)
>>>> {
>>>> 	if (!vectors)
>>>> 		return pwrite(fd, io_buffer, min(count, buffer_size),
>>>> offset);
>>>>
>>>> 	return do_pwritev(fd, offset, count, pwritev2_flags);
>>>> }
>>>>
>>>> it will not call pwritev/pwritev2 unless we have vectors for which you
>>>> will need -V parameter with pwrite subcommand of xfs_io.
>>>>
>>>>
>>>> So I think the correct way to do this would be the following:
>>>>
>>>> bytes_written=$($XFS_IO_PROGS -c "open -d $testfile" -c "pwrite -A -D
>>>> -V 1 0 $bsize" | grep wrote | awk -F'[/ ]' '{print $2}').
>>> Agreed.
>>>
>>>> This also bring us to 2 more test cases that we can add:
>>>>
>>>> a. Atomic write with vec count > 1
>>>> $XFS_IO_PROGS -c "open -d $testfile" -c "pwrite -A -D -V 2 0 $bsize"
>>>> (This should fail with Invalid argument since currently iovec count is
>>>> restricted to 1)
>>>>
>>>> b.
>>>> Open a file withOUT O_DIRECT and try to perform an atomic write. This
>>>> should fail with Operation not Supported (EOPNOTSUPP). So something
>>>> like
>>>> $XFS_IO_PROGS -c "open -f $testfile" -c "pwrite -A -V 1 0 $bsize"
>>> Yeah, those are good subcases.
>>>
>>>> 3. It is better to use -b $bsize with pwrite else, the write might be
>>>> spilitted into multiple atomic writes. For example try the following:
>>>>
>>>> $XFS_IO_PROGS -c "open -fd $testfile" -c "pwrite -A -D -V 1 0 $((
>>>> $bsize * 2 ))"
>>>> The above is expected to fail as the size of the atomic write is
>>>> greater than the limit i.e, 1 block but it will still succeed. Look at
>>>> the strace and you will see 2 pwritev2 system calls. However the
>>>> following will fail expectedly with -EINVAL:
>>>> $XFS_IO_PROGS -c "open -fd $testfile" -c "pwrite -A -D -V 1 -b $((
>>>> $bsize * 2 )) 0 $(( $bsize * 2 ))"
>>> Good catch.
>>>
>>>>
>>>>> +    test $bytes_written -eq $bsize || echo "atomic write len=$bsize
>>>>> failed"
>>>>> +
>>>>> +    # Check that we can perform an atomic write on an unwritten
>>>>> block
>>>>> +    $XFS_IO_PROG -c "falloc $bsize $bsize" $testfile
>>>>> +    bytes_written=$($XFS_IO_PROG -dc "pwrite -A -D $bsize $bsize"
>>>>> $testfile | \
>>>>> +        grep wrote | awk -F'[/ ]' '{print $2}')
>>>>> +    test $bytes_written -eq $bsize || echo "atomic write to
>>>>> unwritten block failed"
>>>>> +
>>>>> +    # Check that we can perform an atomic write on a sparse hole
>>>>> +    $XFS_IO_PROG -c "fpunch 0 $bsize" $testfile
>>>>> +    bytes_written=$($XFS_IO_PROG -dc "pwrite -A -D 0 $bsize"
>>>>> $testfile | \
>>>>> +        grep wrote | awk -F'[/ ]' '{print $2}')
>>>>> +    test $bytes_written -eq $bsize || echo "atomic write to sparse
>>>>> hole failed"
>>>>> +
>>>>> +    # Reject atomic write if len is out of bounds
>>>>> +    $XFS_IO_PROG -dc "pwrite -A -D 0 $((bsize - 1))" $testfile 2>>
>>>>> $seqres.full && \
>>>>> +        echo "atomic write len=$((bsize - 1)) should fail"
>>>>> +    $XFS_IO_PROG -dc "pwrite -A -D 0 $((bsize + 1))" $testfile 2>>
>>>>> $seqres.full && \
>>>>> +        echo "atomic write len=$((bsize + 1)) should fail"
>>>> Have we covered the scenario where the offset % len != 0 Should fail -
>>>> Should fail with Invalid arguments -EINVAL.
>>> I think you're right.
>>>
>>>> Also do you think adding similar tests with raw writes to the
>>>> underlying devices bypassing the fs layer will add some value? There
>>>> are slight less strict or different rules in the block layer which IMO
>>>> worth to be tested. Please let me know your thoughts.
>>> That should be in blktests.
>>>
>>>>> +
>>>>> +    _scratch_unmount
>>>>> +}
>>>>> +
>>>>> +bdev_min_write=$($XFS_IO_PROG -c "statx -r -m $STATX_WRITE_ATOMIC"
>>>>> $SCRATCH_DEV | \
>>>>> +    grep atomic_write_unit_min | cut -d ' ' -f 3)
>>>>> +bdev_max_write=$($XFS_IO_PROG -c "statx -r -m $STATX_WRITE_ATOMIC"
>>>>> $SCRATCH_DEV | \
>>>>> +    grep atomic_write_unit_max | cut -d ' ' -f 3)
>>>>> +
>>>> Similar comment before - Refactor this into a function.
>>>>> +for ((bsize=$bdev_min_write; bsize<=bdev_max_write; bsize*=2)); do
>>>>> +    _scratch_mkfs_xfs_supported -b size=$bsize >> $seqres.full 2>&1
>>>>> && \
>>>>> +        test_atomic_writes $bsize
>>>>> +done;
>>>> Minor: This might fail on some archs(x86_64) if the kernel isn't
>>>> compiled without CONFIG_TRANSPARENT_HUGEPAGE to enable block size
>>>> greater than 4k on x86_64.
>>> Oh, yeah, that's a good catch.
>>>
>>> --D
>>>
>>>> --
>>>> NR
>>>>> +
>>>>> +# success, all done
>>>>> +echo Silence is golden
>>>>> +status=0
>>>>> +exit
>>>>> diff --git a/tests/xfs/611.out b/tests/xfs/611.out
>>>>> new file mode 100644
>>>>> index 00000000..b8a44164
>>>>> --- /dev/null
>>>>> +++ b/tests/xfs/611.out
>>>>> @@ -0,0 +1,2 @@
>>>>> +QA output created by 611
>>>>> +Silence is golden
>> -- 
>> ---
>> Nirjhar Roy
>> Linux Kernel Developer
>> IBM, Bangalore
>>
>>
-- 
---
Nirjhar Roy
Linux Kernel Developer
IBM, Bangalore

next prev parent reply	other threads:[~2024-12-24  5:10 UTC|newest]

Thread overview: 11+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-17  2:08 [PATCH v2] xfs: add a test for atomic writes Catherine Hoang
2024-12-19  0:41 ` Darrick J. Wong
2024-12-19 10:48 ` John Garry
2024-12-19 17:32   ` Darrick J. Wong
2024-12-20 20:57   ` Catherine Hoang
2024-12-19 15:13 ` Nirjhar Roy
2024-12-19 17:27   ` Darrick J. Wong
2024-12-20  5:02     ` Nirjhar Roy
2024-12-23 17:26       ` Darrick J. Wong
2024-12-24  5:10         ` Nirjhar Roy [this message]
2024-12-23 12:13 ` Ritesh Harjani

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=a49db46c-d593-4cf8-bbe2-960b4b9eb820@linux.ibm.com \
    --to=nirjhar@linux.ibm.com \
    --cc=catherine.hoang@oracle.com \
    --cc=djwong@kernel.org \
    --cc=fstests@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=ojaswin@linux.ibm.com \
    --cc=ritesh.list@gmail.com \
    --cc=zlang@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox