From: Ari Sundholm <ari@tuxera.com>
To: Dave Chinner <david@fromorbit.com>
Cc: fstests@vger.kernel.org, Emil Karlson <jkarlson@tuxera.com>,
linux-xfs@vger.kernel.org
Subject: Re: [PATCH] xfs_io: fix exitcode handling (was Re: generic/399 and xfs_io pwrite command)
Date: Thu, 14 Dec 2017 14:11:32 +0200 [thread overview]
Message-ID: <27b82a44-b889-212f-af74-cad35c103c86@tuxera.com> (raw)
In-Reply-To: <20171206002638.GB5858@dastard>
On 12/06/2017 02:26 AM, Dave Chinner wrote:
> On Wed, Dec 06, 2017 at 08:18:50AM +1100, Dave Chinner wrote:
>> On Tue, Dec 05, 2017 at 04:23:43PM +0200, Ari Sundholm wrote:
>>> On 12/05/2017 12:28 AM, Dave Chinner wrote:
>>>> On Thu, Nov 30, 2017 at 04:21:47PM +0200, Ari Sundholm wrote:
>>>>> Hi!
>>>>>
>>>>> While debugging an issue, we found out that generic/399 seems to
>>>>> rely on a behavior that is specific to ext4 in the following section
>>>>> of code:
>>>>> ----------------8<--------snip--------8<------------------------------
>>>>> #
>>>>> # Write files of 1 MB of all the same byte until we hit ENOSPC.
>>>>> Note that we
>>>>> # must not create sparse files, since the contents of sparse files are not
>>>>> # stored on-disk. Also, we create multiple files rather than one big file
>>>>> # because we want to test for reuse of per-file keys.
>>>>> #
>>>>> total_file_size=0
>>>>> i=1
>>>>> while true; do
>>>>> file=$SCRATCH_MNT/encrypted_dir/file$i
>>>>> if ! $XFS_IO_PROG -f $file -c 'pwrite 0 1M' &> $tmp.out; then
>>>>> if ! grep -q 'No space left on device' $tmp.out; then
>>>>> echo "FAIL: unexpected pwrite failure"
>>>>> cat $tmp.out
>>>>> elif [ -e $file ]; then
>>>>> total_file_size=$((total_file_size + $(stat -c %s $file)))
>>>>> fi
>>>>> break
>>>>> fi
>>>>> total_file_size=$((total_file_size + $(stat -c %s $file)))
>>>>> i=$((i + 1))
>>>>> if [ $i -gt $fs_size_in_mb ]; then
>>>>> echo "FAIL: filesystem never filled up!"
>>>>> break
>>>>> fi
>>>>> done
>>>>> ----------------8<--------snip--------8<------------------------------
>>>>>
>>>>> What happens with ext4 is that the xfs_io command gives a nonzero
>>>>> exit value not when the pwrite command fails with ENOSPC but during
>>>>> the *next* iteration when opening the file fails with ENOSPC. Turns
>>>>> out the pwrite command failing does not cause xfs_io to give a
>>>>> nonzero exit value.
>>>>
>>>> That implies ext4 is returning zero bytes written to the pwrite()
>>>> call rather than ENOSPC. i.e.:
>>>>
>>>> bytes = do_pwrite(file->fd, off, cnt, buffersize,
>>>> pwritev2_flags);
>>>> if (bytes == 0)
>>>> break;
>>>> if (bytes < 0) {
>>>> perror("pwrite");
>>>> return -1;
>>>> }
>>>>
>>>> So if it's exiting with no error, then we can't have got an error
>>> >from ext4 at ENOSPC. If that's the case, it probably should be
>>>> considered an ext4 bug, not an issue with xfs_io...
>>>>
>>>
>>> No, according to what we've observed, that is not what happens. The
>>> pwrite() call does fail and errno is ENOSPC after the call. The
>>> immediate problem is that xfs_io does not reflect this failure in
>>> its exit value and thus the check in generic/399 does not work in
>>> this case. Only when open() fails during the next iteration does
>>> xfs_io give a nonzero exit value and cause the check in the test
>>> case to allow the test case to end successfully.
>>
>> Ah, io/pwrite.c fails to set exitcode on failure iappropriately
>> before returning. Looks like there is a bunch of xfs_io commands
>> that fail to do this properly.
>>
>>> What is specific to ext4 here is, as stated in my original message,
>>> that open() fails.
>>
>> Because there are no more inodes to allocate (i.e. inode table is
>> full) and so attempts to create more fail with ENOSPC. The xfs_io
>> open command set exitcode appropriately, and that's why it finally
>> triggers a test abort.
>>
>> Looks like xfs_io does need fixing to set exitcode appropriately on
>> all failures...
>
> Try the patch below.
>
Thanks, I'll try it. Sorry for the late response (was on vacation).
> Cheers,
>
> Dave.
>
next prev parent reply other threads:[~2017-12-14 12:11 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <7235f31f-5427-ff12-11f9-cba98431e71c@tuxera.com>
[not found] ` <20171204222842.GX4094@dastard>
[not found] ` <35d2712c-b036-d017-2d42-f74bbc4444a9@tuxera.com>
[not found] ` <20171205211850.GA5858@dastard>
2017-12-06 0:26 ` [PATCH] xfs_io: fix exitcode handling (was Re: generic/399 and xfs_io pwrite command) Dave Chinner
2017-12-07 14:05 ` Brian Foster
2017-12-07 23:46 ` Dave Chinner
2017-12-08 14:15 ` Brian Foster
2017-12-08 22:52 ` Dave Chinner
2017-12-24 19:51 ` Eric Sandeen
2017-12-14 12:11 ` Ari Sundholm [this message]
2017-12-15 14:26 ` Ari Sundholm
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=27b82a44-b889-212f-af74-cad35c103c86@tuxera.com \
--to=ari@tuxera.com \
--cc=david@fromorbit.com \
--cc=fstests@vger.kernel.org \
--cc=jkarlson@tuxera.com \
--cc=linux-xfs@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).