From: Jens Axboe <axboe@fb.com>
To: "Theodore Ts'o" <tytso@mit.edu>
Cc: Dave Chinner <david@fromorbit.com>, <linux-ext4@vger.kernel.org>,
<fstests@vger.kernel.org>, <tarasov@vasily.name>
Subject: Re: Test generic/299 stalling forever
Date: Thu, 20 Oct 2016 08:22:00 -0600 [thread overview]
Message-ID: <1fb60e7c-a558-80df-09da-d3c36863a461@fb.com> (raw)
In-Reply-To: <20161019203233.mbbmskpn5ekgl7og@thunk.org>
On 10/19/2016 02:32 PM, Theodore Ts'o wrote:
> On Wed, Oct 19, 2016 at 11:49:12AM -0600, Jens Axboe wrote:
>>
>> Number of cores/nodes?
>> Memory size?
>
> I'm using a gce n1-standard-2 VM. So that's two CPU's and 7680M.
>
> Each CPU is a virtual CPU is implemented as a single hardware
> hyper-thread on a 2.3 GHz Intel Xeon E5 v3 (Haswell). (I was using a
> GCE zone that has Haswell processors; different GCE zones may have
> different processors. See [1] and [2] for more details.)
>
> [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__cloud.google.com_compute_docs_machine-2Dtypes&d=DQIBAg&c=5VD0RTtNlTh3ycd41b3MUw&r=cK1a7KivzZRh1fKQMjSm2A&m=itmAtbDNAiup3d6EW5J8mxTc5VKZZo4z-TaIkfeBJ8o&s=JGVEfRvtOlOiYH_c8NLKuy3FFH8Ap3EGLrhsLV_UdiM&e=
> [2] https://urldefense.proofpoint.com/v2/url?u=https-3A__cloud.google.com_compute_docs_regions-2Dzones_regions-2Dzones&d=DQIBAg&c=5VD0RTtNlTh3ycd41b3MUw&r=cK1a7KivzZRh1fKQMjSm2A&m=itmAtbDNAiup3d6EW5J8mxTc5VKZZo4z-TaIkfeBJ8o&s=nMlkYsjMNWYhpWmDmQmSFH_bQ6am_PeCfQzhWwFbbag&e=
>
>> Rough speed and size of the device?
>
> I'm using a GCE PD backed by a SSD. To a first approximation, you can
> think of it as a KVM qcow file stored on a fast flash device. I'm
> running LVM on the disk, and the fio is running on a 5 gig LVM volume.
>
>> Any special mkfs options?
>
> No. This particular error will trigger on 4k block file systems, 1k
> block file systems, 4k file system swith journals disabled, etc. It's
> fairly insensitive to the file system configuration.
>
>> And whatever else might be relevant.
>
> Note that the generic/299 test is running fio in an an ENOSPC hitter
> configuration, where there is an antangonist thread which is
> constantly allocating all of the disk space available and then freeing
> it all:
>
> # FSQA Test No. 299
> #
> # AIO/DIO stress test
> # Run random AIO/DIO activity and fallocate/truncate simultaneously
> # Test will operate on huge sparsed files so ENOSPC is expected.
>
>
> So some of the AIO/DIO operations will be failing with an error, and
> and I suspect that's very likely relevant to reproducing the failure.
>
> The actual guts of the test from generic/299[1]:
>
> [1] https://git.kernel.org/cgit/fs/xfs/xfstests-dev.git/tree/tests/generic/299
>
> _workout()
> {
> echo ""
> echo "Run fio with random aio-dio pattern"
> echo ""
> cat $fio_config >> $seqres.full
> run_check $FIO_PROG $fio_config &
> pid=$!
> echo "Start fallocate/truncate loop"
>
> for ((i=0; ; i++))
> do
> for ((k=1; k <= NUM_JOBS; k++))
> do
> $XFS_IO_PROG -f -c "falloc 0 $FILE_SIZE" \
> $SCRATCH_MNT/direct_aio.$k.0 >> $seqres.full 2>&1
> done
> for ((k=1; k <= NUM_JOBS; k++))
> do
> $XFS_IO_PROG -c "truncate 0" \
> $SCRATCH_MNT/direct_aio.$k.0 >> $seqres.full 2>&1
> done
> # Following like will check that pid is still run.
> # Once fio exit we can stop fallocate/truncate loop
> pgrep -f "$FIO_PROG" > /dev/null 2>&1 || break
> done
> wait $pid
> }
>
> So what's happening is that generic/299 is looping in the
> fallocate/truncate loop until fio exits, but since fio never exits, so
> it ends up looping forever.
I'm setting up the GCE now, I've had the tests running for about 24h now
on another test box and haven't been able to trigger any hangs. I'll
match your setup as closely as I can, hopefully that'll work.
--
Jens Axboe
next prev parent reply other threads:[~2016-10-20 14:22 UTC|newest]
Thread overview: 29+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-06-18 15:53 Test generic/299 stalling forever Theodore Ts'o
2015-06-18 16:25 ` Eric Whitney
2015-06-18 23:34 ` Dave Chinner
2015-06-19 2:56 ` Theodore Ts'o
2016-09-29 4:37 ` Theodore Ts'o
2016-10-12 15:46 ` Jens Axboe
2016-10-12 21:14 ` Dave Chinner
2016-10-12 21:19 ` Jens Axboe
2016-10-13 2:15 ` Theodore Ts'o
2016-10-13 2:39 ` Jens Axboe
2016-10-13 23:19 ` Theodore Ts'o
2016-10-18 18:01 ` Theodore Ts'o
2016-10-19 14:06 ` Jens Axboe
2016-10-19 17:49 ` Jens Axboe
2016-10-19 20:32 ` Theodore Ts'o
2016-10-20 14:22 ` Jens Axboe [this message]
2016-10-21 22:15 ` Theodore Ts'o
2016-10-23 2:02 ` Theodore Ts'o
2016-10-23 14:32 ` Jens Axboe
2016-10-23 19:33 ` Theodore Ts'o
2016-10-23 21:24 ` Theodore Ts'o
2016-10-24 1:41 ` Jens Axboe
2016-10-24 3:38 ` Theodore Ts'o
2016-10-24 16:28 ` Jens Axboe
2016-10-25 2:54 ` Theodore Ts'o
2016-10-25 2:59 ` Jens Axboe
2016-10-13 13:08 ` Anatoly Pugachev
2016-10-13 13:36 ` Anatoly Pugachev
2016-10-13 14:28 ` Jens Axboe
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1fb60e7c-a558-80df-09da-d3c36863a461@fb.com \
--to=axboe@fb.com \
--cc=david@fromorbit.com \
--cc=fstests@vger.kernel.org \
--cc=linux-ext4@vger.kernel.org \
--cc=tarasov@vasily.name \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).