From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from mail-qt0-f194.google.com ([209.85.216.194]:36251 "EHLO mail-qt0-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751689AbdK1RV5 (ORCPT ); Tue, 28 Nov 2017 12:21:57 -0500 Received: by mail-qt0-f194.google.com with SMTP id a16so804307qtj.3 for ; Tue, 28 Nov 2017 09:21:57 -0800 (PST) Date: Tue, 28 Nov 2017 12:21:55 -0500 From: Josef Bacik Subject: Re: [PATCH v3 10/13] fstests: crash consistency fsx test using dm-log-writes Message-ID: <20171128172152.ktvpnwv233govfwl@destiny> References: <1504638680-25682-1-git-send-email-amir73il@gmail.com> <1504638680-25682-11-git-send-email-amir73il@gmail.com> <20171127150439.i4tpt4yhooxnoaiz@destiny> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: fstests-owner@vger.kernel.org To: Amir Goldstein Cc: Josef Bacik , Josef Bacik , fstests , linux-fsdevel , Eryu Guan List-ID: On Tue, Nov 28, 2017 at 06:48:43PM +0200, Amir Goldstein wrote: > On Mon, Nov 27, 2017 at 5:04 PM, Josef Bacik wrote: > > On Mon, Nov 27, 2017 at 11:56:58AM +0200, Amir Goldstein wrote: > >> On Tue, Sep 5, 2017 at 10:11 PM, Amir Goldstein wrote: > >> > Cherry-picked the test from commit 70d41e17164b > >> > in Josef Bacik's fstests tree (https://github.com/josefbacik/fstests). > >> > Quoting from Josef's commit message: > >> > > >> > The test just runs some ops and exits, then finds all of the good buffers > >> > in the directory we provided and: > >> > - replays up to the mark given > >> > - mounts the file system and compares the md5sum > >> > - unmounts and fsck's to check for metadata integrity > >> > > >> > dm-log-writes will pretend to do discard and the replay-log tool will > >> > replay it properly depending on the underlying device, either by writing > >> > 0's or actually calling the discard ioctl, so I've enabled discard in the > >> > test for maximum fun. > >> > > >> > [Amir:] > >> > - Removed unneeded _test_falloc_support dynamic FSX_OPTS > >> > - Fold repetitions into for loops > >> > - Added place holders for using constant random seeds > >> > - Add pre umount checkpint > >> > - Add test to new 'replay' group > >> > - Address review comments by Eryu Guan > >> > > >> > Cc: Josef Bacik > >> > Signed-off-by: Amir Goldstein > >> > >> > >> Josef, > >> > >> As you know, this test is now merged to xfstest as generic/455. > >> I have been running the test for a while on xfs and it occasionally > >> reports inconsistencies which I try to investigate. > >> > >> In some of the reports, it appears that dm-log-writes may be exhibiting > >> a reliability issue (see below). > >> > > > > It's not a reliability issue, its a caching issue. dm-log-writes is just > > issuing bio's to the log device, and our destructor waits for all pending io > > blocks to complete before exiting, so unless I've missed how dm is destroying > > devices everything should be on disk. > > > > However since we replay in userspace we are going through the blockdevice's > > pagecache, so we could have stale pages left in place which is screwing us up. > > Will you try this patch and see if it fixes the problem? Thanks, > > > > Josef > > > > > > diff --git a/drivers/md/dm-log-writes.c b/drivers/md/dm-log-writes.c > > index 8b80a9ce9ea9..1c502930af5e 100644 > > --- a/drivers/md/dm-log-writes.c > > +++ b/drivers/md/dm-log-writes.c > > @@ -545,6 +545,8 @@ static void log_writes_dtr(struct dm_target *ti) > > !atomic_read(&lc->pending_blocks)); > > kthread_stop(lc->log_kthread); > > > > + invalidate_bdev(lc->logdev->bdev); > > + invalidate_bdev(lc->dev->bdev); > > WARN_ON(!list_empty(&lc->logging_blocks)); > > WARN_ON(!list_empty(&lc->unflushed_blocks)); > > dm_put_device(ti, lc->dev); > > Josef, > > With your patch OR with my xfstest patch that adds "sync" I did not yet see > another problem of garbage fs after _log_writes_remove. > > I did however, encounter this error (failure to verify read data during fsx) > from scratch/log-writes device (see attached full log). > > I will keep running the test to collect more information. > That failure I'll lay at the feet of whatever fs you are testing ;). I'm glad my patch fixed the replay problem, I'll send that up. Thanks, Josef