public inbox for fstests@vger.kernel.org
 help / color / mirror / Atom feed
From: Carlos Maiolino <cmaiolino@redhat.com>
To: Zorro Lang <zlang@redhat.com>
Cc: fstests@vger.kernel.org, linux-xfs@vger.kernel.org
Subject: Re: [PATCH] generic: test dm-thin running out of data space vs concurrent discard
Date: Mon, 2 Jul 2018 11:27:11 +0200	[thread overview]
Message-ID: <20180702092711.budxkovx2ncpehhp@odin.usersys.redhat.com> (raw)
In-Reply-To: <20180629165738.8106-1-zlang@redhat.com>

On Sat, Jun 30, 2018 at 12:57:38AM +0800, Zorro Lang wrote:
> If a user constructs a test that loops repeatedly over below steps
> on dm-thin, block allocation can fail due to discards not having
> completed yet (Fixed by a685557 dm thin: handle running out of data
> space vs concurrent discard):
> 1) fill thin device via filesystem file
> 2) remove file
> 3) fstrim
> 
> And this maybe cause a deadlock (fast device likes ramdisk can help
> a lot) when racing a fstrim with a filesystem (XFS) shutdown. (Fixed
> by 8c81dd46ef3c Force log to disk before reading the AGF during a
> fstrim)
> 
> This case can reproduce both two bugs if they're not fixed. If only
> the dm-thin bug is fixed, then the test will pass. If only the fs
> bug is fixed, then the test will fail. If both of bugs aren't fixed,
> the test will hang.
> 
> Signed-off-by: Zorro Lang <zlang@redhat.com>
> ---
> 
> Hi,
> 
> If both of two bugs aren't fixed, a loop device base on tmpfs can help
> reproduce the XFS deadlock:
> 1) mount -t tmpfs tmpfs /tmp
> 2) dd if=/dev/zero of=/tmp/test.img bs=1M count=100
> 3) losetup /dev/loop0 /tmp/test.img
> 4) use /dev/loop0 to be SCRATCH_DEV, run this case. The test will hang there.

Particularly, I could never reproduce this bug on spindles or SSDs, and I
believe many (if not most) people run xfstests on commodity hardware, not on
very fast disks, and the test doesn't reproduce the bug 100% of the times when
running on slow disks, so, unless the default for the test is to run it using
ramdisks, the test is useless IMHO.

> 
> Ramdisk can help trigger the race. Maybe NVME device can help too. But it's
> hard to reproduce on general disk.
> 

I didn't test it on NVME, so I can't tell =/

> If the XFS bug is fixed, above steps can reproduce dm-thin bug, the test
> will fail.
> 
> Unfortunately, if the dm-thin bug is fixed, then this case can't reproduce
> the XFS bug singly.
> 
> Thanks,
> Zorro
> 
> +#! /bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +# Copyright (c) 2018 Red Hat Inc.  All Rights Reserved.
> +#
> +# FS QA Test 499
> +#
> +# Race test running out of data space with concurrent discard operation on
> +# dm-thin.
> +#
> +# If a user constructs a test that loops repeatedly over below steps on
> +# dm-thin, block allocation can fail due to discards not having completed
> +# yet (Fixed by a685557 dm thin: handle running out of data space vs
> +# concurrent discard):
> +# 1) fill thin device via filesystem file
> +# 2) remove file
> +# 3) fstrim
> +#
> +# And this maybe cause a deadlock when racing a fstrim with a filesystem
> +# (XFS) shutdown. (Fixed by 8c81dd46ef3c Force log to disk before reading
> +# the AGF during a fstrim)
> +


> +# There're two bugs at here, one is dm-thin bug, the other is filesystem
> +# (XFS especially) bug. The dm-thin bug can't handle running out of data
> +# space with concurrent discard well. Then the dm-thin bug cause fs unmount
> +# hang when racing a fstrim with a filesystem shutdown.
> +#
> +# If both of two bugs haven't been fixed, below test maybe cause deadlock.
> +# Else if the fs bug has been fixed, but the dm-thin bug hasn't. below test
> +# will cause the test fail (no deadlock).
> +# Else the test will pass.

The test looks mostly ok, despite the fact I believe this should run on a
ramdisk by default (or not run, if $SCRATCH_DEV is not a ramdisk)

-- 
Carlos

  reply	other threads:[~2018-07-02  9:27 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-06-29 16:57 [PATCH] generic: test dm-thin running out of data space vs concurrent discard Zorro Lang
2018-07-02  9:27 ` Carlos Maiolino [this message]
2018-07-02 10:28   ` Zorro Lang
2018-07-04 13:04 ` Eryu Guan

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180702092711.budxkovx2ncpehhp@odin.usersys.redhat.com \
    --to=cmaiolino@redhat.com \
    --cc=fstests@vger.kernel.org \
    --cc=linux-xfs@vger.kernel.org \
    --cc=zlang@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox