public inbox for fstests@vger.kernel.org
 help / color / mirror / Atom feed
From: Filipe Manana <fdmanana@kernel.org>
To: Josef Bacik <josef@toxicpanda.com>
Cc: fstests@vger.kernel.org, linux-btrfs@vger.kernel.org
Subject: Re: [PATCH] fstests: UDEV_SETTLE_PROG before dmsetup create
Date: Thu, 2 Dec 2021 14:45:24 +0000	[thread overview]
Message-ID: <YajcBNMm3dR4YI/N@debian9.Home> (raw)
In-Reply-To: <ebf63c27065c5fa676701184501353a9f2014832.1638382705.git.josef@toxicpanda.com>

On Wed, Dec 01, 2021 at 01:18:52PM -0500, Josef Bacik wrote:
> We've been seeing transient errors with any test that uses a dm device
> for the entirety of the time that we've been running nightly xfstests

I have been having it on my tests vms since ever as well.
It's really annoying, but fortunatelly it doesn't happen too often.

> runs.  This turns out to be because sometimes we get EBUSY while trying
> to create our new dm device.  Generally this is because the test comes
> right after another test that messes with the dm device, and thus we
> still have udev messing around with the device when DM tries to O_EXCL
> the block device.
> 
> Add a UDEV_SETTLE_PROG before creating the device to make sure we can
> create our new dm device without getting this transient error.

I suspect this might only make it seem the problem goes away but does not
really fix it.

I say that for 2 reasons:

1) All tests that use dm end up calling _dmsetup_remove(), like through
   _log_writes_remove() or _cleanup_flakey() for example. Normally those
   are called in the _cleanup() function, which ensures it's done even if
   the test fails for some reason.

   So I don't understand why we need that UDEV_SETTLE_PROG at _dmsetup_create().

   And I've seen the ebusy failure happen even when the previous tests did
   not use any dm device;

2) Some tests fail after creating the dm device and using it. For example
   btrfs/206 often fails when it tries to fsck the filesystem:

   btrfs/206 3s ... [failed, exit status 1]- output mismatch (see /home/fdmanana/git/hub/xfstests/results//btrfs/206.out.bad)
        --- tests/btrfs/206.out     2020-10-16 23:13:46.554152652 +0100
        +++ /home/fdmanana/git/hub/xfstests/results//btrfs/206.out.bad      2021-12-01 21:09:46.317632589 +0000
        @@ -3,3 +3,5 @@
        XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
        wrote 8192/8192 bytes at offset 0
        XXX Bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
        +_check_btrfs_filesystem: filesystem on /dev/sdc is inconsistent
        +(see /home/fdmanana/git/hub/xfstests/results//btrfs/206.full for details)
        ...

       (Run 'diff -u /home/fdmanana/git/hub/xfstests/tests/btrfs/206.out /home/fdmanana/git/hub/xfstests/results//btrfs/206.out.bad'  to see the entire diff)

    In the .full file I got:

    (...)
    replaying 1239@11201: sector 2173408, size 16384, flags 0x10(METADATA)
    replaying 1240@11234: sector 0, size 0, flags 0x1(FLUSH)
    replaying 1241@11235: sector 128, size 4096, flags 0x12(FUA|METADATA)
    _check_btrfs_filesystem: filesystem on /dev/sdc is inconsistent
    *** fsck.btrfs output ***
    ERROR: cannot open device '/dev/sdc': Device or resource busy
    ERROR: cannot open file system
    Opening filesystem to check...
    *** end fsck.btrfs output
    *** mount output ***

   The ebusy failure is not when the test starts, but when somewhere in the middle
   of the replay loop when it calls fsck, or when it ends and the fstests framework
   calls fsck.

   I've seen that with btrfs/172 too, which also uses dm logwrites in a similar way.

So to me this suggests 2 things:

1) Calling UDEV_SETTLE_PROG at _dmsetup_create() doesn't solve that problem with
   btrfs/206 (and other tests) - the problem is fsck failing to open the scratch
   device after it called _log_writes_remove() -> _dmsetup_remove(), and not a
   failure to create the dm device;

2) The problem is likely something missing at _dmsetup_remove(). Perhaps add
   another UDEV_SETTLE_PROG there:

   diff --git a/common/rc b/common/rc
   index 8e351f17..22b34677 100644
   --- a/common/rc
   +++ b/common/rc
   @@ -4563,6 +4563,7 @@ _dmsetup_remove()
            $UDEV_SETTLE_PROG >/dev/null 2>&1
            $DMSETUP_PROG remove "$@" >>$seqres.full 2>&1
            $DMSETUP_PROG mknodes >/dev/null 2>&1
    +       $UDEV_SETTLE_PROG >/dev/null 2>&1
     }
 
    _dmsetup_create()

  I can't say if that change to _dmsetup_remove() is correct, or what it's
  needed, as I really haven't spent time trying to figure out why the issue
  happens.

    


> 
> Signed-off-by: Josef Bacik <josef@toxicpanda.com>
> ---
>  common/rc | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/common/rc b/common/rc
> index 8e351f17..35e861ec 100644
> --- a/common/rc
> +++ b/common/rc
> @@ -4567,6 +4567,7 @@ _dmsetup_remove()
>  
>  _dmsetup_create()
>  {
> +	$UDEV_SETTLE_PROG >/dev/null 2>&1
>  	$DMSETUP_PROG create "$@" >>$seqres.full 2>&1 || return 1
>  	$DMSETUP_PROG mknodes >/dev/null 2>&1
>  	$UDEV_SETTLE_PROG >/dev/null 2>&1
> -- 
> 2.26.3
> 

  reply	other threads:[~2021-12-02 14:45 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2021-12-01 18:18 [PATCH] fstests: UDEV_SETTLE_PROG before dmsetup create Josef Bacik
2021-12-02 14:45 ` Filipe Manana [this message]
2021-12-02 15:45   ` Josef Bacik
2021-12-06  5:47     ` Chandan Babu R
2021-12-06 14:07       ` Chandan Babu R
2021-12-06 19:28         ` Josef Bacik

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=YajcBNMm3dR4YI/N@debian9.Home \
    --to=fdmanana@kernel.org \
    --cc=fstests@vger.kernel.org \
    --cc=josef@toxicpanda.com \
    --cc=linux-btrfs@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox