public inbox for linux-btrfs@vger.kernel.org
 help / color / mirror / Atom feed
From: Wang Yugui <wangyugui@e16-tech.com>
To: Anand Jain <anand.jain@oracle.com>
Cc: linux-btrfs@vger.kernel.org, Sherry Yang <sherry.yang@oracle.com>,
	kernel test robot <oliver.sang@intel.com>
Subject: Re: [PATCH] btrfs: fix mkfs/mount/check failures due to race with systemd-udevd scan
Date: Thu, 23 Mar 2023 21:27:46 +0800	[thread overview]
Message-ID: <20230323212745.4342.409509F4@e16-tech.com> (raw)
In-Reply-To: <cbac9a8b-7db4-dc54-1f1d-4dc48e5dfcc9@oracle.com>

Hi,

> 
> 
> On 23/03/2023 19:57, Wang Yugui wrote:
> > Hi,
> >
> >> During the device scan initiated by systemd-udevd, other user space
> >> EXCL operations such as mkfs, mount, or check may get blocked and result
> >> in a "Device or resource busy" error. This is because the device
> >> scan process opens the device with the EXCL flag in the kernel.
> >>
> >> Two reports were received:
> >>
> >>   . One with the btrfs/179 testcase, where the fsck command failed with
> >>     the -EBUSY error; and
> >>
> >>   . Another with the LTP pwritev03 testcase, where mkfs.vfs failed with
> >>     the -EBUSY error, when mkfs.vfs tried to overwrite old btrfs filesystem
> >>     on the device.
> >>
> >> In both cases, fsck and mkfs (respectively) were racing with a
> >> systemd-udevd device scan, and systemd-udevd won, resulting in the
> >> -EBUSY error for fsck and mkfs.
> >>
> >> Reproducing the problem has been difficult because there is a very
> >> small timeframe during which these userspace threads can race to
> >> acquire the exclusive device open. Even on the system where the problem
> >> was observed, the problem occurances were anywhere between 10 to 400
> >> iterations and chances of reproducing lessen with debug printk()s.
> >>
> >> However, an exclusive device open is unnecessary for the scan process,
> >> as there are no write operations on the device during scan. Furthermore,
> >> during the mount process, the superblock is re-read in the below
> >> function stack.
> >>
> >>    btrfs_mount_root
> >>     btrfs_open_devices
> >>      open_fs_devices
> >>       btrfs_open_one_device
> >>         btrfs_get_bdev_and_sb
> >>
> >> So, to fix this issue, this patch removes the FMODE_EXCL flag from the scan
> >> operation, and adds a comment.
> >>
> >> Reported-by: Sherry Yang <sherry.yang@oracle.com>
> >> Reported-by: kernel test robot <oliver.sang@intel.com>
> >> Link: https://lore.kernel.org/oe-lkp/202303170839.fdf23068-oliver.sang@intel.com
> >> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> >> ---
> >>
> >>   This patch should be cc-ed to stable-5.15.y and stable-6.1.y. As for
> >>   stable-5.10.y and stable-5.4.y, a conflict fix is necessary, which I
> >>   will send separately.
> >>
> >>   fs/btrfs/volumes.c | 11 ++++++++++-
> >>   1 file changed, 10 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> >> index 93bc45001e68..cc1871767c8c 100644
> >> --- a/fs/btrfs/volumes.c
> >> +++ b/fs/btrfs/volumes.c
> >> @@ -1366,8 +1366,17 @@ struct btrfs_device *btrfs_scan_one_device(const char *path, fmode_t flags,
> >>   	 * So, we need to add a special mount option to scan for
> >>   	 * later supers, using BTRFS_SUPER_MIRROR_MAX instead
> >>   	 */
> >> -	flags |= FMODE_EXCL;
> >>  >> +	/*
> >> +	 * Avoid using flag |= FMODE_EXCL here, as the systemd-udev may
> >> +	 * initiate the device scan which may race with the user's mount
> >> +	 * or mkfs command, resulting in failure.
> > 
> > for  FMODE_READ | FMODE_EXCL, we need some sleep/retry,
> > for  FMODE_WRITE | FMODE_EXCL, we should fail immediately?
> 
>   Sorry I don't understand the context here what represents the we here?
> 
>   In the LTP testcase the two sides are
>    mkfs.<vfs|btrfs> side (FMODE_WRITE|FMODE_EXCL) and
>    device-scan side (now: FMODE_READ, before: FMODE_READ|FMODE_EXCL)
> 
> > scan race with with mkfs may result worse?
> 
>   In the above example, the mkfs.<vfs|btrfs> failed immediately without
>   the patch and with the patch it is successful.

With the patch, when mkfs.<vfs|btrfs> is still running, 
device-scan can read, but the read data is meaningless, so it is worse?

Best Regards
Wang Yugui (wangyugui@e16-tech.com)
2023/03/23


  reply	other threads:[~2023-03-23 13:27 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-03-23  7:56 [PATCH] btrfs: fix mkfs/mount/check failures due to race with systemd-udevd scan Anand Jain
2023-03-23 11:57 ` Wang Yugui
2023-03-23 13:14   ` Anand Jain
2023-03-23 13:27     ` Wang Yugui [this message]
2023-03-23 18:27       ` David Sterba
2023-03-23 18:24 ` David Sterba

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230323212745.4342.409509F4@e16-tech.com \
    --to=wangyugui@e16-tech.com \
    --cc=anand.jain@oracle.com \
    --cc=linux-btrfs@vger.kernel.org \
    --cc=oliver.sang@intel.com \
    --cc=sherry.yang@oracle.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox