From: Wang Yugui <wangyugui@e16-tech.com>
To: Anand Jain <anand.jain@oracle.com>
Cc: linux-btrfs@vger.kernel.org, Sherry Yang <sherry.yang@oracle.com>,
kernel test robot <oliver.sang@intel.com>
Subject: Re: [PATCH] btrfs: fix mkfs/mount/check failures due to race with systemd-udevd scan
Date: Thu, 23 Mar 2023 21:27:46 +0800 [thread overview]
Message-ID: <20230323212745.4342.409509F4@e16-tech.com> (raw)
In-Reply-To: <cbac9a8b-7db4-dc54-1f1d-4dc48e5dfcc9@oracle.com>
Hi,
>
>
> On 23/03/2023 19:57, Wang Yugui wrote:
> > Hi,
> >
> >> During the device scan initiated by systemd-udevd, other user space
> >> EXCL operations such as mkfs, mount, or check may get blocked and result
> >> in a "Device or resource busy" error. This is because the device
> >> scan process opens the device with the EXCL flag in the kernel.
> >>
> >> Two reports were received:
> >>
> >> . One with the btrfs/179 testcase, where the fsck command failed with
> >> the -EBUSY error; and
> >>
> >> . Another with the LTP pwritev03 testcase, where mkfs.vfs failed with
> >> the -EBUSY error, when mkfs.vfs tried to overwrite old btrfs filesystem
> >> on the device.
> >>
> >> In both cases, fsck and mkfs (respectively) were racing with a
> >> systemd-udevd device scan, and systemd-udevd won, resulting in the
> >> -EBUSY error for fsck and mkfs.
> >>
> >> Reproducing the problem has been difficult because there is a very
> >> small timeframe during which these userspace threads can race to
> >> acquire the exclusive device open. Even on the system where the problem
> >> was observed, the problem occurances were anywhere between 10 to 400
> >> iterations and chances of reproducing lessen with debug printk()s.
> >>
> >> However, an exclusive device open is unnecessary for the scan process,
> >> as there are no write operations on the device during scan. Furthermore,
> >> during the mount process, the superblock is re-read in the below
> >> function stack.
> >>
> >> btrfs_mount_root
> >> btrfs_open_devices
> >> open_fs_devices
> >> btrfs_open_one_device
> >> btrfs_get_bdev_and_sb
> >>
> >> So, to fix this issue, this patch removes the FMODE_EXCL flag from the scan
> >> operation, and adds a comment.
> >>
> >> Reported-by: Sherry Yang <sherry.yang@oracle.com>
> >> Reported-by: kernel test robot <oliver.sang@intel.com>
> >> Link: https://lore.kernel.org/oe-lkp/202303170839.fdf23068-oliver.sang@intel.com
> >> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> >> ---
> >>
> >> This patch should be cc-ed to stable-5.15.y and stable-6.1.y. As for
> >> stable-5.10.y and stable-5.4.y, a conflict fix is necessary, which I
> >> will send separately.
> >>
> >> fs/btrfs/volumes.c | 11 ++++++++++-
> >> 1 file changed, 10 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> >> index 93bc45001e68..cc1871767c8c 100644
> >> --- a/fs/btrfs/volumes.c
> >> +++ b/fs/btrfs/volumes.c
> >> @@ -1366,8 +1366,17 @@ struct btrfs_device *btrfs_scan_one_device(const char *path, fmode_t flags,
> >> * So, we need to add a special mount option to scan for
> >> * later supers, using BTRFS_SUPER_MIRROR_MAX instead
> >> */
> >> - flags |= FMODE_EXCL;
> >> >> + /*
> >> + * Avoid using flag |= FMODE_EXCL here, as the systemd-udev may
> >> + * initiate the device scan which may race with the user's mount
> >> + * or mkfs command, resulting in failure.
> >
> > for FMODE_READ | FMODE_EXCL, we need some sleep/retry,
> > for FMODE_WRITE | FMODE_EXCL, we should fail immediately?
>
> Sorry I don't understand the context here what represents the we here?
>
> In the LTP testcase the two sides are
> mkfs.<vfs|btrfs> side (FMODE_WRITE|FMODE_EXCL) and
> device-scan side (now: FMODE_READ, before: FMODE_READ|FMODE_EXCL)
>
> > scan race with with mkfs may result worse?
>
> In the above example, the mkfs.<vfs|btrfs> failed immediately without
> the patch and with the patch it is successful.
With the patch, when mkfs.<vfs|btrfs> is still running,
device-scan can read, but the read data is meaningless, so it is worse?
Best Regards
Wang Yugui (wangyugui@e16-tech.com)
2023/03/23
next prev parent reply other threads:[~2023-03-23 13:27 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-03-23 7:56 [PATCH] btrfs: fix mkfs/mount/check failures due to race with systemd-udevd scan Anand Jain
2023-03-23 11:57 ` Wang Yugui
2023-03-23 13:14 ` Anand Jain
2023-03-23 13:27 ` Wang Yugui [this message]
2023-03-23 18:27 ` David Sterba
2023-03-23 18:24 ` David Sterba
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20230323212745.4342.409509F4@e16-tech.com \
--to=wangyugui@e16-tech.com \
--cc=anand.jain@oracle.com \
--cc=linux-btrfs@vger.kernel.org \
--cc=oliver.sang@intel.com \
--cc=sherry.yang@oracle.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox