From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6D60BC761AF for ; Mon, 3 Apr 2023 14:33:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233536AbjDCOdt (ORCPT ); Mon, 3 Apr 2023 10:33:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38272 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233593AbjDCOdm (ORCPT ); Mon, 3 Apr 2023 10:33:42 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8C771E77 for ; Mon, 3 Apr 2023 07:33:31 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 1486EB81AD8 for ; Mon, 3 Apr 2023 14:33:30 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 836F7C433EF; Mon, 3 Apr 2023 14:33:28 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1680532408; bh=jzKs/L7fQQH1SNLrtLePASuflTrmr6yh7QSHbuAW07I=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=dNSDFCL174B40tWLGpC9kYa1Bwj+goSOguvV+qt0zyTqD6a8b3EWNPyvNtR6z2kFG s6pLYbVDyhHmaLdwU0YPIyxgLuSimvCtc2CV+KF3ghENV7M/FrJ6Y6UL/hLtKc2m71 aDkZmNyyous+tBZ99SqcSTAnH/KvgSUu2Ngrw/og= From: Greg Kroah-Hartman To: stable@vger.kernel.org Cc: Greg Kroah-Hartman , patches@lists.linux.dev, Sherry Yang , kernel test robot , Anand Jain , David Sterba Subject: [PATCH 5.15 71/99] btrfs: scan device in non-exclusive mode Date: Mon, 3 Apr 2023 16:09:34 +0200 Message-Id: <20230403140406.157804109@linuxfoundation.org> X-Mailer: git-send-email 2.40.0 In-Reply-To: <20230403140356.079638751@linuxfoundation.org> References: <20230403140356.079638751@linuxfoundation.org> User-Agent: quilt/0.67 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: stable@vger.kernel.org From: Anand Jain commit 50d281fc434cb8e2497f5e70a309ccca6b1a09f0 upstream. This fixes mkfs/mount/check failures due to race with systemd-udevd scan. During the device scan initiated by systemd-udevd, other user space EXCL operations such as mkfs, mount, or check may get blocked and result in a "Device or resource busy" error. This is because the device scan process opens the device with the EXCL flag in the kernel. Two reports were received: - btrfs/179 test case, where the fsck command failed with the -EBUSY error - LTP pwritev03 test case, where mkfs.vfs failed with the -EBUSY error, when mkfs.vfs tried to overwrite old btrfs filesystem on the device. In both cases, fsck and mkfs (respectively) were racing with a systemd-udevd device scan, and systemd-udevd won, resulting in the -EBUSY error for fsck and mkfs. Reproducing the problem has been difficult because there is a very small window during which these userspace threads can race to acquire the exclusive device open. Even on the system where the problem was observed, the problem occurrences were anywhere between 10 to 400 iterations and chances of reproducing decreases with debug printk()s. However, an exclusive device open is unnecessary for the scan process, as there are no write operations on the device during scan. Furthermore, during the mount process, the superblock is re-read in the below function call chain: btrfs_mount_root btrfs_open_devices open_fs_devices btrfs_open_one_device btrfs_get_bdev_and_sb So, to fix this issue, removes the FMODE_EXCL flag from the scan operation, and add a comment. The case where mkfs may still write to the device and a scan is running, the btrfs signature is not written at that time so scan will not recognize such device. Reported-by: Sherry Yang Reported-by: kernel test robot Link: https://lore.kernel.org/oe-lkp/202303170839.fdf23068-oliver.sang@intel.com CC: stable@vger.kernel.org # 5.4+ Signed-off-by: Anand Jain Reviewed-by: David Sterba Signed-off-by: David Sterba Signed-off-by: Greg Kroah-Hartman --- fs/btrfs/volumes.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) --- a/fs/btrfs/volumes.c +++ b/fs/btrfs/volumes.c @@ -1406,8 +1406,17 @@ struct btrfs_device *btrfs_scan_one_devi * So, we need to add a special mount option to scan for * later supers, using BTRFS_SUPER_MIRROR_MAX instead */ - flags |= FMODE_EXCL; + /* + * Avoid using flag |= FMODE_EXCL here, as the systemd-udev may + * initiate the device scan which may race with the user's mount + * or mkfs command, resulting in failure. + * Since the device scan is solely for reading purposes, there is + * no need for FMODE_EXCL. Additionally, the devices are read again + * during the mount process. It is ok to get some inconsistent + * values temporarily, as the device paths of the fsid are the only + * required information for assembling the volume. + */ bdev = blkdev_get_by_path(path, flags, holder); if (IS_ERR(bdev)) return ERR_CAST(bdev);