From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 76930CF5395 for ; Wed, 23 Oct 2024 14:32:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:Cc:To:From: Reply-To:Content-Type:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=FMtJo2vw9R8y5qTucg0NC5g+O5MCLfyCamYe6eBqKYY=; b=YWwvakTtG8/6Cn9O2fzJUhxKVe GCZQI5hNoTPqyfmq8UXmRrVQU6VwUVHvC/vncbx3PF6sWi/FcRhAbclw5XWCCYTd57XGwZ3CssmPk HMGTm7LbtJgAKx3G6aG+cIXeJnq4I+eairgTV+P8T2NZMz915ZJ6AKPezi9/eKe1SupZqC44iWfAb 48rNaEpxUcuGeOaysHKP67u8lhWdXG/VVFevFbxpn7LyKZXvrWocVKLJ7RwIy5O2EQIkmQKr1lKmn vELnBsXnZSuw9zbp2rl7Pr/u7dTCnljC+KAv9wUa5/GwRkAMeFrgyA6VxdskC4J3J7aqwSQbWa7vO 4hVKsDeQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1t3cPd-0000000EiYv-3JIV; Wed, 23 Oct 2024 14:32:37 +0000 Received: from nyc.source.kernel.org ([2604:1380:45d1:ec00::3]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1t3cOp-0000000EiBz-1cUb for linux-nvme@lists.infradead.org; Wed, 23 Oct 2024 14:31:49 +0000 Received: from smtp.kernel.org (transwarp.subspace.kernel.org [100.75.92.58]) by nyc.source.kernel.org (Postfix) with ESMTP id 57A30A44F5C; Wed, 23 Oct 2024 14:31:37 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 199C7C4CEEE; Wed, 23 Oct 2024 14:31:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1729693906; bh=M95oN5BJW2ED7DYuRHRVyRxqpANuvn7SqeCMgOM4kyQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Ue9atXbpmUgJdILdqypU1R+Ep6n8bSfDc2ivKrtp9WHVCYG2KhqA2z7QzgsHLW+0z XQyuMCHbkPat6dkar73ummzMa3VgICSMJYUCFLAZgpYehYh3OIQeZ+Wld03hWi9n0Z IYI9CmdjuHPYPNFPtFQ4fjROg98J+s6tlHNW8HM8J7x0+RdfvkfwwCeyDu8VpHik3l aj0OZG/RI2BsOtzy4zXXFPxfNOPsCFYNh0HPhdk89cr/yAEREzJ2lRJw9DYlv4sW9v VJ5k4SMgYHsEJmtD1ESK3GwVZnmyxus21KPPRsmMtRCwLkIfEY7GBytzhsYCqq5oxa hb0FluwIfZ9JA== From: Sasha Levin To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Cc: Keith Busch , Hannes Reinecke , Christoph Hellwig , Sasha Levin , sagi@grimberg.me, linux-nvme@lists.infradead.org Subject: [PATCH AUTOSEL 6.6 18/23] nvme-multipath: defer partition scanning Date: Wed, 23 Oct 2024 10:31:02 -0400 Message-ID: <20241023143116.2981369-18-sashal@kernel.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20241023143116.2981369-1-sashal@kernel.org> References: <20241023143116.2981369-1-sashal@kernel.org> MIME-Version: 1.0 X-stable: review X-Patchwork-Hint: Ignore X-stable-base: Linux 6.6.58 Content-Transfer-Encoding: 8bit X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20241023_073147_598684_9FD37BB0 X-CRM114-Status: GOOD ( 16.03 ) X-BeenThere: linux-nvme@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "Linux-nvme" Errors-To: linux-nvme-bounces+linux-nvme=archiver.kernel.org@lists.infradead.org From: Keith Busch [ Upstream commit 1f021341eef41e77a633186e9be5223de2ce5d48 ] We need to suppress the partition scan from occuring within the controller's scan_work context. If a path error occurs here, the IO will wait until a path becomes available or all paths are torn down, but that action also occurs within scan_work, so it would deadlock. Defer the partion scan to a different context that does not block scan_work. Reported-by: Hannes Reinecke Reviewed-by: Christoph Hellwig Signed-off-by: Keith Busch Signed-off-by: Sasha Levin --- drivers/nvme/host/multipath.c | 33 +++++++++++++++++++++++++++++++++ drivers/nvme/host/nvme.h | 1 + 2 files changed, 34 insertions(+) diff --git a/drivers/nvme/host/multipath.c b/drivers/nvme/host/multipath.c index 37ea0fa421da8..ede2a14dad8be 100644 --- a/drivers/nvme/host/multipath.c +++ b/drivers/nvme/host/multipath.c @@ -499,6 +499,20 @@ static int nvme_add_ns_head_cdev(struct nvme_ns_head *head) return ret; } +static void nvme_partition_scan_work(struct work_struct *work) +{ + struct nvme_ns_head *head = + container_of(work, struct nvme_ns_head, partition_scan_work); + + if (WARN_ON_ONCE(!test_and_clear_bit(GD_SUPPRESS_PART_SCAN, + &head->disk->state))) + return; + + mutex_lock(&head->disk->open_mutex); + bdev_disk_changed(head->disk, false); + mutex_unlock(&head->disk->open_mutex); +} + static void nvme_requeue_work(struct work_struct *work) { struct nvme_ns_head *head = @@ -525,6 +539,7 @@ int nvme_mpath_alloc_disk(struct nvme_ctrl *ctrl, struct nvme_ns_head *head) bio_list_init(&head->requeue_list); spin_lock_init(&head->requeue_lock); INIT_WORK(&head->requeue_work, nvme_requeue_work); + INIT_WORK(&head->partition_scan_work, nvme_partition_scan_work); /* * Add a multipath node if the subsystems supports multiple controllers. @@ -540,6 +555,16 @@ int nvme_mpath_alloc_disk(struct nvme_ctrl *ctrl, struct nvme_ns_head *head) return -ENOMEM; head->disk->fops = &nvme_ns_head_ops; head->disk->private_data = head; + + /* + * We need to suppress the partition scan from occuring within the + * controller's scan_work context. If a path error occurs here, the IO + * will wait until a path becomes available or all paths are torn down, + * but that action also occurs within scan_work, so it would deadlock. + * Defer the partion scan to a different context that does not block + * scan_work. + */ + set_bit(GD_SUPPRESS_PART_SCAN, &head->disk->state); sprintf(head->disk->disk_name, "nvme%dn%d", ctrl->subsys->instance, head->instance); @@ -589,6 +614,7 @@ static void nvme_mpath_set_live(struct nvme_ns *ns) return; } nvme_add_ns_head_cdev(head); + kblockd_schedule_work(&head->partition_scan_work); } mutex_lock(&head->lock); @@ -889,6 +915,12 @@ void nvme_mpath_shutdown_disk(struct nvme_ns_head *head) kblockd_schedule_work(&head->requeue_work); if (test_bit(NVME_NSHEAD_DISK_LIVE, &head->flags)) { nvme_cdev_del(&head->cdev, &head->cdev_device); + /* + * requeue I/O after NVME_NSHEAD_DISK_LIVE has been cleared + * to allow multipath to fail all I/O. + */ + synchronize_srcu(&head->srcu); + kblockd_schedule_work(&head->requeue_work); del_gendisk(head->disk); } } @@ -900,6 +932,7 @@ void nvme_mpath_remove_disk(struct nvme_ns_head *head) /* make sure all pending bios are cleaned up */ kblockd_schedule_work(&head->requeue_work); flush_work(&head->requeue_work); + flush_work(&head->partition_scan_work); put_disk(head->disk); } diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h index 799f8a2bb0b4f..14a867245c29f 100644 --- a/drivers/nvme/host/nvme.h +++ b/drivers/nvme/host/nvme.h @@ -476,6 +476,7 @@ struct nvme_ns_head { struct bio_list requeue_list; spinlock_t requeue_lock; struct work_struct requeue_work; + struct work_struct partition_scan_work; struct mutex lock; unsigned long flags; #define NVME_NSHEAD_DISK_LIVE 0 -- 2.43.0