From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-19.5 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED,USER_AGENT_GIT autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 141F4C07E9C for ; Sat, 10 Jul 2021 23:51:06 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 03D5461154 for ; Sat, 10 Jul 2021 23:51:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232972AbhGJXxu (ORCPT ); Sat, 10 Jul 2021 19:53:50 -0400 Received: from mail.kernel.org ([198.145.29.99]:40316 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233182AbhGJXxM (ORCPT ); Sat, 10 Jul 2021 19:53:12 -0400 Received: by mail.kernel.org (Postfix) with ESMTPSA id 3532A6135F; Sat, 10 Jul 2021 23:50:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1625961026; bh=ZKkm3ed5F/+6ULuJXKWRc77JtNczmJ7XfH84J09nG/Q=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=T0iXjEG2pLF2G4Cmd52JvlJ2eUYE/E4nj7wN8IBUYe2XoW2P7VGK0vt3yPLSWOETW KpYWwUgSqjZ3qD4cWtPzyaadWl9tvEth0cBSVBb8hNgMltGA9xnQHGeJv1dK5cMDiZ ojYTIVLaHJRlPdKTyUxNFz3ATnhl18DmrOZjLoH6QB4B/p2MQKlYzgIjiPzeqZO8Dz pGwvq7sQPNiFTZ2bgTk5u+WFEsEbASYlpH44stLnfUNheJtLYmRm1DmpSX99rogQ2W C9+oPcDtGKurRVyTHlp/dnyTnHBDK9gOMm7uTRC3P2Hy70NbPj2qVm8VtH/XrJqTDo R52FFSuP+1GaA== From: Sasha Levin To: linux-kernel@vger.kernel.org, stable@vger.kernel.org Cc: Long Li , Lorenzo Pieralisi , Michael Kelley , Sasha Levin , linux-hyperv@vger.kernel.org, linux-pci@vger.kernel.org Subject: [PATCH AUTOSEL 5.10 07/37] PCI: hv: Fix a race condition when removing the device Date: Sat, 10 Jul 2021 19:49:45 -0400 Message-Id: <20210710235016.3221124-7-sashal@kernel.org> X-Mailer: git-send-email 2.30.2 In-Reply-To: <20210710235016.3221124-1-sashal@kernel.org> References: <20210710235016.3221124-1-sashal@kernel.org> MIME-Version: 1.0 X-stable: review X-Patchwork-Hint: Ignore Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-hyperv@vger.kernel.org From: Long Li [ Upstream commit 94d22763207ac6633612b8d8e0ca4fba0f7aa139 ] On removing the device, any work item (hv_pci_devices_present() or hv_pci_eject_device()) scheduled on workqueue hbus->wq may still be running and race with hv_pci_remove(). This can happen because the host may send PCI_EJECT or PCI_BUS_RELATIONS(2) and decide to rescind the channel immediately after that. Fix this by flushing/destroying the workqueue of hbus before doing hbus remove. Link: https://lore.kernel.org/r/1620806800-30983-1-git-send-email-longli@linuxonhyperv.com Signed-off-by: Long Li Signed-off-by: Lorenzo Pieralisi Reviewed-by: Michael Kelley Signed-off-by: Sasha Levin --- drivers/pci/controller/pci-hyperv.c | 30 ++++++++++++++++++++++------- 1 file changed, 23 insertions(+), 7 deletions(-) diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c index 03ed5cb1c4b2..4932d7677be2 100644 --- a/drivers/pci/controller/pci-hyperv.c +++ b/drivers/pci/controller/pci-hyperv.c @@ -444,7 +444,6 @@ enum hv_pcibus_state { hv_pcibus_probed, hv_pcibus_installed, hv_pcibus_removing, - hv_pcibus_removed, hv_pcibus_maximum }; @@ -3247,8 +3246,9 @@ static int hv_pci_bus_exit(struct hv_device *hdev, bool keep_devs) struct pci_packet teardown_packet; u8 buffer[sizeof(struct pci_message)]; } pkt; - struct hv_dr_state *dr; struct hv_pci_compl comp_pkt; + struct hv_pci_dev *hpdev, *tmp; + unsigned long flags; int ret; /* @@ -3260,9 +3260,16 @@ static int hv_pci_bus_exit(struct hv_device *hdev, bool keep_devs) if (!keep_devs) { /* Delete any children which might still exist. */ - dr = kzalloc(sizeof(*dr), GFP_KERNEL); - if (dr && hv_pci_start_relations_work(hbus, dr)) - kfree(dr); + spin_lock_irqsave(&hbus->device_list_lock, flags); + list_for_each_entry_safe(hpdev, tmp, &hbus->children, list_entry) { + list_del(&hpdev->list_entry); + if (hpdev->pci_slot) + pci_destroy_slot(hpdev->pci_slot); + /* For the two refs got in new_pcichild_device() */ + put_pcichild(hpdev); + put_pcichild(hpdev); + } + spin_unlock_irqrestore(&hbus->device_list_lock, flags); } ret = hv_send_resources_released(hdev); @@ -3305,13 +3312,23 @@ static int hv_pci_remove(struct hv_device *hdev) hbus = hv_get_drvdata(hdev); if (hbus->state == hv_pcibus_installed) { + tasklet_disable(&hdev->channel->callback_event); + hbus->state = hv_pcibus_removing; + tasklet_enable(&hdev->channel->callback_event); + destroy_workqueue(hbus->wq); + hbus->wq = NULL; + /* + * At this point, no work is running or can be scheduled + * on hbus-wq. We can't race with hv_pci_devices_present() + * or hv_pci_eject_device(), it's safe to proceed. + */ + /* Remove the bus from PCI's point of view. */ pci_lock_rescan_remove(); pci_stop_root_bus(hbus->pci_bus); hv_pci_remove_slots(hbus); pci_remove_root_bus(hbus->pci_bus); pci_unlock_rescan_remove(); - hbus->state = hv_pcibus_removed; } ret = hv_pci_bus_exit(hdev, false); @@ -3326,7 +3343,6 @@ static int hv_pci_remove(struct hv_device *hdev) irq_domain_free_fwnode(hbus->sysdata.fwnode); put_hvpcibus(hbus); wait_for_completion(&hbus->remove_event); - destroy_workqueue(hbus->wq); hv_put_dom_num(hbus->sysdata.domain); -- 2.30.2