From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-4.1 required=3.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS autolearn=no autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 30C82C43461 for ; Wed, 9 Sep 2020 20:41:01 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E110821D46 for ; Wed, 9 Sep 2020 20:41:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1599684060; bh=XYTJIX+UX5+9hkTZ1JmQGWBSE1wnXfxEBbc2SLTvbLI=; h=Date:From:To:Cc:Subject:In-Reply-To:List-ID:From; b=cflc0y0Jx7JyunvCSdCEBWr8nK+tRggScyrz8jFHFkLCIUCr1hSvNpn4aZAqPl7io VVcTzd+MCsH+dEMBwqH291ejGk/VIT/k5ucVcrGWkYcOxlxRXMtN6vihnKRp4ceFWv YZP0XHuFn7ddM5Mo4/1HCypNfFudguq2nMGKAvLI= Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728971AbgIIUlA (ORCPT ); Wed, 9 Sep 2020 16:41:00 -0400 Received: from mail.kernel.org ([198.145.29.99]:54012 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726414AbgIIUlA (ORCPT ); Wed, 9 Sep 2020 16:41:00 -0400 Received: from localhost (52.sub-72-107-123.myvzw.com [72.107.123.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id E5C4B2064B; Wed, 9 Sep 2020 20:40:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1599684059; bh=XYTJIX+UX5+9hkTZ1JmQGWBSE1wnXfxEBbc2SLTvbLI=; h=Date:From:To:Cc:Subject:In-Reply-To:From; b=doAj1CwRVbRE50VuDa0NRPzX3puwcNhH10PkIoMJlZ7VDX3J9Cpv7+Xu8Str2rpMp Duyf5ETzvgvpOayzw8I6cdDNzZUgfA5LXZHANt1OAYXDMuRtnJeoYx6BVRCwrzTwl+ +gjVdbDN5g3o6OLchOJSLhiZ4czH+k8PKVD1rqk0= Date: Wed, 9 Sep 2020 15:40:57 -0500 From: Bjorn Helgaas To: Lukas Wunner , Rick Farrington Cc: kernel test robot , Bjorn Helgaas , Alex Williamson , Boris Ostrovsky , Juergen Gross , Michael Haeuptle , Ian May , Keith Busch , linux-pci@vger.kernel.org, Cornelia Huck , kvm@vger.kernel.org, Derek Chickles , Satanand Burla , Felix Manlunas , Stefano Stabellini , xen-devel@lists.xenproject.org, Govinda Tatti , Konrad Rzeszutek Wilk , LKML , lkp@lists.01.org Subject: Re: [PCI] 3233e41d3e: WARNING:at_drivers/pci/pci.c:#pci_reset_hotplug_slot Message-ID: <20200909204057.GA724236@bjorn-Precision-5520> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200723095152.nf3fmfzrjlpoi35h@wunner.de> Sender: linux-pci-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-pci@vger.kernel.org On Thu, Jul 23, 2020 at 11:51:52AM +0200, Lukas Wunner wrote: > On Thu, Jul 23, 2020 at 05:13:06PM +0800, kernel test robot wrote: > > FYI, we noticed the following commit (built with gcc-9): > [...] > > commit: 3233e41d3e8ebcd44e92da47ffed97fd49b84278 ("[PATCH] PCI: pciehp: Fix AB-BA deadlock between reset_lock and device_lock") > [...] > > caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace): > > [ 0.971752] WARNING: CPU: 0 PID: 1 at drivers/pci/pci.c:4905 pci_reset_hotplug_slot+0x70/0x80 > > Thank you, trusty robot. > > I botched the call to lockdep_assert_held_write(), it should have been > conditional on "if (probe)". > > Happy to respin the patch, but I'd like to hear opinions on the locking > issues surrounding xen and octeon (and the patch in general). I wish liquidio/octeon weren't a special case. Why should that driver reset the device when unbinding when no other drivers do? Looks like this was added by 70535350e26f ("liquidio: with embedded f/w, don't reload f/w, issue pf flr at exit"). Maybe Rick will chime in. > In particular, would a solution be entertained wherein the pci_dev is > reset by the PCI core after driver unbinding, contingent on a flag which > is set by a PCI driver to indicate that the pci_dev is returned to the > core in an unclean state? How would we do this? The PCI core isn't called after unbinding, is it? So I guess we'd have to have a queue and a worker thread to process it? Device removal also has nasty locking issues, and a queue might help solve those, too. Might also help in the problematic case of 40f11adc7cd9 ("PCI: Avoid race while enabling upstream bridges"), which we had to revert. > Also, why does xen require a device reset on bind? > > Thanks! > > Lukas