Linux PCI subsystem development
 help / color / mirror / Atom feed
From: Bjorn Helgaas <helgaas@kernel.org>
To: Jay Zhou <jianjay.zhou@huawei.com>
Cc: linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org,
	bhelgaas@google.com, alex.williamson@redhat.com,
	weidong.huang@huawei.com, Donald Dutile <ddutile@redhat.com>,
	Jason Gunthorpe <jgg@ziepe.ca>, Leon Romanovsky <leon@kernel.org>
Subject: Re: [PATCH] pci: avoid dead lock between device reset and sriov disable
Date: Wed, 11 May 2022 17:22:43 -0500	[thread overview]
Message-ID: <20220511222243.GA822874@bhelgaas> (raw)
In-Reply-To: <20220404062539.1710-1-jianjay.zhou@huawei.com>

[+cc Don, Jason, Leon, just FYI]

On Mon, Apr 04, 2022 at 02:25:39PM +0800, Jay Zhou wrote:
> Call trace of PF SRIOV disable:
> sriov_numvfs_store
>   device_lock <----------------- (1) get the device lock
>     ->sriov_configure # e.g. vfio_pci_sriov_configure
>       sriov_disable
>         pci_cfg_access_lock <--- (4) wait dev->block_cfg_access to be 0
> 
> Call trace of PF reset:
> reset_store
>   pci_reset_function
>     pci_dev_lock
>       pci_cfg_access_lock <----- (2) set dev->block_cfg_access = 1
>       device_lock <------------- (3) want to get the device lock
> 
> These two oprations would wait for each other forever if the
> code execution sequence is (1)(2)(3)(4).
> 
> Let's get the device lock and then the config access lock in
> pci_dev_lock().
> 
> Signed-off-by: Jay Zhou <jianjay.zhou@huawei.com>

Thanks, I (finally) applied Yicong's identical patch from a year ago
to pci/virtualization for v5.19, with a commit log combined and
adapted from both:

    PCI: Avoid pci_dev_lock() AB/BA deadlock with sriov_numvfs_store()
    
    The sysfs sriov_numvfs_store() path acquires the device lock before the
    config space access lock:
    
      sriov_numvfs_store
        device_lock                 # A (1) acquire device lock
        sriov_configure
          vfio_pci_sriov_configure  # (for example)
            vfio_pci_core_sriov_configure
              pci_disable_sriov
                sriov_disable
                  pci_cfg_access_lock
                    pci_wait_cfg    # B (4) wait for dev->block_cfg_access == 0
    
    Previously, pci_dev_lock() acquired the config space access lock before the
    device lock:
    
      pci_dev_lock
        pci_cfg_access_lock
          dev->block_cfg_access = 1 # B (2) set dev->block_cfg_access = 1
        device_lock                 # A (3) wait for device lock
    
    Any path that uses pci_dev_lock(), e.g., pci_reset_function(), may
    deadlock with sriov_numvfs_store() if the operations occur in the sequence
    (1) (2) (3) (4).
    
    Avoid the deadlock by reversing the order in pci_dev_lock() so it acquires
    the device lock before the config space access lock, the same as the
    sriov_numvfs_store() path.
    
    [bhelgaas: combined and adapted commit log from Jay Zhou's independent
    subsequent posting:
    https://lore.kernel.org/r/20220404062539.1710-1-jianjay.zhou@huawei.com]
    Link: https://lore.kernel.org/linux-pci/1583489997-17156-1-git-send-email-yangyicong@hisilicon.com/
    Also-posted-by: Jay Zhou <jianjay.zhou@huawei.com>
    Signed-off-by: Yicong Yang <yangyicong@hisilicon.com>
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>

> ---
>  drivers/pci/pci.c | 10 +++++-----
>  1 file changed, 5 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index 9ecce435fb3f..61a6db1d21f6 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -5103,19 +5103,19 @@ static int pci_reset_bus_function(struct pci_dev *dev, bool probe)
>  
>  void pci_dev_lock(struct pci_dev *dev)
>  {
> -	pci_cfg_access_lock(dev);
>  	/* block PM suspend, driver probe, etc. */
>  	device_lock(&dev->dev);
> +	pci_cfg_access_lock(dev);
>  }
>  EXPORT_SYMBOL_GPL(pci_dev_lock);
>  
>  /* Return 1 on successful lock, 0 on contention */
>  int pci_dev_trylock(struct pci_dev *dev)
>  {
> -	if (pci_cfg_access_trylock(dev)) {
> -		if (device_trylock(&dev->dev))
> +	if (device_trylock(&dev->dev)) {
> +		if (pci_cfg_access_trylock(dev))
>  			return 1;
> -		pci_cfg_access_unlock(dev);
> +		device_unlock(&dev->dev);
>  	}
>  
>  	return 0;
> @@ -5124,8 +5124,8 @@ EXPORT_SYMBOL_GPL(pci_dev_trylock);
>  
>  void pci_dev_unlock(struct pci_dev *dev)
>  {
> -	device_unlock(&dev->dev);
>  	pci_cfg_access_unlock(dev);
> +	device_unlock(&dev->dev);
>  }
>  EXPORT_SYMBOL_GPL(pci_dev_unlock);
>  
> -- 
> 2.27.0
> 

      parent reply	other threads:[~2022-05-11 22:22 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2022-04-04  6:25 [PATCH] pci: avoid dead lock between device reset and sriov disable Jay Zhou
2022-05-10  2:03 ` Yicong Yang
2022-05-11 22:22 ` Bjorn Helgaas [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20220511222243.GA822874@bhelgaas \
    --to=helgaas@kernel.org \
    --cc=alex.williamson@redhat.com \
    --cc=bhelgaas@google.com \
    --cc=ddutile@redhat.com \
    --cc=jgg@ziepe.ca \
    --cc=jianjay.zhou@huawei.com \
    --cc=leon@kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=weidong.huang@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox