linux-pci.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Bjorn Helgaas <helgaas@kernel.org>
To: Marek Vasut <marek.vasut@gmail.com>
Cc: linux-pci@vger.kernel.org, linux-renesas-soc@vger.kernel.org,
	Tho Vu <tho.vu.wh@rvc.renesas.com>,
	Bart Van Assche <bart.vanassche@wdc.com>,
	Tejun Heo <tj@kernel.org>
Subject: Re: [RFC][PATCH] PCI: Avoid PCI device removing/rescanning through sysfs triggers a deadlock
Date: Tue, 27 Nov 2018 16:55:06 -0600	[thread overview]
Message-ID: <20181127225506.GA168199@google.com> (raw)
In-Reply-To: <20181105232500.19146-1-marek.vasut+renesas@gmail.com>

[+cc Bart, Tejun]

On Tue, Nov 06, 2018 at 12:25:00AM +0100, Marek Vasut wrote:
> From: Tho Vu <tho.vu.wh@rvc.renesas.com>
> 
> This patch fixes deadlock warning in removing/rescanning through sysfs
> when CONFIG_PROVE_LOCKING is enabled.
> 
> The issue can be reproduced by these steps:
> 1. Enable CONFIG_PROVE_LOCKING via defconfig or menuconfig
> 2. Insert Ethernet card into PCIe CH0 and start up.
>    After kernel starting up, execute the following command.
>    echo 1 > /sys/class/pci_bus/0000\:00/device/0000\:00\:00.0/remove
> 3. Rescan PCI device by this command
>    echo 1 > /sys/class/pci_bus/0000\:00/rescan
> 
> The deadlock warnings will occur.
> ============================================
> WARNING: possible recursive locking detected
> 4.14.70-ltsi-yocto-standard #27 Not tainted
> --------------------------------------------
> sh/3402 is trying to acquire lock:
>  (kn->count#78){++++}, at: kernfs_remove_by_name_ns+0x50/0xa8
> 
> but task is already holding lock:
>  (kn->count#78){++++}, at: kernfs_remove_self+0xe0/0x130
> 
> other info that might help us debug this:
>  Possible unsafe locking scenario:
> 
>        CPU0
>        ----
>   lock(kn->count#78);
>   lock(kn->count#78);
> 
>  *** DEADLOCK ***
> 
>  May be due to missing lock nesting notation
> 
> 4 locks held by sh/3402:
>  #0:  (sb_writers#4){.+.+}, at: vfs_write+0x198/0x1b0
>  #1:  (&of->mutex){+.+.}, at: kernfs_fop_write+0x108/0x210
>  #2:  (kn->count#78){++++}, at: kernfs_remove_self+0xe0/0x130
>  #3:  (pci_rescan_remove_lock){+.+.}, at: pci_lock_rescan_remove+0x1c/0x28
> 
> stack backtrace:
> CPU: 3 PID: 3402 Comm: sh Not tainted 4.14.70-ltsi-yocto-standard #27
> Hardware name: Renesas Salvator-X 2nd version board based on r8a7795
> ES3.0+ with 8GiB (4 x 2 GiB) (DT)
> Call trace:
> dump_backtrace+0x0/0x3d8
> show_stack+0x14/0x20
> dump_stack+0xbc/0xf4
> __lock_acquire+0x930/0x18a8
> lock_acquire+0x48/0x68
> __kernfs_remove+0x280/0x2f8
> kernfs_remove_by_name_ns+0x50/0xa8
> remove_files.isra.0+0x38/0x78
> sysfs_remove_group+0x4c/0xa0
> sysfs_remove_groups+0x38/0x60
> device_remove_attrs+0x54/0x78
> device_del+0x1ac/0x308
> pci_remove_bus_device+0x78/0xf8
> pci_remove_bus_device+0x34/0xf8
> pci_stop_and_remove_bus_device_locked+0x24/0x38
> remove_store+0x6c/0x78
> dev_attr_store+0x18/0x28
> sysfs_kf_write+0x4c/0x78
> kernfs_fop_write+0x138/0x210
> __vfs_write+0x18/0x118
> vfs_write+0xa4/0x1b0
> SyS_write+0x48/0xb0

Indent quoted material.  I use two spaces.

> This warning occurs due to a self-deletion attribute using in the sysfs
> PCI device directory. This kind of attribute is really tricky,
> it does not allow pci framework drop this attribute until all active
> .show() and .store() callbacks have finished unless
> sysfs_break_active_protection() is called.
> Hence this patch avoids writing into this attribute triggers a deadlock.

Wrap paragraph correctly or, if this is supposed to be two paragraphs,
insert a blank line between them.

Use "PCI" (not "pci") consistently in English text.

> Referrence commit 5b55b24cec4c ("scsi: core: Avoid that SCSI device
> removal through sysfs triggers a deadlock")
> of scsi driver

s/Referrence/Reference/

5b55b24cec4c doesn't exist.  I suppose maybe you mean 0ee223b2e1f6
("scsi: core: Avoid that SCSI device removal through sysfs triggers a
deadlock")?

Wrap paragraph correctly.

Use "SCSI" (not "scsi") consistently in English text.

> Signed-off-by: Tho Vu <tho.vu.wh@rvc.renesas.com>

Missing your Signed-off-by (as Geert pointed out).

> ---
>  drivers/pci/pci-sysfs.c | 20 +++++++++++++++++++-
>  1 file changed, 19 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/pci-sysfs.c b/drivers/pci/pci-sysfs.c
> index 9ecfe13157c0..d522bd8368d9 100644
> --- a/drivers/pci/pci-sysfs.c
> +++ b/drivers/pci/pci-sysfs.c
> @@ -470,12 +470,22 @@ static ssize_t remove_store(struct device *dev, struct device_attribute *attr,
>  			    const char *buf, size_t count)
>  {
>  	unsigned long val;
> +	struct kernfs_node *kn;
> +
> +	kn = sysfs_break_active_protection(&dev->kobj, &attr->attr);

I'm awfully wary of interfaces with only a single user.  That suggests
that there's something very special about this path, and I doubt it's
*that* special.  I grant you that sysfs_break_active_protection() is
very new (added by 2afc9166f79b ("scsi: sysfs: Introduce
sysfs_{un,}break_active_protection()")).

> +	WARN_ON_ONCE(!kn);

I don't see the point of the WARN_ON_ONCE() (also pointed out already by
Geert).

>  	if (kstrtoul(buf, 0, &val) < 0)
>  		return -EINVAL;

This looks wrong because you may return -EINVAL without calling
sysfs_unbreak_active_protection().

There's no point in calling sysfs_break_active_protection() if either
kstrtoul() fails or val is zero, i.e., you could do:

  if (kstrtoul(..., &val) < 0)
    return -EINVAL;

  if (!val)
    return count;

  sysfs_break_active_protection(...);
  device_remove_file(...);
  pci_stop_and_remove_bus_device_locked(...);
  sysfs_unbreak_active_protection(...);

> -	if (val && device_remove_file_self(dev, attr))
> +	if (val) {
> +		device_remove_file(dev, attr);
>  		pci_stop_and_remove_bus_device_locked(to_pci_dev(dev));

We need some story about why we should use device_remove_file()
instead of device_remove_file_self().

We also need an explanation for why other callers of
device_remove_file_self() don't need similar changes.  They *look*
similar to this case, so we need to look at nvme_sysfs_delete(),
dcssblk_shared_store(), vfio remove_store(), s390 recover_store(),
etc, and explain why they are correct and this one is incorrect.  Or,
if they're all wrong, we need to fix them all.

Is there a way we can avoid this self-deletion situation by doing the
deletion via a workqueue item that is scheduled by remove_store() but
not executed in its context?

> +	}
> +
> +	if (kn)
> +		sysfs_unbreak_active_protection(kn);
> +
>  	return count;
>  }
>  static struct device_attribute dev_remove_attr = __ATTR(remove,
> @@ -487,11 +497,15 @@ static ssize_t dev_bus_rescan_store(struct device *dev,
>  				    const char *buf, size_t count)
>  {
>  	unsigned long val;
> +	struct kernfs_node *kn;
>  	struct pci_bus *bus = to_pci_bus(dev);
>  
>  	if (kstrtoul(buf, 0, &val) < 0)
>  		return -EINVAL;
>  
> +	kn = sysfs_break_active_protection(&dev->kobj, &attr->attr);
> +	WARN_ON_ONCE(!kn);
> +
>  	if (val) {
>  		pci_lock_rescan_remove();
>  		if (!pci_is_root_bus(bus) && list_empty(&bus->devices))
> @@ -500,6 +514,10 @@ static ssize_t dev_bus_rescan_store(struct device *dev,
>  			pci_rescan_bus(bus);
>  		pci_unlock_rescan_remove();
>  	}
> +
> +	if (kn)
> +		sysfs_unbreak_active_protection(kn);

What's the purpose of this dev_bus_rescan_store() change?  There's no
remove here, and the changelog doesn't mention this path.

>  	return count;
>  }
>  static DEVICE_ATTR(rescan, (S_IWUSR|S_IWGRP), NULL, dev_bus_rescan_store);
> -- 
> 2.18.0
> 

      parent reply	other threads:[~2018-11-27 22:55 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-11-05 23:25 [RFC][PATCH] PCI: Avoid PCI device removing/rescanning through sysfs triggers a deadlock Marek Vasut
2018-11-06  8:13 ` Geert Uytterhoeven
2018-11-06  9:45   ` Marek Vasut
2018-11-27 22:55 ` Bjorn Helgaas [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20181127225506.GA168199@google.com \
    --to=helgaas@kernel.org \
    --cc=bart.vanassche@wdc.com \
    --cc=linux-pci@vger.kernel.org \
    --cc=linux-renesas-soc@vger.kernel.org \
    --cc=marek.vasut@gmail.com \
    --cc=tho.vu.wh@rvc.renesas.com \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).