From: Sinan Kaya <okaya@codeaurora.org>
To: Govindarajulu Varadarajan <gvaradar@cisco.com>
Cc: benve@cisco.com, bhelgaas@google.com, linux-pci@vger.kernel.org,
linux-kernel@vger.kernel.org, jlbec@evilplan.org, hch@lst.de,
mingo@redhat.com, peterz@infradead.org
Subject: Re: [PATCH 3/4] pci aer: fix deadlock in do_recovery
Date: Fri, 29 Sep 2017 09:32:42 -0400 [thread overview]
Message-ID: <0a2a41c5-2872-fdb6-8ad2-97b0b6dc69b1@codeaurora.org> (raw)
In-Reply-To: <alpine.LNX.2.20.1709281624330.24635@cae-iprp-alln-lb.cisco.com>
On 9/28/2017 7:46 PM, Govindarajulu Varadarajan wrote:
>> How about releasing the device_lock here on CPU0?>
>
> pci_device_add() is called by driver's pci probe function. device_lock(dev)
> should be held before calling pci driver probe function.
I see. The goal of the lock held here is to protect probe() operation from
being disrupted. I also don't think we can change this.
>
>> or in other words keep device_lock as short as possible?
>
> The problem is not the duration device_lock is held. It is the order two locks
> are aquired. We cannot control or implement a restriction that during
> device_lock() is held, driver probe should not call pci function which aquires
> pci_bus_sem. And in case of pci aer, aer handler needs to call driver err_handler()
> for which we need to hold device_lock() before calling err_handler(). In order
> to find all the devices on a pci bus, we should hold pci_bus_sem to do
> pci_walk_bus().
I was reacting to this to see if there is a better way to do this.
"Only fix I could think of is to lock &pci_bus_sem and try locking all
device->mutex under that pci_bus. If it fails, unlock all device->mutex
and &pci_bus_sem and try again."
How about gracefully returning from report_error_detected() when we cannot obtain
the device_lock() by replacing it with device_trylock()?
aer_pci_walk_bus() can still poll like you did until it gets the lock. At least,
we don't get to introduce a new API, new lock semantics and code refactoring.
__pci_bus_trylock() looked very powerful and also dangerously flexible to
introduce new bugs to me.
For instance, you called it like this.
+ down_read(&pci_bus_sem);
+ locked = __pci_bus_trylock(bus, pci_device_trylock,
+ pci_device_unlock);
pci_bus_trylock() would obtain device + cfg locks whereas pci_device_trylock() only
obtains the device lock. Can it race against cfg lock? It depends on the caller.
Very subtle difference.
--
Sinan Kaya
Qualcomm Datacenter Technologies, Inc. as an affiliate of Qualcomm Technologies, Inc.
Qualcomm Technologies, Inc. is a member of the Code Aurora Forum, a Linux Foundation Collaborative Project.
next prev parent reply other threads:[~2017-09-29 13:32 UTC|newest]
Thread overview: 15+ messages / expand[flat|nested] mbox.gz Atom feed top
2017-09-27 21:42 [PATCH 0/4] pci aer: fix deadlock in do_recovery Govindarajulu Varadarajan
2017-09-27 21:42 ` [PATCH 1/4] pci: introduce __pci_walk_bus for caller with pci_bus_sem held Govindarajulu Varadarajan
2017-09-28 16:12 ` Sinan Kaya
2017-09-28 23:52 ` Govindarajulu Varadarajan
2017-09-27 21:42 ` [PATCH 2/4] pci: code refactor pci_bus_lock/unlock/trylock Govindarajulu Varadarajan
2017-09-27 21:42 ` [PATCH 3/4] pci aer: fix deadlock in do_recovery Govindarajulu Varadarajan
2017-09-28 16:47 ` Sinan Kaya
2017-09-28 23:46 ` Govindarajulu Varadarajan
2017-09-29 13:32 ` Sinan Kaya [this message]
2017-09-30 6:00 ` Govindarajulu Varadarajan
2017-09-27 21:42 ` [PATCH 4/4] lockdep: make MAX_LOCK_DEPTH configurable from Kconfig Govindarajulu Varadarajan
2017-09-28 9:26 ` Peter Zijlstra
2017-09-28 23:51 ` Govindarajulu Varadarajan
2017-09-29 16:23 ` Bjorn Helgaas
2017-09-30 6:03 ` Govindarajulu Varadarajan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=0a2a41c5-2872-fdb6-8ad2-97b0b6dc69b1@codeaurora.org \
--to=okaya@codeaurora.org \
--cc=benve@cisco.com \
--cc=bhelgaas@google.com \
--cc=gvaradar@cisco.com \
--cc=hch@lst.de \
--cc=jlbec@evilplan.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=mingo@redhat.com \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).