From: Yijing Wang <wangyijing@huawei.com>
To: "Kaneshige, Kenji" <kaneshige.kenji@jp.fujitsu.com>
Cc: Bjorn Helgaas <bhelgaas@google.com>,
Yinghai Lu <yinghai@kernel.org>, Rafael <rjw@sisk.pl>,
Rusty Russell <rusty@rustcorp.com.au>,
Mauro Carvalho Chehab <mchehab@redhat.com>,
Oliver Neukum <oneukum@suse.de>,
"jiang.liu@huawei.com" <jiang.liu@huawei.com>,
"linux-pci@vger.kernel.org" <linux-pci@vger.kernel.org>,
Hanjun Guo <guohanjun@huawei.com>
Subject: Re: [PATCH -v3] PCI, pciehp: make every slot have its own workqueue to avoid deadlock
Date: Tue, 13 Nov 2012 10:53:40 +0800 [thread overview]
Message-ID: <50A1B634.8000402@huawei.com> (raw)
In-Reply-To: <4A338DB2991D2A44B9A44B8718AECF650A52062A@G01JPEXMBYT03>
On 2012/11/13 10:35, Kaneshige, Kenji wrote:
>> -----Original Message-----
>> From: Bjorn Helgaas [mailto:bhelgaas@google.com]
>> Sent: Tuesday, November 13, 2012 9:18 AM
>> To: Yijing Wang
>> Cc: Kaneshige, Kenji/金重 憲治; Yinghai Lu; Rafael; Rusty Russell; Mauro Carvalho Chehab; Oliver Neukum;
>> jiang.liu@huawei.com; linux-pci@vger.kernel.org; Hanjun Guo
>> Subject: Re: [PATCH -v3] PCI, pciehp: make every slot have its own workqueue to avoid deadlock
>>
>> On Mon, Nov 12, 2012 at 2:17 AM, Yijing Wang <wangyijing@huawei.com> wrote:
>>> Currently, pciehp use global pciehp_wq to handle hotplug event from hardware.
>>> Hot remove path will be blocked if a hotplug slot connected a IO-BOX(composed of PCIe
>>> Switch and some slots which support hotplug). The hot removed work was queued into
>>> pciehp_wq. But in the hot-remove path, pciehp driver would flush pciehp_wq when
>>> the pcie port(support pciehp) was removed. In this case the hot-remove path blocked.
>>> This patch remove the global pciehp_wq and create a new workqueue for every slot to
>>> avoid above problem.
>>
>> Can you include the actual call path that leads to the deadlock? I
>> assume it's related to a PCIe device removal that happens at about the
>> same time as the pcie_port_service_unregister(&hpdriver_portdrv)?
>>
>> I just want to make sure that the new per-slot workqueue arrangement
>> doesn't lead to any case where the slot workqueue item depends on
>> something removed by the pcied_cleanup() as the module exits.
>
> Just for your information, here is my understanding of the problem.
>
> ====
> - Your hardware has nested PCIe hotplug slots like below.
>
> --<slot A>--<Slot B>
>
> - Hot-removal request (attention button event) comes to <slot A>
> - Pciehp driver queue the hot-removal work
> - This hot-removal work try to remove <slot B>
> - To remove <slot B>, pciehp flush the pciehp_wq, but it never finishes
> because hot-removal work is in progress. ===> deadlock
> ====
>
Yes, the deadlock situation as Kaneshige's description above, I will add deadlock call path
description into patch.
>>
>>> --- a/drivers/pci/hotplug/pciehp_hpc.c
>>> +++ b/drivers/pci/hotplug/pciehp_hpc.c
>>> @@ -773,23 +773,32 @@ static void pcie_shutdown_notification(struct controller *ctrl)
>>> static int pcie_init_slot(struct controller *ctrl)
>>> {
>>> struct slot *slot;
>>> + char name[32];
>>>
>>> slot = kzalloc(sizeof(*slot), GFP_KERNEL);
>>> if (!slot)
>>> return -ENOMEM;
>>> +
>>> + snprintf(name, sizeof(name), "pciehp-%u", PSN(ctrl));
>>> + slot->wq = create_singlethread_workqueue(name);
>>
>> You're using create_singlethread_workqueue() instead of the previous
>> alloc_workqueue(). This is a slight change in semantics since we
>> previously called "alloc_workqueue(name, 0, 0)" and
>> create_singlethread_workqueue() calls "alloc_workqueue(name,
>> WQ_UNBOUND | WQ_MEM_RECLAIM, 1)."
>>
>> I don't know whether this is necessary in order to have per-slot
>> workqueues or not. If not, please split it into a separate patch to
>> make it more visible.
>
> Good catch.
> I think we should use 0 for max_active.
> Hot-add/remove operation take a long time. So if we use 1 for max_active,
> we cannot handle events during hot-add/remove operation. As a result, for
> example, push button event and cancel event that are issued during
> hot-add/remove operation might not be handled properly, I think.
>
> Any comments, Yijing?
This is an issue, I will test this case in my hot-plug machine and confirm it.
Thanks!
Yijing
>
> Regards,
> Kenji Kaneshige
>
>
> .
>
--
Thanks!
Yijing
next prev parent reply other threads:[~2012-11-13 2:56 UTC|newest]
Thread overview: 6+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-11-12 9:17 [PATCH -v3] PCI, pciehp: make every slot have its own workqueue to avoid deadlock Yijing Wang
2012-11-12 10:12 ` Kaneshige, Kenji
2012-11-13 0:17 ` Bjorn Helgaas
2012-11-13 2:35 ` Kaneshige, Kenji
2012-11-13 2:53 ` Yijing Wang [this message]
2012-11-13 8:00 ` Yijing Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=50A1B634.8000402@huawei.com \
--to=wangyijing@huawei.com \
--cc=bhelgaas@google.com \
--cc=guohanjun@huawei.com \
--cc=jiang.liu@huawei.com \
--cc=kaneshige.kenji@jp.fujitsu.com \
--cc=linux-pci@vger.kernel.org \
--cc=mchehab@redhat.com \
--cc=oneukum@suse.de \
--cc=rjw@sisk.pl \
--cc=rusty@rustcorp.com.au \
--cc=yinghai@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.