All of lore.kernel.org
 help / color / mirror / Atom feed
From: Yijing Wang <wangyijing@huawei.com>
To: Bjorn Helgaas <bhelgaas@google.com>
Cc: Jiang Liu <liuj97@gmail.com>, Daniel J Blueman <daniel@quora.org>,
	Jesse Barnes <jbarnes@virtuousgeek.org>,
	Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>,
	Yinghai Lu <yinghai@kernel.org>,
	Linux Kernel <linux-kernel@vger.kernel.org>,
	Linux PCI <linux-pci@vger.kernel.org>
Subject: Re: 3.8-rc2: pciehp waitqueue hang...
Date: Wed, 9 Jan 2013 15:40:27 +0800	[thread overview]
Message-ID: <50ED1EEB.8000405@huawei.com> (raw)
In-Reply-To: <CAErSpo7Qo_=d1dLqEFwFcpYjkncST-c1Lx8=3fThtLOaG_P8tQ@mail.gmail.com>

Hi Bjorn,
   I will send the shpchp patch soon.

Thanks!
Yijing

>>> Yijing, please check for the same problem in other hotplug drivers.
>>> Questions I have after a quick look:
>>>
>>
>> Hi Bjorn,
>>    Sorry for delay reply. There are some busy work these days.
>>
>>>   - shpchp_wq looks like it might have the same deadlock issue.
>>
>> shpchp driver uses two workqueues shpchp_wq and shpchp_ordered_wq, they are created by alloc_ordered_workqueue
>> which set the "max_active" parameter to 1. So only one pci hotplug slot can do hotplug at the same time.
>> shpchp introduced these workqueue to remove the use of flush_scheduled_work() which is deprecated and scheduled for removal.
>>
>> hot remove path is:
>>  button press
>>        shpc_isr(interrupt handler)
>>             shpchp_handle_attention_button
>>                 queue_interrupt_event
>>                    queue_work "interrupt_event_handler" into "shpchp_wq"
>>                        interrupt_event_handler
>>                              handle_button_press_event
>>                                    queue_delayed_work "shpchp_queue_pushbutton_work" into "shpchp_wq"
>>                                          queue_work "shpchp_pushbutton_thread" into "shpchp_ordered_wq"
>>                                                shpchp_pushbutton_thread
>>                                                       shpchp_disable_slot
>>                                                             pci_stop_and_remove_bus_device
>>                                                                 ......
>>                                                                shpc_remove()   if the hotplug slot connected a iobox which contains some hotplug pcieport, shpc_remove will be called when remove pcie port device.
>>                                                                    hpc_release_ctlr
>>                                                                        flush_workqueue(shpchp_wq);
>>                                                                        flush_workqueue(shpchp_ordered_wq);
>>                                                                        So hotplug task hang.
>> shpchp driver has the same deadlock issue like pciehp driver, I think we should fix the issue, I will send out the patch if you agree this, but I have no machine support shpchp hotplug,
>> so I can't test this patch in real machine.
> 
> That's OK.  You've tested pciehp, and I don't want to leave shpchp
> broken the same way just because we can't test a similar fix there, so
> please do send the shpchp patch, too.
> 
>>>   - pciehp_wq (and your per-slot replacement) are allocated with
>>> alloc_workqueue().  shpchp_wq is allocated with
>>> alloc_ordered_workqueue().  Why the difference?
>>
>> alloc_workqueue(name, 0, 0) set max_active to 0(0 is default value used and support 256 work items of the wq can be executing at the same time per CPU).
>> So pciehp driver can handle push button event asynchronously.
>>
>> alloc_ordered_workqueue can only one handle push button event at the same time.
> 
> pciehp and shpchp should work the same in this respect unless there's
> a reason they can't, so it sounds like we should make shpchp work like
> pciehp.
> 
>>>   - The alloc/alloc_ordered difference might be related to 486b10b9f4,
>>> where Kenji removed alloc_ordered from pciehp.  Should a similar
>>> change be made to shpchp?
>>
>> Yes, I agree, we can use per-slot workqueue to fix this issue.
>>
>>>
>>>   - acpiphp uses the global kacpi_hotplug_wq.  We never flush or drain
>>> kacpi_hotplug_wq, so I doubt there's a deadlock issue, but I wonder if
>>> there are any ordering issues there because we *don't* ever wait for
>>> things in that queue to be completed.
>>
>> acpiphp driver is not attach to a pci device, so when hot remove pci device, driver will not to flush or drain kacpi_hotplug_wq.
>> But if we do acpiphp hot remove in sequence like this, there maybe cause some unexpected errors, I think.
>> slot(A)------pcie port----slot(B)
>> slot A and slot B both support acpiphp hotplug.
>> 1、press attention button on slot A;
>> 2、press attention button on slot B quickly after step 1;
>> Because kacpi_hotplug_wq is a ordered workqueue, slot B hot remove won't run unless slot A hot remove action completed.
>> After Slot B hot remove completed, some resources of slot A also has been destroyed. So slot B hot remove will cause some unexpected errors.
>> Because my hotplug machine's bios don't support iobox hotplug(slot-connected-slot), I can't verify this situation.
> 
> Hmm.  That definitely sounds like a potential problem.  But I think
> it's beyond the scope of the issue you're trying to fix, and any fix
> would look much different from your current pciehp patch, so I think
> we can treat it separately.
> 
> Bjorn
> 
> .
> 


-- 
Thanks!
Yijing


      reply	other threads:[~2013-01-09  7:40 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2013-01-03 15:11 3.8-rc2: pciehp waitqueue hang Daniel J Blueman
2013-01-03 15:41 ` Jiang Liu
2013-01-04  1:08   ` Daniel J Blueman
2013-01-04 16:57     ` Jiang Liu
2013-01-04 20:01   ` Bjorn Helgaas
2013-01-04 21:50     ` Bjorn Helgaas
2013-01-05  1:28       ` Yijing Wang
2013-01-06 12:13       ` Yijing Wang
2013-01-08 18:11         ` Bjorn Helgaas
2013-01-09  7:40           ` Yijing Wang [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=50ED1EEB.8000405@huawei.com \
    --to=wangyijing@huawei.com \
    --cc=bhelgaas@google.com \
    --cc=daniel@quora.org \
    --cc=jbarnes@virtuousgeek.org \
    --cc=kaneshige.kenji@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=liuj97@gmail.com \
    --cc=yinghai@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.