From: Yijing Wang <wangyijing@huawei.com>
To: Bjorn Helgaas <bhelgaas@google.com>
Cc: Jiang Liu <liuj97@gmail.com>, Daniel J Blueman <daniel@quora.org>,
Jesse Barnes <jbarnes@virtuousgeek.org>,
Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>,
Yinghai Lu <yinghai@kernel.org>,
Linux Kernel <linux-kernel@vger.kernel.org>,
Linux PCI <linux-pci@vger.kernel.org>
Subject: Re: 3.8-rc2: pciehp waitqueue hang...
Date: Sat, 5 Jan 2013 09:28:07 +0800 [thread overview]
Message-ID: <50E781A7.8070607@huawei.com> (raw)
In-Reply-To: <CAErSpo6vTS6MnWoHJgXgmVtq=jqrudOHW47+8hVpc8GfiY100A@mail.gmail.com>
On 2013/1/5 5:50, Bjorn Helgaas wrote:
> [+to Yijing, +cc Kenji]
>
> On Fri, Jan 4, 2013 at 1:01 PM, Bjorn Helgaas <bhelgaas@google.com> wrote:
>> On Thu, Jan 3, 2013 at 8:41 AM, Jiang Liu <liuj97@gmail.com> wrote:
>>> Hi Daniel,
>>> It seems like an issue caused by recursive PCIe HPC.
>>> Could you please help to try the patch from:
>>> http://www.spinics.net/lists/linux-pci/msg18625.html
>>
>> Hi Gerry,
>>
>> I'm working on merging this patch. Seems like something that might be
>> appropriate for stable as well.
>>
>> Did you look for similar problems in other hotplug drivers?
>
> Oops, sorry, I forgot that Yijing is the author of the patch in question.
>
> Yijing, please check for the same problem in other hotplug drivers.
> Questions I have after a quick look:
>
OK, I will check the similar problems for other hotplug drivers, my pleasure.
Thanks!
Yijing.
> - shpchp_wq looks like it might have the same deadlock issue.
>
> - pciehp_wq (and your per-slot replacement) are allocated with
> alloc_workqueue(). shpchp_wq is allocated with
> alloc_ordered_workqueue(). Why the difference?
>
> - The alloc/alloc_ordered difference might be related to 486b10b9f4,
> where Kenji removed alloc_ordered from pciehp. Should a similar
> change be made to shpchp?
>
> - acpiphp uses the global kacpi_hotplug_wq. We never flush or drain
> kacpi_hotplug_wq, so I doubt there's a deadlock issue, but I wonder if
> there are any ordering issues there because we *don't* ever wait for
> things in that queue to be completed.
>
>>> Thanks!
>>> Gerry
>>> On 01/03/2013 11:11 PM, Daniel J Blueman wrote:
>>>> When the Apple thunderbolt ethernet adapter comes loose on my Macbook
>>>> Pro Retina (Intel DSL3510), we see pci_slot_name return
>>>> non-deterministic data (ie varying each boot), and we see pciehp_wp
>>>> remain armed with events causing the kthread to get stuck:
>>>>
>>>> tg3 0000:0a:00.0 eth0: Link is up at 1000 Mbps, full duplex
>>>> tg3 0000:0a:00.0 eth0: Flow control is on for TX and on for RX
>>>> <thunderbold adapter comes loose>
>>>> pciehp 0000:06:03.0:pcie24: Card not present on Slot(3)
>>>> tg3 0000:0a:00.0: tg3_abort_hw timed out, TX_MODE_ENABLE will not
>>>> clear MAC_TX_MODE=ffffffff
>>>> tg3 0000:0a:00.0 eth0: No firmware running
>>>> tg3 0000:0a:00.0 eth0: Link is down
>>>> pcieport 0000:00:01.1: System wakeup enabled by ACPI
>>>> pciehp 0000:09:00.0:pcie24: unloading service driver pciehp
>>>> pciehp 0000:09:00.0:pcie24: Latch open on
>>>> Slot(\xfffffff89\xffffffbbe\x02\xffffff88\xffffffff\xffffffff\xffffffe09\xffffffbbe\x02\xffffff88\xffffffff\xfffffffffbcon)
>>>> pciehp 0000:09:00.0:pcie24: Button pressed on
>>>> Slot(\xfffffff89\xffffffbbe\x02\xffffff88\xffffffff\xffffffff\xffffffe09\xffffffbbe\x02\xffffff88\xffffffff\xfffffffffbcon)
>>>> pciehp 0000:09:00.0:pcie24: Card present on
>>>> Slot(\xfffffff89\xffffffbbe\x02\xffffff88\xffffffff\xffffffff\xffffffe09\xffffffbbe\x02\xffffff88\xffffffff\xfffffffffbcon)
>>>> pciehp 0000:09:00.0:pcie24: Power fault on slot
>>>> \xfffffff89\xffffffbbe\x02\xffffff88\xffffffff\xffffffff\xffffffe09\xffffffbbe\x02\xffffff88\xffffffff\xfffffffffbcon
>>>> pciehp 0000:09:00.0:pcie24: Power fault bit 0 set
>>>> pciehp 0000:09:00.0:pcie24: PCI slot
>>>> #\xfffffff89\xffffffbbe\x02\xffffff88\xffffffff\xffffffff\xffffffe09\xffffffbbe\x02\xffffff88\xffffffff\xfffffffffbcon
>>>> - powering on due to button press.
>>>> pciehp 0000:09:00.0:pcie24: Link Training Error occurs
>>>> pciehp 0000:09:00.0:pcie24: Failed to check link status
>>>> INFO: task kworker/0:1:52 blocked for more than 120 seconds.
>>>> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
>>>> kworker/0:1 D ffff880265893090 0 52 2 0x00000000
>>>> ffff8802655456f8 0000000000000046 ffffffff81a21a60 ffff880265545fd8
>>>> 0000000000004000 ffff880265545fd8 ffff880265892bb0 ffff880265adc8d0
>>>> 000000000000059e 0000000000000082 ffff880265545668 ffffffff810415aa
>>>> Call Trace:
>>>> [<ffffffff810415aa>] ? console_unlock+0x1fa/0x4a0
>>>> [<ffffffff8108d16d>] ? trace_hardirqs_off+0xd/0x10
>>>> [<ffffffff81041b19>] ? vprintk_emit+0x1c9/0x510
>>>> [<ffffffff81558db4>] schedule+0x24/0x70
>>>> [<ffffffff8155653c>] schedule_timeout+0x19c/0x1e0
>>>> [<ffffffff81558c43>] wait_for_common+0xe3/0x180
>>>> [<ffffffff8105adc1>] ? flush_workqueue+0x111/0x4d0
>>>> [<ffffffff81071140>] ? try_to_wake_up+0x2d0/0x2d0
>>>> [<ffffffff81558d88>] wait_for_completion+0x18/0x20
>>>> [<ffffffff8105ae86>] flush_workqueue+0x1d6/0x4d0
>>>> [<ffffffff8105acb0>] ? flush_workqueue_prep_cwqs+0x200/0x200
>>>> [<ffffffff8125e909>] pciehp_release_ctrl+0x39/0x90
>>>> [<ffffffff8125b945>] pciehp_remove+0x25/0x30
>>>> [<ffffffff81255bf2>] pcie_port_remove_service+0x52/0x70
>>>> [<ffffffff81306a27>] __device_release_driver+0x77/0xe0
>>>> [<ffffffff81306ab9>] device_release_driver+0x29/0x40
>>>> [<ffffffff813064b1>] bus_remove_device+0xf1/0x140
>>>> [<ffffffff81303fe7>] device_del+0x127/0x1c0
>>>> [<ffffffff81255d70>] ? resume_iter+0x40/0x40
>>>> [<ffffffff81304091>] device_unregister+0x11/0x20
>>>> [<ffffffff81255da5>] remove_iter+0x35/0x40
>>>> [<ffffffff81302eb6>] device_for_each_child+0x36/0x70
>>>> [<ffffffff81256341>] pcie_port_device_remove+0x21/0x40
>>>> [<ffffffff81256588>] pcie_portdrv_remove+0x28/0x50
>>>> [<ffffffff8124a821>] pci_device_remove+0x41/0xc0
>>>> [<ffffffff81306a27>] __device_release_driver+0x77/0xe0
>>>> [<ffffffff81306ab9>] device_release_driver+0x29/0x40
>>>> [<ffffffff813064b1>] bus_remove_device+0xf1/0x140
>>>> [<ffffffff81303fe7>] device_del+0x127/0x1c0
>>>> [<ffffffff81304091>] device_unregister+0x11/0x20
>>>> [<ffffffff8124566c>] pci_stop_bus_device+0x8c/0xa0
>>>> [<ffffffff81245615>] pci_stop_bus_device+0x35/0xa0
>>>> [<ffffffff81245811>] pci_stop_and_remove_bus_device+0x11/0x20
>>>> [<ffffffff8125cc91>] pciehp_unconfigure_device+0x91/0x190
>>>> [<ffffffff8125c76d>] ? pciehp_power_thread+0x2d/0x110
>>>> [<ffffffff8125c591>] pciehp_disable_slot+0x71/0x220
>>>> [<ffffffff8125c826>] pciehp_power_thread+0xe6/0x110
>>>> [<ffffffff8105d203>] process_one_work+0x193/0x550
>>>> [<ffffffff8105d1a1>] ? process_one_work+0x131/0x550
>>>> [<ffffffff8125c740>] ? pciehp_disable_slot+0x220/0x220
>>>> [<ffffffff8105d96d>] worker_thread+0x15d/0x400
>>>> [<ffffffff8109213d>] ? trace_hardirqs_on+0xd/0x10
>>>> [<ffffffff8105d810>] ? rescuer_thread+0x210/0x210
>>>> [<ffffffff81062bd6>] kthread+0xd6/0xe0
>>>> [<ffffffff8155a18b>] ? _raw_spin_unlock_irq+0x2b/0x50
>>>> [<ffffffff81062b00>] ? __init_kthread_worker+0x70/0x70
>>>> [<ffffffff8155ae6c>] ret_from_fork+0x7c/0xb0
>>>> [<ffffffff81062b00>] ? __init_kthread_worker+0x70/0x70
>>>>
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-pci" in
>>> the body of a message to majordomo@vger.kernel.org
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
> .
>
--
Thanks!
Yijing
next prev parent reply other threads:[~2013-01-05 1:28 UTC|newest]
Thread overview: 10+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-01-03 15:11 3.8-rc2: pciehp waitqueue hang Daniel J Blueman
2013-01-03 15:41 ` Jiang Liu
2013-01-04 1:08 ` Daniel J Blueman
2013-01-04 16:57 ` Jiang Liu
2013-01-04 20:01 ` Bjorn Helgaas
2013-01-04 21:50 ` Bjorn Helgaas
2013-01-05 1:28 ` Yijing Wang [this message]
2013-01-06 12:13 ` Yijing Wang
2013-01-08 18:11 ` Bjorn Helgaas
2013-01-09 7:40 ` Yijing Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=50E781A7.8070607@huawei.com \
--to=wangyijing@huawei.com \
--cc=bhelgaas@google.com \
--cc=daniel@quora.org \
--cc=jbarnes@virtuousgeek.org \
--cc=kaneshige.kenji@jp.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-pci@vger.kernel.org \
--cc=liuj97@gmail.com \
--cc=yinghai@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.