public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Jens Axboe <jens.axboe@oracle.com>
To: Kenji Kaneshige <kaneshige.kenji@jp.fujitsu.com>
Cc: Alex Chiang <achiang@hp.com>, Mark Lord <lkml@rtr.ca>,
	Greg KH <greg@kroah.com>,
	Linux Kernel <linux-kernel@vger.kernel.org>,
	jbarnes@virtuousgeek.org, linux-pci@vger.kernel.org
Subject: Re: pci-express hotplug
Date: Thu, 29 Oct 2009 10:24:53 +0100	[thread overview]
Message-ID: <20091029092453.GD10727@kernel.dk> (raw)
In-Reply-To: <4AE95EFA.7000009@jp.fujitsu.com>

On Thu, Oct 29 2009, Kenji Kaneshige wrote:
> Jens Axboe wrote:
>> On Thu, Oct 29 2009, Kenji Kaneshige wrote:
>>> Jens Axboe wrote:
>>>> On Wed, Oct 28 2009, Kenji Kaneshige wrote:
>>>>> Jens Axboe wrote:
>>>>>> On Tue, Oct 27 2009, Kenji Kaneshige wrote:
>>>>>>> Jens Axboe wrote:
>>>>>>>> On Tue, Oct 20 2009, Alex Chiang wrote:
>>>>>>>>> * Jens Axboe <jens.axboe@oracle.com>:
>>>>>>>>>> On Tue, Oct 13 2009, Alex Chiang wrote:
>>>>>>>>>>>>> Can you modprobe acpiphp with debug=1? And send the output?
>>>>>>>>>>>> acpiphp: ACPI Hot Plug PCI Controller Driver version: 0.5
>>>>>>>>>>>> acpiphp_glue: found PCI-to-PCI bridge at PCI 0000:00:05.0
>>>>>>>>>>>> acpiphp_glue: found ACPI PCI Hotplug slot 1 at PCI 0000:08:00
>>>>>>>>>>>> acpiphp: Slot [1] registered
>>>>>>>>>>>> acpiphp_glue: found PCI-to-PCI bridge at PCI 0000:00:07.0
>>>>>>>>>>>> acpiphp_glue: found ACPI PCI Hotplug slot 2 at PCI 0000:0b:00
>>>>>>>>>>>> acpiphp: Slot [2] registered
>>>>>>>>>>>> acpiphp_glue: found PCI-to-PCI bridge at PCI 0000:80:07.0
>>>>>>>>>>>> acpiphp_glue: found ACPI PCI Hotplug slot 6 at PCI 0000:84:00
>>>>>>>>>>>> acpiphp: Slot [6] registered
>>>>>>>>>>>> acpiphp_glue: found PCI-to-PCI bridge at PCI 0000:80:09.0
>>>>>>>>>>>> acpiphp_glue: found ACPI PCI Hotplug slot 7 at PCI 0000:87:00
>>>>>>>>>>>> acpiphp: Slot [7] registered
>>>>>>>>>>>> acpiphp_glue: Bus 0000:87 has 1 slot
>>>>>>>>>>>> acpiphp_glue: Bus 0000:84 has 1 slot
>>>>>>>>>>>> acpiphp_glue: Bus 0000:0b has 1 slot
>>>>>>>>>>>> acpiphp_glue: Bus 0000:08 has 1 slot
>>>>>>>>>>>> acpiphp_glue: Total 4 slots
>>>>>>>>>>> You mentioned in another mail that you echoed 1 into the various
>>>>>>>>>>> slots' power files.
>>>>>>>>>>>
>>>>>>>>>>> Did you do that after modprobing acpiphp with debug=1?
>>>>>>>>>>>
>>>>>>>>>>> If so, there should be debug output when you try and turn them
>>>>>>>>>>> on.
>>>>>>>>>> It produces:
>>>>>>>>>>
>>>>>>>>>> acpiphp: enable_slot - physical_slot = 1
>>>>>>>>>> acpiphp_glue: acpiphp_enable_slot: Slot status is not ACPI_STA_ALL
>>>>>>>>>> acpiphp: enable_slot - physical_slot = 2
>>>>>>>>>> acpiphp_glue: acpiphp_enable_slot: Slot status is not ACPI_STA_ALL
>>>>>>>>>> acpiphp: enable_slot - physical_slot = 6
>>>>>>>>>> acpiphp_glue: acpiphp_enable_slot: Slot status is not ACPI_STA_ALL
>>>>>>>>>> acpiphp: enable_slot - physical_slot = 7
>>>>>>>>>> acpiphp_glue: acpiphp_enable_slot: Slot status is not ACPI_STA_ALL
>>>>>>>>> Hm, so for some reason, firmware on your machine is telling us
>>>>>>>>> that it doesn't think cards are present and/or enabled.
>>>>>>>>>
>>>>>>>>> Unfortunately, I don't know why your firmware would be saying
>>>>>>>>> that. We could add some more debug printks to see what firmware
>>>>>>>>> thinks about your system... Or we could just wait and see what
>>>>>>>>> happens after you get your hardware replaced.
>>>>>>>> New board, the exact same thing happens.
>>>>>>>>
>>>>>>>>>> I have a card in one of the slots only this time.
>>>>>>>>>>
>>>>>>>>>>> Also, quick dummy check, you are trying to power on populated
>>>>>>>>>>> slots, right? :)
>>>>>>>>>> Yes :-)
>>>>>>>>>>
>>>>>>>>>>> Can you send the output of lspci -vv? And I like the output of
>>>>>>>>>>> lspci -vt as well... Both before and after loading acpiphp
>>>>>>>>>>> please.
>>>>>>>>>> Send privately.
>>>>>>>>> No difference in before and after. Odd.
>>>>>>>>>
>>>>>>>>> If you want to poke us again after your hardware swap, please do
>>>>>>>>> so. Sorry for being not so helpful. :-/
>>>>>>>> Poke :-)
>>>>>>>>
>>>>>>>> One more thing I tried was pushing the power button on the slot
>>>>>>>> manually. With acpiphp, I get the same messages as above. Using pciehp,
>>>>>>>> I get the same power fault bit interrupt storm. So no difference from
>>>>>>>> using the sysfs interface or doing it on the box side, doesn't work
>>>>>>>> either way.
>>>>>>>>
>>>>>>> I'd like to confirm power fault interrupt storm, just in case.
>>>>>>> Could you get /proc/interrupts information after power fault
>>>>>>> problem happens and send it to me?
>>>>>> The box pretty much hangs when I try to power on a slot with pciehp, so
>>>>>> it's not easy to do... It doesn't hang with acpiphp, but doesn't work
>>>>>> either (see previous reply to Alex).
>>>>>>
>>>>> Could you try the attached debugging patch? With this patch, power
>>>>> fault interrupt would be disabled after 100 power fault detected (
>>>>> I hope so). You can get /proc/interrupts after that.
>>>> Here is the output of doing the power on with that patch applied.
>>>>
>>>> pciehp 0000:00:05.0:pcie04: enable_slot: physical_slot = 1
>>>> pciehp 0000:00:05.0:pcie04: pciehp_get_power_status: SLOTCTRL a8 value read 77b
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 10
>>>> pciehp 0000:00:05.0:pcie04: pciehp_power_on_slot: SLOTCTRL a8 write cmd 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 10
>>>> pciehp 0000:00:05.0:pcie04: pciehp_green_led_blink: SLOTCTRL a8 write cmd 200
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: Power fault interrupt received
>>>> pciehp 0000:00:05.0:pcie04: Power fault on Slot(1)
>>>> pciehp 0000:00:05.0:pcie04: Power fault bit 0 set
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 2
>>>> pciehp 0000:00:05.0:pcie04: Data Link Layer Link Active not set in 1000 msec
>>>> pciehp 0000:00:05.0:pcie04: pciehp_check_link_status: lnk_status = 1001
>>>> pciehp 0000:00:05.0:pcie04: Link Training Error occurs pciehp  
>>>> 0000:00:05.0:pcie04: Failed to check link status
>>>> pciehp 0000:00:05.0:pcie04: Command not completed in 1000 msec
>>>> pciehp 0000:00:05.0:pcie04: pciehp_set_attention_status: SLOTCTRL a8 write cmd 40
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 12
>>>> pciehp 0000:00:05.0:pcie04: pciehp_green_led_off: SLOTCTRL a8 write cmd 300
>>>> pciehp 0000:00:05.0:pcie04: pcie_isr: intr_loc 12
>>>> pciehp 0000:00:05.0:pcie04: pciehp_power_off_slot: SLOTCTRL a8 write cmd 400
>>>> pciehp 0000:00:05.0:pcie04: Command not completed in 1000 msec
>>>> pciehp 0000:00:05.0:pcie04: pciehp_green_led_off: SLOTCTRL a8 write cmd 300
>>>> pciehp 0000:00:05.0:pcie04: Command not completed in 1000 msec
>>>> pciehp 0000:00:05.0:pcie04: pciehp_set_attention_status: SLOTCTRL a8 write cmd 40
>>>> pciehp 0000:00:05.0:pcie04: pciehp_get_power_status: SLOTCTRL a8 value read 779
>>>> pciehp 0000:00:05.0:pcie04: pciehp_get_attention_status: SLOTCTRL a8, value read 779
>>>>
>>> From the console log, it seems that my debug patch worked as I expected
>>> (power fault event interrupts ware disabled after 100 power fault event).
>>> But for some reasons, /proc/interrupts indicates only 5 interrupts of
>>> pciehp. Just in case, did you get /proc/interrupts after doing power on?
>>
>> Nope, it was captured post the power on attempt and the above log dump.
>>
>
> Can I confirm that? (sorry for my poor English skill)
>
> The /proc/interrupt was captured *before* the power on attempt and the log.
> Correct?

No, the /proc/interrupt output was captured AFTER the power on attempt
and the log capture shown above.

-- 
Jens Axboe


  reply	other threads:[~2009-10-29  9:24 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-10-12 12:06 pci-express hotplug Jens Axboe
2009-10-12 14:52 ` Greg KH
2009-10-12 14:57   ` Jens Axboe
2009-10-12 15:00     ` Mark Lord
2009-10-12 15:06       ` Jens Axboe
2009-10-12 21:48         ` Alex Chiang
2009-10-13  8:29           ` Jens Axboe
2009-10-13 17:27             ` Alex Chiang
2009-10-14  8:13               ` Jens Axboe
2009-10-20 19:07                 ` Alex Chiang
2009-10-26 10:54                   ` Jens Axboe
2009-10-27  2:48                     ` Alex Chiang
2009-10-27  8:26                       ` Jens Axboe
2009-10-27  8:34                         ` Jens Axboe
2009-10-27 15:15                           ` Alex Chiang
2009-10-28  9:18                             ` Jens Axboe
2009-10-28 19:55                               ` Alex Chiang
2009-10-29 18:55                                 ` Jens Axboe
2009-10-28 20:46                         ` Alex Chiang
2009-10-28 21:39                         ` Alex Chiang
2009-10-29  8:57                           ` Jens Axboe
2009-10-27  6:31                     ` Kenji Kaneshige
2009-10-27  8:27                       ` Jens Axboe
2009-10-27  8:36                         ` Jens Axboe
2009-10-27  8:46                         ` Kenji Kaneshige
2009-10-28  6:15                         ` Kenji Kaneshige
2009-10-28  9:23                           ` Jens Axboe
2009-10-29  7:44                             ` Kenji Kaneshige
2009-10-29  8:58                               ` Jens Axboe
2009-10-29  9:23                                 ` Kenji Kaneshige
2009-10-29  9:24                                   ` Jens Axboe [this message]
2009-11-02  5:27                                     ` Kenji Kaneshige
2009-10-13  3:19 ` Kenji Kaneshige
2009-10-13  8:31   ` Jens Axboe
2009-10-13 10:48     ` Kenji Kaneshige
2009-10-13 11:25       ` Jens Axboe
2009-10-14  5:26         ` Kenji Kaneshige
2009-10-14  8:47           ` Jens Axboe
2009-10-15  5:41             ` Kenji Kaneshige
2009-10-15  9:42               ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20091029092453.GD10727@kernel.dk \
    --to=jens.axboe@oracle.com \
    --cc=achiang@hp.com \
    --cc=greg@kroah.com \
    --cc=jbarnes@virtuousgeek.org \
    --cc=kaneshige.kenji@jp.fujitsu.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pci@vger.kernel.org \
    --cc=lkml@rtr.ca \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox