* Hot add a PCIe device driver upon hotplug event @ 2015-01-12 11:42 Paulo Fortuna Carvalho 2015-01-12 16:58 ` Bjorn Helgaas 0 siblings, 1 reply; 16+ messages in thread From: Paulo Fortuna Carvalho @ 2015-01-12 11:42 UTC (permalink / raw) To: linux-pci Hello, I want to automatically load/unload a PCIe device driver when a card is inserted/removed from the system. I can see in the system logger with dmesg that the interrupt event is captured and acknowledged by the pciehp hotplug service driver. What I want to do next is that the operating system load/unload the corresponding PCIe device driver for that card. Can anyone give me some information if possible? Thx, Paulo. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Hot add a PCIe device driver upon hotplug event 2015-01-12 11:42 Hot add a PCIe device driver upon hotplug event Paulo Fortuna Carvalho @ 2015-01-12 16:58 ` Bjorn Helgaas 2015-01-12 17:26 ` Paulo Fortuna Carvalho [not found] ` <CAH9N0t8s+kcG_Qok0wpoDz9jzwvPk_QmBK_p-qbACZJjrr+iVQ@mail.gmail.com> 0 siblings, 2 replies; 16+ messages in thread From: Bjorn Helgaas @ 2015-01-12 16:58 UTC (permalink / raw) To: Paulo Fortuna Carvalho; +Cc: linux-pci@vger.kernel.org On Mon, Jan 12, 2015 at 5:42 AM, Paulo Fortuna Carvalho <pricardofc@gmail.com> wrote: > Hello, > I want to automatically load/unload a PCIe device driver when a card > is inserted/removed from the system. I can see in the system logger > with dmesg that the interrupt event is captured and acknowledged by > the pciehp hotplug service driver. > What I want to do next is that the operating system load/unload the > corresponding PCIe device driver for that card. When pciehp receives the interrupt, it should enumerate the device, and you should see a line in dmesg similar to this (of course, it will have different bus/device/function and different vendor/device IDs): pci 0000:00:16.0: [8086:9c3a] type 00 class 0x078000 The PCI core should then add the device using device_add(), and part of that is to emit a uevent, which can be read by user-space. Generally udev would handle the event and load the appropriate driver. I don't know the details of how the user-space side works. Bjorn ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Hot add a PCIe device driver upon hotplug event 2015-01-12 16:58 ` Bjorn Helgaas @ 2015-01-12 17:26 ` Paulo Fortuna Carvalho 2015-01-12 17:41 ` Bjorn Helgaas [not found] ` <CAH9N0t8s+kcG_Qok0wpoDz9jzwvPk_QmBK_p-qbACZJjrr+iVQ@mail.gmail.com> 1 sibling, 1 reply; 16+ messages in thread From: Paulo Fortuna Carvalho @ 2015-01-12 17:26 UTC (permalink / raw) To: Bjorn Helgaas; +Cc: linux-pci@vger.kernel.org Hello Helgas, I cannot see that line in dmesg. I think that something else is missing... Do you know what may cause that not appearing in dmesg? Hotplug Surprise? Thx, Paulo. 2015-01-12 16:58 GMT, Bjorn Helgaas <bhelgaas@google.com>: > On Mon, Jan 12, 2015 at 5:42 AM, Paulo Fortuna Carvalho > <pricardofc@gmail.com> wrote: >> Hello, >> I want to automatically load/unload a PCIe device driver when a card >> is inserted/removed from the system. I can see in the system logger >> with dmesg that the interrupt event is captured and acknowledged by >> the pciehp hotplug service driver. >> What I want to do next is that the operating system load/unload the >> corresponding PCIe device driver for that card. > > When pciehp receives the interrupt, it should enumerate the device, > and you should see a line in dmesg similar to this (of course, it will > have different bus/device/function and different vendor/device IDs): > > pci 0000:00:16.0: [8086:9c3a] type 00 class 0x078000 > > The PCI core should then add the device using device_add(), and part > of that is to emit a uevent, which can be read by user-space. > Generally udev would handle the event and load the appropriate driver. > I don't know the details of how the user-space side works. > > Bjorn > ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Hot add a PCIe device driver upon hotplug event 2015-01-12 17:26 ` Paulo Fortuna Carvalho @ 2015-01-12 17:41 ` Bjorn Helgaas [not found] ` <CAH9N0t-EP9p2V3JMht1F_EyK+bOdiiH1krPLrsTMDQ_961damQ@mail.gmail.com> 0 siblings, 1 reply; 16+ messages in thread From: Bjorn Helgaas @ 2015-01-12 17:41 UTC (permalink / raw) To: Paulo Fortuna Carvalho; +Cc: linux-pci@vger.kernel.org On Mon, Jan 12, 2015 at 11:26 AM, Paulo Fortuna Carvalho <pricardofc@gmail.com> wrote: > Hello Helgas, > I cannot see that line in dmesg. I think that something else is missing... > Do you know what may cause that not appearing in dmesg? Hotplug Surprise? Can you just collect the entire dmesg log, so I can see what controller is involved and what is happening? Surprise hotplug should work. Are you inserting an ExpressCard, or is it some other form factor? Exactly what hardware are you adding, and what are you adding it to? Bjorn > 2015-01-12 16:58 GMT, Bjorn Helgaas <bhelgaas@google.com>: >> On Mon, Jan 12, 2015 at 5:42 AM, Paulo Fortuna Carvalho >> <pricardofc@gmail.com> wrote: >>> Hello, >>> I want to automatically load/unload a PCIe device driver when a card >>> is inserted/removed from the system. I can see in the system logger >>> with dmesg that the interrupt event is captured and acknowledged by >>> the pciehp hotplug service driver. >>> What I want to do next is that the operating system load/unload the >>> corresponding PCIe device driver for that card. >> >> When pciehp receives the interrupt, it should enumerate the device, >> and you should see a line in dmesg similar to this (of course, it will >> have different bus/device/function and different vendor/device IDs): >> >> pci 0000:00:16.0: [8086:9c3a] type 00 class 0x078000 >> >> The PCI core should then add the device using device_add(), and part >> of that is to emit a uevent, which can be read by user-space. >> Generally udev would handle the event and load the appropriate driver. >> I don't know the details of how the user-space side works. >> >> Bjorn >> ^ permalink raw reply [flat|nested] 16+ messages in thread
[parent not found: <CAH9N0t-EP9p2V3JMht1F_EyK+bOdiiH1krPLrsTMDQ_961damQ@mail.gmail.com>]
* Re: Hot add a PCIe device driver upon hotplug event [not found] ` <CAH9N0t-EP9p2V3JMht1F_EyK+bOdiiH1krPLrsTMDQ_961damQ@mail.gmail.com> @ 2015-01-13 17:29 ` Bjorn Helgaas 0 siblings, 0 replies; 16+ messages in thread From: Bjorn Helgaas @ 2015-01-13 17:29 UTC (permalink / raw) To: Paulo Fortuna Carvalho; +Cc: linux-pci@vger.kernel.org On Tue, Jan 13, 2015 at 7:03 AM, Paulo Fortuna Carvalho <pricardofc@gmail.com> wrote: > Hello Bjorn, > Im sending a picture of lspci command for yuo t see the pcie switch > and endpoints status when i boot the system with everything connected. > DPIO cards are the 2 endpoints (6114 is a carrier containing a pcie > switch) and 6014 is the DAq card endpoint (both use a FPGA Virtex 6 > from XILINX). > We have in the system an OSS card that is the second PCIe switch in > the system hierarchy closest to the root complex that is the chipset > in the motherboard and then the processor. > The OS is redhat v6.3 and the kernel we are using is 3.0.9 realtime. Can you try a more recent kernel? There have been significant hotplug changes since 3.0, and I don't want to spend time debugging issues that have already been fixed. If it turns out that hotplug works as you expect in, say, v3.18, then you can figure out whether it's better to backport the hotplug fixes, or to update to a newer RT kernel. Bjorn > Im sending also the dmesg information after system boot > (dmesg_output1.txt) and after I perform a switch off of the DAq card > (6014) handle, switch on and switch off again. > As you can see from dmesg_output2.txt the hot-add event is detected by > the pciehp on slot (8) bu no further print messages are sent to the > system log. At this point hotplug surprise is off. I will turn it on > on next test. > > The PCIe cards (both carrier and DAq) are ATCA with PCIExpress > capabilities for Linux PCI device drivers perform required operations. > > Regards, > Paulo. > > > 2015-01-12 17:41 GMT, Bjorn Helgaas <bhelgaas@google.com>: >> On Mon, Jan 12, 2015 at 11:26 AM, Paulo Fortuna Carvalho >> <pricardofc@gmail.com> wrote: >>> Hello Helgas, >>> I cannot see that line in dmesg. I think that something else is >>> missing... >>> Do you know what may cause that not appearing in dmesg? Hotplug Surprise? >> >> Can you just collect the entire dmesg log, so I can see what >> controller is involved and what is happening? >> >> Surprise hotplug should work. Are you inserting an ExpressCard, or is >> it some other form factor? Exactly what hardware are you adding, and >> what are you adding it to? >> >> Bjorn >> >>> 2015-01-12 16:58 GMT, Bjorn Helgaas <bhelgaas@google.com>: >>>> On Mon, Jan 12, 2015 at 5:42 AM, Paulo Fortuna Carvalho >>>> <pricardofc@gmail.com> wrote: >>>>> Hello, >>>>> I want to automatically load/unload a PCIe device driver when a card >>>>> is inserted/removed from the system. I can see in the system logger >>>>> with dmesg that the interrupt event is captured and acknowledged by >>>>> the pciehp hotplug service driver. >>>>> What I want to do next is that the operating system load/unload the >>>>> corresponding PCIe device driver for that card. >>>> >>>> When pciehp receives the interrupt, it should enumerate the device, >>>> and you should see a line in dmesg similar to this (of course, it will >>>> have different bus/device/function and different vendor/device IDs): >>>> >>>> pci 0000:00:16.0: [8086:9c3a] type 00 class 0x078000 >>>> >>>> The PCI core should then add the device using device_add(), and part >>>> of that is to emit a uevent, which can be read by user-space. >>>> Generally udev would handle the event and load the appropriate driver. >>>> I don't know the details of how the user-space side works. >>>> >>>> Bjorn >>>> >> ^ permalink raw reply [flat|nested] 16+ messages in thread
[parent not found: <CAH9N0t8s+kcG_Qok0wpoDz9jzwvPk_QmBK_p-qbACZJjrr+iVQ@mail.gmail.com>]
* Re: Hot add a PCIe device driver upon hotplug event [not found] ` <CAH9N0t8s+kcG_Qok0wpoDz9jzwvPk_QmBK_p-qbACZJjrr+iVQ@mail.gmail.com> @ 2015-01-22 17:41 ` Bjorn Helgaas [not found] ` <CAH9N0t-+gog9wNFo7hqhzrWuttrRyf5HjjHciFrDGz1rZUiUfw@mail.gmail.com> 0 siblings, 1 reply; 16+ messages in thread From: Bjorn Helgaas @ 2015-01-22 17:41 UTC (permalink / raw) To: Paulo Fortuna Carvalho; +Cc: linux-pci@vger.kernel.org On Thu, Jan 22, 2015 at 8:35 AM, Paulo Fortuna Carvalho <pricardofc@gmail.com> wrote: > Hello Bjorn, > I can add and remove an ATCA card with a PCIExpress endpoint. > UDEV device manager runs hotplug scripts ok and device files are created and > removed from system accdordingly. > Im trying now to run a script that on remove kills running applications > using a certain device before it is removed from system. > To do so I implement in my .rules file an entry regarding "remove" that > calls a script to perform that task. > My problem is that before script runs the device node is no longer there and > my system crashes. > Do you have any idea on how can run my script stopping the applciations > before device remove procedure occurs? If I understand correctly, you want to run a script while your device is still in the system, *before* it is removed. I'm not familiar with the ATCA hotplug slot details. What event would trigger the script to run? Is there an attention button used to request card removal? Is there a sysfs or similar software interface to request removal? > 2015-01-12 16:58 GMT+00:00 Bjorn Helgaas <bhelgaas@google.com>: >> >> On Mon, Jan 12, 2015 at 5:42 AM, Paulo Fortuna Carvalho >> <pricardofc@gmail.com> wrote: >> > Hello, >> > I want to automatically load/unload a PCIe device driver when a card >> > is inserted/removed from the system. I can see in the system logger >> > with dmesg that the interrupt event is captured and acknowledged by >> > the pciehp hotplug service driver. >> > What I want to do next is that the operating system load/unload the >> > corresponding PCIe device driver for that card. >> >> When pciehp receives the interrupt, it should enumerate the device, >> and you should see a line in dmesg similar to this (of course, it will >> have different bus/device/function and different vendor/device IDs): >> >> pci 0000:00:16.0: [8086:9c3a] type 00 class 0x078000 >> >> The PCI core should then add the device using device_add(), and part >> of that is to emit a uevent, which can be read by user-space. >> Generally udev would handle the event and load the appropriate driver. >> I don't know the details of how the user-space side works. >> >> Bjorn > > ^ permalink raw reply [flat|nested] 16+ messages in thread
[parent not found: <CAH9N0t-+gog9wNFo7hqhzrWuttrRyf5HjjHciFrDGz1rZUiUfw@mail.gmail.com>]
* Re: Hot add a PCIe device driver upon hotplug event [not found] ` <CAH9N0t-+gog9wNFo7hqhzrWuttrRyf5HjjHciFrDGz1rZUiUfw@mail.gmail.com> @ 2015-01-22 22:20 ` Bjorn Helgaas 2015-01-23 11:35 ` Paulo Fortuna Carvalho 0 siblings, 1 reply; 16+ messages in thread From: Bjorn Helgaas @ 2015-01-22 22:20 UTC (permalink / raw) To: Paulo Fortuna Carvalho; +Cc: linux-pci, Greg Kroah-Hartman [+cc Greg] It's better if you put your responses inline, e.g., http://en.wikipedia.org/wiki/Posting_style#Interleaved_style . That's the convention on Linux lists because it makes it easier for new participants to enter the discussion. Also, some of your previous messages didn't make it to the mailing list, probably because they exceeded 100K; see http://vger.kernel.org/majordomo-info.html . I extracted the dmesg information below from your "dmesg_output2.txt" attachment from Jan 13. On Thu, Jan 22, 2015 at 08:07:14PM +0000, Paulo Fortuna Carvalho wrote: > Hi, > Yes but i want that udev .rules file launches the script on "remove" ACTION. > The event that triggers the removal procedure is an ATCA card handle > switch. the corredponding hot-add event is caught by pciehp and sent to > udev that runs a rule which calls my script on "remove" ACTION. Let me summarize what I think you're trying to do. Please correct me if I don't understand correctly. You're interested in device 0c:00.0: [1556:6014], which is here in the topology: pci 0000:07:08.0: PCI bridge to [bus 0c] pci 0000:0c:00.0: [1556:6014] type 0 class 0x001100 You're looking for a way to run a script when a user expresses her intent to remove a card, but before the card is actually removed. Linux currently emits a KOBJ_REMOVE uevent when it tears down the internal device data structure. You can use udev to do things when that occurs, but that's too late for what you want to do because the device is already unusable. 07:08.0, the Downstream Port leading to 0c:00.0, contains a hotplug controller with several indicators and sensors: pciehp 0000:07:08.0:pcie24: Physical Slot Number : 8 pciehp 0000:07:08.0:pcie24: Attention Button : yes pciehp 0000:07:08.0:pcie24: Power Controller : yes pciehp 0000:07:08.0:pcie24: MRL Sensor : yes pciehp 0000:07:08.0:pcie24: Attention Indicator : yes pciehp 0000:07:08.0:pcie24: Power Indicator : yes pciehp 0000:07:08.0:pcie24: Hot-Plug Surprise : no pciehp 0000:07:08.0:pcie24: EMI Present : no pciehp 0000:07:08.0:pcie24: Command Completed : yes It might be possible for Linux to emit other uevents, for example, when the Attention Button is pressed. I don't know how such a uevent would fit into the udev framework. You're using an ATCA card handle to physically remove the device, and you mention a card handle switch that triggers the removal. Is the card handle something like this: http://www.cbttechnology.com/resources/pdf/Datasheets/Sep2014/ATCA-Handles.pdf? Is the microswitch mentioned there wired up as the Attention Button? If not, do you know how the state of the ATCA card handle switch is communicated to the OS? In your dmesg log, I see this: pciehp 0000:07:08.0:pcie24: pcie_isr: intr_loc 8 pciehp 0000:07:08.0:pcie24: Presence/Notify input change pciehp 0000:07:08.0:pcie24: Card not present on Slot(8) pciehp 0000:07:08.0:pcie24: pcie_isr: intr_loc 8 pciehp 0000:07:08.0:pcie24: Presence/Notify input change pciehp 0000:07:08.0:pcie24: Card present on Slot(8) That means we only saw Presence Detect Changed interrupts: one for card removal and another for card insertion. We didn't see an Attention Button interrupt at all. You can look at the Slot Status directly with "lspci -vvs07:08.0". If you do that while removing the device, e.g., run it while the handle is in position #1, again in position #2, and again in position #3, you should see whether there's any signal that could potentially be used to do what you need. > On Jan 22, 2015 5:41 PM, "Bjorn Helgaas" <bhelgaas@google.com> wrote: > > On Thu, Jan 22, 2015 at 8:35 AM, Paulo Fortuna Carvalho > > <pricardofc@gmail.com> wrote: > > > Hello Bjorn, > > > I can add and remove an ATCA card with a PCIExpress endpoint. > > > UDEV device manager runs hotplug scripts ok and device files are created > > and > > > removed from system accdordingly. > > > Im trying now to run a script that on remove kills running applications > > > using a certain device before it is removed from system. > > > To do so I implement in my .rules file an entry regarding "remove" that > > > calls a script to perform that task. > > > My problem is that before script runs the device node is no longer there > > and > > > my system crashes. > > > Do you have any idea on how can run my script stopping the applciations > > > before device remove procedure occurs? > > > > If I understand correctly, you want to run a script while your device > > is still in the system, *before* it is removed. I'm not familiar with > > the ATCA hotplug slot details. What event would trigger the script to > > run? Is there an attention button used to request card removal? Is > > there a sysfs or similar software interface to request removal? > > > > > 2015-01-12 16:58 GMT+00:00 Bjorn Helgaas <bhelgaas@google.com>: > > >> > > >> On Mon, Jan 12, 2015 at 5:42 AM, Paulo Fortuna Carvalho > > >> <pricardofc@gmail.com> wrote: > > >> > Hello, > > >> > I want to automatically load/unload a PCIe device driver when a card > > >> > is inserted/removed from the system. I can see in the system logger > > >> > with dmesg that the interrupt event is captured and acknowledged by > > >> > the pciehp hotplug service driver. > > >> > What I want to do next is that the operating system load/unload the > > >> > corresponding PCIe device driver for that card. > > >> > > >> When pciehp receives the interrupt, it should enumerate the device, > > >> and you should see a line in dmesg similar to this (of course, it will > > >> have different bus/device/function and different vendor/device IDs): > > >> > > >> pci 0000:00:16.0: [8086:9c3a] type 00 class 0x078000 > > >> > > >> The PCI core should then add the device using device_add(), and part > > >> of that is to emit a uevent, which can be read by user-space. > > >> Generally udev would handle the event and load the appropriate driver. > > >> I don't know the details of how the user-space side works. > > >> > > >> Bjorn > > > > > > > > ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Hot add a PCIe device driver upon hotplug event 2015-01-22 22:20 ` Bjorn Helgaas @ 2015-01-23 11:35 ` Paulo Fortuna Carvalho 2015-01-23 14:36 ` Bjorn Helgaas 0 siblings, 1 reply; 16+ messages in thread From: Paulo Fortuna Carvalho @ 2015-01-23 11:35 UTC (permalink / raw) To: Bjorn Helgaas; +Cc: linux-pci, Greg Kroah-Hartman 2015-01-22 22:20 GMT, Bjorn Helgaas: > [+cc Greg] > It's better if you put your responses inline, e.g., > http://en.wikipedia.org/wiki/Posting_style#Interleaved_style . That's the > convention on Linux lists because it makes it easier for new participants > to enter the discussion. Ok. > Also, some of your previous messages didn't make it to the mailing list, > probably because they exceeded 100K; see > http://vger.kernel.org/majordomo-info.html . Ok, although the response I get from the server is something about my email containing html format characters... > I extracted the dmesg information below from your "dmesg_output2.txt" > attachment from Jan 13. > On Thu, Jan 22, 2015 at 08:07:14PM +0000, Paulo Fortuna Carvalho wrote: >> Hi, >> Yes but i want that udev .rules file launches the script on "remove" >> ACTION. >> The event that triggers the removal procedure is an ATCA card handle >> switch. the corredponding hot-add event is caught by pciehp and sent to >> udev that runs a rule which calls my script on "remove" ACTION. > Let me summarize what I think you're trying to do. Please correct me if I > don't understand correctly. > You're interested in device 0c:00.0: [1556:6014], which is here in the > topology: > pci 0000:07:08.0: PCI bridge to [bus 0c] > pci 0000:0c:00.0: [1556:6014] type 0 class 0x001100 Yes. > You're looking for a way to run a script when a user expresses her intent > to remove a card, but before the card is actually removed. > Linux currently emits a KOBJ_REMOVE uevent when it tears down the internal > device data structure. You can use udev to do things when that occurs, but > that's too late for what you want to do because the device is already > unusable. Yes. > 07:08.0, the Downstream Port leading to 0c:00.0, contains a hotplug > controller with several indicators and sensors: > pciehp 0000:07:08.0:pcie24: Physical Slot Number : 8 > pciehp 0000:07:08.0:pcie24: Attention Button : yes > pciehp 0000:07:08.0:pcie24: Power Controller : yes > pciehp 0000:07:08.0:pcie24: MRL Sensor : yes > pciehp 0000:07:08.0:pcie24: Attention Indicator : yes > pciehp 0000:07:08.0:pcie24: Power Indicator : yes > pciehp 0000:07:08.0:pcie24: Hot-Plug Surprise : no > pciehp 0000:07:08.0:pcie24: EMI Present : no > pciehp 0000:07:08.0:pcie24: Command Completed : yes Yes, although now the configuration is a little different (we dont have some of the system components like the Attention Button): ============================================== pciehp 0000:07:08.0:pcie24: Physical Slot Number : 8 pciehp 0000:07:08.0:pcie24: Attention Button : no pciehp 0000:07:08.0:pcie24: Power Controller : no pciehp 0000:07:08.0:pcie24: MRL Sensor : no pciehp 0000:07:08.0:pcie24: Attention Indicator : no pciehp 0000:07:08.0:pcie24: Power Indicator : no pciehp 0000:07:08.0:pcie24: Hot-Plug Surprise : yes pciehp 0000:07:08.0:pcie24: EMI Present : no pciehp 0000:07:08.0:pcie24: Command Completed : yes ============================================== > It might be possible for Linux to emit other uevents, for example, when the > Attention Button is pressed. I don't know how such a uevent would fit into > the udev framework. Yes, although we don't have the Attention Button physically installed but maybe we can use another way to trigger the interrupt. > You're using an ATCA card handle to physically remove the device, and you > mention a card handle switch that triggers the removal. Is the card handle > something like this: > http://www.cbttechnology.com/resources/pdf/Datasheets/Sep2014/ATCA-Handles.pdf? Yes, its a handle similar to the one in the .pdf file you sent. > Is the microswitch mentioned there wired up as the Attention Button? If > not, do you know how the state of the ATCA card handle switch is > communicated to the OS? > In your dmesg log, I see this: > pciehp 0000:07:08.0:pcie24: pcie_isr: intr_loc 8 > pciehp 0000:07:08.0:pcie24: Presence/Notify input change > pciehp 0000:07:08.0:pcie24: Card not present on Slot(8) > pciehp 0000:07:08.0:pcie24: pcie_isr: intr_loc 8 > pciehp 0000:07:08.0:pcie24: Presence/Notify input change > pciehp 0000:07:08.0:pcie24: Card present on Slot(8) > That means we only saw Presence Detect Changed interrupts: one for card > removal and another for card insertion. We didn't see an Attention Button > interrupt at all. Yes, thats it. The Presence Detect Change signal is what triggers the uevent. We dont have an attention button in our ATCA system so we dont use it. > You can look at the Slot Status directly with "lspci -vvs07:08.0". If you > do that while removing the device, e.g., run it while the handle is in > position #1, again in position #2, and again in position #3, you should see > whether there's any signal that could potentially be used to do what you > need. Yes. I will try to see if the handle switch can trigger an uvent before the remove device procedure from the system occurs. I will let you know the result. ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Hot add a PCIe device driver upon hotplug event 2015-01-23 11:35 ` Paulo Fortuna Carvalho @ 2015-01-23 14:36 ` Bjorn Helgaas 2015-01-27 16:31 ` Paulo Fortuna Carvalho [not found] ` <54C5EF11.8070604@desy.de> 0 siblings, 2 replies; 16+ messages in thread From: Bjorn Helgaas @ 2015-01-23 14:36 UTC (permalink / raw) To: Paulo Fortuna Carvalho; +Cc: linux-pci@vger.kernel.org, Greg Kroah-Hartman On Fri, Jan 23, 2015 at 5:35 AM, Paulo Fortuna Carvalho <pricardofc@gmail.com> wrote: > 2015-01-22 22:20 GMT, Bjorn Helgaas: >> In your dmesg log, I see this: > >> pciehp 0000:07:08.0:pcie24: pcie_isr: intr_loc 8 >> pciehp 0000:07:08.0:pcie24: Presence/Notify input change >> pciehp 0000:07:08.0:pcie24: Card not present on Slot(8) >> pciehp 0000:07:08.0:pcie24: pcie_isr: intr_loc 8 >> pciehp 0000:07:08.0:pcie24: Presence/Notify input change >> pciehp 0000:07:08.0:pcie24: Card present on Slot(8) > >> That means we only saw Presence Detect Changed interrupts: one for card >> removal and another for card insertion. We didn't see an Attention Button >> interrupt at all. > > Yes, thats it. The Presence Detect Change signal is what triggers the > uevent. We dont have an attention button in our ATCA system so we dont > use it. If the only signal you have is Presence Detect, I think you're out of luck, because if Presence Detect State is "false" (see PCIe spec r3.0, sec 7.8.11), the card is already gone and it's too late to do anything with it. If that's the case, you'd have to look for a software solution, e.g., run a script when you decide to remove the card, before you physically touch the card. >> You can look at the Slot Status directly with "lspci -vvs07:08.0". If you >> do that while removing the device, e.g., run it while the handle is in >> position #1, again in position #2, and again in position #3, you should see >> whether there's any signal that could potentially be used to do what you >> need. > > Yes. I will try to see if the handle switch can trigger an uvent > before the remove device procedure from the system occurs. I will let > you know the result. Sec 6.7.3 lists the events pciehp has to work with: - Slot events: - Attention Button - Power Fault Detected - MRL Sensor Changed - Presence Detect Changed - Command Completed Events (this is internal to the hotplug controller) - Data Link Layer State Changed Events These are really the only inputs to the pciehp driver. You apparently don't have an Attention Button. You do have Presence Detect and possibly others, but I don't think they normally (other than an Attention Button) will give you any warning before the card is removed. The only possibility I see is the Power Fault handling (see sec 6.7.1.8). Depending on the form factor, there is the possibility of independent main and auxiliary power faults. An auxiliary power fault can be detected and reported without affecting main power. And it says "For example, one form factor may remove auxiliary power when the MRL for the slot is opened." If your form factor does that, we might get a Power Fault when the latch is opened, and the card would still have main power and software should still be able to operate it. I don't know how we would distinguish such an auxiliary power fault from a main power fault. Maybe a form factor spec would talk about that. Do you have any pointers to something like that? If we could figure that out, it might be possible to emit a uevent for that case. Bjorn ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Hot add a PCIe device driver upon hotplug event 2015-01-23 14:36 ` Bjorn Helgaas @ 2015-01-27 16:31 ` Paulo Fortuna Carvalho 2015-01-27 16:43 ` Bjorn Helgaas 2015-01-27 16:47 ` Greg Kroah-Hartman [not found] ` <54C5EF11.8070604@desy.de> 1 sibling, 2 replies; 16+ messages in thread From: Paulo Fortuna Carvalho @ 2015-01-27 16:31 UTC (permalink / raw) To: Bjorn Helgaas; +Cc: linux-pci@vger.kernel.org, Greg Kroah-Hartman Hello Bjorn, Is it possible to cancel somehow the remove procedure if the device is in use? When we are using the device and remove occurs the kernel crashes. Regards, Paulo. 2015-01-23 14:36 GMT, Bjorn Helgaas <bhelgaas@google.com>: > On Fri, Jan 23, 2015 at 5:35 AM, Paulo Fortuna Carvalho > <pricardofc@gmail.com> wrote: >> 2015-01-22 22:20 GMT, Bjorn Helgaas: > >>> In your dmesg log, I see this: >> >>> pciehp 0000:07:08.0:pcie24: pcie_isr: intr_loc 8 >>> pciehp 0000:07:08.0:pcie24: Presence/Notify input change >>> pciehp 0000:07:08.0:pcie24: Card not present on Slot(8) >>> pciehp 0000:07:08.0:pcie24: pcie_isr: intr_loc 8 >>> pciehp 0000:07:08.0:pcie24: Presence/Notify input change >>> pciehp 0000:07:08.0:pcie24: Card present on Slot(8) >> >>> That means we only saw Presence Detect Changed interrupts: one for card >>> removal and another for card insertion. We didn't see an Attention >>> Button >>> interrupt at all. >> >> Yes, thats it. The Presence Detect Change signal is what triggers the >> uevent. We dont have an attention button in our ATCA system so we dont >> use it. > > If the only signal you have is Presence Detect, I think you're out of > luck, because if Presence Detect State is "false" (see PCIe spec r3.0, > sec 7.8.11), the card is already gone and it's too late to do anything > with it. If that's the case, you'd have to look for a software > solution, e.g., run a script when you decide to remove the card, > before you physically touch the card. > >>> You can look at the Slot Status directly with "lspci -vvs07:08.0". If >>> you >>> do that while removing the device, e.g., run it while the handle is in >>> position #1, again in position #2, and again in position #3, you should >>> see >>> whether there's any signal that could potentially be used to do what you >>> need. >> >> Yes. I will try to see if the handle switch can trigger an uvent >> before the remove device procedure from the system occurs. I will let >> you know the result. > > Sec 6.7.3 lists the events pciehp has to work with: > > - Slot events: > - Attention Button > - Power Fault Detected > - MRL Sensor Changed > - Presence Detect Changed > - Command Completed Events (this is internal to the hotplug controller) > - Data Link Layer State Changed Events > > These are really the only inputs to the pciehp driver. You apparently > don't have an Attention Button. You do have Presence Detect and > possibly others, but I don't think they normally (other than an > Attention Button) will give you any warning before the card is > removed. > > The only possibility I see is the Power Fault handling (see sec > 6.7.1.8). Depending on the form factor, there is the possibility of > independent main and auxiliary power faults. An auxiliary power fault > can be detected and reported without affecting main power. And it > says "For example, one form factor may remove auxiliary power when the > MRL for the slot is opened." If your form factor does that, we might > get a Power Fault when the latch is opened, and the card would still > have main power and software should still be able to operate it. > > I don't know how we would distinguish such an auxiliary power fault > from a main power fault. Maybe a form factor spec would talk about > that. Do you have any pointers to something like that? If we could > figure that out, it might be possible to emit a uevent for that case. > > Bjorn > ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Hot add a PCIe device driver upon hotplug event 2015-01-27 16:31 ` Paulo Fortuna Carvalho @ 2015-01-27 16:43 ` Bjorn Helgaas 2015-01-27 16:47 ` Greg Kroah-Hartman 1 sibling, 0 replies; 16+ messages in thread From: Bjorn Helgaas @ 2015-01-27 16:43 UTC (permalink / raw) To: Paulo Fortuna Carvalho; +Cc: linux-pci@vger.kernel.org, Greg Kroah-Hartman On Tue, Jan 27, 2015 at 10:31 AM, Paulo Fortuna Carvalho <pricardofc@gmail.com> wrote: > Hello Bjorn, > > Is it possible to cancel somehow the remove procedure if the device is in use? > When we are using the device and remove occurs the kernel crashes. As far as I can tell, the only signal the kernel gets from your platform is the Presence Detect Changed interrupt that tells us the device is already gone. So there's nothing the kernel can do to prevent the removal. But the kernel *should* detach the driver from the now-missing device. If the kernel crashes in this case, it's probably because the driver doesn't handle the removal gracefully, and the driver can probably be improved to handle that better. If you post details about the crash, we should be able to figure out if there's a PCI core issue or a driver issue. > 2015-01-23 14:36 GMT, Bjorn Helgaas <bhelgaas@google.com>: >> On Fri, Jan 23, 2015 at 5:35 AM, Paulo Fortuna Carvalho >> <pricardofc@gmail.com> wrote: >>> 2015-01-22 22:20 GMT, Bjorn Helgaas: >> >>>> In your dmesg log, I see this: >>> >>>> pciehp 0000:07:08.0:pcie24: pcie_isr: intr_loc 8 >>>> pciehp 0000:07:08.0:pcie24: Presence/Notify input change >>>> pciehp 0000:07:08.0:pcie24: Card not present on Slot(8) >>>> pciehp 0000:07:08.0:pcie24: pcie_isr: intr_loc 8 >>>> pciehp 0000:07:08.0:pcie24: Presence/Notify input change >>>> pciehp 0000:07:08.0:pcie24: Card present on Slot(8) >>> >>>> That means we only saw Presence Detect Changed interrupts: one for card >>>> removal and another for card insertion. We didn't see an Attention >>>> Button >>>> interrupt at all. >>> >>> Yes, thats it. The Presence Detect Change signal is what triggers the >>> uevent. We dont have an attention button in our ATCA system so we dont >>> use it. >> >> If the only signal you have is Presence Detect, I think you're out of >> luck, because if Presence Detect State is "false" (see PCIe spec r3.0, >> sec 7.8.11), the card is already gone and it's too late to do anything >> with it. If that's the case, you'd have to look for a software >> solution, e.g., run a script when you decide to remove the card, >> before you physically touch the card. >> >>>> You can look at the Slot Status directly with "lspci -vvs07:08.0". If >>>> you >>>> do that while removing the device, e.g., run it while the handle is in >>>> position #1, again in position #2, and again in position #3, you should >>>> see >>>> whether there's any signal that could potentially be used to do what you >>>> need. >>> >>> Yes. I will try to see if the handle switch can trigger an uvent >>> before the remove device procedure from the system occurs. I will let >>> you know the result. >> >> Sec 6.7.3 lists the events pciehp has to work with: >> >> - Slot events: >> - Attention Button >> - Power Fault Detected >> - MRL Sensor Changed >> - Presence Detect Changed >> - Command Completed Events (this is internal to the hotplug controller) >> - Data Link Layer State Changed Events >> >> These are really the only inputs to the pciehp driver. You apparently >> don't have an Attention Button. You do have Presence Detect and >> possibly others, but I don't think they normally (other than an >> Attention Button) will give you any warning before the card is >> removed. >> >> The only possibility I see is the Power Fault handling (see sec >> 6.7.1.8). Depending on the form factor, there is the possibility of >> independent main and auxiliary power faults. An auxiliary power fault >> can be detected and reported without affecting main power. And it >> says "For example, one form factor may remove auxiliary power when the >> MRL for the slot is opened." If your form factor does that, we might >> get a Power Fault when the latch is opened, and the card would still >> have main power and software should still be able to operate it. >> >> I don't know how we would distinguish such an auxiliary power fault >> from a main power fault. Maybe a form factor spec would talk about >> that. Do you have any pointers to something like that? If we could >> figure that out, it might be possible to emit a uevent for that case. >> >> Bjorn >> ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Hot add a PCIe device driver upon hotplug event 2015-01-27 16:31 ` Paulo Fortuna Carvalho 2015-01-27 16:43 ` Bjorn Helgaas @ 2015-01-27 16:47 ` Greg Kroah-Hartman 2015-01-27 17:10 ` Paulo Fortuna Carvalho [not found] ` <CAH9N0t-M2QfW84Jd3fRaLr+NdFvZEkjXzrXsgncoLhtYv-xQ3g@mail.gmail.com> 1 sibling, 2 replies; 16+ messages in thread From: Greg Kroah-Hartman @ 2015-01-27 16:47 UTC (permalink / raw) To: Paulo Fortuna Carvalho; +Cc: Bjorn Helgaas, linux-pci@vger.kernel.org On Tue, Jan 27, 2015 at 04:31:04PM +0000, Paulo Fortuna Carvalho wrote: > Hello Bjorn, > > Is it possible to cancel somehow the remove procedure if the device is in use? Nope, you could have physically removed the device, so we can't cancel that :) > When we are using the device and remove occurs the kernel crashes. Then we need to fix the kernel, what driver is crashing, and where? thanks, greg k-h ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Hot add a PCIe device driver upon hotplug event 2015-01-27 16:47 ` Greg Kroah-Hartman @ 2015-01-27 17:10 ` Paulo Fortuna Carvalho [not found] ` <CAH9N0t-M2QfW84Jd3fRaLr+NdFvZEkjXzrXsgncoLhtYv-xQ3g@mail.gmail.com> 1 sibling, 0 replies; 16+ messages in thread From: Paulo Fortuna Carvalho @ 2015-01-27 17:10 UTC (permalink / raw) To: Greg Kroah-Hartman; +Cc: Bjorn Helgaas, linux-pci@vger.kernel.org >Is it possible to cancel somehow the remove procedure if the device is in use? Nope, >you could have physically removed the device, so we can't cancel that :) ok. >When we are using the device and remove occurs the kernel crashes. Then we need to fix the kernel, what driver is crashing, and where? The driver of the ATCA acquisition card crashes in a read data function through DMA accesses. ^ permalink raw reply [flat|nested] 16+ messages in thread
[parent not found: <CAH9N0t-M2QfW84Jd3fRaLr+NdFvZEkjXzrXsgncoLhtYv-xQ3g@mail.gmail.com>]
[parent not found: <CAH9N0t-_JbwtZeU+Kyds8=NX=CBT-+_ecKiUWPKr1zXiLKm0vQ@mail.gmail.com>]
* Re: Hot add a PCIe device driver upon hotplug event [not found] ` <CAH9N0t-_JbwtZeU+Kyds8=NX=CBT-+_ecKiUWPKr1zXiLKm0vQ@mail.gmail.com> @ 2015-01-27 17:11 ` Greg Kroah-Hartman 0 siblings, 0 replies; 16+ messages in thread From: Greg Kroah-Hartman @ 2015-01-27 17:11 UTC (permalink / raw) To: Paulo Fortuna Carvalho; +Cc: Bjorn Helgaas, linux-pci@vger.kernel.org On Tue, Jan 27, 2015 at 05:06:33PM +0000, Paulo Fortuna Carvalho wrote: > >Is it possible to cancel somehow the remove procedure if the device is > >in use? > > >Nope, you could have physically removed the device, so we can't cancel > >that :) > > ok. > > >When we are using the device and remove occurs the kernel crashes. > >Then we need to fix the kernel, what driver is crashing, and where? > > The driver of the ATCA acquisition card crashes in a read data function > through DMA accesses. Do you have an oops callback? All read functions need to be able to handle a 0xff read when the device is removed and react properly. thanks, greg k-h ^ permalink raw reply [flat|nested] 16+ messages in thread
[parent not found: <54C5EF11.8070604@desy.de>]
* Re: Hot add a PCIe device driver upon hotplug event [not found] ` <54C5EF11.8070604@desy.de> @ 2015-01-27 16:50 ` Bjorn Helgaas 2015-01-28 7:51 ` Ludwig Petrosyan 0 siblings, 1 reply; 16+ messages in thread From: Bjorn Helgaas @ 2015-01-27 16:50 UTC (permalink / raw) To: Ludwig Petrosyan Cc: Paulo Fortuna Carvalho, linux-pci@vger.kernel.org, Greg Kroah-Hartman Hi Ludwig, Thanks a lot for the pointers to MTCA information. I found presentation slides here: https://indico.desy.de/getFile.py/access?contribId=33&sessionId=0&resId=1&materialId=slides&confId=10329, which I assume is that you're referring to. On Mon, Jan 26, 2015 at 1:38 AM, Ludwig Petrosyan <ludwig.petrosyan@desy.de> wrote: > > Hello > > (I have send this email two days ago but got status undeliverable, so try to > send it again) > > We had the same problem on the MTCA system years ago, but now the problem is > solved. > The MTCA system is a kind of ATCA and I think attention Button problem could > be solved in the same way. > Ok, now what we have done: > First of all the main difference of the ATCA or MTCA PCIe system is what the > PCIe Switch which responsible for the HotPlug > > (I mean Switch which is connected to the crate slots, and hotplug > controllers of this are used in hotpluging ) and the Attention Button of the > > PCIe slot reside on the different boards (usually PCIe Switch is on the > Crate Manager board and Attention Button is the AMC module Latch ), > > So there are no any wired connections between Attention Button and PCIe > Switch. Than user pool out the AMC module Latch the PCIe Switch has no idea > > about that, BUT the AMC module controller (MMC) sends IPMI message to the > Crate Management Controller (MCMC) > and MCMC starts AMC powering down procedures. The idea was: usually MCMC and > PCIe Switch reside on the same board and MCMC getting IPMI message > > about state change of the AMC Latch (latch polled out or pressed) set > appropriate registers of the PCIe Switch or push high appropriate lines of > the PCIe Switch > > and PCIe Switch gets information that Attention Button is pressed and send > hotplug interrupt to the hotplug driver. > This approach works in our MTCA systems and we have a full working PCIe > HotPlug. It sounds like you basically have some IPMI glue between the the AMC latch and the PCIe Attention Button, and from the point of view of the pciehp (PCIe native hotplug) driver, it just sees a normal Attention Button. That sounds like a reasonable thing to do. In Paulo's case, it sounds like there is some sort of switch related to the card, but the IPMI or similar glue that could potentially lead to the Attention Button line doesn't exist. In that case, pciehp doesn't know anything about the switch. If there's some other way, e.g., IPMI, to learn about the switch, maybe that could be done via a separate driver. > More detailed information one can get to look documents of the MTCA Workshop > (http://mtcaws.desy.de/) in tutorial section was presentation about > > PCIe Hotplug, or look in PICMIG recommendation (there is new recommendation > about PCIe HotPlug for MTC.4) > > With best regards > > Ludwig Petrosyan > > ^ permalink raw reply [flat|nested] 16+ messages in thread
* Re: Hot add a PCIe device driver upon hotplug event 2015-01-27 16:50 ` Bjorn Helgaas @ 2015-01-28 7:51 ` Ludwig Petrosyan 0 siblings, 0 replies; 16+ messages in thread From: Ludwig Petrosyan @ 2015-01-28 7:51 UTC (permalink / raw) To: Bjorn Helgaas Cc: Paulo Fortuna Carvalho, linux-pci@vger.kernel.org, Greg Kroah-Hartman Hi Bjorn yes this that are refer, more detailed version was in previous Workshop https://indico.desy.de/getFile.py/access?contribId=5&sessionId=1&resId=0&materialId=slides&confId=7866 and additionally was presentation about driver part https://indico.desy.de/getFile.py/access?contribId=32&sessionId=9&resId=0&materialId=slides&confId=10329 I hope this will help and may be I will get some feedback to improve both and fix bugs Cheers Ludwig On 01/27/2015 05:50 PM, Bjorn Helgaas wrote: > Hi Ludwig, > > Thanks a lot for the pointers to MTCA information. I found > presentation slides here: > https://indico.desy.de/getFile.py/access?contribId=33&sessionId=0&resId=1&materialId=slides&confId=10329, > which I assume is that you're referring to. > > On Mon, Jan 26, 2015 at 1:38 AM, Ludwig Petrosyan > <ludwig.petrosyan@desy.de> wrote: >> Hello >> >> (I have send this email two days ago but got status undeliverable, so try to >> send it again) >> >> We had the same problem on the MTCA system years ago, but now the problem is >> solved. >> The MTCA system is a kind of ATCA and I think attention Button problem could >> be solved in the same way. >> Ok, now what we have done: >> First of all the main difference of the ATCA or MTCA PCIe system is what the >> PCIe Switch which responsible for the HotPlug >> >> (I mean Switch which is connected to the crate slots, and hotplug >> controllers of this are used in hotpluging ) and the Attention Button of the >> >> PCIe slot reside on the different boards (usually PCIe Switch is on the >> Crate Manager board and Attention Button is the AMC module Latch ), >> >> So there are no any wired connections between Attention Button and PCIe >> Switch. Than user pool out the AMC module Latch the PCIe Switch has no idea >> >> about that, BUT the AMC module controller (MMC) sends IPMI message to the >> Crate Management Controller (MCMC) >> and MCMC starts AMC powering down procedures. The idea was: usually MCMC and >> PCIe Switch reside on the same board and MCMC getting IPMI message >> >> about state change of the AMC Latch (latch polled out or pressed) set >> appropriate registers of the PCIe Switch or push high appropriate lines of >> the PCIe Switch >> >> and PCIe Switch gets information that Attention Button is pressed and send >> hotplug interrupt to the hotplug driver. >> This approach works in our MTCA systems and we have a full working PCIe >> HotPlug. > It sounds like you basically have some IPMI glue between the the AMC > latch and the PCIe Attention Button, and from the point of view of the > pciehp (PCIe native hotplug) driver, it just sees a normal Attention > Button. That sounds like a reasonable thing to do. > > In Paulo's case, it sounds like there is some sort of switch related > to the card, but the IPMI or similar glue that could potentially lead > to the Attention Button line doesn't exist. In that case, pciehp > doesn't know anything about the switch. If there's some other way, > e.g., IPMI, to learn about the switch, maybe that could be done via a > separate driver. > >> More detailed information one can get to look documents of the MTCA Workshop >> (http://mtcaws.desy.de/) in tutorial section was presentation about >> >> PCIe Hotplug, or look in PICMIG recommendation (there is new recommendation >> about PCIe HotPlug for MTC.4) >> >> With best regards >> >> Ludwig Petrosyan >> >> ^ permalink raw reply [flat|nested] 16+ messages in thread
end of thread, other threads:[~2015-01-29 1:26 UTC | newest] Thread overview: 16+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2015-01-12 11:42 Hot add a PCIe device driver upon hotplug event Paulo Fortuna Carvalho 2015-01-12 16:58 ` Bjorn Helgaas 2015-01-12 17:26 ` Paulo Fortuna Carvalho 2015-01-12 17:41 ` Bjorn Helgaas [not found] ` <CAH9N0t-EP9p2V3JMht1F_EyK+bOdiiH1krPLrsTMDQ_961damQ@mail.gmail.com> 2015-01-13 17:29 ` Bjorn Helgaas [not found] ` <CAH9N0t8s+kcG_Qok0wpoDz9jzwvPk_QmBK_p-qbACZJjrr+iVQ@mail.gmail.com> 2015-01-22 17:41 ` Bjorn Helgaas [not found] ` <CAH9N0t-+gog9wNFo7hqhzrWuttrRyf5HjjHciFrDGz1rZUiUfw@mail.gmail.com> 2015-01-22 22:20 ` Bjorn Helgaas 2015-01-23 11:35 ` Paulo Fortuna Carvalho 2015-01-23 14:36 ` Bjorn Helgaas 2015-01-27 16:31 ` Paulo Fortuna Carvalho 2015-01-27 16:43 ` Bjorn Helgaas 2015-01-27 16:47 ` Greg Kroah-Hartman 2015-01-27 17:10 ` Paulo Fortuna Carvalho [not found] ` <CAH9N0t-M2QfW84Jd3fRaLr+NdFvZEkjXzrXsgncoLhtYv-xQ3g@mail.gmail.com> [not found] ` <CAH9N0t-_JbwtZeU+Kyds8=NX=CBT-+_ecKiUWPKr1zXiLKm0vQ@mail.gmail.com> 2015-01-27 17:11 ` Greg Kroah-Hartman [not found] ` <54C5EF11.8070604@desy.de> 2015-01-27 16:50 ` Bjorn Helgaas 2015-01-28 7:51 ` Ludwig Petrosyan
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).