* Re: bug in pci_try_reset_bus [not found] ` <2f0fb394-f1c9-d782-ffec-dac04b1ac0a9@intel.com> @ 2018-08-27 19:08 ` Sinan Kaya 2018-08-27 19:52 ` Dennis Dalessandro 0 siblings, 1 reply; 5+ messages in thread From: Sinan Kaya @ 2018-08-27 19:08 UTC (permalink / raw) To: Dennis Dalessandro, bhelgaas, linux-pci On 8/27/2018 2:59 PM, Dennis Dalessandro wrote: > On 8/27/2018 2:21 PM, Sinan Kaya wrote: >> [+linux-pci] > > I'm not seeing a CC? sorry, I fat fingered it. fixing now. > >> On 8/27/2018 2:12 PM, Dennis Dalessandro wrote: >>> On 8/27/2018 1:27 PM, Sinan Kaya wrote: >>>> On 8/27/2018 1:21 PM, Dennis Dalessandro wrote: >>>>> + */ >>>>> +int pci_try_reset_bus(struct pci_dev *pdev) >>>>> +{ >>>>> + return pci_probe_reset_slot(pdev->slot) ? >>>>> + __pci_try_reset_slot(pdev->slot) : __pci_try_reset_bus(pdev->bus); >>>>> +} >>>> >>>> Code needs to go to __pci_try_reset_bus() when pci_probe_reset_slot() fails. >>>> >>>> It looks like we are missing a ! in front of pci_probe_reset_slot(). >>>> >>>> Can you try this instead? >>>> >>>> return !pci_probe_reset_slot(pdev->slot) ? >>>> __pci_try_reset_slot(pdev->slot) : __pci_try_reset_bus(pdev->bus); >>> >>> Right so I did that and it lands where I want it to, but the hang I get seems >>> to happen after it bails out of pci_bus_reset() because probe is set to 1. >> >> Is it returning from here? >> >> static int __pci_reset_bus(struct pci_bus *bus) >> { >> int rc; >> >> rc = pci_bus_reset(bus, 1); >> if (rc) >> return rc; >> >> ... >> } >> >> If yes, this means that your bus is not resettable or your bus pointer >> is null? >> >> Can you find out which one it is? >> >> https://elixir.bootlin.com/linux/v4.19-rc1/source/drivers/pci/pci.c#L5138 >> >> It might be also good to capture the return value of pci_bus_reset() to >> see if it returns 0 or not. > > It is returning 0 from there because pci_bus_reset is being called with probe = 1. > > The call stack from the driver is: > > pci_reset_bus(pdev) > __pci_reset_bus(pdev->bus) <-- after fixing first issue > pci_bus_reset(bus, 1) > Checks ptr and if resetable. > if (probe) > return 0 <-- here is where it returns > Probe is there to query if this particular device supports reset or not. Which is the very first thing this code does to ensure we are allowed to touch it and it is correct. If probe is returning 0, this code won't return from here. It will continue to trylock piece. rc = pci_bus_reset(bus, 1); if (rc) return rc; My guess is you are failing below with -EAGAIN because lock fails: if (pci_bus_trylock(bus)) { } else { rc = -EAGAIN; } can you please confirm? > I tried removing the whole if (probe) then return thing but still get a hang > when it calls pci_bus_lock(). > >>> What is the right way to do an SBR while our driver is being probed? The >>> reason we are doing all of this is because we need to do a bump up to gen3 >>> speed when the driver loads and the last thing we need to do is issue an SBR. >> >> I don't think you can achieve what you want without doing an SBR. Another >> possibility is retrain if your hardware supports. > > We do want to do an SBR. I'm just wondering if we are going about it the right > way. Because things worked before this was hidden from the drivers and we called > right into the bus reset. That's where the problem is. It is responsibility of the PCI core to check that you are allowed to reset. > > -Denny > ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: bug in pci_try_reset_bus 2018-08-27 19:08 ` bug in pci_try_reset_bus Sinan Kaya @ 2018-08-27 19:52 ` Dennis Dalessandro 2018-08-27 20:18 ` Sinan Kaya 0 siblings, 1 reply; 5+ messages in thread From: Dennis Dalessandro @ 2018-08-27 19:52 UTC (permalink / raw) To: Sinan Kaya, bhelgaas, linux-pci On 8/27/2018 3:08 PM, Sinan Kaya wrote: >> The call stack from the driver is: >> >> pci_reset_bus(pdev) >> __pci_reset_bus(pdev->bus) <-- after fixing first issue >> pci_bus_reset(bus, 1) >> Checks ptr and if resetable. >> if (probe) >> return 0 <-- here is where it returns >> > > Probe is there to query if this particular device supports reset or not. > Which > is the very first thing this code does to ensure we are allowed to touch it > and it is correct. > > If probe is returning 0, this code won't return from here. It will continue > to trylock piece. The probe arg is hard coded to 1 in this path. Actually both calls in pci.c it is hard coded to 1. Maybe we just need to get rid of that arg all together? > > rc = pci_bus_reset(bus, 1); > if (rc) > return rc; > > My guess is you are failing below with -EAGAIN because lock fails: > > if (pci_bus_trylock(bus)) { > > } else { > rc = -EAGAIN; > } > > can you please confirm? Ah yes, silly me. pci_bus_reset() returns 0 and it does go on but doesn't make it to the trylock, it gets hung calling pci_bus_save_and_disable(). -Denny ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: bug in pci_try_reset_bus 2018-08-27 19:52 ` Dennis Dalessandro @ 2018-08-27 20:18 ` Sinan Kaya 2018-08-27 20:48 ` Dennis Dalessandro 0 siblings, 1 reply; 5+ messages in thread From: Sinan Kaya @ 2018-08-27 20:18 UTC (permalink / raw) To: Dennis Dalessandro, bhelgaas, linux-pci On 8/27/2018 3:52 PM, Dennis Dalessandro wrote: >> >> can you please confirm? > > Ah yes, silly me. pci_bus_reset() returns 0 and it does go on but doesn't make > it to the trylock, it gets hung calling pci_bus_save_and_disable(). > OK. That makes sense now. pci_bus_save_and_disable() is also trying to obtain a device lock via pci_dev_lock(). Since you are calling this from probe time, you are getting dead lock because device is locked. Is it possible to defer this secondary bus reset operation to post probe? Possible solutions are: 1. introduce a locked reset API 2. skip lock during probe 3. bring back raw reset API even though it is undesirable. Other opinions? BTW, please file a bugzilla and capture your email details there so that we can have record of what we are doing? > -Denny ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: bug in pci_try_reset_bus 2018-08-27 20:18 ` Sinan Kaya @ 2018-08-27 20:48 ` Dennis Dalessandro 2018-08-28 13:25 ` Sinan Kaya 0 siblings, 1 reply; 5+ messages in thread From: Dennis Dalessandro @ 2018-08-27 20:48 UTC (permalink / raw) To: Sinan Kaya, bhelgaas, linux-pci On 8/27/2018 4:18 PM, Sinan Kaya wrote: > On 8/27/2018 3:52 PM, Dennis Dalessandro wrote: >>> >>> can you please confirm? >> >> Ah yes, silly me. pci_bus_reset() returns 0 and it does go on but >> doesn't make it to the trylock, it gets hung calling >> pci_bus_save_and_disable(). >> > > OK. That makes sense now. pci_bus_save_and_disable() is also trying to > obtain a device lock via pci_dev_lock(). > > Since you are calling this from probe time, you are getting dead lock > because device is locked. > > Is it possible to defer this secondary bus reset operation to post probe? I don't think so. We need to do a gen3 bump at probe. I don't know that there is any other hardware that does is so probably why it hasn't been noticed. > Possible solutions are: > 1. introduce a locked reset API > 2. skip lock during probe > 3. bring back raw reset API even though it is undesirable. I think the first option is the cleanest. I can put together a patch and post it soon. > Other opinions? > > BTW, please file a bugzilla and capture your email details there so that we > can have record of what we are doing? I can certainly do that. Thanks for your help on this! -Denny ^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: bug in pci_try_reset_bus 2018-08-27 20:48 ` Dennis Dalessandro @ 2018-08-28 13:25 ` Sinan Kaya 0 siblings, 0 replies; 5+ messages in thread From: Sinan Kaya @ 2018-08-28 13:25 UTC (permalink / raw) To: Dennis Dalessandro, bhelgaas, linux-pci On 8/27/2018 4:48 PM, Dennis Dalessandro wrote: >> Possible solutions are: >> 1. introduce a locked reset API > 2. skip lock during probe >> 3. bring back raw reset API even though it is undesirable. > > I think the first option is the cleanest. I can put together a patch and > post it soon. I was thinking about this. These locked API are getting out of control. It could be a simpler solution to change the mutex type of device_lock such that it can be locked twice. I think I have seen some examples of how to do this in the NVMe driver. ^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2018-08-28 17:17 UTC | newest] Thread overview: 5+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- [not found] <3d3d78fb-f1f2-ec4f-38f9-0b0d14564f53@intel.com> [not found] ` <323b3bbe-d255-8a5d-2a9c-d64e1f86cb64@codeaurora.org> [not found] ` <bd7dbd5a-eebc-9142-f9a2-3b5ff8de9b8b@intel.com> [not found] ` <d437400c-7eed-19d3-9cf2-b8a42217136b@codeaurora.org> [not found] ` <2f0fb394-f1c9-d782-ffec-dac04b1ac0a9@intel.com> 2018-08-27 19:08 ` bug in pci_try_reset_bus Sinan Kaya 2018-08-27 19:52 ` Dennis Dalessandro 2018-08-27 20:18 ` Sinan Kaya 2018-08-27 20:48 ` Dennis Dalessandro 2018-08-28 13:25 ` Sinan Kaya
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).