* PCI hotplug v.s. suspend @ 2009-06-15 10:55 Alan Jenkins 2009-06-15 13:58 ` [linux-pm] " Alan Stern 0 siblings, 1 reply; 13+ messages in thread From: Alan Jenkins @ 2009-06-15 10:55 UTC (permalink / raw) To: linux-pm; +Cc: ath5k-devel, linux-wireless@vger.kernel.org, linux-pci Hi, I triggered a WARNING while hacking on eeepc-laptop's rfkill-by-pci-hotplug. I saw "Trying to free already-free IRQ" in ath5k_pci_remove(), because the IRQ was already freed by ath5k_pci_suspend(). My changes to eeepc-laptop had allowed the PCI device to be removed while suspended. Are PCI drivers supposed to handle remove() while suspended? Thanks Alan ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [linux-pm] PCI hotplug v.s. suspend 2009-06-15 10:55 PCI hotplug v.s. suspend Alan Jenkins @ 2009-06-15 13:58 ` Alan Stern 2009-06-15 14:03 ` Alan Jenkins 0 siblings, 1 reply; 13+ messages in thread From: Alan Stern @ 2009-06-15 13:58 UTC (permalink / raw) To: Alan Jenkins Cc: linux-pm, linux-pci, linux-wireless@vger.kernel.org, ath5k-devel On Mon, 15 Jun 2009, Alan Jenkins wrote: > Hi, > > I triggered a WARNING while hacking on eeepc-laptop's > rfkill-by-pci-hotplug. I saw "Trying to free already-free IRQ" in > ath5k_pci_remove(), because the IRQ was already freed by > ath5k_pci_suspend(). My changes to eeepc-laptop had allowed the PCI > device to be removed while suspended. > > Are PCI drivers supposed to handle remove() while suspended? Yes. You found a bug in the driver. However it's not clear (to me at least) whether the bug is that the IRQ is freed in the suspend method or that there's no check for already-freed in the remove method. My guess is the latter. Alan Stern ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [linux-pm] PCI hotplug v.s. suspend 2009-06-15 13:58 ` [linux-pm] " Alan Stern @ 2009-06-15 14:03 ` Alan Jenkins 2009-06-15 14:37 ` Alan Stern 0 siblings, 1 reply; 13+ messages in thread From: Alan Jenkins @ 2009-06-15 14:03 UTC (permalink / raw) To: Alan Stern Cc: linux-pm, linux-pci, linux-wireless@vger.kernel.org, ath5k-devel Alan Stern wrote: > On Mon, 15 Jun 2009, Alan Jenkins wrote: > > >> Hi, >> >> I triggered a WARNING while hacking on eeepc-laptop's >> rfkill-by-pci-hotplug. I saw "Trying to free already-free IRQ" in >> ath5k_pci_remove(), because the IRQ was already freed by >> ath5k_pci_suspend(). My changes to eeepc-laptop had allowed the PCI >> device to be removed while suspended. >> >> Are PCI drivers supposed to handle remove() while suspended? >> > > Yes. You found a bug in the driver. However it's not clear (to me at > least) whether the bug is that the IRQ is freed in the suspend method > or that there's no check for already-freed in the remove method. My > guess is the latter. > > Alan Stern > The example in Documentation/power/pci.txt shows free_irq() being called in suspend (and request_irq() in resume). So the problem is in remove(). Perhaps I will try hacking fakephp to simulate this case, and see if I can find bugs in any other drivers :-). Thanks! Alan ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [linux-pm] PCI hotplug v.s. suspend 2009-06-15 14:03 ` Alan Jenkins @ 2009-06-15 14:37 ` Alan Stern 2009-06-15 19:58 ` Rafael J. Wysocki 0 siblings, 1 reply; 13+ messages in thread From: Alan Stern @ 2009-06-15 14:37 UTC (permalink / raw) To: Alan Jenkins Cc: linux-pm, linux-pci, linux-wireless@vger.kernel.org, ath5k-devel On Mon, 15 Jun 2009, Alan Jenkins wrote: > Alan Stern wrote: > > On Mon, 15 Jun 2009, Alan Jenkins wrote: > > > > > >> Hi, > >> > >> I triggered a WARNING while hacking on eeepc-laptop's > >> rfkill-by-pci-hotplug. I saw "Trying to free already-free IRQ" in > >> ath5k_pci_remove(), because the IRQ was already freed by > >> ath5k_pci_suspend(). My changes to eeepc-laptop had allowed the PCI > >> device to be removed while suspended. > >> > >> Are PCI drivers supposed to handle remove() while suspended? > >> > > > > Yes. You found a bug in the driver. However it's not clear (to me at > > least) whether the bug is that the IRQ is freed in the suspend method > > or that there's no check for already-freed in the remove method. My > > guess is the latter. > > > > Alan Stern > > > > The example in Documentation/power/pci.txt shows free_irq() being called > in suspend (and request_irq() in resume). So the problem is in remove(). I'm not sure whether the example is trustworthy. There was some discussion quite a while ago regarding whether drivers should free their IRQs during suspend, but I don't remember what the outcome was. Alan Stern ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [linux-pm] PCI hotplug v.s. suspend 2009-06-15 14:37 ` Alan Stern @ 2009-06-15 19:58 ` Rafael J. Wysocki 2009-06-15 20:34 ` Alan Stern 2009-06-16 1:01 ` [ath5k-devel] " Bob Copeland 0 siblings, 2 replies; 13+ messages in thread From: Rafael J. Wysocki @ 2009-06-15 19:58 UTC (permalink / raw) To: linux-pm Cc: Alan Stern, Alan Jenkins, linux-pci, linux-wireless@vger.kernel.org, ath5k-devel On Monday 15 June 2009, Alan Stern wrote: > On Mon, 15 Jun 2009, Alan Jenkins wrote: > > > Alan Stern wrote: > > > On Mon, 15 Jun 2009, Alan Jenkins wrote: > > > > > > > > >> Hi, > > >> > > >> I triggered a WARNING while hacking on eeepc-laptop's > > >> rfkill-by-pci-hotplug. I saw "Trying to free already-free IRQ" in > > >> ath5k_pci_remove(), because the IRQ was already freed by > > >> ath5k_pci_suspend(). My changes to eeepc-laptop had allowed the PCI > > >> device to be removed while suspended. > > >> > > >> Are PCI drivers supposed to handle remove() while suspended? > > >> > > > > > > Yes. You found a bug in the driver. However it's not clear (to me at > > > least) whether the bug is that the IRQ is freed in the suspend method > > > or that there's no check for already-freed in the remove method. My > > > guess is the latter. > > > > > > Alan Stern > > > > > > > The example in Documentation/power/pci.txt shows free_irq() being called > > in suspend (and request_irq() in resume). So the problem is in remove(). > > I'm not sure whether the example is trustworthy. There was some > discussion quite a while ago regarding whether drivers should free > their IRQs during suspend, but I don't remember what the outcome was. No, they shouldn't. That's why we do the entire suspend_device_irqs() thing etc. Best, Rafael ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [linux-pm] PCI hotplug v.s. suspend 2009-06-15 19:58 ` Rafael J. Wysocki @ 2009-06-15 20:34 ` Alan Stern 2009-06-15 20:51 ` Rafael J. Wysocki 2009-06-16 1:01 ` [ath5k-devel] " Bob Copeland 1 sibling, 1 reply; 13+ messages in thread From: Alan Stern @ 2009-06-15 20:34 UTC (permalink / raw) To: Rafael J. Wysocki Cc: linux-pm, Alan Jenkins, linux-pci, linux-wireless@vger.kernel.org, ath5k-devel On Mon, 15 Jun 2009, Rafael J. Wysocki wrote: > > I'm not sure whether the example is trustworthy. There was some > > discussion quite a while ago regarding whether drivers should free > > their IRQs during suspend, but I don't remember what the outcome was. > > No, they shouldn't. > > That's why we do the entire suspend_device_irqs() thing etc. Looks like the Documentation/power/pci.txt file needs to be updated. No surprise, it hasn't been touched in well over a year and there haven't been any meaningful updates in four years. Alan Stern ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [linux-pm] PCI hotplug v.s. suspend 2009-06-15 20:34 ` Alan Stern @ 2009-06-15 20:51 ` Rafael J. Wysocki 0 siblings, 0 replies; 13+ messages in thread From: Rafael J. Wysocki @ 2009-06-15 20:51 UTC (permalink / raw) To: Alan Stern Cc: linux-pm, Alan Jenkins, linux-pci, linux-wireless@vger.kernel.org, ath5k-devel On Monday 15 June 2009, Alan Stern wrote: > On Mon, 15 Jun 2009, Rafael J. Wysocki wrote: > > > > I'm not sure whether the example is trustworthy. There was some > > > discussion quite a while ago regarding whether drivers should free > > > their IRQs during suspend, but I don't remember what the outcome was. > > > > No, they shouldn't. > > > > That's why we do the entire suspend_device_irqs() thing etc. > > Looks like the Documentation/power/pci.txt file needs to be updated. > No surprise, it hasn't been touched in well over a year and there > haven't been any meaningful updates in four years. Sure it does, but PCI PM has been a moving target for quite some time. I hope it's going to settle down now. Best, Rafael ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [ath5k-devel] [linux-pm] PCI hotplug v.s. suspend 2009-06-15 19:58 ` Rafael J. Wysocki 2009-06-15 20:34 ` Alan Stern @ 2009-06-16 1:01 ` Bob Copeland 2009-06-18 11:51 ` Alan Jenkins 1 sibling, 1 reply; 13+ messages in thread From: Bob Copeland @ 2009-06-16 1:01 UTC (permalink / raw) To: Rafael J. Wysocki Cc: linux-pm, linux-wireless@vger.kernel.org, Alan Jenkins, ath5k-devel, Alan Stern, linux-pci On Mon, Jun 15, 2009 at 09:58:30PM +0200, Rafael J. Wysocki wrote: > On Monday 15 June 2009, Alan Stern wrote: > > I'm not sure whether the example is trustworthy. There was some > > discussion quite a while ago regarding whether drivers should free > > their IRQs during suspend, but I don't remember what the outcome was. > > No, they shouldn't. > > That's why we do the entire suspend_device_irqs() thing etc. So, ath5k needs something like the following? diff --git a/drivers/net/wireless/ath/ath5k/base.c b/drivers/net/wireless/ath/ath5k/base.c index 55f7de0..0107cd6 100644 --- a/drivers/net/wireless/ath/ath5k/base.c +++ b/drivers/net/wireless/ath/ath5k/base.c @@ -665,7 +665,6 @@ ath5k_pci_suspend(struct pci_dev *pdev, pm_message_t state) ath5k_led_off(sc); - free_irq(pdev->irq, sc); pci_save_state(pdev); pci_disable_device(pdev); pci_set_power_state(pdev, PCI_D3hot); @@ -686,18 +685,8 @@ ath5k_pci_resume(struct pci_dev *pdev) if (err) return err; - err = request_irq(pdev->irq, ath5k_intr, IRQF_SHARED, "ath", sc); - if (err) { - ATH5K_ERR(sc, "request_irq failed\n"); - goto err_no_irq; - } - ath5k_led_enable(sc); return 0; - -err_no_irq: - pci_disable_device(pdev); - return err; } #endif /* CONFIG_PM */ -- Bob Copeland %% www.bobcopeland.com ^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [ath5k-devel] [linux-pm] PCI hotplug v.s. suspend 2009-06-16 1:01 ` [ath5k-devel] " Bob Copeland @ 2009-06-18 11:51 ` Alan Jenkins 2009-06-18 18:45 ` Alan Stern 0 siblings, 1 reply; 13+ messages in thread From: Alan Jenkins @ 2009-06-18 11:51 UTC (permalink / raw) To: Bob Copeland Cc: Rafael J. Wysocki, linux-pm, linux-wireless@vger.kernel.org, ath5k-devel, Alan Stern, linux-pci Bob Copeland wrote: > On Mon, Jun 15, 2009 at 09:58:30PM +0200, Rafael J. Wysocki wrote: > >> On Monday 15 June 2009, Alan Stern wrote: >> >>> I'm not sure whether the example is trustworthy. There was some >>> discussion quite a while ago regarding whether drivers should free >>> their IRQs during suspend, but I don't remember what the outcome was. >>> >> No, they shouldn't. >> >> That's why we do the entire suspend_device_irqs() thing etc. >> > > So, ath5k needs something like the following? > > diff --git a/drivers/net/wireless/ath/ath5k/base.c b/drivers/net/wireless/ath/ath5k/base.c > index 55f7de0..0107cd6 100644 > --- a/drivers/net/wireless/ath/ath5k/base.c > +++ b/drivers/net/wireless/ath/ath5k/base.c > @@ -665,7 +665,6 @@ ath5k_pci_suspend(struct pci_dev *pdev, pm_message_t state) > > ath5k_led_off(sc); > > - free_irq(pdev->irq, sc); > pci_save_state(pdev); > pci_disable_device(pdev); > pci_set_power_state(pdev, PCI_D3hot); > @@ -686,18 +685,8 @@ ath5k_pci_resume(struct pci_dev *pdev) > if (err) > return err; > > - err = request_irq(pdev->irq, ath5k_intr, IRQF_SHARED, "ath", sc); > - if (err) { > - ATH5K_ERR(sc, "request_irq failed\n"); > - goto err_no_irq; > - } > - > ath5k_led_enable(sc); > return 0; > - > -err_no_irq: > - pci_disable_device(pdev); > - return err; > } > #endif /* CONFIG_PM */ > > Unfortunately this makes it worse. My eeepc-laptop hacks are now in wireless-testing. If I apply the above patch to wireless-testing and "remove" the device while suspended, I get a soft hang on resume. Suspending without removal works fine. I can see a BUG if I boot with no_console_suspend Unable to handle kernel NULL pointer dereference IP: klist_put Tainted: G W Process s2disk Call trace: ? klist_del ? device_del ? device_unregister ? pci_stop_dev ? pci_stop_bus ? pci_remove_device ? eeepc_rfkill_hotplug [eeepc_laptop] ? eeepc_hotk_resume [eeepc_laptop] ? acpi_device_resume ? device_resume ? hibernation_snapshot ? release_console_sem ? snapshot_ioctl ? snapshot_ioctl ? vfs_ioctl ? do_vfs_ioctl ? n_tty_write ? vfs_write ? sys_ioctl ? syscall_call ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [ath5k-devel] [linux-pm] PCI hotplug v.s. suspend 2009-06-18 11:51 ` Alan Jenkins @ 2009-06-18 18:45 ` Alan Stern 2009-06-18 21:18 ` Alan Jenkins 0 siblings, 1 reply; 13+ messages in thread From: Alan Stern @ 2009-06-18 18:45 UTC (permalink / raw) To: Alan Jenkins Cc: Bob Copeland, Rafael J. Wysocki, linux-pm, linux-wireless@vger.kernel.org, ath5k-devel, linux-pci On Thu, 18 Jun 2009, Alan Jenkins wrote: > Unfortunately this makes it worse. My eeepc-laptop hacks are now in > wireless-testing. If I apply the above patch to wireless-testing and > "remove" the device while suspended, I get a soft hang on resume. Is this different from the behavior without the patch? (I don't see how it could be.) > Suspending without removal works fine. > > I can see a BUG if I boot with no_console_suspend > > Unable to handle kernel NULL pointer dereference > IP: klist_put > Tainted: G W > Process s2disk > > Call trace: > ? klist_del > ? device_del > ? device_unregister > ? pci_stop_dev > ? pci_stop_bus > ? pci_remove_device > ? eeepc_rfkill_hotplug [eeepc_laptop] > ? eeepc_hotk_resume [eeepc_laptop] > ? acpi_device_resume > ? device_resume > ? hibernation_snapshot This should be doing more or less the same thing as if you removed the device while the system was running. Or is it not hot-unpluggable? Alan Stern ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [ath5k-devel] [linux-pm] PCI hotplug v.s. suspend 2009-06-18 18:45 ` Alan Stern @ 2009-06-18 21:18 ` Alan Jenkins 2009-06-19 8:34 ` Alan Jenkins 0 siblings, 1 reply; 13+ messages in thread From: Alan Jenkins @ 2009-06-18 21:18 UTC (permalink / raw) To: Alan Stern Cc: Bob Copeland, Rafael J. Wysocki, linux-pm, linux-wireless@vger.kernel.org, ath5k-devel, linux-pci Alan Stern wrote: > On Thu, 18 Jun 2009, Alan Jenkins wrote: > > >> Unfortunately this makes it worse. My eeepc-laptop hacks are now in >> wireless-testing. If I apply the above patch to wireless-testing and >> "remove" the device while suspended, I get a soft hang on resume. >> > > Is this different from the behavior without the patch? (I don't see > how it could be.) > Ow. No, the free_irq patch doesn't cause this bug. I missed testing this case after adding a workaround for an even more obscure issue :-(. I have a good idea how eeepc-laptop could cause a double-free error. I suspect we originally got away with trying to call pci_remove_device() twice, but it's not actually legal, and my workaround caused two attempts to happen much closer together. I'll fix it tomorrow and re-test. With _both_ of my nasty scenarios this time (and the normal ones) - I should really make a checklist :-). >> Suspending without removal works fine. >> >> I can see a BUG if I boot with no_console_suspend >> >> Unable to handle kernel NULL pointer dereference >> IP: klist_put >> Tainted: G W >> Process s2disk >> >> Call trace: >> ? klist_del >> ? device_del >> ? device_unregister >> ? pci_stop_dev >> ? pci_stop_bus >> ? pci_remove_device >> ? eeepc_rfkill_hotplug [eeepc_laptop] >> ? eeepc_hotk_resume [eeepc_laptop] >> ? acpi_device_resume >> ? device_resume >> ? hibernation_snapshot >> > > This should be doing more or less the same thing as if you removed the > device while the system was running. Or is it not hot-unpluggable? > > Alan Stern > Outside of suspend, I can hot-unplug the device alright. I'm blaming my hotplug driver resume handler. Thanks Alan ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [ath5k-devel] [linux-pm] PCI hotplug v.s. suspend 2009-06-18 21:18 ` Alan Jenkins @ 2009-06-19 8:34 ` Alan Jenkins 2009-06-19 11:08 ` Bob Copeland 0 siblings, 1 reply; 13+ messages in thread From: Alan Jenkins @ 2009-06-19 8:34 UTC (permalink / raw) To: Alan Stern Cc: Bob Copeland, Rafael J. Wysocki, linux-pm, linux-wireless@vger.kernel.org, ath5k-devel, linux-pci Alan Jenkins wrote: > Alan Stern wrote: > >> On Thu, 18 Jun 2009, Alan Jenkins wrote: >> >> >> >>> Unfortunately this makes it worse. My eeepc-laptop hacks are now in >>> wireless-testing. If I apply the above patch to wireless-testing and >>> "remove" the device while suspended, I get a soft hang on resume. >>> >>> >> Is this different from the behavior without the patch? (I don't see >> how it could be.) >> >> > > Ow. No, the free_irq patch doesn't cause this bug. I missed testing > this case after adding a workaround for an even more obscure issue :-(. > I tried an earlier version without the problem workaround. Bob's suggested patch to remove the free_irq() on suspend works fine. It fixes the WARNING when the device is removed during suspend, and it still works if the device is not removed during suspend. Alan ^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [ath5k-devel] [linux-pm] PCI hotplug v.s. suspend 2009-06-19 8:34 ` Alan Jenkins @ 2009-06-19 11:08 ` Bob Copeland 0 siblings, 0 replies; 13+ messages in thread From: Bob Copeland @ 2009-06-19 11:08 UTC (permalink / raw) To: Alan Jenkins Cc: Alan Stern, Rafael J. Wysocki, linux-pm, linux-wireless@vger.kernel.org, ath5k-devel, linux-pci On Fri, Jun 19, 2009 at 09:34:20AM +0100, Alan Jenkins wrote: > I tried an earlier version without the problem workaround. Bob's > suggested patch to remove the free_irq() on suspend works fine. It > fixes the WARNING when the device is removed during suspend, and it > still works if the device is not removed during suspend. Okay thanks for finding it and testing. I'll send a version with my s-o-b later today. -- Bob Copeland %% www.bobcopeland.com ^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2009-06-19 11:10 UTC | newest] Thread overview: 13+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-06-15 10:55 PCI hotplug v.s. suspend Alan Jenkins 2009-06-15 13:58 ` [linux-pm] " Alan Stern 2009-06-15 14:03 ` Alan Jenkins 2009-06-15 14:37 ` Alan Stern 2009-06-15 19:58 ` Rafael J. Wysocki 2009-06-15 20:34 ` Alan Stern 2009-06-15 20:51 ` Rafael J. Wysocki 2009-06-16 1:01 ` [ath5k-devel] " Bob Copeland 2009-06-18 11:51 ` Alan Jenkins 2009-06-18 18:45 ` Alan Stern 2009-06-18 21:18 ` Alan Jenkins 2009-06-19 8:34 ` Alan Jenkins 2009-06-19 11:08 ` Bob Copeland
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).