* [PATCH] drm/nouveau: Remove interrupt handler around suspend/resume
@ 2011-04-28 5:20 Alex Williamson
2011-04-28 5:54 ` Dave Airlie
0 siblings, 1 reply; 3+ messages in thread
From: Alex Williamson @ 2011-04-28 5:20 UTC (permalink / raw)
To: airlied, dri-devel; +Cc: alex.williamson, linux-kernel
We're often using a shared interrupt line for nouveau, so we have
to be prepared that it could be called at any point in time. If
we've suspended the device via vga switcheroo and get a stray
interrupt on the line from another device, we'll read back -1 from
the device and head down all sorts of strange paths, most of which
eventually lock the system.
On my system (Asus UL30VT) the interrupt line is shared with USB.
Attempting to disable the USB bluetooth device seems to trigger
a stray interrupt that ends up in nv04_fifo_isr() where we
eventually hit the "PFIFO still angry after 100 spins, halt",
which kills the system.
Using free_irq/request_irq around the suspend seems to be a
reliable fix. Attempting to flag the device state in
nouvea_irq_handler(), similar to the intel_lid_notify() fix
is too racy since we can power off the device as an interrupt
is being processed.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
---
drivers/gpu/drm/nouveau/nouveau_drv.c | 22 ++++++++++++++++++++++
1 files changed, 22 insertions(+), 0 deletions(-)
diff --git a/drivers/gpu/drm/nouveau/nouveau_drv.c b/drivers/gpu/drm/nouveau/nouveau_drv.c
index 155ebdc..91f2aca 100644
--- a/drivers/gpu/drm/nouveau/nouveau_drv.c
+++ b/drivers/gpu/drm/nouveau/nouveau_drv.c
@@ -229,6 +229,10 @@ nouveau_pci_suspend(struct pci_dev *pdev, pm_message_t pm_state)
NV_INFO(dev, "And we're gone!\n");
pci_save_state(pdev);
+
+ pci_intx(pdev, 0);
+ free_irq(drm_dev_to_irq(dev), dev);
+
if (pm_state.event == PM_EVENT_SUSPEND) {
pci_disable_device(pdev);
pci_set_power_state(pdev, PCI_D3hot);
@@ -255,6 +259,8 @@ nouveau_pci_resume(struct pci_dev *pdev)
struct drm_nouveau_private *dev_priv = dev->dev_private;
struct nouveau_engine *engine = &dev_priv->engine;
struct drm_crtc *crtc;
+ char *irqname;
+ unsigned long sh_flags = 0;
int ret, i;
if (dev->switch_power_state == DRM_SWITCH_POWER_OFF)
@@ -265,6 +271,22 @@ nouveau_pci_resume(struct pci_dev *pdev)
NV_INFO(dev, "We're back, enabling device...\n");
pci_set_power_state(pdev, PCI_D0);
pci_restore_state(pdev);
+
+ if (drm_core_check_feature(dev, DRIVER_IRQ_SHARED))
+ sh_flags = IRQF_SHARED;
+
+ if (dev->devname)
+ irqname = dev->devname;
+ else
+ irqname = dev->driver->name;
+
+ ret = request_irq(drm_dev_to_irq(dev), dev->driver->irq_handler,
+ sh_flags, irqname, dev);
+ if (ret < 0) {
+ NV_ERROR(dev, "error re-requesting irq: %d\n", ret);
+ return ret;
+ }
+
if (pci_enable_device(pdev))
return -1;
pci_set_master(dev->pdev);
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH] drm/nouveau: Remove interrupt handler around suspend/resume
2011-04-28 5:20 [PATCH] drm/nouveau: Remove interrupt handler around suspend/resume Alex Williamson
@ 2011-04-28 5:54 ` Dave Airlie
2011-04-28 12:48 ` Alex Williamson
0 siblings, 1 reply; 3+ messages in thread
From: Dave Airlie @ 2011-04-28 5:54 UTC (permalink / raw)
To: Alex Williamson; +Cc: dri-devel, linux-kernel
On Wed, 2011-04-27 at 23:20 -0600, Alex Williamson wrote:
> We're often using a shared interrupt line for nouveau, so we have
> to be prepared that it could be called at any point in time. If
> we've suspended the device via vga switcheroo and get a stray
> interrupt on the line from another device, we'll read back -1 from
> the device and head down all sorts of strange paths, most of which
> eventually lock the system.
>
> On my system (Asus UL30VT) the interrupt line is shared with USB.
> Attempting to disable the USB bluetooth device seems to trigger
> a stray interrupt that ends up in nv04_fifo_isr() where we
> eventually hit the "PFIFO still angry after 100 spins, halt",
> which kills the system.
>
> Using free_irq/request_irq around the suspend seems to be a
> reliable fix. Attempting to flag the device state in
> nouvea_irq_handler(), similar to the intel_lid_notify() fix
> is too racy since we can power off the device as an interrupt
> is being processed.
The actual solution is to check if we read back all Fs and return from
the irq handler. Robust irq handlers are generally considered a good
idea esp around race conditions at suspend/resume time.
Dave.
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH] drm/nouveau: Remove interrupt handler around suspend/resume
2011-04-28 5:54 ` Dave Airlie
@ 2011-04-28 12:48 ` Alex Williamson
0 siblings, 0 replies; 3+ messages in thread
From: Alex Williamson @ 2011-04-28 12:48 UTC (permalink / raw)
To: Dave Airlie; +Cc: dri-devel, linux-kernel
On Thu, 2011-04-28 at 15:54 +1000, Dave Airlie wrote:
> On Wed, 2011-04-27 at 23:20 -0600, Alex Williamson wrote:
> > We're often using a shared interrupt line for nouveau, so we have
> > to be prepared that it could be called at any point in time. If
> > we've suspended the device via vga switcheroo and get a stray
> > interrupt on the line from another device, we'll read back -1 from
> > the device and head down all sorts of strange paths, most of which
> > eventually lock the system.
> >
> > On my system (Asus UL30VT) the interrupt line is shared with USB.
> > Attempting to disable the USB bluetooth device seems to trigger
> > a stray interrupt that ends up in nv04_fifo_isr() where we
> > eventually hit the "PFIFO still angry after 100 spins, halt",
> > which kills the system.
> >
> > Using free_irq/request_irq around the suspend seems to be a
> > reliable fix. Attempting to flag the device state in
> > nouvea_irq_handler(), similar to the intel_lid_notify() fix
> > is too racy since we can power off the device as an interrupt
> > is being processed.
>
> The actual solution is to check if we read back all Fs and return from
> the irq handler. Robust irq handlers are generally considered a good
> idea esp around race conditions at suspend/resume time.
The trouble I found in trying to do that is that we can still race,
having the device be disabled while and interrupt is still being
processed. It seems impractical to check every device read through the
interrupt path for -1 and back out. Adding a spinlock to the interrupt
handler seemed expensive, while this has no additional runtime interrupt
overhead. Thanks,
Alex
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2011-04-28 12:49 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-04-28 5:20 [PATCH] drm/nouveau: Remove interrupt handler around suspend/resume Alex Williamson
2011-04-28 5:54 ` Dave Airlie
2011-04-28 12:48 ` Alex Williamson
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox