linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Fix USB suspend/resume crasher
@ 2005-11-23  3:08 Benjamin Herrenschmidt
  2005-11-23  3:09 ` Benjamin Herrenschmidt
                   ` (4 more replies)
  0 siblings, 5 replies; 13+ messages in thread
From: Benjamin Herrenschmidt @ 2005-11-23  3:08 UTC (permalink / raw)
  To: David Brownell
  Cc: Andrew Morton, Greg KH, Linux Kernel list, linuxppc-dev list,
	Alan Stern

This is my latest patch against current linus -git, it closes the IRQ
race and makes various other OHCI & EHCI code path safer vs.
suspend/resume. I've been able to (finally !) successfully suspend and
resume various Mac models, with or without USB mouse plugged, or
plugging while asleep, or unplugging while asleep etc... all without a
crash. There are still some races here or there in the USB code, but at
least the main cause of crash is now fixes by this patch (access to a
controller that has been suspended, due to either shared interrupts or
other code path).

I haven't fixed UHCI as I don't have any HW to test, though I hope I
haven't broken it neither. Alan, I would appreciate if you could have a
look.

This patch applies on top of the patch that moves the PowerMac specific
code out of ohci-pci.c to hcd-pci.c where it belongs. This patch isn't
upstream yet for reasons I don't fully understand (why does USB stuffs
has such a high latency for going upstream ?), I'm sending it as a reply
to this email for completeness.

Without this patch, you cannot reliably sleep/wakeup any recent Mac, and
I suspect PCs have some more sneaky issues too (they don't frankly crash
with machine checks because x86 tend to silently swallow PCI errors but
that won't last afaik, at least PCI Express will blow up in those
situations, but the USB code may still misbehave).

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
---

Note that I REALLY want that in 2.6.15. 2.6.14 is already broken, though I
have a similar patch for it that some users have been successfully testing
and I don't want 2.6.15 to be broken too. So unless you have a major issue
with the patch as it is and it breaks something, I think it should be applied,
and I don't want to get into yet another 10000 email exchange discussion on
the merit of doing proper locking and why the current situation is broken...

Index: linux-serialfix/drivers/usb/core/hcd-pci.c
===================================================================
--- linux-serialfix.orig/drivers/usb/core/hcd-pci.c	2005-11-23 13:52:23.000000000 +1100
+++ linux-serialfix/drivers/usb/core/hcd-pci.c	2005-11-23 13:52:32.000000000 +1100
@@ -218,6 +218,7 @@
 			goto done;
 		}
 	}
+	synchronize_irq(dev->irq);
 
 	/* FIXME until the generic PM interfaces change a lot more, this
 	 * can't use PCI D1 and D2 states.  For example, the confusion
@@ -386,7 +387,7 @@
 
 	dev->dev.power.power_state = PMSG_ON;
 
-	hcd->saw_irq = 0;
+	clear_bit(HCD_FLAG_SAW_IRQ, &hcd->flags);
 
 	if (hcd->driver->resume) {
 		retval = hcd->driver->resume(hcd);
Index: linux-serialfix/drivers/usb/core/hcd.c
===================================================================
--- linux-serialfix.orig/drivers/usb/core/hcd.c	2005-11-23 13:47:45.000000000 +1100
+++ linux-serialfix/drivers/usb/core/hcd.c	2005-11-23 13:52:32.000000000 +1100
@@ -1315,11 +1315,12 @@
 	 * finish unlinking the initial failed usb_set_address()
 	 * or device descriptor fetch.
 	 */
-	if (!hcd->saw_irq && hcd->self.root_hub != urb->dev) {
+	if (!test_bit(HCD_FLAG_SAW_IRQ, &hcd->flags)
+	    && hcd->self.root_hub != urb->dev) {
 		dev_warn (hcd->self.controller, "Unlink after no-IRQ?  "
 			"Controller is probably using the wrong IRQ."
 			"\n");
-		hcd->saw_irq = 1;
+		set_bit(HCD_FLAG_SAW_IRQ, &hcd->flags);
 	}
 
 	urb->status = status;
@@ -1649,13 +1650,15 @@
 	struct usb_hcd		*hcd = __hcd;
 	int			start = hcd->state;
 
-	if (start == HC_STATE_HALT)
+	if (unlikely(start == HC_STATE_HALT ||
+	    !test_bit(HCD_FLAG_HW_ACCESSIBLE, &hcd->flags)))
 		return IRQ_NONE;
 	if (hcd->driver->irq (hcd, r) == IRQ_NONE)
 		return IRQ_NONE;
 
-	hcd->saw_irq = 1;
-	if (hcd->state == HC_STATE_HALT)
+	set_bit(HCD_FLAG_SAW_IRQ, &hcd->flags);
+
+	if (unlikely(hcd->state == HC_STATE_HALT))
 		usb_hc_died (hcd);
 	return IRQ_HANDLED;
 }
@@ -1768,6 +1771,8 @@
 
 	dev_info(hcd->self.controller, "%s\n", hcd->product_desc);
 
+	set_bit(HCD_FLAG_HW_ACCESSIBLE, &hcd->flags);
+
 	/* till now HC has been in an indeterminate state ... */
 	if (hcd->driver->reset && (retval = hcd->driver->reset(hcd)) < 0) {
 		dev_err(hcd->self.controller, "can't reset\n");
Index: linux-serialfix/drivers/usb/core/hcd.h
===================================================================
--- linux-serialfix.orig/drivers/usb/core/hcd.h	2005-11-23 13:47:45.000000000 +1100
+++ linux-serialfix/drivers/usb/core/hcd.h	2005-11-23 13:52:32.000000000 +1100
@@ -72,7 +72,12 @@
 	 * hardware info/state
 	 */
 	const struct hc_driver	*driver;	/* hw-specific hooks */
-	unsigned		saw_irq : 1;
+
+	/* Flags that need to be manipulated atomically */
+	unsigned long		flags;
+#define HCD_FLAG_HW_ACCESSIBLE	0x00000001
+#define HCD_FLAG_SAW_IRQ	0x00000002
+
 	unsigned		can_wakeup:1;	/* hw supports wakeup? */
 	unsigned		remote_wakeup:1;/* sw should use wakeup? */
 	unsigned		rh_registered:1;/* is root hub registered? */
Index: linux-serialfix/drivers/usb/host/ehci-pci.c
===================================================================
--- linux-serialfix.orig/drivers/usb/host/ehci-pci.c	2005-11-23 13:47:45.000000000 +1100
+++ linux-serialfix/drivers/usb/host/ehci-pci.c	2005-11-23 13:52:32.000000000 +1100
@@ -244,22 +244,34 @@
 static int ehci_pci_suspend (struct usb_hcd *hcd, pm_message_t message)
 {
 	struct ehci_hcd		*ehci = hcd_to_ehci (hcd);
+	unsigned long		flags;
+	int			rc = 0;
 
 	if (time_before (jiffies, ehci->next_statechange))
 		msleep (100);
 
-#ifdef	CONFIG_USB_SUSPEND
-	(void) usb_suspend_device (hcd->self.root_hub);
-#else
-	usb_lock_device (hcd->self.root_hub);
-	(void) ehci_bus_suspend (hcd);
-	usb_unlock_device (hcd->self.root_hub);
-#endif
+	/* Root hub was already suspended. Disable irq emission and
+	 * mark HW unaccessible, bail out if RH has been resumed. Use
+	 * the spinlock to properly synchronize with possible pending
+	 * RH suspend or resume activity.
+	 *
+	 * This is still racy as hcd->state is manipulated outside of
+	 * any locks =P But that will be a different fix.
+	 */
+	spin_lock_irqsave (&ehci->lock, flags);
+	if (hcd->state != HC_STATE_SUSPENDED) {
+		rc = -EINVAL;
+		goto bail;
+	}
+	writel (0, &ehci->regs->intr_enable);
+	clear_bit(HCD_FLAG_HW_ACCESSIBLE, &hcd->flags);
+ bail:
+	spin_unlock_irqrestore (&ehci->lock, flags);
 
 	// save (PCI) FLADJ in case of Vaux power loss
 	// ... we'd only use it to handle clock skew
 
-	return 0;
+	return rc;
 }
 
 static int ehci_pci_resume (struct usb_hcd *hcd)
@@ -274,6 +286,8 @@
 	if (time_before (jiffies, ehci->next_statechange))
 		msleep (100);
 
+	set_bit(HCD_FLAG_HW_ACCESSIBLE, &hcd->flags);
+
 	/* If any port is suspended (or owned by the companion),
 	 * we know we can/must resume the HC (and mustn't reset it).
 	 */
Index: linux-serialfix/drivers/usb/host/ehci-q.c
===================================================================
--- linux-serialfix.orig/drivers/usb/host/ehci-q.c	2005-11-23 13:47:45.000000000 +1100
+++ linux-serialfix/drivers/usb/host/ehci-q.c	2005-11-23 13:52:32.000000000 +1100
@@ -912,6 +912,7 @@
 	int			epnum;
 	unsigned long		flags;
 	struct ehci_qh		*qh = NULL;
+	int			rc = 0;
 
 	qtd = list_entry (qtd_list->next, struct ehci_qtd, qtd_list);
 	epnum = ep->desc.bEndpointAddress;
@@ -926,21 +927,28 @@
 #endif
 
 	spin_lock_irqsave (&ehci->lock, flags);
+	if (unlikely(!test_bit(HCD_FLAG_HW_ACCESSIBLE,
+			       &ehci_to_hcd(ehci)->flags))) {
+		rc = -ESHUTDOWN;
+		goto done;
+	}
+
 	qh = qh_append_tds (ehci, urb, qtd_list, epnum, &ep->hcpriv);
+	if (unlikely(qh == NULL)) {
+		rc = -ENOMEM;
+		goto done;
+	}
 
 	/* Control/bulk operations through TTs don't need scheduling,
 	 * the HC and TT handle it when the TT has a buffer ready.
 	 */
-	if (likely (qh != NULL)) {
-		if (likely (qh->qh_state == QH_STATE_IDLE))
-			qh_link_async (ehci, qh_get (qh));
-	}
+	if (likely (qh->qh_state == QH_STATE_IDLE))
+		qh_link_async (ehci, qh_get (qh));
+ done:
 	spin_unlock_irqrestore (&ehci->lock, flags);
-	if (unlikely (qh == NULL)) {
+	if (unlikely (qh == NULL))
 		qtd_list_free (ehci, urb, qtd_list);
-		return -ENOMEM;
-	}
-	return 0;
+	return rc;
 }
 
 /*-------------------------------------------------------------------------*/
Index: linux-serialfix/drivers/usb/host/ehci-sched.c
===================================================================
--- linux-serialfix.orig/drivers/usb/host/ehci-sched.c	2005-11-23 13:47:45.000000000 +1100
+++ linux-serialfix/drivers/usb/host/ehci-sched.c	2005-11-23 13:52:32.000000000 +1100
@@ -602,6 +602,12 @@
 
 	spin_lock_irqsave (&ehci->lock, flags);
 
+	if (unlikely(!test_bit(HCD_FLAG_HW_ACCESSIBLE,
+			       &ehci_to_hcd(ehci)->flags))) {
+		status = -ESHUTDOWN;
+		goto done;
+	}
+
 	/* get qh and force any scheduling errors */
 	INIT_LIST_HEAD (&empty);
 	qh = qh_append_tds (ehci, urb, &empty, epnum, &ep->hcpriv);
@@ -1456,7 +1462,11 @@
 
 	/* schedule ... need to lock */
 	spin_lock_irqsave (&ehci->lock, flags);
-	status = iso_stream_schedule (ehci, urb, stream);
+	if (unlikely(!test_bit(HCD_FLAG_HW_ACCESSIBLE,
+			       &ehci_to_hcd(ehci)->flags)))
+		status = -ESHUTDOWN;
+	else
+		status = iso_stream_schedule (ehci, urb, stream);
  	if (likely (status == 0))
 		itd_link_urb (ehci, urb, ehci->periodic_size << 3, stream);
 	spin_unlock_irqrestore (&ehci->lock, flags);
@@ -1815,7 +1825,11 @@
 
 	/* schedule ... need to lock */
 	spin_lock_irqsave (&ehci->lock, flags);
-	status = iso_stream_schedule (ehci, urb, stream);
+	if (unlikely(!test_bit(HCD_FLAG_HW_ACCESSIBLE,
+			       &ehci_to_hcd(ehci)->flags)))
+		status = -ESHUTDOWN;
+	else
+		status = iso_stream_schedule (ehci, urb, stream);
  	if (status == 0)
 		sitd_link_urb (ehci, urb, ehci->periodic_size << 3, stream);
 	spin_unlock_irqrestore (&ehci->lock, flags);
Index: linux-serialfix/drivers/usb/host/ohci-hcd.c
===================================================================
--- linux-serialfix.orig/drivers/usb/host/ohci-hcd.c	2005-11-23 13:47:45.000000000 +1100
+++ linux-serialfix/drivers/usb/host/ohci-hcd.c	2005-11-23 13:57:06.000000000 +1100
@@ -115,7 +115,7 @@
 
 /*-------------------------------------------------------------------------*/
 
-// #define OHCI_VERBOSE_DEBUG	/* not always helpful */
+#undef OHCI_VERBOSE_DEBUG	/* not always helpful */
 
 /* For initializing controller (mask in an HCFS mode too) */
 #define	OHCI_CONTROL_INIT 	OHCI_CTRL_CBSR
@@ -253,6 +253,10 @@
 	spin_lock_irqsave (&ohci->lock, flags);
 
 	/* don't submit to a dead HC */
+	if (!test_bit(HCD_FLAG_HW_ACCESSIBLE, &hcd->flags)) {
+		retval = -ENODEV;
+		goto fail;
+	}
 	if (!HC_IS_RUNNING(hcd->state)) {
 		retval = -ENODEV;
 		goto fail;
Index: linux-serialfix/drivers/usb/host/ohci-hub.c
===================================================================
--- linux-serialfix.orig/drivers/usb/host/ohci-hub.c	2005-11-23 13:47:45.000000000 +1100
+++ linux-serialfix/drivers/usb/host/ohci-hub.c	2005-11-23 13:54:04.000000000 +1100
@@ -53,6 +53,11 @@
 
 	spin_lock_irqsave (&ohci->lock, flags);
 
+	if (unlikely(!test_bit(HCD_FLAG_HW_ACCESSIBLE, &hcd->flags))) {
+		spin_unlock_irqrestore (&ohci->lock, flags);
+		return -ESHUTDOWN;
+	}
+
 	ohci->hc_control = ohci_readl (ohci, &ohci->regs->control);
 	switch (ohci->hc_control & OHCI_CTRL_HCFS) {
 	case OHCI_USB_RESUME:
@@ -140,11 +145,19 @@
 	struct ohci_hcd		*ohci = hcd_to_ohci (hcd);
 	u32			temp, enables;
 	int			status = -EINPROGRESS;
+	unsigned long		flags;
 
 	if (time_before (jiffies, ohci->next_statechange))
 		msleep(5);
 
-	spin_lock_irq (&ohci->lock);
+	spin_lock_irqsave (&ohci->lock, flags);
+
+	if (unlikely(!test_bit(HCD_FLAG_HW_ACCESSIBLE, &hcd->flags))) {
+		spin_unlock_irqrestore (&ohci->lock, flags);
+		return -ESHUTDOWN;
+	}
+
+
 	ohci->hc_control = ohci_readl (ohci, &ohci->regs->control);
 
 	if (ohci->hc_control & (OHCI_CTRL_IR | OHCI_SCHED_ENABLES)) {
@@ -179,7 +192,7 @@
 		ohci_dbg (ohci, "lost power\n");
 		status = -EBUSY;
 	}
-	spin_unlock_irq (&ohci->lock);
+	spin_unlock_irqrestore (&ohci->lock, flags);
 	if (status == -EBUSY) {
 		(void) ohci_init (ohci);
 		return ohci_restart (ohci);
@@ -297,8 +310,8 @@
 	/* handle autosuspended root:  finish resuming before
 	 * letting khubd or root hub timer see state changes.
 	 */
-	if ((ohci->hc_control & OHCI_CTRL_HCFS) != OHCI_USB_OPER
-			|| !HC_IS_RUNNING(hcd->state)) {
+	if (unlikely((ohci->hc_control & OHCI_CTRL_HCFS) != OHCI_USB_OPER
+		     || !HC_IS_RUNNING(hcd->state))) {
 		can_suspend = 0;
 		goto done;
 	}
@@ -508,6 +521,9 @@
 	u32		temp;
 	int		retval = 0;
 
+	if (unlikely(!test_bit(HCD_FLAG_HW_ACCESSIBLE, &hcd->flags)))
+		return -ESHUTDOWN;
+
 	switch (typeReq) {
 	case ClearHubFeature:
 		switch (wValue) {
Index: linux-serialfix/drivers/usb/host/ohci-pci.c
===================================================================
--- linux-serialfix.orig/drivers/usb/host/ohci-pci.c	2005-11-23 13:52:23.000000000 +1100
+++ linux-serialfix/drivers/usb/host/ohci-pci.c	2005-11-23 13:52:32.000000000 +1100
@@ -114,12 +114,35 @@
 
 static int ohci_pci_suspend (struct usb_hcd *hcd, pm_message_t message)
 {
-	return 0;
+	struct ohci_hcd	*ohci = hcd_to_ohci (hcd);
+	unsigned long	flags;
+	int		rc = 0;
+
+	/* Root hub was already suspended. Disable irq emission and
+	 * mark HW unaccessible, bail out if RH has been resumed. Use
+	 * the spinlock to properly synchronize with possible pending
+	 * RH suspend or resume activity.
+	 *
+	 * This is still racy as hcd->state is manipulated outside of
+	 * any locks =P But that will be a different fix.
+	 */
+	spin_lock_irqsave (&ohci->lock, flags);
+	if (hcd->state != HC_STATE_SUSPENDED) {
+		rc = -EINVAL;
+		goto bail;
+	}
+	ohci_writel(ohci, OHCI_INTR_MIE, &ohci->regs->intrdisable);
+	clear_bit(HCD_FLAG_HW_ACCESSIBLE, &hcd->flags);
+ bail:
+	spin_unlock_irqrestore (&ohci->lock, flags);
+
+	return rc;
 }
 
 
 static int ohci_pci_resume (struct usb_hcd *hcd)
 {
+	set_bit(HCD_FLAG_HW_ACCESSIBLE, &hcd->flags);
 	usb_hcd_resume_root_hub(hcd);
 	return 0;
 }

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] Fix USB suspend/resume crasher
  2005-11-23  3:08 [PATCH] Fix USB suspend/resume crasher Benjamin Herrenschmidt
@ 2005-11-23  3:09 ` Benjamin Herrenschmidt
  2005-11-23  3:14 ` Nigel Cunningham
                   ` (3 subsequent siblings)
  4 siblings, 0 replies; 13+ messages in thread
From: Benjamin Herrenschmidt @ 2005-11-23  3:09 UTC (permalink / raw)
  To: David Brownell
  Cc: Andrew Morton, Greg KH, Linux Kernel list, linuxppc-dev list,
	Alan Stern


> This patch applies on top of the patch that moves the PowerMac specific
> code out of ohci-pci.c to hcd-pci.c where it belongs. This patch isn't
> upstream yet for reasons I don't fully understand (why does USB stuffs
> has such a high latency for going upstream ?), I'm sending it as a reply
> to this email for completeness.

Here it is. it's my understand that it's already been queued somewhere
for submission by David for some time though. 

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>

Index: linux-serialfix/drivers/usb/core/hcd-pci.c
===================================================================
--- linux-serialfix.orig/drivers/usb/core/hcd-pci.c	2005-11-21 11:53:15.000000000 +1100
+++ linux-serialfix/drivers/usb/core/hcd-pci.c	2005-11-23 11:23:24.000000000 +1100
@@ -27,6 +27,13 @@
 #include "usb.h"
 #include "hcd.h"
 
+#ifdef CONFIG_PPC_PMAC
+#include <asm/machdep.h>
+#include <asm/pmac_feature.h>
+#include <asm/pci-bridge.h>
+#include <asm/prom.h>
+#endif
+
 
 /* PCI-based HCs are common, but plenty of non-PCI HCs are used too */
 
@@ -277,6 +284,16 @@
 	}
 
 done:
+#ifdef CONFIG_PPC_PMAC
+	if (retval == 0 && _machine == _MACH_Pmac) {
+	   	struct device_node	*of_node;
+
+		/* Disable USB PAD & cell clock */
+		of_node = pci_device_to_OF_node(dev);
+		if (of_node)
+			pmac_call_feature(PMAC_FTR_USB_ENABLE, of_node, 0, 0);
+	}
+#endif /* CONFIG_PPC_PMAC */
 	if (retval == 0)
 		dev->dev.power.power_state = PMSG_SUSPEND;
 	return retval;
@@ -301,6 +318,17 @@
 		return 0;
 	}
 
+#ifdef CONFIG_PPC_PMAC
+	if (_machine == _MACH_Pmac) {
+		struct device_node *of_node;
+
+		/* Re-enable USB PAD & cell clock */
+		of_node = pci_device_to_OF_node(dev);
+		if (of_node)
+			pmac_call_feature(PMAC_FTR_USB_ENABLE, of_node, 0, 1);
+	}
+#endif /* CONFIG_PPC_PMAC */
+
 	/* NOTE:  chip docs cover clean "real suspend" cases (what Linux
 	 * calls "standby", "suspend to RAM", and so on).  There are also
 	 * dirty cases when swsusp fakes a suspend in "shutdown" mode.
Index: linux-serialfix/drivers/usb/host/ohci-pci.c
===================================================================
--- linux-serialfix.orig/drivers/usb/host/ohci-pci.c	2005-11-14 10:42:00.000000000 +1100
+++ linux-serialfix/drivers/usb/host/ohci-pci.c	2005-11-23 11:25:26.000000000 +1100
@@ -114,40 +114,12 @@
 
 static int ohci_pci_suspend (struct usb_hcd *hcd, pm_message_t message)
 {
-	/* root hub was already suspended */
-
-	/* FIXME these PMAC things get called in the wrong places.  ASIC
-	 * clocks should be turned off AFTER entering D3, and on BEFORE
-	 * trying to enter D0.  Evidently the PCI layer doesn't currently
-	 * provide the right sort of platform hooks for this ...
-	 */
-#ifdef CONFIG_PPC_PMAC
-	if (_machine == _MACH_Pmac) {
-	   	struct device_node	*of_node;
- 
-		/* Disable USB PAD & cell clock */
-		of_node = pci_device_to_OF_node (to_pci_dev(hcd->self.controller));
-		if (of_node)
-			pmac_call_feature(PMAC_FTR_USB_ENABLE, of_node, 0, 0);
-	}
-#endif /* CONFIG_PPC_PMAC */
 	return 0;
 }
 
 
 static int ohci_pci_resume (struct usb_hcd *hcd)
 {
-#ifdef CONFIG_PPC_PMAC
-	if (_machine == _MACH_Pmac) {
-		struct device_node *of_node;
-
-		/* Re-enable USB PAD & cell clock */
-		of_node = pci_device_to_OF_node (to_pci_dev(hcd->self.controller));
-		if (of_node)
-			pmac_call_feature (PMAC_FTR_USB_ENABLE, of_node, 0, 1);
-	}
-#endif /* CONFIG_PPC_PMAC */
-
 	usb_hcd_resume_root_hub(hcd);
 	return 0;
 }

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] Fix USB suspend/resume crasher
  2005-11-23  3:08 [PATCH] Fix USB suspend/resume crasher Benjamin Herrenschmidt
  2005-11-23  3:09 ` Benjamin Herrenschmidt
@ 2005-11-23  3:14 ` Nigel Cunningham
  2005-11-23  5:13   ` Benjamin Herrenschmidt
  2005-11-23 17:10 ` Greg KH
                   ` (2 subsequent siblings)
  4 siblings, 1 reply; 13+ messages in thread
From: Nigel Cunningham @ 2005-11-23  3:14 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Andrew Morton, Greg KH, Linux Kernel Mailing List, David Brownell,
	linuxppc-dev list, Alan Stern

Hi.

On Wed, 2005-11-23 at 14:08, Benjamin Herrenschmidt wrote:
> This is my latest patch against current linus -git, it closes the IRQ
> race and makes various other OHCI & EHCI code path safer vs.
> suspend/resume. I've been able to (finally !) successfully suspend and
> resume various Mac models, with or without USB mouse plugged, or
> plugging while asleep, or unplugging while asleep etc... all without a
> crash. There are still some races here or there in the USB code, but at
> least the main cause of crash is now fixes by this patch (access to a
> controller that has been suspended, due to either shared interrupts or
> other code path).
> 
> I haven't fixed UHCI as I don't have any HW to test, though I hope I
> haven't broken it neither. Alan, I would appreciate if you could have a
> look.
> 
> This patch applies on top of the patch that moves the PowerMac specific
> code out of ohci-pci.c to hcd-pci.c where it belongs. This patch isn't
> upstream yet for reasons I don't fully understand (why does USB stuffs
> has such a high latency for going upstream ?), I'm sending it as a reply
> to this email for completeness.
> 
> Without this patch, you cannot reliably sleep/wakeup any recent Mac, and
> I suspect PCs have some more sneaky issues too (they don't frankly crash
> with machine checks because x86 tend to silently swallow PCI errors but
> that won't last afaik, at least PCI Express will blow up in those
> situations, but the USB code may still misbehave).

Sounds great. Maybe I'll finally be able to change my first question to
people with suspend problems from: "Do you have USB built as modules and
unloaded while suspending."

Regards,

Nigel

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] Fix USB suspend/resume crasher
  2005-11-23  3:14 ` Nigel Cunningham
@ 2005-11-23  5:13   ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 13+ messages in thread
From: Benjamin Herrenschmidt @ 2005-11-23  5:13 UTC (permalink / raw)
  To: ncunningham
  Cc: Andrew Morton, Greg KH, Linux Kernel Mailing List, David Brownell,
	linuxppc-dev list, Alan Stern


> Sounds great. Maybe I'll finally be able to change my first question to
> people with suspend problems from: "Do you have USB built as modules and
> unloaded while suspending."

Heh, I don't know :) I haven't done anything to UHCI at this point, and
there are still other possible issues.

For example, we should probably do the "handoff" trick on resume as well
as on boot. In fact, I suspect that most PCI quirks should be re-run on
resume, with IRQs off if possible, so that stuffs can be put back into
some sane state when coming back from the BIOS before they get a chance
to spam the kernel with bogus IRQs left enabled by that same BIOS (does
this happen ?) or other niceties of that sort... I don't have any of
these problems on powermacs, but x86 might not be yet at the end of the
tunnel...

Ben.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] Fix USB suspend/resume crasher
  2005-11-23  3:08 [PATCH] Fix USB suspend/resume crasher Benjamin Herrenschmidt
  2005-11-23  3:09 ` Benjamin Herrenschmidt
  2005-11-23  3:14 ` Nigel Cunningham
@ 2005-11-23 17:10 ` Greg KH
  2005-11-23 18:58   ` David Brownell
  2005-11-24  0:22 ` Rafael J. Wysocki
  2005-11-24 16:52 ` Arkadiusz Miskiewicz
  4 siblings, 1 reply; 13+ messages in thread
From: Greg KH @ 2005-11-23 17:10 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Andrew Morton, Linux Kernel list, David Brownell,
	linuxppc-dev list, Alan Stern

On Wed, Nov 23, 2005 at 02:08:07PM +1100, Benjamin Herrenschmidt wrote:
> This patch applies on top of the patch that moves the PowerMac specific
> code out of ohci-pci.c to hcd-pci.c where it belongs. This patch isn't
> upstream yet for reasons I don't fully understand (why does USB stuffs
> has such a high latency for going upstream ?), I'm sending it as a reply
> to this email for completeness.

Sorry, I hadn't seen it, otherwise I would have sent it on.

David, are you ok with the patch Ben sent on as a followup?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] Fix USB suspend/resume crasher
  2005-11-23 17:10 ` Greg KH
@ 2005-11-23 18:58   ` David Brownell
  0 siblings, 0 replies; 13+ messages in thread
From: David Brownell @ 2005-11-23 18:58 UTC (permalink / raw)
  To: Greg KH; +Cc: Andrew Morton, Linux Kernel list, linuxppc-dev list, Alan Stern

On Wednesday 23 November 2005 9:10 am, Greg KH wrote:
> On Wed, Nov 23, 2005 at 02:08:07PM +1100, Benjamin Herrenschmidt wrote:
> > This patch applies on top of 

Ben, I'll look at this patch today.


> > the patch that moves the PowerMac specific 
> > code out of ohci-pci.c to hcd-pci.c where it belongs. This patch isn't
> > upstream yet for reasons I don't fully understand (why does USB stuffs
> > has such a high latency for going upstream ?), I'm sending it as a reply
> > to this email for completeness.
> 
> Sorry, I hadn't seen it, otherwise I would have sent it on.
> 
> David, are you ok with the patch Ben sent on as a followup?

It should be the same as 

http://www.kernel.org/pub/linux/kernel/people/gregkh/gregkh-2.6/gregkh-04-usb/usb-ohci-move-ppc-asic-tweaks-nearer-pci.patch

which is already in your queue, just it's not yet upstream.
I posted it before RC1 came out, but evidently not before
your RC1-merge-to-Linus window had closed.

- Dave

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] Fix USB suspend/resume crasher
  2005-11-23  3:08 [PATCH] Fix USB suspend/resume crasher Benjamin Herrenschmidt
                   ` (2 preceding siblings ...)
  2005-11-23 17:10 ` Greg KH
@ 2005-11-24  0:22 ` Rafael J. Wysocki
  2005-11-24  1:23   ` Benjamin Herrenschmidt
  2005-11-24 16:52 ` Arkadiusz Miskiewicz
  4 siblings, 1 reply; 13+ messages in thread
From: Rafael J. Wysocki @ 2005-11-24  0:22 UTC (permalink / raw)
  To: linux-kernel
  Cc: Andrew Morton, David Brownell, linuxppc-dev list, Greg KH,
	Alan Stern

[-- Attachment #1: Type: text/plain, Size: 5195 bytes --]

Hi,

On Wednesday, 23 of November 2005 04:08, Benjamin Herrenschmidt wrote:
> This is my latest patch against current linus -git, it closes the IRQ
> race and makes various other OHCI & EHCI code path safer vs.
> suspend/resume. I've been able to (finally !) successfully suspend and
> resume various Mac models, with or without USB mouse plugged, or
> plugging while asleep, or unplugging while asleep etc... all without a
> crash. There are still some races here or there in the USB code, but at
> least the main cause of crash is now fixes by this patch (access to a
> controller that has been suspended, due to either shared interrupts or
> other code path).
> 
> I haven't fixed UHCI as I don't have any HW to test, though I hope I
> haven't broken it neither. Alan, I would appreciate if you could have a
> look.
> 
> This patch applies on top of the patch that moves the PowerMac specific
> code out of ohci-pci.c to hcd-pci.c where it belongs. This patch isn't
> upstream yet for reasons I don't fully understand (why does USB stuffs
> has such a high latency for going upstream ?), I'm sending it as a reply
> to this email for completeness.
> 
> Without this patch, you cannot reliably sleep/wakeup any recent Mac, and
> I suspect PCs have some more sneaky issues too (they don't frankly crash
> with machine checks because x86 tend to silently swallow PCI errors but
> that won't last afaik, at least PCI Express will blow up in those
> situations, but the USB code may still misbehave).

Unfortunately with this patch the EHCI controller in my box (Asus L5D,
x86-64 kernel) does not resume from suspend.  Appended is the relevant
snippet from the serial console log (EHCI is the only device using IRQ #5).

Greetings,
Rafael


PM: Image restored successfully.
ohci_hcd 0000:00:02.0: PCI D0, from previous PCI D3
ACPI: PCI Interrupt 0000:00:02.0[A] -> Link [LUS0] -> GSI 11 (level, low) -> IRQ 11
PCI: Setting latency timer of device 0000:00:02.0 to 64
ohci_hcd 0000:00:02.1: PCI D0, from previous PCI D3
ACPI: PCI Interrupt 0000:00:02.1[B] -> Link [LUS1] -> GSI 11 (level, low) -> IRQ 11
PCI: Setting latency timer of device 0000:00:02.1 to 64
ehci_hcd 0000:00:02.2: PCI D0, from previous PCI D3
ACPI: PCI Interrupt 0000:00:02.2[C] -> Link [LUS2] -> GSI 5 (level, low) -> IRQ 5
PCI: Setting latency timer of device 0000:00:02.2 to 64
ehci_hcd 0000:00:02.2: lost power, restarting
usb usb3: root hub lost power or was reset
ehci_hcd 0000:00:02.2: reset command 080b02 park=3 ithresh=8 period=1024 Reset HALT
ehci_hcd 0000:00:02.2: debug port 1
ehci_hcd 0000:00:02.2: capability 1000001 at a0
PCI: cache line size of 64 is not supported by device 0000:00:02.2
ehci_hcd 0000:00:02.2: reset command 080b02 park=3 ithresh=8 period=1024 Reset HALT
ehci_hcd 0000:00:02.2: init command 010009 (park)=0 ithresh=1 period=256 RUN
ehci_hcd 0000:00:02.2: USB 2.0 started, EHCI 1.00, driver 10 Dec 2004
ACPI: PCI Interrupt 0000:00:06.0[A] -> Link [LAUI] -> GSI 10 (level, low) -> IRQ 10
PCI: Setting latency timer of device 0000:00:06.0 to 64
irq 5: nobody cared (try booting with the "irqpoll" option)

Call Trace: <IRQ> <ffffffff80250c3e>{add_preempt_count+94} <ffffffff8015a878>{__report_bad_irq+56}
       <ffffffff8015aab4>{note_interrupt+484} <ffffffff8015a317>{__do_IRQ+199}
       <ffffffff801110f7>{do_IRQ+55} <ffffffff8010f100>{ret_from_intr+0}
       <ffffffff8010fb02>{call_softirq+30} <ffffffff80138714>{__do_softirq+68}
       <ffffffff801386ff>{__do_softirq+47} <ffffffff8010fb02>{call_softirq+30}
       <ffffffff801110b5>{do_softirq+53} <ffffffff8013849f>{irq_exit+63}
       <ffffffff801110fc>{do_IRQ+60} <ffffffff8010f100>{ret_from_intr+0}
        <EOI> <ffffffff802c183a>{serial8250_console_write+186}
       <ffffffff80132dad>{release_console_sem+333} <ffffffff80133587>{vprintk+775}
       <ffffffff80133682>{printk+162} <ffffffff80133682>{printk+162}
       <ffffffff8036003d>{_spin_unlock_irqrestore+29} <ffffffff80252c31>{pci_bus_read_config_byte+113}
       <ffffffff802ed6fe>{pcibios_set_master+110} <ffffffff80255415>{pci_set_master+85}
       <ffffffff88180eb7>{:snd_intel8x0:intel8x0_resume+39}
       <ffffffff8812a9e7>{:snd:snd_card_pci_resume+55} <ffffffff80256c74>{pci_device_resume+36}
       <ffffffff802d5cdd>{resume_device+157} <ffffffff802d5e23>{dpm_resume+147}
       <ffffffff802d5e90>{device_resume+32} <ffffffff80153508>{pm_suspend_disk+296}
       <ffffffff80150f10>{enter_state+112} <ffffffff80151147>{state_store+119}
       <ffffffff801bfdd4>{subsys_attr_store+36} <ffffffff801c025a>{sysfs_write_file+202}
       <ffffffff8017fb09>{vfs_write+233} <ffffffff8017fcb0>{sys_write+80}
       <ffffffff8010eb5e>{system_call+126}
---------------------------
| preempt count: 00010103 ]
| 3 level deep critical section nesting:
----------------------------------------
.. [<ffffffff801332a4>] .... vprintk+0x24/0x360
.....[<ffffffff80133682>] ..   ( <= printk+0xa2/0xb0)
.. [<ffffffff80360176>] .... _spin_lock+0x16/0x30
.....[<ffffffff8015a2fc>] ..   ( <= __do_IRQ+0xac/0x120)
.. [<ffffffff80360176>] .... _spin_lock+0x16/0x30
.....[<ffffffff8015a2fc>] ..   ( <= __do_IRQ+0xac/0x120)

handlers:
[<ffffffff80260970>] (usb_hcd_irq+0x0/0x70)
Disabling IRQ #5


[-- Attachment #2: lspci-v.log --]
[-- Type: text/x-log, Size: 6384 bytes --]

0000:00:00.0 Host bridge: nVidia Corporation nForce3 Host Bridge (rev a4)
	Subsystem: ASUSTeK Computer Inc.: Unknown device 80c5
	Flags: bus master, 66Mhz, fast devsel, latency 0
	Memory at e8000000 (32-bit, prefetchable) [size=128M]
	Capabilities: <available only to root>

0000:00:01.0 ISA bridge: nVidia Corporation nForce3 LPC Bridge (rev f6)
	Subsystem: ASUSTeK Computer Inc.: Unknown device 80c5
	Flags: bus master, 66Mhz, fast devsel, latency 0

0000:00:01.1 SMBus: nVidia Corporation nForce3 SMBus (rev a4)
	Subsystem: ASUSTeK Computer Inc.: Unknown device 80c5
	Flags: 66Mhz, fast devsel
	I/O ports at 5000 [size=64]
	I/O ports at 5040 [size=64]
	Capabilities: <available only to root>

0000:00:02.0 USB Controller: nVidia Corporation nForce3 USB 1.1 (rev a5) (prog-if 10 [OHCI])
	Subsystem: ASUSTeK Computer Inc.: Unknown device 1858
	Flags: bus master, 66Mhz, fast devsel, latency 0, IRQ 11
	Memory at febfb000 (32-bit, non-prefetchable) [size=4K]
	Capabilities: <available only to root>

0000:00:02.1 USB Controller: nVidia Corporation nForce3 USB 1.1 (rev a5) (prog-if 10 [OHCI])
	Subsystem: ASUSTeK Computer Inc.: Unknown device 1858
	Flags: bus master, 66Mhz, fast devsel, latency 0, IRQ 11
	Memory at febfc000 (32-bit, non-prefetchable) [size=4K]
	Capabilities: <available only to root>

0000:00:02.2 USB Controller: nVidia Corporation nForce3 USB 2.0 (rev a2) (prog-if 20 [EHCI])
	Subsystem: ASUSTeK Computer Inc.: Unknown device 1859
	Flags: bus master, 66Mhz, fast devsel, latency 0, IRQ 5
	Memory at febfdc00 (32-bit, non-prefetchable) [size=256]
	Capabilities: <available only to root>

0000:00:06.0 Multimedia audio controller: nVidia Corporation nForce3 Audio (rev a2)
	Subsystem: ASUSTeK Computer Inc.: Unknown device 1853
	Flags: bus master, 66Mhz, fast devsel, latency 0, IRQ 10
	I/O ports at e800 [size=256]
	I/O ports at ec00 [size=128]
	Memory at febff000 (32-bit, non-prefetchable) [size=4K]
	Capabilities: <available only to root>

0000:00:08.0 IDE interface: nVidia Corporation nForce3 IDE (rev a5) (prog-if 8a [Master SecP PriP])
	Subsystem: ASUSTeK Computer Inc.: Unknown device 185a
	Flags: bus master, 66Mhz, fast devsel, latency 0
	I/O ports at ffa0 [size=16]
	Capabilities: <available only to root>

0000:00:0a.0 PCI bridge: nVidia Corporation nForce3 PCI Bridge (rev a2) (prog-if 00 [Normal decode])
	Flags: bus master, 66Mhz, fast devsel, latency 0
	Bus: primary=00, secondary=02, subordinate=04, sec-latency=128
	I/O behind bridge: 0000b000-0000dfff
	Memory behind bridge: f8a00000-feafffff
	Prefetchable memory behind bridge: 40000000-43ffffff

0000:00:0b.0 PCI bridge: nVidia Corporation nForce3 AGP Bridge (rev a4) (prog-if 00 [Normal decode])
	Flags: bus master, 66Mhz, medium devsel, latency 64
	Bus: primary=00, secondary=01, subordinate=01, sec-latency=64
	Memory behind bridge: f6900000-f89fffff
	Prefetchable memory behind bridge: c6800000-e67fffff

0000:00:18.0 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] HyperTransport Technology Configuration
	Flags: fast devsel
	Capabilities: <available only to root>

0000:00:18.1 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Address Map
	Flags: fast devsel

0000:00:18.2 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] DRAM Controller
	Flags: fast devsel

0000:00:18.3 Host bridge: Advanced Micro Devices [AMD] K8 [Athlon64/Opteron] Miscellaneous Control
	Flags: fast devsel

0000:01:00.0 VGA compatible controller: nVidia Corporation NV31M [GeForce FX Go5650] (rev a1) (prog-if 00 [VGA])
	Subsystem: ASUSTeK Computer Inc.: Unknown device 1852
	Flags: bus master, 66Mhz, medium devsel, latency 64, IRQ 11
	Memory at f7000000 (32-bit, non-prefetchable) [size=16M]
	Memory at d0000000 (32-bit, prefetchable) [size=256M]
	Expansion ROM at f89e0000 [disabled] [size=128K]
	Capabilities: <available only to root>

0000:02:00.0 Ethernet controller: Marvell Technology Group Ltd. Gigabit Ethernet Controller (rev 13)
	Subsystem: ASUSTeK Computer Inc. Marvell 88E8001 Gigabit Ethernet Controller (Asus)
	Flags: bus master, 66Mhz, medium devsel, latency 64, IRQ 10
	Memory at feaf8000 (32-bit, non-prefetchable) [size=16K]
	I/O ports at d800 [size=256]
	Expansion ROM at feac0000 [disabled] [size=128K]
	Capabilities: <available only to root>

0000:02:01.0 CardBus bridge: Ricoh Co Ltd RL5c476 II (rev ab)
	Subsystem: ASUSTeK Computer Inc.: Unknown device 1854
	Flags: bus master, medium devsel, latency 168, IRQ 11
	Memory at fd200000 (32-bit, non-prefetchable) [size=4K]
	Bus: primary=02, secondary=03, subordinate=06, sec-latency=176
	Memory window 0: 40000000-41fff000 (prefetchable)
	Memory window 1: fc600000-fd1ff000
	I/O window 0: 0000b000-0000b0ff
	I/O window 1: 0000b400-0000b4ff
	16-bit legacy interface ports at 0001

0000:02:01.1 CardBus bridge: Ricoh Co Ltd RL5c476 II (rev ab)
	Subsystem: ASUSTeK Computer Inc.: Unknown device 1854
	Flags: bus master, medium devsel, latency 168, IRQ 11
	Memory at fa200000 (32-bit, non-prefetchable) [size=4K]
	Bus: primary=02, secondary=07, subordinate=0a, sec-latency=176
	Memory window 0: 42000000-43fff000 (prefetchable)
	Memory window 1: f9600000-fa1ff000
	I/O window 0: 0000b800-0000b8ff
	I/O window 1: 0000bc00-0000bcff
	16-bit legacy interface ports at 0001

0000:02:01.2 FireWire (IEEE 1394): Ricoh Co Ltd R5C552 IEEE 1394 Controller (rev 03) (prog-if 10 [OHCI])
	Subsystem: ASUSTeK Computer Inc.: Unknown device 1857
	Flags: bus master, medium devsel, latency 64, IRQ 10
	Memory at feafd000 (32-bit, non-prefetchable) [size=2K]
	Capabilities: <available only to root>

0000:02:01.3 System peripheral: Ricoh Co Ltd: Unknown device 0576 (rev 01)
	Subsystem: ASUSTeK Computer Inc.: Unknown device 185b
	Flags: medium devsel, IRQ 11
	Memory at feafd800 (32-bit, non-prefetchable) [size=256]
	Capabilities: <available only to root>

0000:02:01.4 System peripheral: Ricoh Co Ltd: Unknown device 0592
	Subsystem: ASUSTeK Computer Inc.: Unknown device 185c
	Flags: medium devsel, IRQ 11
	Memory at feafdc00 (32-bit, non-prefetchable) [size=256]
	Capabilities: <available only to root>

0000:02:02.0 Network controller: Broadcom Corporation BCM4306 802.11b/g Wireless LAN Controller (rev 03)
	Subsystem: ASUSTeK Computer Inc.: Unknown device 120f
	Flags: bus master, fast devsel, latency 64, IRQ 11
	Memory at feafe000 (32-bit, non-prefetchable) [size=8K]
	Capabilities: <available only to root>


^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] Fix USB suspend/resume crasher
  2005-11-24  0:22 ` Rafael J. Wysocki
@ 2005-11-24  1:23   ` Benjamin Herrenschmidt
  2005-11-24 20:50     ` Rafael J. Wysocki
  0 siblings, 1 reply; 13+ messages in thread
From: Benjamin Herrenschmidt @ 2005-11-24  1:23 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Andrew Morton, Greg KH, linux-kernel, David Brownell,
	linuxppc-dev list, Alan Stern


> Unfortunately with this patch the EHCI controller in my box (Asus L5D,
> x86-64 kernel) does not resume from suspend.  Appended is the relevant
> snippet from the serial console log (EHCI is the only device using IRQ #5).

Hrm... let me see... You are getting an interrupt for EHCI after it has
been resumed, so it should work.

 /me double-checks the patch

> ehci_hcd 0000:00:02.2: lost power, restarting

Hrm... I can't find that line in the code...

 /me rechecks with david's other patches

Ah ... I see it. There might have been some screwup between david's
patch and mine.

Make sure that 

       set_bit(HCD_FLAG_HW_ACCESSIBLE, &hcd->flags);

Is still done before anything else in ehci_pci_resume(). It may be worth
following it with a memory barrier actually... just in case (due to the
absence of locks in that area).

Ben.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] Fix USB suspend/resume crasher
  2005-11-23  3:08 [PATCH] Fix USB suspend/resume crasher Benjamin Herrenschmidt
                   ` (3 preceding siblings ...)
  2005-11-24  0:22 ` Rafael J. Wysocki
@ 2005-11-24 16:52 ` Arkadiusz Miskiewicz
  4 siblings, 0 replies; 13+ messages in thread
From: Arkadiusz Miskiewicz @ 2005-11-24 16:52 UTC (permalink / raw)
  To: linuxppc-dev

On Wednesday 23 November 2005 04:08, Benjamin Herrenschmidt wrote:
> This is my latest patch against current linus -git, it closes the IRQ
> race and makes various other OHCI & EHCI code path safer vs.
> suspend/resume. I've been able to (finally !) successfully suspend and
> resume various Mac models, with or without USB mouse plugged, or
> plugging while asleep, or unplugging while asleep etc... all without a
> crash. There are still some races here or there in the USB code, but at
> least the main cause of crash is now fixes by this patch (access to a
> controller that has been suspended, due to either shared interrupts or
> other code path).

btw. what should happen if there are drivers which do not support=20
suspend/resume?

I'm using zd1211 wifi usb driver (which isn't state of art) from=20
http://zd1211.ath.cx/repos/trunk rev 39 and while trying to suspend on my=20
ibook g4 dec 2004 I'm getting this:
http://www.t17.ds.pwr.wroc.pl/~misiek/rozne/failed-sleep.jpg

After that the only thing I can do is turn the thing off and turn it back o=
n -=20
it just hangs. It would be nice if it just refuse to suspend or got things=
=20
back into sane state.

I'm using 2.6.15rc2git3 + both patches posted in this thread.

ps.
My usual suspend is:

/sbin/rmmod therm_adt746x
/sbin/rmmod zd1211
/sbin/rmmod usbmouse
/sbin/rmmod usbhid
/sbin/rmmod zd1211
/sbin/rmmod ehci-hcd
/sbin/rmmod ohci-hcd
/usr/sbin/snooze

(I guess that rmmod zd1211 should be enough now).

=2D-=20
Arkadiusz Mi=B6kiewicz                    PLD/Linux Team
http://www.t17.ds.pwr.wroc.pl/~misiek/  http://ftp.pld-linux.org/

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] Fix USB suspend/resume crasher
  2005-11-24  1:23   ` Benjamin Herrenschmidt
@ 2005-11-24 20:50     ` Rafael J. Wysocki
  2005-11-24 21:01       ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 13+ messages in thread
From: Rafael J. Wysocki @ 2005-11-24 20:50 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Andrew Morton, Greg KH, linux-kernel, David Brownell,
	linuxppc-dev list, Alan Stern

On Thursday, 24 of November 2005 02:23, Benjamin Herrenschmidt wrote:
> 
> > Unfortunately with this patch the EHCI controller in my box (Asus L5D,
> > x86-64 kernel) does not resume from suspend.  Appended is the relevant
> > snippet from the serial console log (EHCI is the only device using IRQ #5).
> 
> Hrm... let me see... You are getting an interrupt for EHCI after it has
> been resumed, so it should work.
> 
>  /me double-checks the patch
> 
> > ehci_hcd 0000:00:02.2: lost power, restarting
> 
> Hrm... I can't find that line in the code...
> 
>  /me rechecks with david's other patches
> 
> Ah ... I see it. There might have been some screwup between david's
> patch and mine.
> 
> Make sure that 
> 
>        set_bit(HCD_FLAG_HW_ACCESSIBLE, &hcd->flags);
> 
> Is still done before anything else in ehci_pci_resume().

Well, it's there (actually the problem occurs in vanilla 2.6.15-rc2-mm1 that
contains the patch).  Do you mean it should go before the

if (readl(&ehci->regs->configured_flag) != FLAG_CF)
		goto restart;

thing?

> It may be worth following it with a memory barrier actually... just in case
> (due to the absence of locks in that area).

wmb()?

Rafael

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] Fix USB suspend/resume crasher
  2005-11-24 20:50     ` Rafael J. Wysocki
@ 2005-11-24 21:01       ` Benjamin Herrenschmidt
  2005-11-24 21:14         ` Rafael J. Wysocki
  0 siblings, 1 reply; 13+ messages in thread
From: Benjamin Herrenschmidt @ 2005-11-24 21:01 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Andrew Morton, Greg KH, linux-kernel, David Brownell,
	linuxppc-dev list, Alan Stern


> 
> Well, it's there (actually the problem occurs in vanilla 2.6.15-rc2-mm1 that
> contains the patch).  Do you mean it should go before the
> 
> if (readl(&ehci->regs->configured_flag) != FLAG_CF)
> 		goto restart;
> 
> thing?

Yes.

> > It may be worth following it with a memory barrier actually... just in case
> > (due to the absence of locks in that area).
> 
> wmb()?

Yup.

I wrote that patch against a tree that had different things in that
function, Greg merged it by hand but he got that little bit wrong
unfortunately. I'll send a new patch later today.

Ben.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] Fix USB suspend/resume crasher
  2005-11-24 21:01       ` Benjamin Herrenschmidt
@ 2005-11-24 21:14         ` Rafael J. Wysocki
  2005-11-24 21:22           ` Benjamin Herrenschmidt
  0 siblings, 1 reply; 13+ messages in thread
From: Rafael J. Wysocki @ 2005-11-24 21:14 UTC (permalink / raw)
  To: Benjamin Herrenschmidt
  Cc: Andrew Morton, Greg KH, linux-kernel, David Brownell,
	linuxppc-dev list, Alan Stern

On Thursday, 24 of November 2005 22:01, Benjamin Herrenschmidt wrote:
> 
> > 
> > Well, it's there (actually the problem occurs in vanilla 2.6.15-rc2-mm1 that
> > contains the patch).  Do you mean it should go before the
> > 
> > if (readl(&ehci->regs->configured_flag) != FLAG_CF)
> > 		goto restart;
> > 
> > thing?
> 
> Yes.
> 
> > > It may be worth following it with a memory barrier actually... just in case
> > > (due to the absence of locks in that area).
> > 
> > wmb()?
> 
> Yup.
> 
> I wrote that patch against a tree that had different things in that
> function, Greg merged it by hand but he got that little bit wrong
> unfortunately. I'll send a new patch later today.

Thanks.

FWIW, does the appended change look reasonable to you?  (It apparently
helps. ;-))

Rafael


Index: linux-2.6.15-rc2-mm1/drivers/usb/host/ehci-pci.c
===================================================================
--- linux-2.6.15-rc2-mm1.orig/drivers/usb/host/ehci-pci.c	2005-11-24 21:42:34.000000000 +0100
+++ linux-2.6.15-rc2-mm1/drivers/usb/host/ehci-pci.c	2005-11-24 21:50:38.000000000 +0100
@@ -281,12 +281,13 @@
 	if (time_before(jiffies, ehci->next_statechange))
 		msleep(100);
 
+	set_bit(HCD_FLAG_HW_ACCESSIBLE, &hcd->flags);
+	wmb();
+
 	/* If CF is clear, we lost PCI Vaux power and need to restart.  */
 	if (readl(&ehci->regs->configured_flag) != FLAG_CF)
 		goto restart;
 
-	set_bit(HCD_FLAG_HW_ACCESSIBLE, &hcd->flags);
-
 	/* If any port is suspended (or owned by the companion),
 	 * we know we can/must resume the HC (and mustn't reset it).
 	 * We just defer that to the root hub code.

^ permalink raw reply	[flat|nested] 13+ messages in thread

* Re: [PATCH] Fix USB suspend/resume crasher
  2005-11-24 21:14         ` Rafael J. Wysocki
@ 2005-11-24 21:22           ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 13+ messages in thread
From: Benjamin Herrenschmidt @ 2005-11-24 21:22 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Andrew Morton, Greg KH, linux-kernel, David Brownell,
	linuxppc-dev list, Alan Stern


> 
> FWIW, does the appended change look reasonable to you?  (It apparently
> helps. ;-))

Yes. I was about to do a new patch after I finish my breakfast, but
yours applied on top of Greg's merged one works too.

Ben.

^ permalink raw reply	[flat|nested] 13+ messages in thread

end of thread, other threads:[~2005-11-24 21:26 UTC | newest]

Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2005-11-23  3:08 [PATCH] Fix USB suspend/resume crasher Benjamin Herrenschmidt
2005-11-23  3:09 ` Benjamin Herrenschmidt
2005-11-23  3:14 ` Nigel Cunningham
2005-11-23  5:13   ` Benjamin Herrenschmidt
2005-11-23 17:10 ` Greg KH
2005-11-23 18:58   ` David Brownell
2005-11-24  0:22 ` Rafael J. Wysocki
2005-11-24  1:23   ` Benjamin Herrenschmidt
2005-11-24 20:50     ` Rafael J. Wysocki
2005-11-24 21:01       ` Benjamin Herrenschmidt
2005-11-24 21:14         ` Rafael J. Wysocki
2005-11-24 21:22           ` Benjamin Herrenschmidt
2005-11-24 16:52 ` Arkadiusz Miskiewicz

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).