* [PATCH v2] r8152: fix lockup when runtime PM is enabled
@ 2015-12-08 11:17 Peter Wu
2015-12-08 12:39 ` Hayes Wang
2015-12-09 3:48 ` David Miller
0 siblings, 2 replies; 17+ messages in thread
From: Peter Wu @ 2015-12-08 11:17 UTC (permalink / raw)
To: Hayes Wang, David S . Miller, linux-usb, netdev, linux-kernel; +Cc: Lu Baolu
When an interface is brought up which was previously suspended (via
runtime PM), it would hang. This happens because napi_disable is called
before napi_enable.
Solve this by avoiding napi_enable in the resume during open function
(netif_running is true when open is called, IFF_UP is set after a
successful open; netif_running is false when close is called, but IFF_UP
is then still set).
While at it, remove WORK_ENABLE check from rtl8152_open (introduced with
the original change) because it cannot happen:
- After this patch, runtime resume will not set it during rtl8152_open.
- When link is up, rtl8152_open is not called.
- When link is down during system/auto suspend/resume, it is not set.
Fixes: 41cec84cf285 ("r8152: don't enable napi before rx ready")
Link: https://lkml.kernel.org/r/20151205105912.GA1766@al
Signed-off-by: Peter Wu <peter@lekensteyn.nl>
---
v2: moved rtl_runtime_suspend_enable from close to rtl8152_suspend
Tested with the scenario from
https://lkml.kernel.org/r/20151208111008.GA18728@al
(added debug prints to verify the suspend/resume moments)
---
drivers/net/usb/r8152.c | 21 +++------------------
1 file changed, 3 insertions(+), 18 deletions(-)
diff --git a/drivers/net/usb/r8152.c b/drivers/net/usb/r8152.c
index d9427ca..2e32c41 100644
--- a/drivers/net/usb/r8152.c
+++ b/drivers/net/usb/r8152.c
@@ -3067,17 +3067,6 @@ static int rtl8152_open(struct net_device *netdev)
mutex_lock(&tp->control);
- /* The WORK_ENABLE may be set when autoresume occurs */
- if (test_bit(WORK_ENABLE, &tp->flags)) {
- clear_bit(WORK_ENABLE, &tp->flags);
- usb_kill_urb(tp->intr_urb);
- cancel_delayed_work_sync(&tp->schedule);
-
- /* disable the tx/rx, if the workqueue has enabled them. */
- if (netif_carrier_ok(netdev))
- tp->rtl_ops.disable(tp);
- }
-
tp->rtl_ops.up(tp);
rtl8152_set_speed(tp, AUTONEG_ENABLE,
@@ -3124,12 +3113,6 @@ static int rtl8152_close(struct net_device *netdev)
} else {
mutex_lock(&tp->control);
- /* The autosuspend may have been enabled and wouldn't
- * be disable when autoresume occurs, because the
- * netif_running() would be false.
- */
- rtl_runtime_suspend_enable(tp, false);
-
tp->rtl_ops.down(tp);
mutex_unlock(&tp->control);
@@ -3512,7 +3495,7 @@ static int rtl8152_resume(struct usb_interface *intf)
netif_device_attach(tp->netdev);
}
- if (netif_running(tp->netdev)) {
+ if (netif_running(tp->netdev) && tp->netdev->flags & IFF_UP) {
if (test_bit(SELECTIVE_SUSPEND, &tp->flags)) {
rtl_runtime_suspend_enable(tp, false);
clear_bit(SELECTIVE_SUSPEND, &tp->flags);
@@ -3532,6 +3515,8 @@ static int rtl8152_resume(struct usb_interface *intf)
}
usb_submit_urb(tp->intr_urb, GFP_KERNEL);
} else if (test_bit(SELECTIVE_SUSPEND, &tp->flags)) {
+ if (tp->netdev->flags & IFF_UP)
+ rtl_runtime_suspend_enable(tp, false);
clear_bit(SELECTIVE_SUSPEND, &tp->flags);
}
--
2.6.3
^ permalink raw reply related [flat|nested] 17+ messages in thread* RE: [PATCH v2] r8152: fix lockup when runtime PM is enabled
2015-12-08 11:17 [PATCH v2] r8152: fix lockup when runtime PM is enabled Peter Wu
@ 2015-12-08 12:39 ` Hayes Wang
2015-12-08 14:33 ` Peter Wu
2015-12-09 3:48 ` David Miller
1 sibling, 1 reply; 17+ messages in thread
From: Hayes Wang @ 2015-12-08 12:39 UTC (permalink / raw)
To: Peter Wu, David S . Miller, linux-usb@vger.kernel.org,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org
Cc: Lu Baolu
Peter Wu [mailto:peter@lekensteyn.nl]
> Sent: Tuesday, December 08, 2015 7:18 PM
>
> When an interface is brought up which was previously suspended (via
> runtime PM), it would hang. This happens because napi_disable is called
> before napi_enable.
>
> Solve this by avoiding napi_enable in the resume during open function
> (netif_running is true when open is called, IFF_UP is set after a
> successful open; netif_running is false when close is called, but IFF_UP
> is then still set).
>
> While at it, remove WORK_ENABLE check from rtl8152_open (introduced with
> the original change) because it cannot happen:
>
> - After this patch, runtime resume will not set it during rtl8152_open.
> - When link is up, rtl8152_open is not called.
> - When link is down during system/auto suspend/resume, it is not set.
>
> Fixes: 41cec84cf285 ("r8152: don't enable napi before rx ready")
> Link: https://lkml.kernel.org/r/20151205105912.GA1766@al
> Signed-off-by: Peter Wu <peter@lekensteyn.nl>
> ---
> v2: moved rtl_runtime_suspend_enable from close to rtl8152_suspend
Acked-by: Hayes Wang <hayeswang@realtek.com>
Best Regards,
Hayes
^ permalink raw reply [flat|nested] 17+ messages in thread* Re: [PATCH v2] r8152: fix lockup when runtime PM is enabled
2015-12-08 12:39 ` Hayes Wang
@ 2015-12-08 14:33 ` Peter Wu
2015-12-15 11:32 ` Oliver Neukum
2015-12-22 9:48 ` Hayes Wang
0 siblings, 2 replies; 17+ messages in thread
From: Peter Wu @ 2015-12-08 14:33 UTC (permalink / raw)
To: Hayes Wang
Cc: David S . Miller, linux-usb@vger.kernel.org,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Lu Baolu
On Tue, Dec 08, 2015 at 12:39:12PM +0000, Hayes Wang wrote:
> Peter Wu [mailto:peter@lekensteyn.nl]
> > Sent: Tuesday, December 08, 2015 7:18 PM
> >
> > When an interface is brought up which was previously suspended (via
> > runtime PM), it would hang. This happens because napi_disable is called
> > before napi_enable.
> >
> > Solve this by avoiding napi_enable in the resume during open function
> > (netif_running is true when open is called, IFF_UP is set after a
> > successful open; netif_running is false when close is called, but IFF_UP
> > is then still set).
> >
> > While at it, remove WORK_ENABLE check from rtl8152_open (introduced with
> > the original change) because it cannot happen:
> >
> > - After this patch, runtime resume will not set it during rtl8152_open.
> > - When link is up, rtl8152_open is not called.
> > - When link is down during system/auto suspend/resume, it is not set.
> >
> > Fixes: 41cec84cf285 ("r8152: don't enable napi before rx ready")
> > Link: https://lkml.kernel.org/r/20151205105912.GA1766@al
> > Signed-off-by: Peter Wu <peter@lekensteyn.nl>
> > ---
> > v2: moved rtl_runtime_suspend_enable from close to rtl8152_suspend
>
> Acked-by: Hayes Wang <hayeswang@realtek.com>
>
> Best Regards,
> Hayes
I found another problem with runtime PM. When a device is suspended via
autosuspend and a system suspend takes place, there is no network I/O
after resume. Triggering a renegotiation (ethtool -r eth1) brings back
network activity.
Can you have a look? I guess that reset_resume needs different
treatment, but don't know how to do it properly as suspend is not called
before system reset (because the device is apparently already in
suspended state).
I think that this new issue can be addressed in a different patch, this
patch solves a more severe lockup. Opinions?
--
Kind regards,
Peter Wu
https://lekensteyn.nl
^ permalink raw reply [flat|nested] 17+ messages in thread* Re: [PATCH v2] r8152: fix lockup when runtime PM is enabled
2015-12-08 14:33 ` Peter Wu
@ 2015-12-15 11:32 ` Oliver Neukum
2015-12-22 9:48 ` Hayes Wang
1 sibling, 0 replies; 17+ messages in thread
From: Oliver Neukum @ 2015-12-15 11:32 UTC (permalink / raw)
To: Peter Wu
Cc: Hayes Wang, David S . Miller, linux-usb@vger.kernel.org,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Lu Baolu
On Tue, 2015-12-08 at 15:33 +0100, Peter Wu wrote:
> Can you have a look? I guess that reset_resume needs different
> treatment, but don't know how to do it properly as suspend is not
> called
> before system reset (because the device is apparently already in
> suspended state).
>
> I think that this new issue can be addressed in a different patch,
> this
> patch solves a more severe lockup. Opinions?
We do not know in advance whether resume() or reset_resume()
will be called after S3. A driver must be ready for both.
Regards
Oliver
^ permalink raw reply [flat|nested] 17+ messages in thread
* RE: [PATCH v2] r8152: fix lockup when runtime PM is enabled
2015-12-08 14:33 ` Peter Wu
2015-12-15 11:32 ` Oliver Neukum
@ 2015-12-22 9:48 ` Hayes Wang
2015-12-22 11:01 ` Oliver Neukum
1 sibling, 1 reply; 17+ messages in thread
From: Hayes Wang @ 2015-12-22 9:48 UTC (permalink / raw)
To: Peter Wu
Cc: David S . Miller, linux-usb@vger.kernel.org,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Lu Baolu
Peter Wu [mailto:peter@lekensteyn.nl]
> Sent: Tuesday, December 08, 2015 10:33 PM
[...]
> I found another problem with runtime PM. When a device is suspended via
> autosuspend and a system suspend takes place, there is no network I/O
> after resume. Triggering a renegotiation (ethtool -r eth1) brings back
> network activity.
I think it is relative to the firmware. Could you try the driver from Realtek website?
Best Regards,
Hayes
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2] r8152: fix lockup when runtime PM is enabled
2015-12-22 9:48 ` Hayes Wang
@ 2015-12-22 11:01 ` Oliver Neukum
2015-12-23 3:31 ` Hayes Wang
0 siblings, 1 reply; 17+ messages in thread
From: Oliver Neukum @ 2015-12-22 11:01 UTC (permalink / raw)
To: Hayes Wang
Cc: Peter Wu, David S . Miller, linux-usb@vger.kernel.org,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org, Lu Baolu
On Tue, 2015-12-22 at 09:48 +0000, Hayes Wang wrote:
> Peter Wu [mailto:peter@lekensteyn.nl]
> > Sent: Tuesday, December 08, 2015 10:33 PM
> [...]
> > I found another problem with runtime PM. When a device is suspended via
> > autosuspend and a system suspend takes place, there is no network I/O
> > after resume. Triggering a renegotiation (ethtool -r eth1) brings back
> > network activity.
>
> I think it is relative to the firmware. Could you try the driver from Realtek website?
Hi,
at the risk of repeating myself I must say that there is a logic flaw
in the driver. If you look at this code:
static int rtl8152_resume(struct usb_interface *intf)
{
struct r8152 *tp = usb_get_intfdata(intf);
mutex_lock(&tp->control);
if (!test_bit(SELECTIVE_SUSPEND, &tp->flags)) {
tp->rtl_ops.init(tp);
netif_device_attach(tp->netdev);
}
if (netif_running(tp->netdev) && tp->netdev->flags & IFF_UP) {
if (test_bit(SELECTIVE_SUSPEND, &tp->flags)) {
rtl_runtime_suspend_enable(tp, false);
clear_bit(SELECTIVE_SUSPEND, &tp->flags);
napi_disable(&tp->napi);
set_bit(WORK_ENABLE, &tp->flags);
if (netif_carrier_ok(tp->netdev))
rtl_start_rx(tp);
napi_enable(&tp->napi);
} else {
tp->rtl_ops.up(tp);
rtl8152_set_speed(tp, AUTONEG_ENABLE,
tp->mii.supports_gmii ?
SPEED_1000 : SPEED_100,
DUPLEX_FULL);
netif_carrier_off(tp->netdev);
set_bit(WORK_ENABLE, &tp->flags);
}
You need to understand that its use of the flag SELECTIVE_SUSPEND
is invalid. SELECTIVE_SUSPEND is used at two places in the driver.
Once in rtl8152_start_xmit(), where it is working but misnamed.
At that time you need to know whether the device is suspended.
To the device it does not matter whether the suspension is selective
or for the whole bus. It cannot tell. The driver just needs to know
whether it should resume the device if a packet to be transmitted
through it is given to the driver. So far all is well.
But at the time rtl8152_resume() is called the flag has become
meaningless. It tells you whether the device was selectively suspended
(we call that autosuspended, but the concept is the same), but you
take it to mean that it was _only_ selectively suspended. It does not
tell you that.
That matters a lot because the behavior of the host regarding powering
the bus during S3 and S4 is not defined. As I mentioned the device
can't tell whether it is selectively suspended. But it does notice
if its power supply is cut. In that case the driver must reinitialize
the device.
The current code does that conditional on
!test_bit(SELECTIVE_SUSPEND ... )
That is wrong because you need to do this if power was lost. The test
only tells you whether the device was selectively suspend before
power was lost (if power was lost). That is not the same thing at all.
The way the USB subsystem is designed is that it tells you whether power
had been cut by calling reset_resume() if power was cut or resume() if
power was kept.
If reset_resume() is called you must always execute this code:
tp->rtl_ops.init(tp);
and
tp->rtl_ops.up(tp);
rtl8152_set_speed(tp, AUTONEG_ENABLE,
tp->mii.supports_gmii ?
SPEED_1000 : SPEED_100,
DUPLEX_FULL);
The conditions used in rtl8152_resume() are wrong.
That is the reason "ethtool -r eth1" is reported to restore the device.
It triggers equivalent operations.
It is clear to me that you cannot get away with using the same operation
for resume() and reset_resume() in your driver. It is fundamentally
impossible. Firmware cannot fix it.
Sorry for the length of the explanation.
HTH
Oliver
^ permalink raw reply [flat|nested] 17+ messages in thread* RE: [PATCH v2] r8152: fix lockup when runtime PM is enabled
2015-12-22 11:01 ` Oliver Neukum
@ 2015-12-23 3:31 ` Hayes Wang
2015-12-23 8:20 ` Oliver Neukum
0 siblings, 1 reply; 17+ messages in thread
From: Hayes Wang @ 2015-12-23 3:31 UTC (permalink / raw)
To: Oliver Neukum
Cc: linux-usb@vger.kernel.org, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org
Oliver Neukum [mailto:oneukum@suse.com]
[...]
> It is clear to me that you cannot get away with using the same operation
> for resume() and reset_resume() in your driver. It is fundamentally
> impossible. Firmware cannot fix it.
I would think how to fix it.
> Sorry for the length of the explanation.
Thanks for your response. I have some questions. What are the flows when
the system resume follows a system suspend which follows a autosuspend?
Are they as following?
1. suspend() with PMSG_IS_AUTO for autosuspned.
2. suspend() for system suspend.
3. resume() for system resume.
And, should the device exist autosuspend before (2)?
Best Regards,
Hayes
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2] r8152: fix lockup when runtime PM is enabled
2015-12-23 3:31 ` Hayes Wang
@ 2015-12-23 8:20 ` Oliver Neukum
[not found] ` <1450858813.6437.2.camel-l3A5Bk7waGM@public.gmane.org>
0 siblings, 1 reply; 17+ messages in thread
From: Oliver Neukum @ 2015-12-23 8:20 UTC (permalink / raw)
To: Hayes Wang
Cc: linux-kernel@vger.kernel.org, linux-usb@vger.kernel.org,
netdev@vger.kernel.org
On Wed, 2015-12-23 at 03:31 +0000, Hayes Wang wrote:
> Oliver Neukum [mailto:oneukum@suse.com]
> [...]
> > It is clear to me that you cannot get away with using the same operation
> > for resume() and reset_resume() in your driver. It is fundamentally
> > impossible. Firmware cannot fix it.
>
> I would think how to fix it.
>
> > Sorry for the length of the explanation.
>
> Thanks for your response. I have some questions. What are the flows when
> the system resume follows a system suspend which follows a autosuspend?
> Are they as following?
>
> 1. suspend() with PMSG_IS_AUTO for autosuspned.
> 2. suspend() for system suspend.
> 3. resume() for system resume.
>
> And, should the device exist autosuspend before (2)?
No, step (2) does not exist. Calls to suspend() and [reset_]resume()
always balance. Usually a driver shouldn't care about system suspend.
The way the driver is currently coded will also fail for Port-Power Off.
Regards
Oliver
^ permalink raw reply [flat|nested] 17+ messages in thread
* Re: [PATCH v2] r8152: fix lockup when runtime PM is enabled
2015-12-08 11:17 [PATCH v2] r8152: fix lockup when runtime PM is enabled Peter Wu
2015-12-08 12:39 ` Hayes Wang
@ 2015-12-09 3:48 ` David Miller
1 sibling, 0 replies; 17+ messages in thread
From: David Miller @ 2015-12-09 3:48 UTC (permalink / raw)
To: peter; +Cc: hayeswang, linux-usb, netdev, linux-kernel, baolu.lu
From: Peter Wu <peter@lekensteyn.nl>
Date: Tue, 8 Dec 2015 12:17:42 +0100
> When an interface is brought up which was previously suspended (via
> runtime PM), it would hang. This happens because napi_disable is called
> before napi_enable.
>
> Solve this by avoiding napi_enable in the resume during open function
> (netif_running is true when open is called, IFF_UP is set after a
> successful open; netif_running is false when close is called, but IFF_UP
> is then still set).
>
> While at it, remove WORK_ENABLE check from rtl8152_open (introduced with
> the original change) because it cannot happen:
>
> - After this patch, runtime resume will not set it during rtl8152_open.
> - When link is up, rtl8152_open is not called.
> - When link is down during system/auto suspend/resume, it is not set.
>
> Fixes: 41cec84cf285 ("r8152: don't enable napi before rx ready")
> Link: https://lkml.kernel.org/r/20151205105912.GA1766@al
> Signed-off-by: Peter Wu <peter@lekensteyn.nl>
> ---
> v2: moved rtl_runtime_suspend_enable from close to rtl8152_suspend
Applied, and queued up for -stable, thanks.
^ permalink raw reply [flat|nested] 17+ messages in thread
end of thread, other threads:[~2015-12-24 16:08 UTC | newest]
Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-12-08 11:17 [PATCH v2] r8152: fix lockup when runtime PM is enabled Peter Wu
2015-12-08 12:39 ` Hayes Wang
2015-12-08 14:33 ` Peter Wu
2015-12-15 11:32 ` Oliver Neukum
2015-12-22 9:48 ` Hayes Wang
2015-12-22 11:01 ` Oliver Neukum
2015-12-23 3:31 ` Hayes Wang
2015-12-23 8:20 ` Oliver Neukum
[not found] ` <1450858813.6437.2.camel-l3A5Bk7waGM@public.gmane.org>
2015-12-23 9:20 ` Hayes Wang
2015-12-23 10:45 ` Oliver Neukum
2015-12-23 11:15 ` Hayes Wang
2015-12-24 1:32 ` Alan Stern
2015-12-24 7:14 ` Oliver Neukum
2015-12-24 15:14 ` Alan Stern
2015-12-24 15:47 ` Oliver Neukum
2015-12-24 16:08 ` Alan Stern
2015-12-09 3:48 ` David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).