* open sockets preventing unregister_netdevice from completing in linux-next (next-20120724)
@ 2012-07-25 13:45 Bjørn Mork
2012-07-25 14:38 ` Eric Dumazet
0 siblings, 1 reply; 5+ messages in thread
From: Bjørn Mork @ 2012-07-25 13:45 UTC (permalink / raw)
To: netdev
I am currently researching several power management regressions in
linux-next as of next-20120724, spread over the PCI, USB and net
subsystems. This one I believe belongs to the net subsystem, although I
definitely may be wrong, mixing these together.
My test case is:
- open a ssh connection over a USB network device (qmi_wwan - which is why
I am looking at this, but I really don't think it's the driver this time)
- suspend laptop with netdev and ssh connection up
- attempt to resume
The USB device will be gone on resume because it is power cycled, so the
drivers need to clean up, to let the device be rediscovered and bound
again. But this does not happen anymore in linux-next as long as some
socket is open. Instead we have these messages:
Jul 25 15:13:11 nemi kernel: [ 7704.560306] unregister_netdevice: waiting for wwan0 to become free. Usage count = 1
Jul 25 15:13:21 nemi kernel: [ 7714.800308] unregister_netdevice: waiting for wwan0 to become free. Usage count = 1
Jul 25 15:13:31 nemi kernel: [ 7725.040316] unregister_netdevice: waiting for wwan0 to become free. Usage count = 1
There are quite a few problems with the system in this state. Any write
to the power/control associated with that USB device will hang,
presumably because the USB device does not exist anymore. This will
also make new attempts to suspend fail. And the USB device is of course
not functional. The driver has not yet had a chance to clean up any of
the other devices associated with the dead USB device (wwan1,
/dev/cdc-wdm0 and /dev/cdc-wdm1), so these ghost devices will appear as
non- functional. And new devices cannot be registered until the
previous USB device is deleted and a new one created.
Killing the ssh session let the unregister_netdevice continue and
everything will be cleaned up and go back to normal.
This is a regression compared to 3.5, where unregister_netdevice would
succeed regardless of any open sockets. Or maybe the sockets were
auto-reaped? I don't know the inner details - just observing the
results.
The test case above is quite normal operational mode for me. I often
leave open sessions while suspending (because I intend to continue using
them after resuming). And I always forget that this won't work for the
USB modem case. I don't really care either. I expect the netdev to be
removed, routes deleted and any sockets referencing either should just
die or live on as zombies or whatever. The important part is that they
should not prevent deletion of a netdev when e.g. the physical device is
gone. That's the way things used to work.
Bjørn
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: open sockets preventing unregister_netdevice from completing in linux-next (next-20120724)
2012-07-25 13:45 open sockets preventing unregister_netdevice from completing in linux-next (next-20120724) Bjørn Mork
@ 2012-07-25 14:38 ` Eric Dumazet
2012-07-25 22:17 ` David Miller
0 siblings, 1 reply; 5+ messages in thread
From: Eric Dumazet @ 2012-07-25 14:38 UTC (permalink / raw)
To: Bjørn Mork; +Cc: netdev
On Wed, 2012-07-25 at 15:45 +0200, Bjørn Mork wrote:
> I am currently researching several power management regressions in
> linux-next as of next-20120724, spread over the PCI, USB and net
> subsystems. This one I believe belongs to the net subsystem, although I
> definitely may be wrong, mixing these together.
>
> My test case is:
>
> - open a ssh connection over a USB network device (qmi_wwan - which is why
> I am looking at this, but I really don't think it's the driver this time)
> - suspend laptop with netdev and ssh connection up
> - attempt to resume
>
> The USB device will be gone on resume because it is power cycled, so the
> drivers need to clean up, to let the device be rediscovered and bound
> again. But this does not happen anymore in linux-next as long as some
> socket is open. Instead we have these messages:
>
> Jul 25 15:13:11 nemi kernel: [ 7704.560306] unregister_netdevice: waiting for wwan0 to become free. Usage count = 1
> Jul 25 15:13:21 nemi kernel: [ 7714.800308] unregister_netdevice: waiting for wwan0 to become free. Usage count = 1
> Jul 25 15:13:31 nemi kernel: [ 7725.040316] unregister_netdevice: waiting for wwan0 to become free. Usage count = 1
>
>
> There are quite a few problems with the system in this state. Any write
> to the power/control associated with that USB device will hang,
> presumably because the USB device does not exist anymore. This will
> also make new attempts to suspend fail. And the USB device is of course
> not functional. The driver has not yet had a chance to clean up any of
> the other devices associated with the dead USB device (wwan1,
> /dev/cdc-wdm0 and /dev/cdc-wdm1), so these ghost devices will appear as
> non- functional. And new devices cannot be registered until the
> previous USB device is deleted and a new one created.
>
> Killing the ssh session let the unregister_netdevice continue and
> everything will be cleaned up and go back to normal.
>
> This is a regression compared to 3.5, where unregister_netdevice would
> succeed regardless of any open sockets. Or maybe the sockets were
> auto-reaped? I don't know the inner details - just observing the
> results.
>
> The test case above is quite normal operational mode for me. I often
> leave open sessions while suspending (because I intend to continue using
> them after resuming). And I always forget that this won't work for the
> USB modem case. I don't really care either. I expect the netdev to be
> removed, routes deleted and any sockets referencing either should just
> die or live on as zombies or whatever. The important part is that they
> should not prevent deletion of a netdev when e.g. the physical device is
> gone. That's the way things used to work.
>
>
Yes, we miss what was done with rt_cache_flush() : find all cached
routes and release all dev references...
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: open sockets preventing unregister_netdevice from completing in linux-next (next-20120724)
2012-07-25 14:38 ` Eric Dumazet
@ 2012-07-25 22:17 ` David Miller
2012-07-26 17:15 ` Eric Dumazet
0 siblings, 1 reply; 5+ messages in thread
From: David Miller @ 2012-07-25 22:17 UTC (permalink / raw)
To: eric.dumazet; +Cc: bjorn, netdev
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Wed, 25 Jul 2012 16:38:48 +0200
> Yes, we miss what was done with rt_cache_flush() : find all cached
> routes and release all dev references...
We can fix this with a two-pronged approach:
1) Walk the FIB info nexthops and invalidate.
2) Entries not cached in the FIB info nexthops go into a
per-netns list which is scanned as well.
I'll try to work on this if nobody beats me to it.
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: open sockets preventing unregister_netdevice from completing in linux-next (next-20120724)
2012-07-25 22:17 ` David Miller
@ 2012-07-26 17:15 ` Eric Dumazet
2012-07-26 21:10 ` David Miller
0 siblings, 1 reply; 5+ messages in thread
From: Eric Dumazet @ 2012-07-26 17:15 UTC (permalink / raw)
To: David Miller; +Cc: bjorn, netdev
On Wed, 2012-07-25 at 15:17 -0700, David Miller wrote:
> From: Eric Dumazet <eric.dumazet@gmail.com>
> Date: Wed, 25 Jul 2012 16:38:48 +0200
>
> > Yes, we miss what was done with rt_cache_flush() : find all cached
> > routes and release all dev references...
>
> We can fix this with a two-pronged approach:
>
> 1) Walk the FIB info nexthops and invalidate.
>
> 2) Entries not cached in the FIB info nexthops go into a
> per-netns list which is scanned as well.
>
> I'll try to work on this if nobody beats me to it.
With your latest patch, I can "rmmod tg3" while sockets are active.
Not sure we need all this now ?
(the trick is probably in fib_semantics.c, when you changed
dst_release() to dst_free())
^ permalink raw reply [flat|nested] 5+ messages in thread
* Re: open sockets preventing unregister_netdevice from completing in linux-next (next-20120724)
2012-07-26 17:15 ` Eric Dumazet
@ 2012-07-26 21:10 ` David Miller
0 siblings, 0 replies; 5+ messages in thread
From: David Miller @ 2012-07-26 21:10 UTC (permalink / raw)
To: eric.dumazet; +Cc: bjorn, netdev
From: Eric Dumazet <eric.dumazet@gmail.com>
Date: Thu, 26 Jul 2012 19:15:28 +0200
> On Wed, 2012-07-25 at 15:17 -0700, David Miller wrote:
>> From: Eric Dumazet <eric.dumazet@gmail.com>
>> Date: Wed, 25 Jul 2012 16:38:48 +0200
>>
>> > Yes, we miss what was done with rt_cache_flush() : find all cached
>> > routes and release all dev references...
>>
>> We can fix this with a two-pronged approach:
>>
>> 1) Walk the FIB info nexthops and invalidate.
>>
>> 2) Entries not cached in the FIB info nexthops go into a
>> per-netns list which is scanned as well.
>>
>> I'll try to work on this if nobody beats me to it.
>
> With your latest patch, I can "rmmod tg3" while sockets are active.
>
> Not sure we need all this now ?
>
> (the trick is probably in fib_semantics.c, when you changed
> dst_release() to dst_free())
That's not the problem. Not all routes are cached in the FIB nexthop.
Any route that doesn't resolve to a FIB info (255.255.255.255, etc.)
or uses special features (tclassid, etc.) doesn't get cached.
Therefore if a socket gets that kind of route, and then becomes
inactive, we can hold onto the device references forever. The entity
with the route has to call dst_ops->check() to see that the route has
become invalid, but if the application is inactive, that will never
happen.
We have to put such non-cached routes into a special list or table of
some kind, so that we can zap a netdevice if it wants to go down or
unload before the final dst_release() happens.
^ permalink raw reply [flat|nested] 5+ messages in thread
end of thread, other threads:[~2012-07-26 21:10 UTC | newest]
Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-07-25 13:45 open sockets preventing unregister_netdevice from completing in linux-next (next-20120724) Bjørn Mork
2012-07-25 14:38 ` Eric Dumazet
2012-07-25 22:17 ` David Miller
2012-07-26 17:15 ` Eric Dumazet
2012-07-26 21:10 ` David Miller
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).