netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* Generic PHY lib vs. locking
@ 2006-12-22  4:07 Benjamin Herrenschmidt
  2006-12-22 15:24 ` David Hollis
  2006-12-28 18:55 ` Andy Fleming
  0 siblings, 2 replies; 5+ messages in thread
From: Benjamin Herrenschmidt @ 2006-12-22  4:07 UTC (permalink / raw)
  To: Andy Fleming; +Cc: Netdev

Hi Andy !

I've been looking at porting various drivers (EMAC, sungem,
spider_net, ...) to the generic PHY stuff. However, I have one
significant problem here.

One of the things I've been trying to do lately with EMAC and that I
plan to do with others, is to have the PHY polling entirely operate at
task level (along with other "slow" parts of the network driver like
timeout handling etc...).

This makes a lot of locking easier, allowing to use mutexes instead of
locks (less latencies), allowing to sleep waiting for MDIO operations to
complete, etc... it's generall all benefit.

It's especially useful in a case like EMAC where several EMACs can share
MDIO lines, so we need exclusive access, and that might involve even a
second layer of exclusion for access to the RGMII or ZMII layer. mutexes
are really the best approach for that sort of non-speed critical
activities.

However, the generic PHY layer defeats that by it's heavy usage of
spin_lock_bh and timer.

One solution would be to change it to use a mutex instead of a lock as
well, though that would change the requirements of where phy_start/stop
can be called, and use a delayed work queue instead of a timer.

I could do all of these changes provided everybody agrees, though I
suppose all existing network drivers using that PHY layer might need to
be adapted. How many do we have nowadays ?

Also, I see your comments about not calling flush_scheduled_work() in
phy_stop() because of rtnl_lock()... What is the problem here ?
dev_close() ?

Ben.



^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Generic PHY lib vs. locking
  2006-12-22  4:07 Generic PHY lib vs. locking Benjamin Herrenschmidt
@ 2006-12-22 15:24 ` David Hollis
  2006-12-22 20:53   ` Benjamin Herrenschmidt
  2006-12-28 18:55 ` Andy Fleming
  1 sibling, 1 reply; 5+ messages in thread
From: David Hollis @ 2006-12-22 15:24 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Andy Fleming, Netdev

On Fri, 2006-12-22 at 15:07 +1100, Benjamin Herrenschmidt wrote:
> Hi Andy !
> 
> I've been looking at porting various drivers (EMAC, sungem,
> spider_net, ...) to the generic PHY stuff. However, I have one
> significant problem here.
> 

> One solution would be to change it to use a mutex instead of a lock as
> well, though that would change the requirements of where phy_start/stop
> can be called, and use a delayed work queue instead of a timer.

Wouldn't this change also allow it to be used with USB Ethernet devices?
I was looking at porting the asix.c drive to use the PHY layer but the
locking killed that effort since USB devices can't do spinlocks without
hosing things up.

-- 
David Hollis <dhollis@davehollis.com>


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Generic PHY lib vs. locking
  2006-12-22 15:24 ` David Hollis
@ 2006-12-22 20:53   ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 5+ messages in thread
From: Benjamin Herrenschmidt @ 2006-12-22 20:53 UTC (permalink / raw)
  To: David Hollis; +Cc: Andy Fleming, Netdev

On Fri, 2006-12-22 at 10:24 -0500, David Hollis wrote:
> On Fri, 2006-12-22 at 15:07 +1100, Benjamin Herrenschmidt wrote:
> > Hi Andy !
> > 
> > I've been looking at porting various drivers (EMAC, sungem,
> > spider_net, ...) to the generic PHY stuff. However, I have one
> > significant problem here.
> > 
> 
> > One solution would be to change it to use a mutex instead of a lock as
> > well, though that would change the requirements of where phy_start/stop
> > can be called, and use a delayed work queue instead of a timer.
> 
> Wouldn't this change also allow it to be used with USB Ethernet devices?
> I was looking at porting the asix.c drive to use the PHY layer but the
> locking killed that effort since USB devices can't do spinlocks without
> hosing things up.

Yes, possibly. At the end of the day, I get the locking in the driver
down to something like:

 - All rx/tx operatations are done between the NAPI poll and the
hard_start_xmit() routine. Locking here is done with the netif_tx_lock
if needed (depends on the driver). Those can't sleep.

 - Everything else is task level and locks against the fast path above
with stopping NAPI poll and stopping tx. I have my
driver_netif_stop/start routines doing that and a bit more (see below).

 - There is at least one issue with set_multicast which is called with
the lock held. What I do here is that my driver_netif_stop above also
take the lock, set a flag tellign set-mcast to say away, release the
lock. start() on the other hand take the lock, clears that flag, checks
if another flag was set by set-mcast telling it was here, and does the
mcast setting if that was the case.

I much prefer that than having most of the driver operations locked with
the tx lock or some other or run at softirq. Some of the reset/recovery
operations can be slow, PHY operations too, and thus it's all better run
at task level. In addition, MDIO access might be able to sleep waiting
on an interrupt signaling the completion of the access.

Ben.
 


^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Generic PHY lib vs. locking
  2006-12-22  4:07 Generic PHY lib vs. locking Benjamin Herrenschmidt
  2006-12-22 15:24 ` David Hollis
@ 2006-12-28 18:55 ` Andy Fleming
  2006-12-28 20:22   ` Benjamin Herrenschmidt
  1 sibling, 1 reply; 5+ messages in thread
From: Andy Fleming @ 2006-12-28 18:55 UTC (permalink / raw)
  To: Benjamin Herrenschmidt; +Cc: Netdev


On Dec 21, 2006, at 22:07, Benjamin Herrenschmidt wrote:

> Hi Andy !
>
> I've been looking at porting various drivers (EMAC, sungem,
> spider_net, ...) to the generic PHY stuff. However, I have one
> significant problem here.
>
> One of the things I've been trying to do lately with EMAC and that I
> plan to do with others, is to have the PHY polling entirely operate at
> task level (along with other "slow" parts of the network driver like
> timeout handling etc...).
>
> This makes a lot of locking easier, allowing to use mutexes instead of
> locks (less latencies), allowing to sleep waiting for MDIO  
> operations to
> complete, etc... it's generall all benefit.
>
> It's especially useful in a case like EMAC where several EMACs can  
> share
> MDIO lines, so we need exclusive access, and that might involve even a
> second layer of exclusion for access to the RGMII or ZMII layer.  
> mutexes
> are really the best approach for that sort of non-speed critical
> activities.


This sounds good to me.  It was an eventual goal, but I wasn't  
familiar enough with the non-spin-lock locking rules to confidently  
implement it.


>
> However, the generic PHY layer defeats that by it's heavy usage of
> spin_lock_bh and timer.
>
> One solution would be to change it to use a mutex instead of a lock as
> well, though that would change the requirements of where phy_start/ 
> stop
> can be called, and use a delayed work queue instead of a timer.
>
> I could do all of these changes provided everybody agrees, though I
> suppose all existing network drivers using that PHY layer might  
> need to
> be adapted. How many do we have nowadays ?


Great!  At last glance, only gianfar, fs_enet, and au1000_eth.  There  
are one or two others that haven't gone in, yet.  My hope is that  
your changes will not require any changes to the drivers, but I'll  
leave that to your discretion.


>
> Also, I see your comments about not calling flush_scheduled_work() in
> phy_stop() because of rtnl_lock()... What is the problem here ?
> dev_close() ?


Yup.  However, I think a reasonable solution was proposed.  The  
problem is that flush_scheduled_work() actually does all the  
scheduled work.  And if it happens with rtnl_lock() held, and some of  
the scheduled work grabs rtnl_lock(), we deadlock.  But another  
function was proposed, and I believe committed to the tree, which  
only deletes or does the work you own, and therefore lets you avoid  
that problem (assuming you know that your code doesn't grab such  
locks), and also lets you free memory.

Andy

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: Generic PHY lib vs. locking
  2006-12-28 18:55 ` Andy Fleming
@ 2006-12-28 20:22   ` Benjamin Herrenschmidt
  0 siblings, 0 replies; 5+ messages in thread
From: Benjamin Herrenschmidt @ 2006-12-28 20:22 UTC (permalink / raw)
  To: Andy Fleming; +Cc: Netdev


> Great!  At last glance, only gianfar, fs_enet, and au1000_eth.  There  
> are one or two others that haven't gone in, yet.  My hope is that  
> your changes will not require any changes to the drivers, but I'll  
> leave that to your discretion.

Unfortunately, it will probably have an impact on them. I'll have a
look.

Ben.



^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2006-12-29  1:32 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2006-12-22  4:07 Generic PHY lib vs. locking Benjamin Herrenschmidt
2006-12-22 15:24 ` David Hollis
2006-12-22 20:53   ` Benjamin Herrenschmidt
2006-12-28 18:55 ` Andy Fleming
2006-12-28 20:22   ` Benjamin Herrenschmidt

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).