netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* question lan9303: DSA concurrency and locking,
@ 2017-11-15 12:08 Egil Hjelmeland
  2017-11-15 14:08 ` Andrew Lunn
  0 siblings, 1 reply; 4+ messages in thread
From: Egil Hjelmeland @ 2017-11-15 12:08 UTC (permalink / raw)
  To: Andrew Lunn, Vivien Didelot, Florian Fainelli; +Cc: netdev

Hi experts

I am hoping for some guidance.

Does DSA offer any protection against concurrent calls of dsa_switch_ops?

I did not include any locking in the code I contributed to lan9303. 
First I felt bad locking is worse than no locking. Second I assumed that 
reviewers would point out if locking is needed.

The most "interesting" part of the lan9303 driver that has no locking is 
the ALR (=fdb/mdb). ALR access is a sequence of register operations. 
Anyway it is very unlikely that mdb related calls are reentered. But if 
it can happen, it would mean that IGMP snooping can go wrong. (Which is 
actually very bad in our applications.)

Is this something to worry about?

If yes, what is the best strategy to deal with it?

Looking in drivers/net/dsa/lan9303-core.c :

lan9303_indirect_phy_read/lan9303_indirect_phy_write + 
lan9303_write_switch_reg/lan9303_read_switch_reg are already protected 
by chip->indirect_mutex

ALR access uses lan9303_write_switch_reg/lan9303_read_switch_reg.

Adding a chip->alr_mutex on top of chip->indirect_mutex does not feel 
right to me. Would it be better to move the locking up to the relevant 
dsa_switch_ops functions?

Thanks

Egil Hjelmeland

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: question lan9303: DSA concurrency and locking,
  2017-11-15 12:08 question lan9303: DSA concurrency and locking, Egil Hjelmeland
@ 2017-11-15 14:08 ` Andrew Lunn
  2017-11-15 16:56   ` Vivien Didelot
  2017-11-16 10:51   ` Egil Hjelmeland
  0 siblings, 2 replies; 4+ messages in thread
From: Andrew Lunn @ 2017-11-15 14:08 UTC (permalink / raw)
  To: Egil Hjelmeland; +Cc: Vivien Didelot, Florian Fainelli, netdev

On Wed, Nov 15, 2017 at 01:08:22PM +0100, Egil Hjelmeland wrote:
> Hi experts
> 
> I am hoping for some guidance.
> 
> Does DSA offer any protection against concurrent calls of dsa_switch_ops?

Hi Egil

DSA itself does not.

There are various upper locks, which protect some calls, in some ways.
e.g. phy ops are protected by the mdio lock. stats calls are protected
by the rtnl lock, as well as some other calls. And other locks protect
other things.

But nothing gives you protection across them all.

For the mv88e6xxx driver, we took the simple approach. We generally
take a lock at the beginning of each of the dsa_swtich_ops functions,
and release it at the end. Since all accesses to the chip go through
two read/write functions, we also have code in them to detect when
they are called without holding the lock.

Some driver writers worry about performance in some situations, and
want finer grain locking. So they have multiple locks. When reviewing
drivers i will look for obvious locking issues, but don't look too
deeply. Without knowing the chip, it has hard for me to know if
something is safe or not. So i would not be surprised if there are
locking issues in some drivers.

> The most "interesting" part of the lan9303 driver that has no locking is the
> ALR (=fdb/mdb). ALR access is a sequence of register operations. Anyway it
> is very unlikely that mdb related calls are reentered. But if it can happen,
> it would mean that IGMP snooping can go wrong. (Which is actually very bad
> in our applications.)
> 
> Is this something to worry about?

I would suggest looking a bit higher in the stack. fdb/mdb operations
come via switchdev, and have a notification mechanism between slave.c
and port.c. Check if that notification mechanism enforces
serialisation. Also, check that everything actually does go though
this notification mechanism. Maybe the dump operations do not?

And then check the lower levels of the driver. If say statistics
operations are performed at the same time as fdb/mdb, can the register
accesses get interleaved? If they can, is that actually a problem for
the hardware?

	 Andrew

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: question lan9303: DSA concurrency and locking,
  2017-11-15 14:08 ` Andrew Lunn
@ 2017-11-15 16:56   ` Vivien Didelot
  2017-11-16 10:51   ` Egil Hjelmeland
  1 sibling, 0 replies; 4+ messages in thread
From: Vivien Didelot @ 2017-11-15 16:56 UTC (permalink / raw)
  To: Andrew Lunn, Egil Hjelmeland; +Cc: Florian Fainelli, netdev

Hi,

>> Does DSA offer any protection against concurrent calls of
>> dsa_switch_ops?

This is something I thought about for a while. Since DSA offers an
abstraction of different net stack entry points to its drivers, like
netlink (bridge, etc.) or ioctl (ethtool), it would make sense to add a
mutex at the DSA layer.

It would naturally go in the dsa_switch structure, and be (un)locked by
DSA core. But a switch fabric might be composed of multiple devices, so
this locking must happen at the dsa_switch_tree level. The entry points
to the DSA core are the dsa_port structure, either accessed via a master
interface's dsa_ptr pointer, or via a slave interface's private data.

So ideally the locking of the control path must occur when notifying an
operation to every device of the tree, i.e. in dsa_port_notify:

    mutex_lock(&dst->lock);
    err = raw_notifier_call_chain(&dst->nh, e, v);
    mutex_lock(&dst->lock);

Unfortunately the code is not ready for that yet, because not all calls
to ds->ops->foo are centralized yet. But we are slowly going that way.
In the meantime, DSA drivers handle locking themselves when necessary.


Thanks,

        Vivien

^ permalink raw reply	[flat|nested] 4+ messages in thread

* Re: question lan9303: DSA concurrency and locking,
  2017-11-15 14:08 ` Andrew Lunn
  2017-11-15 16:56   ` Vivien Didelot
@ 2017-11-16 10:51   ` Egil Hjelmeland
  1 sibling, 0 replies; 4+ messages in thread
From: Egil Hjelmeland @ 2017-11-16 10:51 UTC (permalink / raw)
  To: Andrew Lunn; +Cc: Vivien Didelot, Florian Fainelli, netdev

On 15. nov. 2017 15:08, Andrew Lunn wrote:
> On Wed, Nov 15, 2017 at 01:08:22PM +0100, Egil Hjelmeland wrote:
>> Hi experts
>>

Hi, thanks for response, both Andrew and Vivien!

>> I am hoping for some guidance.
>>
>> Does DSA offer any protection against concurrent calls of dsa_switch_ops?
> 
> Hi Egil
> 
> DSA itself does not.
> 
> There are various upper locks, which protect some calls, in some ways.
> e.g. phy ops are protected by the mdio lock. stats calls are protected
> by the rtnl lock, as well as some other calls. And other locks protect
> other things.
> 
> But nothing gives you protection across them all.
> 
> For the mv88e6xxx driver, we took the simple approach. We generally
> take a lock at the beginning of each of the dsa_swtich_ops functions,
> and release it at the end. Since all accesses to the chip go through
> two read/write functions, we also have code in them to detect when
> they are called without holding the lock.
> 
> Some driver writers worry about performance in some situations, and
> want finer grain locking. So they have multiple locks. When reviewing
> drivers i will look for obvious locking issues, but don't look too
> deeply. Without knowing the chip, it has hard for me to know if
> something is safe or not. So i would not be surprised if there are
> locking issues in some drivers.
> 
>> The most "interesting" part of the lan9303 driver that has no locking is the
>> ALR (=fdb/mdb). ALR access is a sequence of register operations. Anyway it
>> is very unlikely that mdb related calls are reentered. But if it can happen,
>> it would mean that IGMP snooping can go wrong. (Which is actually very bad
>> in our applications.)
>>
>> Is this something to worry about?
> 
> I would suggest looking a bit higher in the stack. fdb/mdb operations
> come via switchdev, and have a notification mechanism between slave.c
> and port.c. Check if that notification mechanism enforces
> serialisation. Also, check that everything actually does go though
> this notification mechanism. Maybe the dump operations do not?
> 

OK, for my education I took a look in upper layers. Bridge layer specify 
SWITCHDEV_F_DEFER option to switchdev operations. Which means switchdev 
hand the work over to a workqueue. Which is executed by a kworker kernel 
thread. In DSA, operations go through raw_notifier_call_chain. 
raw_notifier_call_chain has no locking, and I assume it executes in same 
context. A dump_stack() in the driver confirm my theory.

So the (most?) dsa operations execute in switchdev_deferred_process_work 
queue. If a operation sleep, other dsa operations will run in the mean 
time. So there is no serialization. Just as indicated by Vivien.

So if I still have time at hands when net-next opens again, I will do 
something about it for lan9303.


> And then check the lower levels of the driver. If say statistics
> operations are performed at the same time as fdb/mdb, can the register
> accesses get interleaved? If they can, is that actually a problem for
> the hardware?
> 

I have not seen anything in the datasheet about simultaneous access to 
different registers. Until proven otherwise, I assume protecting 
functions that require a sequence of related read/write operations will do.

At the moment I have changed my mind, I think it is better to add a new 
alr_mutex to protect the ALR (fdb/mdb) operations. And not touch the 
existing mutex. alr_mutex need to be locked in lan9303_alr_add_port, 
lan9303_alr_del_port and lan9303_alr_loop, all of them simple functions.

> 	 Andrew
> 

Egil

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2017-11-16 10:51 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-11-15 12:08 question lan9303: DSA concurrency and locking, Egil Hjelmeland
2017-11-15 14:08 ` Andrew Lunn
2017-11-15 16:56   ` Vivien Didelot
2017-11-16 10:51   ` Egil Hjelmeland

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).