netdev.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH net-next] bridge: skip fdb add if the port shouldn't learn
@ 2015-05-21 10:42 Nikolay Aleksandrov
  2015-05-25  2:59 ` David Miller
                   ` (2 more replies)
  0 siblings, 3 replies; 15+ messages in thread
From: Nikolay Aleksandrov @ 2015-05-21 10:42 UTC (permalink / raw)
  To: netdev; +Cc: wkok, davem, stephen, bridge, Nikolay Aleksandrov

From: Wilson Kok <wkok@cumulusnetworks.com>

Check in fdb_add_entry() if the source port should learn, similar
check is used in br_fdb_update.
Note that new fdb entries which are added manually or
as local ones are still permitted.
This patch has been tested by running traffic via a bridge port and
switching the port's state, also by manually adding/removing entries
from the bridge's fdb.

Signed-off-by: Wilson Kok <wkok@cumulusnetworks.com>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
---
Nik: Maybe it'd be better if we returned an error even though it
     doesn't look necessary. I'm open to suggestions.

 net/bridge/br_fdb.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
index e0670d7054f9..27de0b7bd76b 100644
--- a/net/bridge/br_fdb.c
+++ b/net/bridge/br_fdb.c
@@ -736,6 +736,12 @@ static int fdb_add_entry(struct net_bridge_port *source, const __u8 *addr,
 	struct net_bridge_fdb_entry *fdb;
 	bool modified = false;
 
+	/* If the port cannot learn allow only local and static entries */
+	if (!(state & NUD_PERMANENT) && !(state & NUD_NOARP) &&
+	    !(source->state == BR_STATE_LEARNING ||
+	      source->state == BR_STATE_FORWARDING))
+		return 0;
+
 	fdb = fdb_find(head, addr, vid);
 	if (fdb == NULL) {
 		if (!(flags & NLM_F_CREATE))
-- 
1.9.3

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next] bridge: skip fdb add if the port shouldn't learn
  2015-05-21 10:42 [PATCH net-next] bridge: skip fdb add if the port shouldn't learn Nikolay Aleksandrov
@ 2015-05-25  2:59 ` David Miller
  2015-05-25 11:35   ` Nikolay Aleksandrov
  2015-05-25 11:41   ` Nikolay Aleksandrov
  2015-05-25 13:39 ` [PATCH net-next v2] " Nikolay Aleksandrov
  2015-05-26 17:28 ` [PATCH net-next] " Stephen Hemminger
  2 siblings, 2 replies; 15+ messages in thread
From: David Miller @ 2015-05-25  2:59 UTC (permalink / raw)
  To: nikolay; +Cc: netdev, bridge, wkok

From: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
Date: Thu, 21 May 2015 03:42:57 -0700

> From: Wilson Kok <wkok@cumulusnetworks.com>
> 
> Check in fdb_add_entry() if the source port should learn, similar
> check is used in br_fdb_update.
> Note that new fdb entries which are added manually or
> as local ones are still permitted.
> This patch has been tested by running traffic via a bridge port and
> switching the port's state, also by manually adding/removing entries
> from the bridge's fdb.
> 
> Signed-off-by: Wilson Kok <wkok@cumulusnetworks.com>
> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
> ---
> Nik: Maybe it'd be better if we returned an error even though it
>      doesn't look necessary. I'm open to suggestions.

If you don't return an error, then rtnetlink.c is going to emit a
NEWNEIGH netlink message.  I seriously doubt we want that to happen.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next] bridge: skip fdb add if the port shouldn't learn
  2015-05-25  2:59 ` David Miller
@ 2015-05-25 11:35   ` Nikolay Aleksandrov
  2015-05-25 11:41   ` Nikolay Aleksandrov
  1 sibling, 0 replies; 15+ messages in thread
From: Nikolay Aleksandrov @ 2015-05-25 11:35 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, bridge, Wilson Kok

[-- Attachment #1: Type: text/plain, Size: 1110 bytes --]

On Mon, May 25, 2015 at 4:59 AM, David Miller <davem@davemloft.net> wrote:

> From: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
> Date: Thu, 21 May 2015 03:42:57 -0700
>
> > From: Wilson Kok <wkok@cumulusnetworks.com>
> >
> > Check in fdb_add_entry() if the source port should learn, similar
> > check is used in br_fdb_update.
> > Note that new fdb entries which are added manually or
> > as local ones are still permitted.
> > This patch has been tested by running traffic via a bridge port and
> > switching the port's state, also by manually adding/removing entries
> > from the bridge's fdb.
> >
> > Signed-off-by: Wilson Kok <wkok@cumulusnetworks.com>
> > Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
> > ---
> > Nik: Maybe it'd be better if we returned an error even though it
> >      doesn't look necessary. I'm open to suggestions.
>
> If you don't return an error, then rtnetlink.c is going to emit a
> NEWNEIGH netlink message.  I seriously doubt we want that to happen.
>

Thanks Dave, I was afraid I've missed something like that. I'll re-spin,
test
and post a v2.

Nik

[-- Attachment #2: Type: text/html, Size: 1886 bytes --]

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next] bridge: skip fdb add if the port shouldn't learn
  2015-05-25  2:59 ` David Miller
  2015-05-25 11:35   ` Nikolay Aleksandrov
@ 2015-05-25 11:41   ` Nikolay Aleksandrov
  1 sibling, 0 replies; 15+ messages in thread
From: Nikolay Aleksandrov @ 2015-05-25 11:41 UTC (permalink / raw)
  To: David Miller; +Cc: netdev, Wilson Kok, stephen, bridge

On Mon, May 25, 2015 at 4:59 AM, David Miller <davem@davemloft.net> wrote:
> From: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
> Date: Thu, 21 May 2015 03:42:57 -0700
>
>> From: Wilson Kok <wkok@cumulusnetworks.com>
>>
>> Check in fdb_add_entry() if the source port should learn, similar
>> check is used in br_fdb_update.
>> Note that new fdb entries which are added manually or
>> as local ones are still permitted.
>> This patch has been tested by running traffic via a bridge port and
>> switching the port's state, also by manually adding/removing entries
>> from the bridge's fdb.
>>
>> Signed-off-by: Wilson Kok <wkok@cumulusnetworks.com>
>> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
>> ---
>> Nik: Maybe it'd be better if we returned an error even though it
>>      doesn't look necessary. I'm open to suggestions.
>
> If you don't return an error, then rtnetlink.c is going to emit a
> NEWNEIGH netlink message.  I seriously doubt we want that to happen.

Thanks Dave, I was afraid I've missed something like that. I'll re-spin, test
and post a v2.

Nik

^ permalink raw reply	[flat|nested] 15+ messages in thread

* [PATCH net-next v2] bridge: skip fdb add if the port shouldn't learn
  2015-05-21 10:42 [PATCH net-next] bridge: skip fdb add if the port shouldn't learn Nikolay Aleksandrov
  2015-05-25  2:59 ` David Miller
@ 2015-05-25 13:39 ` Nikolay Aleksandrov
  2015-05-26 17:28 ` [PATCH net-next] " Stephen Hemminger
  2 siblings, 0 replies; 15+ messages in thread
From: Nikolay Aleksandrov @ 2015-05-25 13:39 UTC (permalink / raw)
  To: netdev; +Cc: wkok, davem, stephen, bridge, Nikolay Aleksandrov

From: Wilson Kok <wkok@cumulusnetworks.com>

Check in fdb_add_entry() if the source port should learn, similar
check is used in br_fdb_update.
Note that new fdb entries which are added manually or
as local ones are still permitted.
This patch has been tested by running traffic via a bridge port and
switching the port's state, also by manually adding/removing entries
from the bridge's fdb.

Signed-off-by: Wilson Kok <wkok@cumulusnetworks.com>
Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
---
v2: return an error instead of silently failing.

 net/bridge/br_fdb.c |    6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/net/bridge/br_fdb.c b/net/bridge/br_fdb.c
index e0670d7054f9..7896cf143045 100644
--- a/net/bridge/br_fdb.c
+++ b/net/bridge/br_fdb.c
@@ -736,6 +736,12 @@ static int fdb_add_entry(struct net_bridge_port *source, const __u8 *addr,
 	struct net_bridge_fdb_entry *fdb;
 	bool modified = false;
 
+	/* If the port cannot learn allow only local and static entries */
+	if (!(state & NUD_PERMANENT) && !(state & NUD_NOARP) &&
+	    !(source->state == BR_STATE_LEARNING ||
+	      source->state == BR_STATE_FORWARDING))
+		return -EPERM;
+
 	fdb = fdb_find(head, addr, vid);
 	if (fdb == NULL) {
 		if (!(flags & NLM_F_CREATE))
-- 
1.7.10.4

^ permalink raw reply related	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next] bridge: skip fdb add if the port shouldn't learn
  2015-05-21 10:42 [PATCH net-next] bridge: skip fdb add if the port shouldn't learn Nikolay Aleksandrov
  2015-05-25  2:59 ` David Miller
  2015-05-25 13:39 ` [PATCH net-next v2] " Nikolay Aleksandrov
@ 2015-05-26 17:28 ` Stephen Hemminger
  2015-05-27  7:05   ` Nikolay Aleksandrov
  2 siblings, 1 reply; 15+ messages in thread
From: Stephen Hemminger @ 2015-05-26 17:28 UTC (permalink / raw)
  To: Nikolay Aleksandrov; +Cc: netdev, wkok, davem, bridge

On Thu, 21 May 2015 03:42:57 -0700
Nikolay Aleksandrov <nikolay@cumulusnetworks.com> wrote:

> From: Wilson Kok <wkok@cumulusnetworks.com>
> 
> Check in fdb_add_entry() if the source port should learn, similar
> check is used in br_fdb_update.
> Note that new fdb entries which are added manually or
> as local ones are still permitted.
> This patch has been tested by running traffic via a bridge port and
> switching the port's state, also by manually adding/removing entries
> from the bridge's fdb.
> 
> Signed-off-by: Wilson Kok <wkok@cumulusnetworks.com>
> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>

What is the problem this is trying to solve?

I think user should be allowed to manually add any entry
even if learning.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next] bridge: skip fdb add if the port shouldn't learn
  2015-05-26 17:28 ` [PATCH net-next] " Stephen Hemminger
@ 2015-05-27  7:05   ` Nikolay Aleksandrov
  2015-05-27  7:59     ` Scott Feldman
  0 siblings, 1 reply; 15+ messages in thread
From: Nikolay Aleksandrov @ 2015-05-27  7:05 UTC (permalink / raw)
  To: Stephen Hemminger; +Cc: netdev, Wilson Kok, David Miller, bridge

On Tue, May 26, 2015 at 7:28 PM, Stephen Hemminger
<stephen@networkplumber.org> wrote:
> On Thu, 21 May 2015 03:42:57 -0700
> Nikolay Aleksandrov <nikolay@cumulusnetworks.com> wrote:
>
>> From: Wilson Kok <wkok@cumulusnetworks.com>
>>
>> Check in fdb_add_entry() if the source port should learn, similar
>> check is used in br_fdb_update.
>> Note that new fdb entries which are added manually or
>> as local ones are still permitted.
>> This patch has been tested by running traffic via a bridge port and
>> switching the port's state, also by manually adding/removing entries
>> from the bridge's fdb.
>>
>> Signed-off-by: Wilson Kok <wkok@cumulusnetworks.com>
>> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
>
> What is the problem this is trying to solve?
>
> I think user should be allowed to manually add any entry
> even if learning.

Hi Stephen,
I have been thinking about the use case and have discussed it
internally with colleagues and the patch
author, the main problem is when there's an external software that
adds dynamic entries (learning) and
it could experience a race condition, here's a possible situation:
* external software learns a mac from hw, sends an add to kernel
 * right before that, port goes blocking (or down) and kernel flushes
   mac, sends notification about the state change and mac flush
 * right after that, kernel gets the previous add from external software, it's
   allowed to add, and then sends an add notification
 * mean while, external software processes the link block/down and mac flush,
   followed by the mac add from kernel.  At this point, external software can't
   really know whether it's a user adding the mac intentionally or it's
   a race.

This issue can't really be avoided in user-space.
As I've noted local and static entries are still allowed, and iproute2
bridge utility always
marks the entries as static (NUD_NOARP), this only affects external
dynamic entries which
are usually sent by something that does the learning externally.
I'll keep digging to see if there's another way to go about this since
I'd like to give the user
full freedom. Personally I don't have strong feeling for this patch
and if it's not preferred then
I'll post a revert.

Thanks,
 Nik

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next] bridge: skip fdb add if the port shouldn't learn
  2015-05-27  7:05   ` Nikolay Aleksandrov
@ 2015-05-27  7:59     ` Scott Feldman
  2015-05-27  8:35       ` Nikolay Aleksandrov
  0 siblings, 1 reply; 15+ messages in thread
From: Scott Feldman @ 2015-05-27  7:59 UTC (permalink / raw)
  To: Nikolay Aleksandrov
  Cc: Stephen Hemminger, Netdev, Wilson Kok, David Miller, bridge,
	Jiří Pírko

On Wed, May 27, 2015 at 12:05 AM, Nikolay Aleksandrov
<nikolay@cumulusnetworks.com> wrote:
> On Tue, May 26, 2015 at 7:28 PM, Stephen Hemminger
> <stephen@networkplumber.org> wrote:
>> On Thu, 21 May 2015 03:42:57 -0700
>> Nikolay Aleksandrov <nikolay@cumulusnetworks.com> wrote:
>>
>>> From: Wilson Kok <wkok@cumulusnetworks.com>
>>>
>>> Check in fdb_add_entry() if the source port should learn, similar
>>> check is used in br_fdb_update.
>>> Note that new fdb entries which are added manually or
>>> as local ones are still permitted.
>>> This patch has been tested by running traffic via a bridge port and
>>> switching the port's state, also by manually adding/removing entries
>>> from the bridge's fdb.
>>>
>>> Signed-off-by: Wilson Kok <wkok@cumulusnetworks.com>
>>> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
>>
>> What is the problem this is trying to solve?
>>
>> I think user should be allowed to manually add any entry
>> even if learning.
>
> Hi Stephen,
> I have been thinking about the use case and have discussed it
> internally with colleagues and the patch
> author, the main problem is when there's an external software that
> adds dynamic entries (learning) and
> it could experience a race condition, here's a possible situation:
> * external software learns a mac from hw, sends an add to kernel
>  * right before that, port goes blocking (or down) and kernel flushes
>    mac, sends notification about the state change and mac flush
>  * right after that, kernel gets the previous add from external software, it's
>    allowed to add, and then sends an add notification
>  * mean while, external software processes the link block/down and mac flush,
>    followed by the mac add from kernel.  At this point, external software can't
>    really know whether it's a user adding the mac intentionally or it's
>    a race.
>
> This issue can't really be avoided in user-space.
> As I've noted local and static entries are still allowed, and iproute2
> bridge utility always
> marks the entries as static (NUD_NOARP), this only affects external
> dynamic entries which
> are usually sent by something that does the learning externally.
> I'll keep digging to see if there's another way to go about this since
> I'd like to give the user
> full freedom. Personally I don't have strong feeling for this patch
> and if it's not preferred then
> I'll post a revert.

So there is already a switchdev API to add/del an externally learned
FDB entry which holds rtnl_lock and avoids these races.  I would
suggest using that and revert this patch.

See call_switchdev_notifiers(SWITCHDEV_FDB_ADD|SWITCHDEV_FDB_DEL) and
the handler in br.c:br_switchdev_event().

-scott

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next] bridge: skip fdb add if the port shouldn't learn
  2015-05-27  7:59     ` Scott Feldman
@ 2015-05-27  8:35       ` Nikolay Aleksandrov
  2015-05-27 16:01         ` Scott Feldman
  0 siblings, 1 reply; 15+ messages in thread
From: Nikolay Aleksandrov @ 2015-05-27  8:35 UTC (permalink / raw)
  To: Scott Feldman
  Cc: Stephen Hemminger, Netdev, Wilson Kok, David Miller, bridge,
	Jiří Pírko

On Wed, May 27, 2015 at 9:59 AM, Scott Feldman <sfeldma@gmail.com> wrote:
> On Wed, May 27, 2015 at 12:05 AM, Nikolay Aleksandrov
> <nikolay@cumulusnetworks.com> wrote:
>> On Tue, May 26, 2015 at 7:28 PM, Stephen Hemminger
>> <stephen@networkplumber.org> wrote:
>>> On Thu, 21 May 2015 03:42:57 -0700
>>> Nikolay Aleksandrov <nikolay@cumulusnetworks.com> wrote:
>>>
>>>> From: Wilson Kok <wkok@cumulusnetworks.com>
>>>>
>>>> Check in fdb_add_entry() if the source port should learn, similar
>>>> check is used in br_fdb_update.
>>>> Note that new fdb entries which are added manually or
>>>> as local ones are still permitted.
>>>> This patch has been tested by running traffic via a bridge port and
>>>> switching the port's state, also by manually adding/removing entries
>>>> from the bridge's fdb.
>>>>
>>>> Signed-off-by: Wilson Kok <wkok@cumulusnetworks.com>
>>>> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
>>>
>>> What is the problem this is trying to solve?
>>>
>>> I think user should be allowed to manually add any entry
>>> even if learning.
>>
>> Hi Stephen,
>> I have been thinking about the use case and have discussed it
>> internally with colleagues and the patch
>> author, the main problem is when there's an external software that
>> adds dynamic entries (learning) and
>> it could experience a race condition, here's a possible situation:
>> * external software learns a mac from hw, sends an add to kernel
>>  * right before that, port goes blocking (or down) and kernel flushes
>>    mac, sends notification about the state change and mac flush
>>  * right after that, kernel gets the previous add from external software, it's
>>    allowed to add, and then sends an add notification
>>  * mean while, external software processes the link block/down and mac flush,
>>    followed by the mac add from kernel.  At this point, external software can't
>>    really know whether it's a user adding the mac intentionally or it's
>>    a race.
>>
>> This issue can't really be avoided in user-space.
>> As I've noted local and static entries are still allowed, and iproute2
>> bridge utility always
>> marks the entries as static (NUD_NOARP), this only affects external
>> dynamic entries which
>> are usually sent by something that does the learning externally.
>> I'll keep digging to see if there's another way to go about this since
>> I'd like to give the user
>> full freedom. Personally I don't have strong feeling for this patch
>> and if it's not preferred then
>> I'll post a revert.
>
> So there is already a switchdev API to add/del an externally learned
> FDB entry which holds rtnl_lock and avoids these races.  I would
> suggest using that and revert this patch.
>
> See call_switchdev_notifiers(SWITCHDEV_FDB_ADD|SWITCHDEV_FDB_DEL) and
> the handler in br.c:br_switchdev_event().
>
> -scott

Hmm, I'm new to the switchdev API and am possibly missing something,
but how do you suggest to use it here ?
How can we differentiate between user-added entry and an externally
learned one ?
Do you mean to use (for example) the NTF_EXT_LEARNED flag when adding
an entry from user-space so
the API can get called in br_fdb_add ?
Minor note: br_fdb_add (ndo_fdb_add) is already called with rtnl held,
so the API will have to be used directly.

Thanks,
 Nik

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next] bridge: skip fdb add if the port shouldn't learn
  2015-05-27  8:35       ` Nikolay Aleksandrov
@ 2015-05-27 16:01         ` Scott Feldman
  2015-05-27 16:14           ` Nikolay Aleksandrov
  2015-06-02 17:14           ` roopa
  0 siblings, 2 replies; 15+ messages in thread
From: Scott Feldman @ 2015-05-27 16:01 UTC (permalink / raw)
  To: Nikolay Aleksandrov
  Cc: Stephen Hemminger, Netdev, Wilson Kok, David Miller, bridge,
	Jiří Pírko

On Wed, May 27, 2015 at 1:35 AM, Nikolay Aleksandrov
<nikolay@cumulusnetworks.com> wrote:
> On Wed, May 27, 2015 at 9:59 AM, Scott Feldman <sfeldma@gmail.com> wrote:
>> On Wed, May 27, 2015 at 12:05 AM, Nikolay Aleksandrov
>> <nikolay@cumulusnetworks.com> wrote:
>>> On Tue, May 26, 2015 at 7:28 PM, Stephen Hemminger
>>> <stephen@networkplumber.org> wrote:
>>>> On Thu, 21 May 2015 03:42:57 -0700
>>>> Nikolay Aleksandrov <nikolay@cumulusnetworks.com> wrote:
>>>>
>>>>> From: Wilson Kok <wkok@cumulusnetworks.com>
>>>>>
>>>>> Check in fdb_add_entry() if the source port should learn, similar
>>>>> check is used in br_fdb_update.
>>>>> Note that new fdb entries which are added manually or
>>>>> as local ones are still permitted.
>>>>> This patch has been tested by running traffic via a bridge port and
>>>>> switching the port's state, also by manually adding/removing entries
>>>>> from the bridge's fdb.
>>>>>
>>>>> Signed-off-by: Wilson Kok <wkok@cumulusnetworks.com>
>>>>> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
>>>>
>>>> What is the problem this is trying to solve?
>>>>
>>>> I think user should be allowed to manually add any entry
>>>> even if learning.
>>>
>>> Hi Stephen,
>>> I have been thinking about the use case and have discussed it
>>> internally with colleagues and the patch
>>> author, the main problem is when there's an external software that
>>> adds dynamic entries (learning) and
>>> it could experience a race condition, here's a possible situation:
>>> * external software learns a mac from hw, sends an add to kernel
>>>  * right before that, port goes blocking (or down) and kernel flushes
>>>    mac, sends notification about the state change and mac flush
>>>  * right after that, kernel gets the previous add from external software, it's
>>>    allowed to add, and then sends an add notification
>>>  * mean while, external software processes the link block/down and mac flush,
>>>    followed by the mac add from kernel.  At this point, external software can't
>>>    really know whether it's a user adding the mac intentionally or it's
>>>    a race.
>>>
>>> This issue can't really be avoided in user-space.
>>> As I've noted local and static entries are still allowed, and iproute2
>>> bridge utility always
>>> marks the entries as static (NUD_NOARP), this only affects external
>>> dynamic entries which
>>> are usually sent by something that does the learning externally.
>>> I'll keep digging to see if there's another way to go about this since
>>> I'd like to give the user
>>> full freedom. Personally I don't have strong feeling for this patch
>>> and if it's not preferred then
>>> I'll post a revert.
>>
>> So there is already a switchdev API to add/del an externally learned
>> FDB entry which holds rtnl_lock and avoids these races.  I would
>> suggest using that and revert this patch.
>>
>> See call_switchdev_notifiers(SWITCHDEV_FDB_ADD|SWITCHDEV_FDB_DEL) and
>> the handler in br.c:br_switchdev_event().
>>
>> -scott
>
> Hmm, I'm new to the switchdev API and am possibly missing something,
> but how do you suggest to use it here ?

You need to call
call_switchdev_notifiers(SWITCHDEV_FDB_ADD|SWITCHDEV_FDB_DEL) when the
device learns a new mac/vlan on the port interface.

> How can we differentiate between user-added entry and an externally
> learned one ?

Externally added ones will be marked with NTF_EXT_LEARNED set in
ndm->ndm_flags in the netlink echo.  Manually added ones from the user
will have ndm->ndm_state set to NUD_NOARP in the netlink echo.

> Do you mean to use (for example) the NTF_EXT_LEARNED flag when adding
> an entry from user-space so
> the API can get called in br_fdb_add ?

No.  br_fdb_add is the bridge's .ndo_fdb_add handler called when user
manually adds an FDB entry using netlink RTM_NEWNEIGH.  For externally
learned entries, use the internal
call_switchdev_notifiers(SWITCHDEV_FDB_ADD|SWITCHDEV_FDB_DEL).

-scott

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next] bridge: skip fdb add if the port shouldn't learn
  2015-05-27 16:01         ` Scott Feldman
@ 2015-05-27 16:14           ` Nikolay Aleksandrov
  2015-05-27 20:41             ` Scott Feldman
  2015-06-02 17:14           ` roopa
  1 sibling, 1 reply; 15+ messages in thread
From: Nikolay Aleksandrov @ 2015-05-27 16:14 UTC (permalink / raw)
  To: Scott Feldman
  Cc: Stephen Hemminger, Netdev, Wilson Kok, David Miller, bridge,
	Jiří Pírko

On Wed, May 27, 2015 at 6:01 PM, Scott Feldman <sfeldma@gmail.com> wrote:
> On Wed, May 27, 2015 at 1:35 AM, Nikolay Aleksandrov
> <nikolay@cumulusnetworks.com> wrote:
>> On Wed, May 27, 2015 at 9:59 AM, Scott Feldman <sfeldma@gmail.com> wrote:
>>> On Wed, May 27, 2015 at 12:05 AM, Nikolay Aleksandrov
>>> <nikolay@cumulusnetworks.com> wrote:
>>>> On Tue, May 26, 2015 at 7:28 PM, Stephen Hemminger
>>>> <stephen@networkplumber.org> wrote:
>>>>> On Thu, 21 May 2015 03:42:57 -0700
>>>>> Nikolay Aleksandrov <nikolay@cumulusnetworks.com> wrote:
>>>>>
>>>>>> From: Wilson Kok <wkok@cumulusnetworks.com>
>>>>>>
>>>>>> Check in fdb_add_entry() if the source port should learn, similar
>>>>>> check is used in br_fdb_update.
>>>>>> Note that new fdb entries which are added manually or
>>>>>> as local ones are still permitted.
>>>>>> This patch has been tested by running traffic via a bridge port and
>>>>>> switching the port's state, also by manually adding/removing entries
>>>>>> from the bridge's fdb.
>>>>>>
>>>>>> Signed-off-by: Wilson Kok <wkok@cumulusnetworks.com>
>>>>>> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
>>>>>
>>>>> What is the problem this is trying to solve?
>>>>>
>>>>> I think user should be allowed to manually add any entry
>>>>> even if learning.
>>>>
>>>> Hi Stephen,
>>>> I have been thinking about the use case and have discussed it
>>>> internally with colleagues and the patch
>>>> author, the main problem is when there's an external software that
>>>> adds dynamic entries (learning) and
>>>> it could experience a race condition, here's a possible situation:
>>>> * external software learns a mac from hw, sends an add to kernel
>>>>  * right before that, port goes blocking (or down) and kernel flushes
>>>>    mac, sends notification about the state change and mac flush
>>>>  * right after that, kernel gets the previous add from external software, it's
>>>>    allowed to add, and then sends an add notification
>>>>  * mean while, external software processes the link block/down and mac flush,
>>>>    followed by the mac add from kernel.  At this point, external software can't
>>>>    really know whether it's a user adding the mac intentionally or it's
>>>>    a race.
>>>>
>>>> This issue can't really be avoided in user-space.
>>>> As I've noted local and static entries are still allowed, and iproute2
>>>> bridge utility always
>>>> marks the entries as static (NUD_NOARP), this only affects external
>>>> dynamic entries which
>>>> are usually sent by something that does the learning externally.
>>>> I'll keep digging to see if there's another way to go about this since
>>>> I'd like to give the user
>>>> full freedom. Personally I don't have strong feeling for this patch
>>>> and if it's not preferred then
>>>> I'll post a revert.
>>>
>>> So there is already a switchdev API to add/del an externally learned
>>> FDB entry which holds rtnl_lock and avoids these races.  I would
>>> suggest using that and revert this patch.
>>>
>>> See call_switchdev_notifiers(SWITCHDEV_FDB_ADD|SWITCHDEV_FDB_DEL) and
>>> the handler in br.c:br_switchdev_event().
>>>
>>> -scott
>>
>> Hmm, I'm new to the switchdev API and am possibly missing something,
>> but how do you suggest to use it here ?
>
> You need to call
> call_switchdev_notifiers(SWITCHDEV_FDB_ADD|SWITCHDEV_FDB_DEL) when the
> device learns a new mac/vlan on the port interface.
>
>> How can we differentiate between user-added entry and an externally
>> learned one ?
>
> Externally added ones will be marked with NTF_EXT_LEARNED set in
> ndm->ndm_flags in the netlink echo.  Manually added ones from the user
> will have ndm->ndm_state set to NUD_NOARP in the netlink echo.
>

I meant between externally learned entries from a user-space daemon and manually
added by the user.

>> Do you mean to use (for example) the NTF_EXT_LEARNED flag when adding
>> an entry from user-space so
>> the API can get called in br_fdb_add ?
>
> No.  br_fdb_add is the bridge's .ndo_fdb_add handler called when user
> manually adds an FDB entry using netlink RTM_NEWNEIGH.  For externally
> learned entries, use the internal
> call_switchdev_notifiers(SWITCHDEV_FDB_ADD|SWITCHDEV_FDB_DEL).
>
> -scott

I got the API, but it doesn't help in the situation I explained
because it's a user-space
software that adds the externally learned entry, so I'm talking about
differentiating between externally learned dynamic entry from a device
which doesn't have
a kernel driver and can't call these notifiers, thus if we disallow
such dynamic entries when
the port is not in the proper state helps to both close the race and
fix the problem.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next] bridge: skip fdb add if the port shouldn't learn
  2015-05-27 16:14           ` Nikolay Aleksandrov
@ 2015-05-27 20:41             ` Scott Feldman
  0 siblings, 0 replies; 15+ messages in thread
From: Scott Feldman @ 2015-05-27 20:41 UTC (permalink / raw)
  To: Nikolay Aleksandrov
  Cc: Stephen Hemminger, Netdev, Wilson Kok, David Miller, bridge,
	Jiří Pírko

On Wed, May 27, 2015 at 9:14 AM, Nikolay Aleksandrov
<nikolay@cumulusnetworks.com> wrote:
> On Wed, May 27, 2015 at 6:01 PM, Scott Feldman <sfeldma@gmail.com> wrote:
>> On Wed, May 27, 2015 at 1:35 AM, Nikolay Aleksandrov
>> <nikolay@cumulusnetworks.com> wrote:
>>> On Wed, May 27, 2015 at 9:59 AM, Scott Feldman <sfeldma@gmail.com> wrote:
>>>> On Wed, May 27, 2015 at 12:05 AM, Nikolay Aleksandrov
>>>> <nikolay@cumulusnetworks.com> wrote:
>>>>> On Tue, May 26, 2015 at 7:28 PM, Stephen Hemminger
>>>>> <stephen@networkplumber.org> wrote:
>>>>>> On Thu, 21 May 2015 03:42:57 -0700
>>>>>> Nikolay Aleksandrov <nikolay@cumulusnetworks.com> wrote:
>>>>>>
>>>>>>> From: Wilson Kok <wkok@cumulusnetworks.com>
>>>>>>>
>>>>>>> Check in fdb_add_entry() if the source port should learn, similar
>>>>>>> check is used in br_fdb_update.
>>>>>>> Note that new fdb entries which are added manually or
>>>>>>> as local ones are still permitted.
>>>>>>> This patch has been tested by running traffic via a bridge port and
>>>>>>> switching the port's state, also by manually adding/removing entries
>>>>>>> from the bridge's fdb.
>>>>>>>
>>>>>>> Signed-off-by: Wilson Kok <wkok@cumulusnetworks.com>
>>>>>>> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
>>>>>>
>>>>>> What is the problem this is trying to solve?
>>>>>>
>>>>>> I think user should be allowed to manually add any entry
>>>>>> even if learning.
>>>>>
>>>>> Hi Stephen,
>>>>> I have been thinking about the use case and have discussed it
>>>>> internally with colleagues and the patch
>>>>> author, the main problem is when there's an external software that
>>>>> adds dynamic entries (learning) and
>>>>> it could experience a race condition, here's a possible situation:
>>>>> * external software learns a mac from hw, sends an add to kernel
>>>>>  * right before that, port goes blocking (or down) and kernel flushes
>>>>>    mac, sends notification about the state change and mac flush
>>>>>  * right after that, kernel gets the previous add from external software, it's
>>>>>    allowed to add, and then sends an add notification
>>>>>  * mean while, external software processes the link block/down and mac flush,
>>>>>    followed by the mac add from kernel.  At this point, external software can't
>>>>>    really know whether it's a user adding the mac intentionally or it's
>>>>>    a race.
>>>>>
>>>>> This issue can't really be avoided in user-space.
>>>>> As I've noted local and static entries are still allowed, and iproute2
>>>>> bridge utility always
>>>>> marks the entries as static (NUD_NOARP), this only affects external
>>>>> dynamic entries which
>>>>> are usually sent by something that does the learning externally.
>>>>> I'll keep digging to see if there's another way to go about this since
>>>>> I'd like to give the user
>>>>> full freedom. Personally I don't have strong feeling for this patch
>>>>> and if it's not preferred then
>>>>> I'll post a revert.
>>>>
>>>> So there is already a switchdev API to add/del an externally learned
>>>> FDB entry which holds rtnl_lock and avoids these races.  I would
>>>> suggest using that and revert this patch.
>>>>
>>>> See call_switchdev_notifiers(SWITCHDEV_FDB_ADD|SWITCHDEV_FDB_DEL) and
>>>> the handler in br.c:br_switchdev_event().
>>>>
>>>> -scott
>>>
>>> Hmm, I'm new to the switchdev API and am possibly missing something,
>>> but how do you suggest to use it here ?
>>
>> You need to call
>> call_switchdev_notifiers(SWITCHDEV_FDB_ADD|SWITCHDEV_FDB_DEL) when the
>> device learns a new mac/vlan on the port interface.
>>
>>> How can we differentiate between user-added entry and an externally
>>> learned one ?
>>
>> Externally added ones will be marked with NTF_EXT_LEARNED set in
>> ndm->ndm_flags in the netlink echo.  Manually added ones from the user
>> will have ndm->ndm_state set to NUD_NOARP in the netlink echo.
>>
>
> I meant between externally learned entries from a user-space daemon and manually
> added by the user.
>
>>> Do you mean to use (for example) the NTF_EXT_LEARNED flag when adding
>>> an entry from user-space so
>>> the API can get called in br_fdb_add ?
>>
>> No.  br_fdb_add is the bridge's .ndo_fdb_add handler called when user
>> manually adds an FDB entry using netlink RTM_NEWNEIGH.  For externally
>> learned entries, use the internal
>> call_switchdev_notifiers(SWITCHDEV_FDB_ADD|SWITCHDEV_FDB_DEL).
>>
>> -scott
>
> I got the API, but it doesn't help in the situation I explained
> because it's a user-space
> software that adds the externally learned entry, so I'm talking about
> differentiating between externally learned dynamic entry from a device
> which doesn't have
> a kernel driver and can't call these notifiers, thus if we disallow
> such dynamic entries when
> the port is not in the proper state helps to both close the race and
> fix the problem.

IMO, we should not be adding weird patches like this to the kernel to
support the out-of-kernel user-space switch drivers (SDK).  This patch
is trying to workaround a serialization issue with netlink messages
created by you because you're using a closed-source user-space driver.
It took us a couple of email replies to draw out your use-case, and if
someone down the road tries to figure out what this patch is doing,
it's not going to be obvious from the kernel code.

I feel this patch should be reverted unless someone can show how it
can be useful in another context.

-scotta

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next] bridge: skip fdb add if the port shouldn't learn
  2015-05-27 16:01         ` Scott Feldman
  2015-05-27 16:14           ` Nikolay Aleksandrov
@ 2015-06-02 17:14           ` roopa
  2015-06-03  5:57             ` Scott Feldman
  1 sibling, 1 reply; 15+ messages in thread
From: roopa @ 2015-06-02 17:14 UTC (permalink / raw)
  To: Scott Feldman
  Cc: Jiří Pírko, Nikolay Aleksandrov, Netdev, bridge,
	Wilson Kok, David Miller

On 5/27/15, 9:01 AM, Scott Feldman wrote:
> On Wed, May 27, 2015 at 1:35 AM, Nikolay Aleksandrov
> <nikolay@cumulusnetworks.com> wrote:
>> On Wed, May 27, 2015 at 9:59 AM, Scott Feldman <sfeldma@gmail.com> wrote:
>>> On Wed, May 27, 2015 at 12:05 AM, Nikolay Aleksandrov
>>> <nikolay@cumulusnetworks.com> wrote:
>>>> On Tue, May 26, 2015 at 7:28 PM, Stephen Hemminger
>>>> <stephen@networkplumber.org> wrote:
>>>>> On Thu, 21 May 2015 03:42:57 -0700
>>>>> Nikolay Aleksandrov <nikolay@cumulusnetworks.com> wrote:
>>>>>
>>>>>> From: Wilson Kok <wkok@cumulusnetworks.com>
>>>>>>
>>>>>> Check in fdb_add_entry() if the source port should learn, similar
>>>>>> check is used in br_fdb_update.
>>>>>> Note that new fdb entries which are added manually or
>>>>>> as local ones are still permitted.
>>>>>> This patch has been tested by running traffic via a bridge port and
>>>>>> switching the port's state, also by manually adding/removing entries
>>>>>> from the bridge's fdb.
>>>>>>
>>>>>> Signed-off-by: Wilson Kok <wkok@cumulusnetworks.com>
>>>>>> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
>>>>> What is the problem this is trying to solve?
>>>>>
>>>>> I think user should be allowed to manually add any entry
>>>>> even if learning.
>>>> Hi Stephen,
>>>> I have been thinking about the use case and have discussed it
>>>> internally with colleagues and the patch
>>>> author, the main problem is when there's an external software that
>>>> adds dynamic entries (learning) and
>>>> it could experience a race condition, here's a possible situation:
>>>> * external software learns a mac from hw, sends an add to kernel
>>>>   * right before that, port goes blocking (or down) and kernel flushes
>>>>     mac, sends notification about the state change and mac flush
>>>>   * right after that, kernel gets the previous add from external software, it's
>>>>     allowed to add, and then sends an add notification
>>>>   * mean while, external software processes the link block/down and mac flush,
>>>>     followed by the mac add from kernel.  At this point, external software can't
>>>>     really know whether it's a user adding the mac intentionally or it's
>>>>     a race.
>>>>
>>>> This issue can't really be avoided in user-space.
>>>> As I've noted local and static entries are still allowed, and iproute2
>>>> bridge utility always
>>>> marks the entries as static (NUD_NOARP), this only affects external
>>>> dynamic entries which
>>>> are usually sent by something that does the learning externally.
>>>> I'll keep digging to see if there's another way to go about this since
>>>> I'd like to give the user
>>>> full freedom. Personally I don't have strong feeling for this patch
>>>> and if it's not preferred then
>>>> I'll post a revert.
>>> So there is already a switchdev API to add/del an externally learned
>>> FDB entry which holds rtnl_lock and avoids these races.  I would
>>> suggest using that and revert this patch.
>>>
>>> See call_switchdev_notifiers(SWITCHDEV_FDB_ADD|SWITCHDEV_FDB_DEL) and
>>> the handler in br.c:br_switchdev_event().
>>>
>>> -scott
>> Hmm, I'm new to the switchdev API and am possibly missing something,
>> but how do you suggest to use it here ?
> You need to call
> call_switchdev_notifiers(SWITCHDEV_FDB_ADD|SWITCHDEV_FDB_DEL) when the
> device learns a new mac/vlan on the port interface.
>
>> How can we differentiate between user-added entry and an externally
>> learned one ?
> Externally added ones will be marked with NTF_EXT_LEARNED set in
> ndm->ndm_flags in the netlink echo.  Manually added ones from the user
> will have ndm->ndm_state set to NUD_NOARP in the netlink echo.
>
>> Do you mean to use (for example) the NTF_EXT_LEARNED flag when adding
>> an entry from user-space so
>> the API can get called in br_fdb_add ?
> No.  br_fdb_add is the bridge's .ndo_fdb_add handler called when user
> manually adds an FDB entry using netlink RTM_NEWNEIGH.  For externally
> learned entries, use the internal
> call_switchdev_notifiers(SWITCHDEV_FDB_ADD|SWITCHDEV_FDB_DEL).

scott, I am assuming you are ok with an external learning entity (user 
space driver or a controller) pushing entries
with the NTF_EXT_LEARNED correct ?. Because NTF_EXT_LEARNED is in uapi 
(and analogous to RTNH_F_OFFLOAD in the fib offload world IMO)

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next] bridge: skip fdb add if the port shouldn't learn
  2015-06-02 17:14           ` roopa
@ 2015-06-03  5:57             ` Scott Feldman
  2015-06-04  8:14               ` Nikolay Aleksandrov
  0 siblings, 1 reply; 15+ messages in thread
From: Scott Feldman @ 2015-06-03  5:57 UTC (permalink / raw)
  To: roopa
  Cc: Nikolay Aleksandrov, Stephen Hemminger, Netdev, Wilson Kok,
	David Miller, bridge, Jiří Pírko

On Tue, Jun 2, 2015 at 10:14 AM, roopa <roopa@cumulusnetworks.com> wrote:
> On 5/27/15, 9:01 AM, Scott Feldman wrote:
>>
>> On Wed, May 27, 2015 at 1:35 AM, Nikolay Aleksandrov
>> <nikolay@cumulusnetworks.com> wrote:
>>>
>>> On Wed, May 27, 2015 at 9:59 AM, Scott Feldman <sfeldma@gmail.com> wrote:
>>>>
>>>> On Wed, May 27, 2015 at 12:05 AM, Nikolay Aleksandrov
>>>> <nikolay@cumulusnetworks.com> wrote:
>>>>>
>>>>> On Tue, May 26, 2015 at 7:28 PM, Stephen Hemminger
>>>>> <stephen@networkplumber.org> wrote:
>>>>>>
>>>>>> On Thu, 21 May 2015 03:42:57 -0700
>>>>>> Nikolay Aleksandrov <nikolay@cumulusnetworks.com> wrote:
>>>>>>
>>>>>>> From: Wilson Kok <wkok@cumulusnetworks.com>
>>>>>>>
>>>>>>> Check in fdb_add_entry() if the source port should learn, similar
>>>>>>> check is used in br_fdb_update.
>>>>>>> Note that new fdb entries which are added manually or
>>>>>>> as local ones are still permitted.
>>>>>>> This patch has been tested by running traffic via a bridge port and
>>>>>>> switching the port's state, also by manually adding/removing entries
>>>>>>> from the bridge's fdb.
>>>>>>>
>>>>>>> Signed-off-by: Wilson Kok <wkok@cumulusnetworks.com>
>>>>>>> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
>>>>>>
>>>>>> What is the problem this is trying to solve?
>>>>>>
>>>>>> I think user should be allowed to manually add any entry
>>>>>> even if learning.
>>>>>
>>>>> Hi Stephen,
>>>>> I have been thinking about the use case and have discussed it
>>>>> internally with colleagues and the patch
>>>>> author, the main problem is when there's an external software that
>>>>> adds dynamic entries (learning) and
>>>>> it could experience a race condition, here's a possible situation:
>>>>> * external software learns a mac from hw, sends an add to kernel
>>>>>   * right before that, port goes blocking (or down) and kernel flushes
>>>>>     mac, sends notification about the state change and mac flush
>>>>>   * right after that, kernel gets the previous add from external
>>>>> software, it's
>>>>>     allowed to add, and then sends an add notification
>>>>>   * mean while, external software processes the link block/down and mac
>>>>> flush,
>>>>>     followed by the mac add from kernel.  At this point, external
>>>>> software can't
>>>>>     really know whether it's a user adding the mac intentionally or
>>>>> it's
>>>>>     a race.
>>>>>
>>>>> This issue can't really be avoided in user-space.
>>>>> As I've noted local and static entries are still allowed, and iproute2
>>>>> bridge utility always
>>>>> marks the entries as static (NUD_NOARP), this only affects external
>>>>> dynamic entries which
>>>>> are usually sent by something that does the learning externally.
>>>>> I'll keep digging to see if there's another way to go about this since
>>>>> I'd like to give the user
>>>>> full freedom. Personally I don't have strong feeling for this patch
>>>>> and if it's not preferred then
>>>>> I'll post a revert.
>>>>
>>>> So there is already a switchdev API to add/del an externally learned
>>>> FDB entry which holds rtnl_lock and avoids these races.  I would
>>>> suggest using that and revert this patch.
>>>>
>>>> See call_switchdev_notifiers(SWITCHDEV_FDB_ADD|SWITCHDEV_FDB_DEL) and
>>>> the handler in br.c:br_switchdev_event().
>>>>
>>>> -scott
>>>
>>> Hmm, I'm new to the switchdev API and am possibly missing something,
>>> but how do you suggest to use it here ?
>>
>> You need to call
>> call_switchdev_notifiers(SWITCHDEV_FDB_ADD|SWITCHDEV_FDB_DEL) when the
>> device learns a new mac/vlan on the port interface.
>>
>>> How can we differentiate between user-added entry and an externally
>>> learned one ?
>>
>> Externally added ones will be marked with NTF_EXT_LEARNED set in
>> ndm->ndm_flags in the netlink echo.  Manually added ones from the user
>> will have ndm->ndm_state set to NUD_NOARP in the netlink echo.
>>
>>> Do you mean to use (for example) the NTF_EXT_LEARNED flag when adding
>>> an entry from user-space so
>>> the API can get called in br_fdb_add ?
>>
>> No.  br_fdb_add is the bridge's .ndo_fdb_add handler called when user
>> manually adds an FDB entry using netlink RTM_NEWNEIGH.  For externally
>> learned entries, use the internal
>> call_switchdev_notifiers(SWITCHDEV_FDB_ADD|SWITCHDEV_FDB_DEL).
>
>
> scott, I am assuming you are ok with an external learning entity (user space
> driver or a controller) pushing entries
> with the NTF_EXT_LEARNED correct ?. Because NTF_EXT_LEARNED is in uapi (and
> analogous to RTNH_F_OFFLOAD in the fib offload world IMO)

That seems OK.  I can see how that would remove the need for this
patch, but still give you the control from user space daemon/listener
to figure out what happened.

^ permalink raw reply	[flat|nested] 15+ messages in thread

* Re: [PATCH net-next] bridge: skip fdb add if the port shouldn't learn
  2015-06-03  5:57             ` Scott Feldman
@ 2015-06-04  8:14               ` Nikolay Aleksandrov
  0 siblings, 0 replies; 15+ messages in thread
From: Nikolay Aleksandrov @ 2015-06-04  8:14 UTC (permalink / raw)
  To: Scott Feldman
  Cc: Jiří Pírko, Netdev, roopa, bridge, Wilson Kok,
	David Miller

On Wed, Jun 3, 2015 at 7:57 AM, Scott Feldman <sfeldma@gmail.com> wrote:
> On Tue, Jun 2, 2015 at 10:14 AM, roopa <roopa@cumulusnetworks.com> wrote:
>> On 5/27/15, 9:01 AM, Scott Feldman wrote:
>>>
>>> On Wed, May 27, 2015 at 1:35 AM, Nikolay Aleksandrov
>>> <nikolay@cumulusnetworks.com> wrote:
>>>>
>>>> On Wed, May 27, 2015 at 9:59 AM, Scott Feldman <sfeldma@gmail.com> wrote:
>>>>>
>>>>> On Wed, May 27, 2015 at 12:05 AM, Nikolay Aleksandrov
>>>>> <nikolay@cumulusnetworks.com> wrote:
>>>>>>
>>>>>> On Tue, May 26, 2015 at 7:28 PM, Stephen Hemminger
>>>>>> <stephen@networkplumber.org> wrote:
>>>>>>>
>>>>>>> On Thu, 21 May 2015 03:42:57 -0700
>>>>>>> Nikolay Aleksandrov <nikolay@cumulusnetworks.com> wrote:
>>>>>>>
>>>>>>>> From: Wilson Kok <wkok@cumulusnetworks.com>
>>>>>>>>
>>>>>>>> Check in fdb_add_entry() if the source port should learn, similar
>>>>>>>> check is used in br_fdb_update.
>>>>>>>> Note that new fdb entries which are added manually or
>>>>>>>> as local ones are still permitted.
>>>>>>>> This patch has been tested by running traffic via a bridge port and
>>>>>>>> switching the port's state, also by manually adding/removing entries
>>>>>>>> from the bridge's fdb.
>>>>>>>>
>>>>>>>> Signed-off-by: Wilson Kok <wkok@cumulusnetworks.com>
>>>>>>>> Signed-off-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>
>>>>>>>
>>>>>>> What is the problem this is trying to solve?
>>>>>>>
>>>>>>> I think user should be allowed to manually add any entry
>>>>>>> even if learning.
>>>>>>
>>>>>> Hi Stephen,
>>>>>> I have been thinking about the use case and have discussed it
>>>>>> internally with colleagues and the patch
>>>>>> author, the main problem is when there's an external software that
>>>>>> adds dynamic entries (learning) and
>>>>>> it could experience a race condition, here's a possible situation:
>>>>>> * external software learns a mac from hw, sends an add to kernel
>>>>>>   * right before that, port goes blocking (or down) and kernel flushes
>>>>>>     mac, sends notification about the state change and mac flush
>>>>>>   * right after that, kernel gets the previous add from external
>>>>>> software, it's
>>>>>>     allowed to add, and then sends an add notification
>>>>>>   * mean while, external software processes the link block/down and mac
>>>>>> flush,
>>>>>>     followed by the mac add from kernel.  At this point, external
>>>>>> software can't
>>>>>>     really know whether it's a user adding the mac intentionally or
>>>>>> it's
>>>>>>     a race.
>>>>>>
>>>>>> This issue can't really be avoided in user-space.
>>>>>> As I've noted local and static entries are still allowed, and iproute2
>>>>>> bridge utility always
>>>>>> marks the entries as static (NUD_NOARP), this only affects external
>>>>>> dynamic entries which
>>>>>> are usually sent by something that does the learning externally.
>>>>>> I'll keep digging to see if there's another way to go about this since
>>>>>> I'd like to give the user
>>>>>> full freedom. Personally I don't have strong feeling for this patch
>>>>>> and if it's not preferred then
>>>>>> I'll post a revert.
>>>>>
>>>>> So there is already a switchdev API to add/del an externally learned
>>>>> FDB entry which holds rtnl_lock and avoids these races.  I would
>>>>> suggest using that and revert this patch.
>>>>>
>>>>> See call_switchdev_notifiers(SWITCHDEV_FDB_ADD|SWITCHDEV_FDB_DEL) and
>>>>> the handler in br.c:br_switchdev_event().
>>>>>
>>>>> -scott
>>>>
>>>> Hmm, I'm new to the switchdev API and am possibly missing something,
>>>> but how do you suggest to use it here ?
>>>
>>> You need to call
>>> call_switchdev_notifiers(SWITCHDEV_FDB_ADD|SWITCHDEV_FDB_DEL) when the
>>> device learns a new mac/vlan on the port interface.
>>>
>>>> How can we differentiate between user-added entry and an externally
>>>> learned one ?
>>>
>>> Externally added ones will be marked with NTF_EXT_LEARNED set in
>>> ndm->ndm_flags in the netlink echo.  Manually added ones from the user
>>> will have ndm->ndm_state set to NUD_NOARP in the netlink echo.
>>>
>>>> Do you mean to use (for example) the NTF_EXT_LEARNED flag when adding
>>>> an entry from user-space so
>>>> the API can get called in br_fdb_add ?
>>>
>>> No.  br_fdb_add is the bridge's .ndo_fdb_add handler called when user
>>> manually adds an FDB entry using netlink RTM_NEWNEIGH.  For externally
>>> learned entries, use the internal
>>> call_switchdev_notifiers(SWITCHDEV_FDB_ADD|SWITCHDEV_FDB_DEL).
>>
>>
>> scott, I am assuming you are ok with an external learning entity (user space
>> driver or a controller) pushing entries
>> with the NTF_EXT_LEARNED correct ?. Because NTF_EXT_LEARNED is in uapi (and
>> analogous to RTNH_F_OFFLOAD in the fib offload world IMO)
>
> That seems OK.  I can see how that would remove the need for this
> patch, but still give you the control from user space daemon/listener
> to figure out what happened.

Hi,
I've been working on that but it doesn't really solve the problem
entirely because we still can get the
same race condition for the replace/modify case.
The reason is we have 2 choices:
1. Keep the flag permanent when an entry is created(learned) with it
 - This seems like the proper way since the entry was learned
externally somehow and that won't change

2. Modify the flag upon user change
 - I don't like this because it breaks the meaning and the consistency.

Thus we still cannot distinguish between user-generated request for
such entry and an external learning
process modifying it in the situation I gave in the beginning.
The more we discuss this patch internally, the more I'm actually
convinced it's correct because any
external learning entity (be it SDK, or some other software that adds
entries) should get an error when
trying to add/modify a dynamic (learned) entry for a port which
shouldn't learn. The same applies for
the entries that are learned via br_fdb_update, and that's why a
similar check is present there.
I think we should keep the patch, the new behaviour is justified.

Cheers,
 Nik

^ permalink raw reply	[flat|nested] 15+ messages in thread

end of thread, other threads:[~2015-06-04  8:14 UTC | newest]

Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2015-05-21 10:42 [PATCH net-next] bridge: skip fdb add if the port shouldn't learn Nikolay Aleksandrov
2015-05-25  2:59 ` David Miller
2015-05-25 11:35   ` Nikolay Aleksandrov
2015-05-25 11:41   ` Nikolay Aleksandrov
2015-05-25 13:39 ` [PATCH net-next v2] " Nikolay Aleksandrov
2015-05-26 17:28 ` [PATCH net-next] " Stephen Hemminger
2015-05-27  7:05   ` Nikolay Aleksandrov
2015-05-27  7:59     ` Scott Feldman
2015-05-27  8:35       ` Nikolay Aleksandrov
2015-05-27 16:01         ` Scott Feldman
2015-05-27 16:14           ` Nikolay Aleksandrov
2015-05-27 20:41             ` Scott Feldman
2015-06-02 17:14           ` roopa
2015-06-03  5:57             ` Scott Feldman
2015-06-04  8:14               ` Nikolay Aleksandrov

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).