From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=cumulusnetworks.com; s=google; h=message-id:date:from:user-agent:mime-version:to:cc:subject :references:in-reply-to:content-type:content-transfer-encoding; bh=+2g7D43fbQHVc6oq7TS8QjIuzVoabc2fdj4mME1e6qs=; b=I2dOa86sApLaVgorFa2x3KVPFP1eicVA9oPwZ7u2PRcClIb7pDc4XFQML/GZIGiHqo n2R6uT3yXB6aO0NHHH/21mLIJTGMBFuOlxTh6B6PzIdkj8A0SooJDKzdhyQvZggdqBwW KtbmNpPAAs9lmEcf7v2a3B9C9rozgtxnx594Q= Message-ID: <556DE481.5030201@cumulusnetworks.com> Date: Tue, 02 Jun 2015 10:14:41 -0700 From: roopa MIME-Version: 1.0 References: <1432204977-4293-1-git-send-email-nikolay@cumulusnetworks.com> <20150526102809.5c3e8abc@urahara> In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Subject: Re: [Bridge] [PATCH net-next] bridge: skip fdb add if the port shouldn't learn List-Id: Linux Ethernet Bridging List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , To: Scott Feldman Cc: =?UTF-8?B?SmnFmcOtIFDDrXJrbw==?= , Nikolay Aleksandrov , Netdev , bridge@lists.linux-foundation.org, Wilson Kok , David Miller On 5/27/15, 9:01 AM, Scott Feldman wrote: > On Wed, May 27, 2015 at 1:35 AM, Nikolay Aleksandrov > wrote: >> On Wed, May 27, 2015 at 9:59 AM, Scott Feldman wrote: >>> On Wed, May 27, 2015 at 12:05 AM, Nikolay Aleksandrov >>> wrote: >>>> On Tue, May 26, 2015 at 7:28 PM, Stephen Hemminger >>>> wrote: >>>>> On Thu, 21 May 2015 03:42:57 -0700 >>>>> Nikolay Aleksandrov wrote: >>>>> >>>>>> From: Wilson Kok >>>>>> >>>>>> Check in fdb_add_entry() if the source port should learn, similar >>>>>> check is used in br_fdb_update. >>>>>> Note that new fdb entries which are added manually or >>>>>> as local ones are still permitted. >>>>>> This patch has been tested by running traffic via a bridge port and >>>>>> switching the port's state, also by manually adding/removing entries >>>>>> from the bridge's fdb. >>>>>> >>>>>> Signed-off-by: Wilson Kok >>>>>> Signed-off-by: Nikolay Aleksandrov >>>>> What is the problem this is trying to solve? >>>>> >>>>> I think user should be allowed to manually add any entry >>>>> even if learning. >>>> Hi Stephen, >>>> I have been thinking about the use case and have discussed it >>>> internally with colleagues and the patch >>>> author, the main problem is when there's an external software that >>>> adds dynamic entries (learning) and >>>> it could experience a race condition, here's a possible situation: >>>> * external software learns a mac from hw, sends an add to kernel >>>> * right before that, port goes blocking (or down) and kernel flushes >>>> mac, sends notification about the state change and mac flush >>>> * right after that, kernel gets the previous add from external software, it's >>>> allowed to add, and then sends an add notification >>>> * mean while, external software processes the link block/down and mac flush, >>>> followed by the mac add from kernel. At this point, external software can't >>>> really know whether it's a user adding the mac intentionally or it's >>>> a race. >>>> >>>> This issue can't really be avoided in user-space. >>>> As I've noted local and static entries are still allowed, and iproute2 >>>> bridge utility always >>>> marks the entries as static (NUD_NOARP), this only affects external >>>> dynamic entries which >>>> are usually sent by something that does the learning externally. >>>> I'll keep digging to see if there's another way to go about this since >>>> I'd like to give the user >>>> full freedom. Personally I don't have strong feeling for this patch >>>> and if it's not preferred then >>>> I'll post a revert. >>> So there is already a switchdev API to add/del an externally learned >>> FDB entry which holds rtnl_lock and avoids these races. I would >>> suggest using that and revert this patch. >>> >>> See call_switchdev_notifiers(SWITCHDEV_FDB_ADD|SWITCHDEV_FDB_DEL) and >>> the handler in br.c:br_switchdev_event(). >>> >>> -scott >> Hmm, I'm new to the switchdev API and am possibly missing something, >> but how do you suggest to use it here ? > You need to call > call_switchdev_notifiers(SWITCHDEV_FDB_ADD|SWITCHDEV_FDB_DEL) when the > device learns a new mac/vlan on the port interface. > >> How can we differentiate between user-added entry and an externally >> learned one ? > Externally added ones will be marked with NTF_EXT_LEARNED set in > ndm->ndm_flags in the netlink echo. Manually added ones from the user > will have ndm->ndm_state set to NUD_NOARP in the netlink echo. > >> Do you mean to use (for example) the NTF_EXT_LEARNED flag when adding >> an entry from user-space so >> the API can get called in br_fdb_add ? > No. br_fdb_add is the bridge's .ndo_fdb_add handler called when user > manually adds an FDB entry using netlink RTM_NEWNEIGH. For externally > learned entries, use the internal > call_switchdev_notifiers(SWITCHDEV_FDB_ADD|SWITCHDEV_FDB_DEL). scott, I am assuming you are ok with an external learning entity (user space driver or a controller) pushing entries with the NTF_EXT_LEARNED correct ?. Because NTF_EXT_LEARNED is in uapi (and analogous to RTNH_F_OFFLOAD in the fib offload world IMO) From mboxrd@z Thu Jan 1 00:00:00 1970 From: roopa Subject: Re: [PATCH net-next] bridge: skip fdb add if the port shouldn't learn Date: Tue, 02 Jun 2015 10:14:41 -0700 Message-ID: <556DE481.5030201@cumulusnetworks.com> References: <1432204977-4293-1-git-send-email-nikolay@cumulusnetworks.com> <20150526102809.5c3e8abc@urahara> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Cc: =?UTF-8?B?SmnFmcOtIFDDrXJrbw==?= , Nikolay Aleksandrov , Netdev , bridge@lists.linux-foundation.org, Wilson Kok , David Miller To: Scott Feldman Return-path: In-Reply-To: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: bridge-bounces@lists.linux-foundation.org Errors-To: bridge-bounces@lists.linux-foundation.org List-Id: netdev.vger.kernel.org On 5/27/15, 9:01 AM, Scott Feldman wrote: > On Wed, May 27, 2015 at 1:35 AM, Nikolay Aleksandrov > wrote: >> On Wed, May 27, 2015 at 9:59 AM, Scott Feldman wrote: >>> On Wed, May 27, 2015 at 12:05 AM, Nikolay Aleksandrov >>> wrote: >>>> On Tue, May 26, 2015 at 7:28 PM, Stephen Hemminger >>>> wrote: >>>>> On Thu, 21 May 2015 03:42:57 -0700 >>>>> Nikolay Aleksandrov wrote: >>>>> >>>>>> From: Wilson Kok >>>>>> >>>>>> Check in fdb_add_entry() if the source port should learn, similar >>>>>> check is used in br_fdb_update. >>>>>> Note that new fdb entries which are added manually or >>>>>> as local ones are still permitted. >>>>>> This patch has been tested by running traffic via a bridge port and >>>>>> switching the port's state, also by manually adding/removing entries >>>>>> from the bridge's fdb. >>>>>> >>>>>> Signed-off-by: Wilson Kok >>>>>> Signed-off-by: Nikolay Aleksandrov >>>>> What is the problem this is trying to solve? >>>>> >>>>> I think user should be allowed to manually add any entry >>>>> even if learning. >>>> Hi Stephen, >>>> I have been thinking about the use case and have discussed it >>>> internally with colleagues and the patch >>>> author, the main problem is when there's an external software that >>>> adds dynamic entries (learning) and >>>> it could experience a race condition, here's a possible situation: >>>> * external software learns a mac from hw, sends an add to kernel >>>> * right before that, port goes blocking (or down) and kernel flushes >>>> mac, sends notification about the state change and mac flush >>>> * right after that, kernel gets the previous add from external software, it's >>>> allowed to add, and then sends an add notification >>>> * mean while, external software processes the link block/down and mac flush, >>>> followed by the mac add from kernel. At this point, external software can't >>>> really know whether it's a user adding the mac intentionally or it's >>>> a race. >>>> >>>> This issue can't really be avoided in user-space. >>>> As I've noted local and static entries are still allowed, and iproute2 >>>> bridge utility always >>>> marks the entries as static (NUD_NOARP), this only affects external >>>> dynamic entries which >>>> are usually sent by something that does the learning externally. >>>> I'll keep digging to see if there's another way to go about this since >>>> I'd like to give the user >>>> full freedom. Personally I don't have strong feeling for this patch >>>> and if it's not preferred then >>>> I'll post a revert. >>> So there is already a switchdev API to add/del an externally learned >>> FDB entry which holds rtnl_lock and avoids these races. I would >>> suggest using that and revert this patch. >>> >>> See call_switchdev_notifiers(SWITCHDEV_FDB_ADD|SWITCHDEV_FDB_DEL) and >>> the handler in br.c:br_switchdev_event(). >>> >>> -scott >> Hmm, I'm new to the switchdev API and am possibly missing something, >> but how do you suggest to use it here ? > You need to call > call_switchdev_notifiers(SWITCHDEV_FDB_ADD|SWITCHDEV_FDB_DEL) when the > device learns a new mac/vlan on the port interface. > >> How can we differentiate between user-added entry and an externally >> learned one ? > Externally added ones will be marked with NTF_EXT_LEARNED set in > ndm->ndm_flags in the netlink echo. Manually added ones from the user > will have ndm->ndm_state set to NUD_NOARP in the netlink echo. > >> Do you mean to use (for example) the NTF_EXT_LEARNED flag when adding >> an entry from user-space so >> the API can get called in br_fdb_add ? > No. br_fdb_add is the bridge's .ndo_fdb_add handler called when user > manually adds an FDB entry using netlink RTM_NEWNEIGH. For externally > learned entries, use the internal > call_switchdev_notifiers(SWITCHDEV_FDB_ADD|SWITCHDEV_FDB_DEL). scott, I am assuming you are ok with an external learning entity (user space driver or a controller) pushing entries with the NTF_EXT_LEARNED correct ?. Because NTF_EXT_LEARNED is in uapi (and analogous to RTNH_F_OFFLOAD in the fib offload world IMO)