* Re: [PATCH v2] rhashtable: make walk safe from softirq context
From: Bob Copeland @ 2019-02-12 21:54 UTC (permalink / raw)
To: Johannes Berg; +Cc: David Miller, linux-wireless, netdev, j, tgraf, herbert
In-Reply-To: <ea6479e089dffc46bb9b09e28d364db4c071998f.camel@sipsolutions.net>
On Tue, Feb 12, 2019 at 08:03:17PM +0100, Johannes Berg wrote:
> commit 60854fd94573f0d3b80b55b40cf0140a0430f3ab
> Author: Bob Copeland <me@bobcopeland.com>
> Date: Wed Mar 2 10:09:20 2016 -0500
>
> mac80211: mesh: convert path table to rhashtable
>
> which is kinda old. Not sure why this didn't surface before, because the
> spinlock was introduced *before*, otherwise certainly the mutex would've
> caused us to not be able to do this code to start with (commit
> c6ff5268293 - rhashtable: Fix walker list corruption).
>
> That commit also just converted an existing hashtable walk to
> rhashtable, so not sure that counts as having introduced the problem :-)
Yeah, I didn't change any of the contexts in which the iterations were
called, except making it possible to replace the previous mesh-specific
RCU hashtable with rhashtable by introducing this. That said, it would
certainly be nice to not have to walk the table to find paths that
use a specific station as nexthop.
> But I guess we should also ask Bob first:
> 1) do you think it'd be easy to maintain a separate list or avoid the
> iteration in some otherway, and make that a small enough patch to be
> applicable for stable?
I'm not sure, it's been a while. I guess most of the difficulties would
be around station removal?
> 2) or do you think maybe the mesh_plink_broken() call could just be
> lifted into a workqueue instead?
As far as I know, no reason this can't wait until later; we might send
a few frames to the wrong destination but that can happen anyway. But
there are a couple more walks in there like mesh_path_flush_by_nexthop().
--
Bob Copeland %% https://bobcopeland.com/
^ permalink raw reply
* Re: [PATCH net-next] rocker: Remove port_attr_bridge_flags_get assignment
From: Florian Fainelli @ 2019-02-12 21:56 UTC (permalink / raw)
To: David Miller; +Cc: netdev, jiri, linux-kernel
In-Reply-To: <20190212.135401.857430040745833995.davem@davemloft.net>
On 2/12/2019 1:54 PM, David Miller wrote:
> From: Florian Fainelli <f.fainelli@gmail.com>
> Date: Tue, 12 Feb 2019 13:39:05 -0800
>
>> After 610d2b601bba ("rocker: Remove getting PORT_BRIDGE_FLAGS") we no
>> longer have a port_attr_bridge_flags_get member in the rocker_world_ops
>> structre, fix that.
>>
>> Fixes: 610d2b601bba ("rocker: Remove getting PORT_BRIDGE_FLAGS")
>> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
>
> Applied, I'm also getting this:
>
> drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c: In function ‘mlxsw_sp_port_attr_get’:
> drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c:438:19: warning: unused variable ‘mlxsw_sp’ [-Wunused-variable]
> struct mlxsw_sp *mlxsw_sp = mlxsw_sp_port->mlxsw_sp;
> ^~~~~~~~
>
> Related?
Unfortunately yes, that whole patch series was used as test for my build
scripts, which clearly failed miserably here.. Follow-up patches in the
next hours.
--
Florian
^ permalink raw reply
* Re: [RFC 12/19] RDMA/irdma: Implement device supported verb APIs
From: Jason Gunthorpe @ 2019-02-12 22:27 UTC (permalink / raw)
To: Shiraz Saleem
Cc: dledford, davem, linux-rdma, netdev, mustafa.ismail,
jeffrey.t.kirsher
In-Reply-To: <20190212214402.23284-13-shiraz.saleem@intel.com>
On Tue, Feb 12, 2019 at 03:43:55PM -0600, Shiraz Saleem wrote:
> +/**
> + * irdma_disassociate_ucontext - Disassociate user context
> + * @context: ib user context
> + */
> +static void irdma_disassociate_ucontext(struct ib_ucontext *context)
> +{
> + struct irdma_ucontext *ucontext = to_ucontext(context);
> +
> + struct irdma_vma_data *vma_data, *n;
> + struct vm_area_struct *vma;
> +
> + irdma_dev_info(&ucontext->iwdev->rf->sc_dev, "called\n");
> + mutex_lock(&ucontext->vma_list_mutex);
> + list_for_each_entry_safe(vma_data, n, &ucontext->vma_list, list) {
> + vma = vma_data->vma;
> + zap_vma_ptes(vma, vma->vm_start, PAGE_SIZE);
> +
> + vma->vm_flags &= ~(VM_SHARED | VM_MAYSHARE);
> + vma->vm_ops = NULL;
> + list_del(&vma_data->list);
> + kfree(vma_data);
> + }
> + mutex_unlock(&ucontext->vma_list_mutex);
> +}
You need to study all the changes that have been done in the core code
and make sure this driver is using all the latest stuff, I do not want
to review a driver and find it is full of obsolete APIs like this
above.
Jason
^ permalink raw reply
* Fw: [Bug 202561] BUG: Null pointer dereference in __skb_unlink()
From: Stephen Hemminger @ 2019-02-12 22:45 UTC (permalink / raw)
To: netdev
https://bugzilla.kernel.org/show_bug.cgi?id=202561
Backtrace:
[ 3.589241] BUG: unable to handle kernel NULL pointer dereference at
0000000000000008
[ 3.598006] IP: __skb_try_recv_from_queue+0x4e/0x1b0
[ 3.606376] Oops: 0002 [#1] PREEMPT SMP NOPTI
[ 3.720520] RIP: 0010:__skb_try_recv_from_queue+0x4e/0x1b0
[ 3.726645] RSP: 0018:ffffb03400e53ac8 EFLAGS: 00010046
[ 3.732470] RAX: ffff9040a78988c8 RBX: ffff904096bbef00 RCX: 000000000000000
[ 3.740441] RDX: 0000000000000000 RSI: ffff9040a78988c8 RDI: fff9040a7898800
[ 3.748411] RBP: ffffb03400e53ae8 R08: ffffb03400e53bf8 R09: fffb03400e53bfc
[ 3.756382] R10: ffff904096989d00 R11: 0000000000000000 R12: fff9040a78988dc
[ 3.764345] R13: ffff9040a78988c8 R14: 0000000000000202 R15: fffb03400e53ba8
[ 3.772305] FS: 00007feb1ef2f740(0000) GS:ffff9040bfd00000(0000)
knlGS:0000000000000000
[ 3.781342] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 3.787757] CR2: 0000000000000008 CR3: 00000001e87fa000 CR4:
00000000003406a0
[ 3.795725] Call Trace:
[ 3.798456] ? preempt_count_add+0x22/0x30
[ 3.803027] __skb_try_recv_datagram+0xe2/0x160
[ 3.808086] __skb_recv_datagram+0x8a/0xc0
[ 3.812657] skb_recv_datagram+0x3a/0x50
[ 3.817035] netlink_recvmsg+0x4e/0x3c0
[ 3.821315] ? copy_msghdr_from_user+0xcf/0x150
[ 3.826372] sock_recvmsg+0x3b/0x50
[ 3.830264] ___sys_recvmsg+0xd4/0x180
[ 3.834441] ? poll_select_copy_remaining+0x140/0x140
[ 3.840080] ? poll_select_copy_remaining+0x140/0x140
[ 3.845721] ? __switch_to_asm+0x34/0x70
[ 3.850098] ? __switch_to_asm+0x40/0x70
[ 3.854473] ? __switch_to_asm+0x34/0x70
[ 3.858849] ? __switch_to_asm+0x40/0x70
[ 3.863228] ? __switch_to_asm+0x34/0x70
[ 3.867604] ? __fget+0x71/0xa0
[ 3.871108] __sys_recvmsg+0x4c/0x90
[ 3.875097] ? __sys_recvmsg+0x4c/0x90
[ 3.879280] SyS_recvmsg+0x9/0x10
[ 3.882979] do_syscall_64+0x7e/0x350
[ 3.887064] ? trace_hardirqs_off_thunk+0x1a/0x1c
[ 3.892315] entry_SYSCALL_64_after_hwframe+0x42/0xb7
Original report from sharathkernel@gmail.com:
NULL POINTER DEFERENCE DURING __skb_unlink()
In the function call, __skb_try_recv_from_queue() (net/core/datagram.c),
sbk_queue_walk() walks through the queue without checking if the next member in the queue has valid next pointer/address. When a socket buffer has to unlink, __skb_unlink() is called.
Inside __skb_unlink() function, it doesn't verify if skb->next has a valid address. skb->next is assigned and used, without verifying the value inside it.
What could be probable solution, in this scenario? Should we check if skb->next is not NULL, before calling __skb_unlink()?
--------------------------------------------------------------------------
--------------------------------------------------------------------------
net/core/datagram.c
struct sk_buff *__skb_try_recv_from_queue(struct sock *sk, struct sk_buff_head *queue, unsigned int flags,void (*destructor)(struct sock *sk, struct sk_buff *skb),int *peeked, int *off, int *err, struct sk_buff **last)
{
...
...
...
skb_queue_walk(queue, skb) {
...
...
...
} else {
==> __skb_unlink(skb, queue);
if(destructor){
....
...
...
}
^ permalink raw reply
* Re: [PATCH net] dsa: mv88e6xxx: Ensure all pending interrupts are handled prior to exit
From: Russell King - ARM Linux admin @ 2019-02-12 22:55 UTC (permalink / raw)
To: Heiner Kallweit
Cc: Andrew Lunn, John David Anglin, Vivien Didelot, Florian Fainelli,
netdev
In-Reply-To: <5ba5b654-ca61-253f-042a-2a178ff86b36@gmail.com>
On Tue, Feb 12, 2019 at 09:54:55PM +0100, Heiner Kallweit wrote:
> On 12.02.2019 17:30, Russell King - ARM Linux admin wrote:
> > On Tue, Feb 12, 2019 at 07:51:05AM +0100, Heiner Kallweit wrote:
> >> On 12.02.2019 04:58, Andrew Lunn wrote:
> >>> That change means we don't check the PHY device if it caused an
> >>> interrupt when its state is less than UP.
> >>>
> >>> What i'm seeing is that the PHY is interrupting pretty early on after
> >>> a reboot when the previous boot had the interface up.
> >>>
> >> So this means that when going down for reboot the interrupts are not
> >> properly masked / disabled? Because (at least for net-next) we enable
> >> interrupts in phy_start() only.
> >
> [..]
> > In looking at this, I came across this chunk of code:
> >
> > static inline bool __phy_is_started(struct phy_device *phydev)
> > {
> > WARN_ON(!mutex_is_locked(&phydev->lock));
> >
> > return phydev->state >= PHY_UP;
> > }
> >
> > /**
> > * phy_is_started - Convenience function to check whether PHY is started
> > * @phydev: The phy_device struct
> > */
> > static inline bool phy_is_started(struct phy_device *phydev)
> > {
> > bool started;
> >
> > mutex_lock(&phydev->lock);
> > started = __phy_is_started(phydev);
> > mutex_unlock(&phydev->lock);
> >
> > return started;
> > }
> >
> > which looks to me like over-complication. The mutex locking there is
> > completely pointless - what are you trying to achieve with it?
> >
> > Let's go through this. The above is exactly equivalent to:
> >
> > bool phy_is_started(phydev)
> > {
> > int state;
> >
> > mutex_lock(&phydev->lock);
> > state = phydev->state;
> > mutex_unlock(&phydev->lock);
> >
> > return state >= PHY_UP;
> > }
> >
> > since when we do the test is irrelevant. Architectures that Linux
> > runs on are single-copy atomic, which means that reading phydev->state
> > itself is an atomic operation. So, the mutex locking around that
> > doesn't add to the atomicity of the entire operation.
> >
> > How, depending on what you do with the rest of this function depends
> > whether the entire operation is safe or not. For example, let's take
> > this code at the end of phy_state_machine():
> >
> > if (phy_polling_mode(phydev) && phy_is_started(phydev))
> > phy_queue_state_machine(phydev, PHY_STATE_TIME);
> >
> > state = PHY_UP
> > thread 0 thread 1
> > phy_disconnect()
> > +-phy_is_started()
> > phy_is_started() |
> > `-phy_stop()
> > +-phydev->state = PHY_HALTED
> > `-phy_stop_machine()
> > `-cancel_delayed_work_sync()
> > phy_queue_state_machine()
> > `-mod_delayed_work()
> >
> > At this point, the phydev->state_queue() has been added back onto the
> > system workqueue despite phy_stop_machine() having been called and
> > cancel_delayed_work_sync() called on it.
> >
> > The original code in 4.20 did not have this race condition.
> >
> > Basically, the lock inside phy_is_started() does nothing useful, and
> > I'd say is dangerously misleading.
> >
> Then idea would be to first remove the locking from phy_is_started()
> and in a second step do the following to prevent the described race
> (phy_stop() takes phydev->lock too).
>
> diff --git a/drivers/net/phy/phy.c b/drivers/net/phy/phy.c
> index c1ed03800..69dc64a4d 100644
> --- a/drivers/net/phy/phy.c
> +++ b/drivers/net/phy/phy.c
> @@ -957,8 +957,10 @@ void phy_state_machine(struct work_struct *work)
> * state machine would be pointless and possibly error prone when
> * called from phy_disconnect() synchronously.
> */
> + mutex_lock(&phydev->lock);
> if (phy_polling_mode(phydev) && phy_is_started(phydev))
> phy_queue_state_machine(phydev, PHY_STATE_TIME);
> + mutex_unlock(&phydev->lock);
> }
Yep, that approach would certainly be better. I didn't exhaustively
audit the 5.0-rc code though.
--
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line in suburbia: sync at 12.1Mbps down 622kbps up
According to speedtest.net: 11.9Mbps down 500kbps up
^ permalink raw reply
* Re: [PATCH] inet_diag: fix reporting cgroup classid and fallback to priority
From: Eric Dumazet @ 2019-02-12 23:03 UTC (permalink / raw)
To: Konstantin Khlebnikov, netdev, linux-kernel, David S. Miller
Cc: Sasha Levin, linux-sctp
In-Reply-To: <154970855279.305165.13649851988934332761.stgit@buzz>
On 02/09/2019 02:35 AM, Konstantin Khlebnikov wrote:
> Field idiag_ext in struct inet_diag_req_v2 used as bitmap of requested
> extensions has only 8 bits. Thus extensions starting from DCTCPINFO
> cannot be requested directly. Some of them included into response
> unconditionally or hook into some of lower 8 bits.
>
> Extension INET_DIAG_CLASS_ID has not way to request from the beginning.
>
> This patch bundle it with INET_DIAG_TCLASS (ipv6 tos), fixes space
> reservation, and documents behavior for other extensions.
>
> Also this patch adds fallback to reporting socket priority. This filed
> is more widely used for traffic classification because ipv4 sockets
> automatically maps TOS to priority and default qdisc pfifo_fast knows
> about that. But priority could be changed via setsockopt SO_PRIORITY so
> INET_DIAG_TOS isn't enough for predicting class.
>
> Also cgroup2 obsoletes net_cls classid (it always zero), but we cannot
> reuse this field for reporting cgroup2 id because it is 64-bit (ino+gen).
>
> So, after this patch INET_DIAG_CLASS_ID will report socket priority
> for most common setup when net_cls isn't set and/or cgroup2 in use.
>
Nice catch Konstantin
Are you planing sending an iproute2/ss patch to output NET_DIAG_CLASS_ID values then ?
Thanks.
^ permalink raw reply
* Re: [RFC 17/19] RDMA/irdma: Add ABI definitions
From: Jason Gunthorpe @ 2019-02-12 23:05 UTC (permalink / raw)
To: Shiraz Saleem
Cc: dledford, davem, linux-rdma, netdev, mustafa.ismail,
jeffrey.t.kirsher
In-Reply-To: <20190212214402.23284-18-shiraz.saleem@intel.com>
On Tue, Feb 12, 2019 at 03:44:00PM -0600, Shiraz Saleem wrote:
> +struct irdma_create_cq_req {
> + __aligned_u64 user_cq_buf;
> + __aligned_u64 user_shadow_area;
> +};
This is the right way to use u64 in uapi headers
> +
> +struct irdma_create_qp_req {
> + __u64 user_wqe_bufs;
> + __u64 user_compl_ctx;
> +};
and this is wrong.. See my prior email.
Jason
^ permalink raw reply
* Is it safe to modify the data directly instead of bpf helper function in tc direct-action mode?
From: IMBRIUS AGER @ 2019-02-12 23:05 UTC (permalink / raw)
To: netdev
I tried the 3 line code:
__builtin_memcpy(skb, &ip->saddr, &src, sizeof(src);
__builtin_memcpy(skb, &ip->daddr, &dst, sizeof(dst);
ip->check = 0x12;
To my surprise, tc dit not reject the code in direct-action mode.
IIRC, since skb must stays uncloned, the verifier will detect all the
writes, which means we need to do the 'data + X > data_end' check
after each write.
Did I miss anything?
Best wishes
^ permalink raw reply
* Re: [PATCHv5 iproute2 net-next 2/2] lib/libnetlink: re malloc buff if size is not enough
From: Eric Dumazet @ 2019-02-12 23:32 UTC (permalink / raw)
To: Hangbin Liu, netdev; +Cc: Stephen Hemminger, Michal Kubecek, Phil Sutter
In-Reply-To: <1508982107-28474-2-git-send-email-liuhangbin@gmail.com>
On 10/25/2017 06:41 PM, Hangbin Liu wrote:
> With commit 72b365e8e0fd ("libnetlink: Double the dump buffer size")
> we doubled the buffer size to support more VFs. But the VFs number is
> increasing all the time. Some customers even use more than 200 VFs now.
>
> We could not double it everytime when the buffer is not enough. Let's just
> not hard code the buffer size and malloc the correct number when running.
>
> Introduce function rtnl_recvmsg() to always return a newly allocated buffer.
> The caller need to free it after using.
>
> Signed-off-by: Hangbin Liu <liuhangbin@gmail.com>
> Signed-off-by: Phil Sutter <phil@nwl.cc>
> ---
> lib/libnetlink.c | 114 ++++++++++++++++++++++++++++++++++++++-----------------
> 1 file changed, 80 insertions(+), 34 deletions(-)
>
> diff --git a/lib/libnetlink.c b/lib/libnetlink.c
> index be7ac86..1847c0b 100644
> --- a/lib/libnetlink.c
> +++ b/lib/libnetlink.c
> @@ -402,6 +402,64 @@ static void rtnl_dump_error(const struct rtnl_handle *rth,
> }
> }
>
> +static int __rtnl_recvmsg(int fd, struct msghdr *msg, int flags)
> +{
> + int len;
> +
> + do {
> + len = recvmsg(fd, msg, flags);
> + } while (len < 0 && (errno == EINTR || errno == EAGAIN));
> +
> + if (len < 0) {
> + fprintf(stderr, "netlink receive error %s (%d)\n",
> + strerror(errno), errno);
> + return -errno;
> + }
> +
> + if (len == 0) {
> + fprintf(stderr, "EOF on netlink\n");
> + return -ENODATA;
> + }
> +
> + return len;
> +}
> +
> +static int rtnl_recvmsg(int fd, struct msghdr *msg, char **answer)
> +{
> + struct iovec *iov = msg->msg_iov;
> + char *buf;
> + int len;
> +
> + iov->iov_base = NULL;
> + iov->iov_len = 0;
> +
> + len = __rtnl_recvmsg(fd, msg, MSG_PEEK | MSG_TRUNC);
> + if (len < 0)
> + return len;
> +
> + buf = malloc(len);
> + if (!buf) {
> + fprintf(stderr, "malloc error: not enough buffer\n");
> + return -ENOMEM;
> + }
> +
> + iov->iov_base = buf;
> + iov->iov_len = len;
> +
> + len = __rtnl_recvmsg(fd, msg, 0);
> + if (len < 0) {
> + free(buf);
> + return len;
> + }
> +
> + if (answer)
> + *answer = buf;
> + else
> + free(buf);
> +
> + return len;
> +}
> +
This patch brings a serious performance penalty.
ss command now uses two system calls per ~4KB worth of data
recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{NULL, 0}], msg_controllen=0, msg_flags=MSG_TRUNC}, MSG_PEEK|MSG_TRUNC) = 3328 <0.000120>
recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{"h\0\0\0\24\0\2\0@\342\1\0\322\0\6\0\n\1\1\0\250\253\276@&\7\370\260\200\231\16\6"..., 3328}], msg_controllen=0, msg_flags=0}, 0) = 3328 <0.000108>
recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{NULL, 0}], msg_controllen=0, msg_flags=MSG_TRUNC}, MSG_PEEK|MSG_TRUNC) = 3328 <0.000086>
recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{"h\0\0\0\24\0\2\0@\342\1\0\322\0\6\0\n\10\2\0002A\266S&\7\370\260\200\231\16\6"..., 3328}], msg_controllen=0, msg_flags=0}, 0) = 3328 <0.000121>
So we are back to a very pessimistic situation.
^ permalink raw reply
* [PATCH net-next 0/2] Remove unused variables
From: Florian Fainelli @ 2019-02-12 23:39 UTC (permalink / raw)
To: netdev; +Cc: davem, Florian Fainelli
Hi David,
This removes unused variables from mlxsw and ethsw after the recent
removal of SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS, build scripts are now
fixed to take care of those warnings :).
Thank you!
Florian Fainelli (2):
mlxsw: spectrum_switchdev: Remove unused variables
staging: fsl-dpaa2: ethsw: Remove unused port_priv variable
drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c | 3 ---
drivers/staging/fsl-dpaa2/ethsw/ethsw.c | 2 --
2 files changed, 5 deletions(-)
--
2.17.1
^ permalink raw reply
* [PATCH net-next 1/2] mlxsw: spectrum_switchdev: Remove unused variables
From: Florian Fainelli @ 2019-02-12 23:39 UTC (permalink / raw)
To: netdev; +Cc: davem, Florian Fainelli
In-Reply-To: <20190212234000.15796-1-f.fainelli@gmail.com>
After 1ecb195753a1 ("mlxsw: spectrum_switchdev: Remove getting
PORT_BRIDGE_FLAGS") we are not accessing any driver private data
structure, so the mlxsw_sp_port and mlxsw_sp variables are unused.
Fixes: 1ecb195753a1 ("mlxsw: spectrum_switchdev: Remove getting PORT_BRIDGE_FLAGS")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c | 3 ---
1 file changed, 3 deletions(-)
diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
index 4c5780f8f4b2..1f492b7dbea8 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
@@ -434,9 +434,6 @@ static void mlxsw_sp_bridge_vlan_put(struct mlxsw_sp_bridge_vlan *bridge_vlan)
static int mlxsw_sp_port_attr_get(struct net_device *dev,
struct switchdev_attr *attr)
{
- struct mlxsw_sp_port *mlxsw_sp_port = netdev_priv(dev);
- struct mlxsw_sp *mlxsw_sp = mlxsw_sp_port->mlxsw_sp;
-
switch (attr->id) {
case SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS_SUPPORT:
attr->u.brport_flags_support = BR_LEARNING | BR_FLOOD |
--
2.17.1
^ permalink raw reply related
* [PATCH net-next 2/2] staging: fsl-dpaa2: ethsw: Remove unused port_priv variable
From: Florian Fainelli @ 2019-02-12 23:39 UTC (permalink / raw)
To: netdev; +Cc: davem, Florian Fainelli
In-Reply-To: <20190212234000.15796-1-f.fainelli@gmail.com>
After 1b8b589d9103 ("staging: fsl-dpaa2: ethsw: Remove getting
PORT_BRIDGE_FLAGS") we are not accessing any driver private data
anymore, remove port_priv which is unused.
Fixes: 1b8b589d9103 ("staging: fsl-dpaa2: ethsw: Remove getting PORT_BRIDGE_FLAGS")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
drivers/staging/fsl-dpaa2/ethsw/ethsw.c | 2 --
1 file changed, 2 deletions(-)
diff --git a/drivers/staging/fsl-dpaa2/ethsw/ethsw.c b/drivers/staging/fsl-dpaa2/ethsw/ethsw.c
index e5a14c7c777f..1b3943b71254 100644
--- a/drivers/staging/fsl-dpaa2/ethsw/ethsw.c
+++ b/drivers/staging/fsl-dpaa2/ethsw/ethsw.c
@@ -643,8 +643,6 @@ static void ethsw_teardown_irqs(struct fsl_mc_device *sw_dev)
static int swdev_port_attr_get(struct net_device *netdev,
struct switchdev_attr *attr)
{
- struct ethsw_port_priv *port_priv = netdev_priv(netdev);
-
switch (attr->id) {
case SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS_SUPPORT:
attr->u.brport_flags_support = BR_LEARNING | BR_FLOOD;
--
2.17.1
^ permalink raw reply related
* [PATCH 01/13] net: bridge: Allow deying multicast snooping toggling
From: Florian Fainelli @ 2019-02-12 23:40 UTC (permalink / raw)
To: netdev; +Cc: davem, Florian Fainelli
In-Reply-To: <20190212234000.15796-1-f.fainelli@gmail.com>
Some Ethernet switches might not be able to support disabling multicast
flooding globally when e.g: several bridges span the same physical
device.
Introduce a new switchdev attribute:
SWITCHDEV_ATTR_ID_BRIDGE_MC_DISABLED_SUPPORT which is first queried as a
blocking call using switchdev_port_attr_get() and then allow
br_mc_disabled_update() to propagate that return value all the way down
to user-space.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
include/net/switchdev.h | 2 ++
net/bridge/br_multicast.c | 36 +++++++++++++++++++++++++++---------
2 files changed, 29 insertions(+), 9 deletions(-)
diff --git a/include/net/switchdev.h b/include/net/switchdev.h
index 5e87b54c5dc5..4531209e6a6a 100644
--- a/include/net/switchdev.h
+++ b/include/net/switchdev.h
@@ -50,6 +50,7 @@ enum switchdev_attr_id {
SWITCHDEV_ATTR_ID_BRIDGE_AGEING_TIME,
SWITCHDEV_ATTR_ID_BRIDGE_VLAN_FILTERING,
SWITCHDEV_ATTR_ID_BRIDGE_MC_DISABLED,
+ SWITCHDEV_ATTR_ID_BRIDGE_MC_DISABLED_SUPPORT,
SWITCHDEV_ATTR_ID_BRIDGE_MROUTER,
};
@@ -67,6 +68,7 @@ struct switchdev_attr {
clock_t ageing_time; /* BRIDGE_AGEING_TIME */
bool vlan_filtering; /* BRIDGE_VLAN_FILTERING */
bool mc_disabled; /* MC_DISABLED */
+ bool mc_disabled_support; /* MC_DISABLED_SUPPORT */
} u;
};
diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c
index 4a048fd1cbea..97cc5f3ad20a 100644
--- a/net/bridge/br_multicast.c
+++ b/net/bridge/br_multicast.c
@@ -815,20 +815,31 @@ static void br_ip6_multicast_port_query_expired(struct timer_list *t)
}
#endif
-static void br_mc_disabled_update(struct net_device *dev, bool value)
+static int br_mc_disabled_update(struct net_device *dev, bool value)
{
struct switchdev_attr attr = {
.orig_dev = dev,
- .id = SWITCHDEV_ATTR_ID_BRIDGE_MC_DISABLED,
- .flags = SWITCHDEV_F_DEFER,
- .u.mc_disabled = !value,
+ .id = SWITCHDEV_ATTR_ID_BRIDGE_MC_DISABLED_SUPPORT,
+ .flags = SWITCHDEV_F_SKIP_EOPNOTSUPP,
+ .u.mc_disabled_support = !value,
};
+ int err;
+
+ err = switchdev_port_attr_get(dev, &attr);
+ if (err)
+ return err;
- switchdev_port_attr_set(dev, &attr);
+ attr.id = SWITCHDEV_ATTR_ID_BRIDGE_MC_DISABLED;
+ attr.flags |= SWITCHDEV_F_DEFER;
+ attr.u.mc_disabled = !value;
+
+ return switchdev_port_attr_set(dev, &attr);
}
int br_multicast_add_port(struct net_bridge_port *port)
{
+ int ret;
+
port->multicast_router = MDB_RTR_TYPE_TEMP_QUERY;
timer_setup(&port->multicast_router_timer,
@@ -839,8 +850,11 @@ int br_multicast_add_port(struct net_bridge_port *port)
timer_setup(&port->ip6_own_query.timer,
br_ip6_multicast_port_query_expired, 0);
#endif
- br_mc_disabled_update(port->dev,
- br_opt_get(port->br, BROPT_MULTICAST_ENABLED));
+ ret = br_mc_disabled_update(port->dev,
+ br_opt_get(port->br,
+ BROPT_MULTICAST_ENABLED));
+ if (ret)
+ return ret;
port->mcast_stats = netdev_alloc_pcpu_stats(struct bridge_mcast_stats);
if (!port->mcast_stats)
@@ -2057,12 +2071,16 @@ static void br_multicast_start_querier(struct net_bridge *br,
int br_multicast_toggle(struct net_bridge *br, unsigned long val)
{
struct net_bridge_port *port;
+ int err = 0;
spin_lock_bh(&br->multicast_lock);
if (!!br_opt_get(br, BROPT_MULTICAST_ENABLED) == !!val)
goto unlock;
- br_mc_disabled_update(br->dev, val);
+ err = br_mc_disabled_update(br->dev, val);
+ if (err)
+ goto unlock;
+
br_opt_toggle(br, BROPT_MULTICAST_ENABLED, !!val);
if (!br_opt_get(br, BROPT_MULTICAST_ENABLED)) {
br_multicast_leave_snoopers(br);
@@ -2079,7 +2097,7 @@ int br_multicast_toggle(struct net_bridge *br, unsigned long val)
unlock:
spin_unlock_bh(&br->multicast_lock);
- return 0;
+ return err;
}
bool br_multicast_enabled(const struct net_device *dev)
--
2.17.1
From 4de219f5de35f83f6048e2045244cfab6dd7f58a Mon Sep 17 00:00:00 2001
From: Florian Fainelli <f.fainelli@gmail.com>
Date: Mon, 10 Dec 2018 16:11:56 -0800
Subject: [PATCH 02/13] net: dsa: b53: Fix default VLAN ID
We were not consistent in how the default VID of a given port was
defined, b53_br_leave() would make sure the VLAN ID would be either 0/1
depending on the switch generation, but b53_configure_vlan(), which is
the default configuration would unconditionally set it to 1. The correct
value is 1 for 5325/5365 series and 0 otherwise. To avoid repeating that
mistake ever again, introduce a helper function: b53_default_pvid() to
factor that out.
Fixes: 967dd82ffc52 ("net: dsa: b53: Add support for Broadcom RoboSwitch")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
drivers/net/dsa/b53/b53_common.c | 29 ++++++++++++++++-------------
1 file changed, 16 insertions(+), 13 deletions(-)
diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c
index 0e4bbdcc614f..964a9ec4652a 100644
--- a/drivers/net/dsa/b53/b53_common.c
+++ b/drivers/net/dsa/b53/b53_common.c
@@ -632,15 +632,25 @@ static void b53_enable_mib(struct b53_device *dev)
b53_write8(dev, B53_MGMT_PAGE, B53_GLOBAL_CONFIG, gc);
}
+static u16 b53_default_pvid(struct b53_device *dev)
+{
+ if (is5325(dev) || is5365(dev))
+ return 1;
+ else
+ return 0;
+}
+
int b53_configure_vlan(struct dsa_switch *ds)
{
struct b53_device *dev = ds->priv;
struct b53_vlan vl = { 0 };
- int i;
+ int i, def_vid;
+
+ def_vid = b53_default_pvid(dev);
/* clear all vlan entries */
if (is5325(dev) || is5365(dev)) {
- for (i = 1; i < dev->num_vlans; i++)
+ for (i = def_vid; i < dev->num_vlans; i++)
b53_set_vlan_entry(dev, i, &vl);
} else {
b53_do_vlan_op(dev, VTA_CMD_CLEAR);
@@ -650,7 +660,7 @@ int b53_configure_vlan(struct dsa_switch *ds)
b53_for_each_port(dev, i)
b53_write16(dev, B53_VLAN_PAGE,
- B53_VLAN_PORT_DEF_TAG(i), 1);
+ B53_VLAN_PORT_DEF_TAG(i), def_vid);
if (!is5325(dev) && !is5365(dev))
b53_set_jumbo(dev, dev->enable_jumbo, false);
@@ -1326,12 +1336,8 @@ int b53_vlan_del(struct dsa_switch *ds, int port,
vl->members &= ~BIT(port);
- if (pvid == vid) {
- if (is5325(dev) || is5365(dev))
- pvid = 1;
- else
- pvid = 0;
- }
+ if (pvid == vid)
+ pvid = b53_default_pvid(dev);
if (untagged && !dsa_is_cpu_port(ds, port))
vl->untag &= ~(BIT(port));
@@ -1644,10 +1650,7 @@ void b53_br_leave(struct dsa_switch *ds, int port, struct net_device *br)
b53_write16(dev, B53_PVLAN_PAGE, B53_PVLAN_PORT_MASK(port), pvlan);
dev->ports[port].vlan_ctl_mask = pvlan;
- if (is5325(dev) || is5365(dev))
- pvid = 1;
- else
- pvid = 0;
+ pvid = b53_default_pvid(dev);
/* Make this port join all VLANs without VLAN entries */
if (is58xx(dev)) {
--
2.17.1
From f383f69ad8a0da53f4f2d634d4dd4f34babf6d52 Mon Sep 17 00:00:00 2001
From: Florian Fainelli <f.fainelli@gmail.com>
Date: Mon, 10 Dec 2018 13:55:55 -0800
Subject: [PATCH 03/13] net: dsa: b53: Properly account for VLAN filtering
VLAN filtering can be built into the kernel, and also dynamically turned
on/off through the bridge master device. Allow re-configuring the switch
appropriately to account for that by deciding whether VLAN table
(v_table) misses should lead to a drop or forward.
Fixes: a2482d2ce349 ("net: dsa: b53: Plug in VLAN support")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
drivers/net/dsa/b53/b53_common.c | 59 +++++++++++++++++++++++++++++---
drivers/net/dsa/b53/b53_priv.h | 3 ++
2 files changed, 57 insertions(+), 5 deletions(-)
diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c
index 964a9ec4652a..2fef4c564420 100644
--- a/drivers/net/dsa/b53/b53_common.c
+++ b/drivers/net/dsa/b53/b53_common.c
@@ -344,7 +344,8 @@ static void b53_set_forwarding(struct b53_device *dev, int enable)
b53_write8(dev, B53_CTRL_PAGE, B53_SWITCH_CTRL, mgmt);
}
-static void b53_enable_vlan(struct b53_device *dev, bool enable)
+static void b53_enable_vlan(struct b53_device *dev, bool enable,
+ bool enable_filtering)
{
u8 mgmt, vc0, vc1, vc4 = 0, vc5;
@@ -369,8 +370,13 @@ static void b53_enable_vlan(struct b53_device *dev, bool enable)
vc0 |= VC0_VLAN_EN | VC0_VID_CHK_EN | VC0_VID_HASH_VID;
vc1 |= VC1_RX_MCST_UNTAG_EN | VC1_RX_MCST_FWD_EN;
vc4 &= ~VC4_ING_VID_CHECK_MASK;
- vc4 |= VC4_ING_VID_VIO_DROP << VC4_ING_VID_CHECK_S;
- vc5 |= VC5_DROP_VTABLE_MISS;
+ if (enable_filtering) {
+ vc4 |= VC4_ING_VID_VIO_DROP << VC4_ING_VID_CHECK_S;
+ vc5 |= VC5_DROP_VTABLE_MISS;
+ } else {
+ vc4 |= VC4_ING_VID_VIO_FWD << VC4_ING_VID_CHECK_S;
+ vc5 &= ~VC5_DROP_VTABLE_MISS;
+ }
if (is5325(dev))
vc0 &= ~VC0_RESERVED_1;
@@ -420,6 +426,9 @@ static void b53_enable_vlan(struct b53_device *dev, bool enable)
}
b53_write8(dev, B53_CTRL_PAGE, B53_SWITCH_MODE, mgmt);
+
+ dev->vlan_enabled = enable;
+ dev->vlan_filtering_enabled = enable_filtering;
}
static int b53_set_jumbo(struct b53_device *dev, bool enable, bool allow_10_100)
@@ -656,7 +665,7 @@ int b53_configure_vlan(struct dsa_switch *ds)
b53_do_vlan_op(dev, VTA_CMD_CLEAR);
}
- b53_enable_vlan(dev, false);
+ b53_enable_vlan(dev, false, dev->vlan_filtering_enabled);
b53_for_each_port(dev, i)
b53_write16(dev, B53_VLAN_PAGE,
@@ -1265,6 +1274,46 @@ EXPORT_SYMBOL(b53_phylink_mac_link_up);
int b53_vlan_filtering(struct dsa_switch *ds, int port, bool vlan_filtering)
{
+ struct b53_device *dev = ds->priv;
+ struct net_device *bridge_dev;
+ unsigned int i;
+ u16 pvid, new_pvid;
+
+ /* Handle the case were multiple bridges span the same switch device
+ * and one of them has a different setting than what is being requested
+ * which would be breaking filtering semantics for any of the other
+ * bridge devices.
+ */
+ b53_for_each_port(dev, i) {
+ bridge_dev = dsa_to_port(ds, i)->bridge_dev;
+ if (bridge_dev &&
+ bridge_dev != dsa_to_port(ds, port)->bridge_dev &&
+ br_vlan_enabled(bridge_dev) != vlan_filtering) {
+ netdev_err(bridge_dev,
+ "VLAN filtering is global to the switch!\n");
+ return -EINVAL;
+ }
+ }
+
+ b53_read16(dev, B53_VLAN_PAGE, B53_VLAN_PORT_DEF_TAG(port), &pvid);
+ new_pvid = pvid;
+ if (dev->vlan_filtering_enabled && !vlan_filtering) {
+ /* Filtering is currently enabled, use the default PVID since
+ * the bridge does not expect tagging anymore
+ */
+ dev->ports[port].pvid = pvid;
+ new_pvid = b53_default_pvid(dev);
+ } else if (!dev->vlan_filtering_enabled && vlan_filtering) {
+ /* Filtering is currently disabled, restore the previous PVID */
+ new_pvid = dev->ports[port].pvid;
+ }
+
+ if (pvid != new_pvid)
+ b53_write16(dev, B53_VLAN_PAGE, B53_VLAN_PORT_DEF_TAG(port),
+ new_pvid);
+
+ b53_enable_vlan(dev, dev->vlan_enabled, vlan_filtering);
+
return 0;
}
EXPORT_SYMBOL(b53_vlan_filtering);
@@ -1280,7 +1329,7 @@ int b53_vlan_prepare(struct dsa_switch *ds, int port,
if (vlan->vid_end > dev->num_vlans)
return -ERANGE;
- b53_enable_vlan(dev, true);
+ b53_enable_vlan(dev, true, dev->vlan_filtering_enabled);
return 0;
}
diff --git a/drivers/net/dsa/b53/b53_priv.h b/drivers/net/dsa/b53/b53_priv.h
index ec796482792d..4dc7ee38b258 100644
--- a/drivers/net/dsa/b53/b53_priv.h
+++ b/drivers/net/dsa/b53/b53_priv.h
@@ -91,6 +91,7 @@ enum {
struct b53_port {
u16 vlan_ctl_mask;
struct ethtool_eee eee;
+ u16 pvid;
};
struct b53_vlan {
@@ -137,6 +138,8 @@ struct b53_device {
unsigned int num_vlans;
struct b53_vlan *vlans;
+ bool vlan_enabled;
+ bool vlan_filtering_enabled;
unsigned int num_ports;
struct b53_port *ports;
};
--
2.17.1
From adf163ed43eed53888acd5f76e84e0d0db344b4c Mon Sep 17 00:00:00 2001
From: Florian Fainelli <f.fainelli@gmail.com>
Date: Wed, 12 Dec 2018 12:23:59 -0800
Subject: [PATCH 04/13] net: systemport: Fix reception of BPDUs
SYSTEMPORT has its RXCHK parser block that attempts to validate the
packet structures, unfortunately setting the L2 header check bit will
cause Bridge PDUs (BPDUs) to be incorrectly rejected because they look
like LLC/SNAP packets with a non-IPv4 or non-IPv6 Ethernet Type.
Fixes: 4e8aedfe78c7 ("net: systemport: Turn on offloads by default")
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
drivers/net/ethernet/broadcom/bcmsysport.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/drivers/net/ethernet/broadcom/bcmsysport.c b/drivers/net/ethernet/broadcom/bcmsysport.c
index 28c9b0bdf2f6..bc3ac369cbe3 100644
--- a/drivers/net/ethernet/broadcom/bcmsysport.c
+++ b/drivers/net/ethernet/broadcom/bcmsysport.c
@@ -134,6 +134,10 @@ static void bcm_sysport_set_rx_csum(struct net_device *dev,
priv->rx_chk_en = !!(wanted & NETIF_F_RXCSUM);
reg = rxchk_readl(priv, RXCHK_CONTROL);
+ /* Clear L2 header checks, which would prevent BPDUs
+ * from being received.
+ */
+ reg &= ~RXCHK_L2_HDR_DIS;
if (priv->rx_chk_en)
reg |= RXCHK_EN;
else
--
2.17.1
From 149369d20cd22fe66f8b9a499df32c20e466a9c8 Mon Sep 17 00:00:00 2001
From: Florian Fainelli <f.fainelli@gmail.com>
Date: Wed, 5 Dec 2018 17:16:51 -0800
Subject: [PATCH 05/13] net: dsa: b53: Define registers for IGMP snooping
Define all necessary registers in order to implement IGMP snooping later
on, which are mostly comprised of the high-level protocol register
control definitions.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
drivers/net/dsa/b53/b53_regs.h | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)
diff --git a/drivers/net/dsa/b53/b53_regs.h b/drivers/net/dsa/b53/b53_regs.h
index 2a9f421680aa..b4aecd4552b6 100644
--- a/drivers/net/dsa/b53/b53_regs.h
+++ b/drivers/net/dsa/b53/b53_regs.h
@@ -115,6 +115,8 @@
#define B53_UC_FLOOD_MASK 0x32
#define B53_MC_FLOOD_MASK 0x34
#define B53_IPMC_FLOOD_MASK 0x36
+#define B53_DIS_LEARN 0x3c
+#define B53_SFT_LRN_CTRL 0x3e
/*
* Override Ports 0-7 State on devices with xMII interfaces (8 bit)
@@ -253,6 +255,26 @@
/* Revision ID register (8 bit) */
#define B53_REV_ID 0x40
+/* High-level Protocol Control Register (32 bit) */
+#define B53_HL_PRTC_CTRL 0x50
+#define HL_PRTC_ARP_EN (1 << 0)
+#define HL_PRTC_RARP_EN (1 << 1)
+#define HL_PRTC_DHCP_EN (1 << 2)
+#define HL_PRTC_ICMPV4_EN (1 << 3)
+#define HL_PRTC_ICMPV6_EN (1 << 4)
+#define HL_PRTC_ICMPV6_FWD_MODE (1 << 5)
+#define HL_PRTC_IGMP_DIP_EN (1 << 8)
+#define HL_PRTC_IGMP_RPTLVE_EN (1 << 9)
+#define HL_PRTC_IGMP_RPTVLE_FWD_MODE (1 << 10)
+#define HL_PRTC_IGMP_QRY_EN (1 << 11)
+#define HL_PRTC_IGMP_QRY_FWD_MODE (1 << 12)
+#define HL_PRTC_IGMP_UKN_EN (1 << 13)
+#define HL_PRTC_IGMP_UKN_FWD_MODE (1 << 14)
+#define HL_PRTC_MLD_RPTDONE_EN (1 << 15)
+#define HL_PRTC_MLD_RPTDONE_FWD_MODE (1 << 16)
+#define HL_PRTC_MLD_QRY_EN (1 << 17)
+#define HL_PRTC_MLD_QRY_FWD_MODE (1 << 18)
+
/* Broadcom header RX control (16 bit) */
#define B53_BRCM_HDR_RX_DIS 0x60
--
2.17.1
From b9676fa661680ce69216f59afc3f90c5c5250302 Mon Sep 17 00:00:00 2001
From: Florian Fainelli <f.fainelli@gmail.com>
Date: Thu, 6 Dec 2018 12:13:40 -0800
Subject: [PATCH 06/13] net: dsa: b53: Add support for MDB
In preparation for supporting IGMP snooping with or without the use of
a bridge, add support within b53_common.c to program the ARL entries for
multicast operations. The key difference is that a multicast ARL entry
is comprised of a bitmask of enabled ports, instead of a port number.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
drivers/net/dsa/b53/b53_common.c | 62 ++++++++++++++++++++++++++++++--
drivers/net/dsa/b53/b53_priv.h | 8 ++++-
2 files changed, 67 insertions(+), 3 deletions(-)
diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c
index 2fef4c564420..6c894ad4768a 100644
--- a/drivers/net/dsa/b53/b53_common.c
+++ b/drivers/net/dsa/b53/b53_common.c
@@ -1503,11 +1503,25 @@ static int b53_arl_op(struct b53_device *dev, int op, int port,
idx = 1;
}
- memset(&ent, 0, sizeof(ent));
- ent.port = port;
+ /* For multicast address, the port is a bitmask and the validity
+ * is determined by having at least one port being still active
+ */
+ if (!is_multicast_ether_addr(addr)) {
+ ent.port = port;
+ ent.is_valid = is_valid;
+ } else {
+ if (is_valid)
+ ent.port |= BIT(port);
+ else
+ ent.port &= ~BIT(port);
+
+ ent.is_valid = !!(ent.port);
+ }
+
ent.is_valid = is_valid;
ent.vid = vid;
ent.is_static = true;
+ ent.is_age = false;
memcpy(ent.mac, addr, ETH_ALEN);
b53_arl_from_entry(&mac_vid, &fwd_entry, &ent);
@@ -1626,6 +1640,47 @@ int b53_fdb_dump(struct dsa_switch *ds, int port,
}
EXPORT_SYMBOL(b53_fdb_dump);
+int b53_mdb_prepare(struct dsa_switch *ds, int port,
+ const struct switchdev_obj_port_mdb *mdb)
+{
+ struct b53_device *priv = ds->priv;
+
+ /* 5325 and 5365 require some more massaging, but could
+ * be supported eventually
+ */
+ if (is5325(priv) || is5365(priv))
+ return -EOPNOTSUPP;
+
+ return 0;
+}
+EXPORT_SYMBOL(b53_mdb_prepare);
+
+void b53_mdb_add(struct dsa_switch *ds, int port,
+ const struct switchdev_obj_port_mdb *mdb)
+{
+ struct b53_device *priv = ds->priv;
+ int ret;
+
+ ret = b53_arl_op(priv, 0, port, mdb->addr, mdb->vid, true);
+ if (ret)
+ dev_err(ds->dev, "failed to add MDB entry\n");
+}
+EXPORT_SYMBOL(b53_mdb_add);
+
+int b53_mdb_del(struct dsa_switch *ds, int port,
+ const struct switchdev_obj_port_mdb *mdb)
+{
+ struct b53_device *priv = ds->priv;
+ int ret;
+
+ ret = b53_arl_op(priv, 0, port, mdb->addr, mdb->vid, false);
+ if (ret)
+ dev_err(ds->dev, "failed to delete MDB entry\n");
+
+ return ret;
+}
+EXPORT_SYMBOL(b53_mdb_del);
+
int b53_br_join(struct dsa_switch *ds, int port, struct net_device *br)
{
struct b53_device *dev = ds->priv;
@@ -1969,6 +2024,9 @@ static const struct dsa_switch_ops b53_switch_ops = {
.port_fdb_del = b53_fdb_del,
.port_mirror_add = b53_mirror_add,
.port_mirror_del = b53_mirror_del,
+ .port_mdb_prepare = b53_mdb_prepare,
+ .port_mdb_add = b53_mdb_add,
+ .port_mdb_del = b53_mdb_del,
};
struct b53_chip_data {
diff --git a/drivers/net/dsa/b53/b53_priv.h b/drivers/net/dsa/b53/b53_priv.h
index 4dc7ee38b258..620638ff9338 100644
--- a/drivers/net/dsa/b53/b53_priv.h
+++ b/drivers/net/dsa/b53/b53_priv.h
@@ -251,7 +251,7 @@ b53_build_op(write48, u64);
b53_build_op(write64, u64);
struct b53_arl_entry {
- u8 port;
+ u16 port;
u8 mac[ETH_ALEN];
u16 vid;
u8 is_valid:1;
@@ -350,6 +350,12 @@ int b53_fdb_del(struct dsa_switch *ds, int port,
const unsigned char *addr, u16 vid);
int b53_fdb_dump(struct dsa_switch *ds, int port,
dsa_fdb_dump_cb_t *cb, void *data);
+int b53_mdb_prepare(struct dsa_switch *ds, int port,
+ const struct switchdev_obj_port_mdb *mdb);
+void b53_mdb_add(struct dsa_switch *ds, int port,
+ const struct switchdev_obj_port_mdb *mdb);
+int b53_mdb_del(struct dsa_switch *ds, int port,
+ const struct switchdev_obj_port_mdb *mdb);
int b53_mirror_add(struct dsa_switch *ds, int port,
struct dsa_mall_mirror_tc_entry *mirror, bool ingress);
enum dsa_tag_protocol b53_get_tag_protocol(struct dsa_switch *ds, int port);
--
2.17.1
From 2bb2f9968f6fa9c1593f338c05e696e48103b2a8 Mon Sep 17 00:00:00 2001
From: Florian Fainelli <f.fainelli@gmail.com>
Date: Thu, 6 Dec 2018 12:29:16 -0800
Subject: [PATCH 07/13] net: dsa: Add ability to program multicast filter for
CPU port
When the switch ports operate as individual network devices, the switch
driver might have configured the switch to flood multicast all the way
to the CPU port. This is really undesireable as it can lead to receiving
a lot of unwanted traffic that the network stack needs to filter in
software.
For each valid multicast address, program it into the switch's MDB only
when the host is interested in receiving such traffic, e.g: running an
multicast application.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
net/dsa/slave.c | 51 +++++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 51 insertions(+)
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index 2e5e7c04821b..f565b6a595d2 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -68,6 +68,41 @@ static int dsa_slave_get_iflink(const struct net_device *dev)
return dsa_slave_to_master(dev)->ifindex;
}
+static int dsa_slave_sync(struct net_device *dev, const unsigned char *addr)
+{
+ struct switchdev_obj_port_mdb mdb = {
+ .obj.id = SWITCHDEV_OBJ_ID_HOST_MDB,
+ .obj.flags = SWITCHDEV_F_DEFER,
+ };
+
+ ether_addr_copy(mdb.addr, addr);
+
+ return switchdev_port_obj_add(dev, &mdb.obj, NULL);
+}
+
+static int dsa_slave_unsync(struct net_device *dev, const unsigned char *addr)
+{
+ struct switchdev_obj_port_mdb mdb = {
+ .obj.id = SWITCHDEV_OBJ_ID_HOST_MDB,
+ .obj.flags = SWITCHDEV_F_DEFER,
+ };
+
+ ether_addr_copy(mdb.addr, addr);
+
+ return switchdev_port_obj_del(dev, &mdb.obj);
+}
+
+static int dsa_slave_mc_sync(struct net_device *dev)
+{
+ return __hw_addr_sync_dev(&dev->mc, dev, dsa_slave_sync,
+ dsa_slave_unsync);
+}
+
+static void dsa_slave_mc_unsync(struct net_device *dev)
+{
+ __hw_addr_unsync_dev(&dev->mc, dev, dsa_slave_unsync);
+}
+
static int dsa_slave_open(struct net_device *dev)
{
struct net_device *master = dsa_slave_to_master(dev);
@@ -126,6 +161,8 @@ static int dsa_slave_close(struct net_device *dev)
dev_mc_unsync(master, dev);
dev_uc_unsync(master, dev);
+ dsa_slave_mc_unsync(dev);
+
if (dev->flags & IFF_ALLMULTI)
dev_set_allmulti(master, -1);
if (dev->flags & IFF_PROMISC)
@@ -153,7 +190,17 @@ static void dsa_slave_change_rx_flags(struct net_device *dev, int change)
static void dsa_slave_set_rx_mode(struct net_device *dev)
{
struct net_device *master = dsa_slave_to_master(dev);
+ struct dsa_port *dp = dsa_slave_to_port(dev);
+
+ /* If the port is bridged, the bridge takes care of sending
+ * SWITCHDEV_OBJ_ID_HOST_MDB to program the host's MC filter
+ */
+ if (netdev_mc_empty(dev) || dp->bridge_dev)
+ goto out;
+ if (dsa_slave_mc_sync(dev))
+ netdev_err(dev, "failed to synchronize MC addresses\n");
+out:
dev_mc_sync(master, dev);
dev_uc_sync(master, dev);
}
@@ -1405,6 +1452,10 @@ static int dsa_slave_changeupper(struct net_device *dev,
if (netif_is_bridge_master(info->upper_dev)) {
if (info->linking) {
+ /* Remove existing MC addresses that might have been
+ * programmed
+ */
+ dsa_slave_mc_unsync(dev);
err = dsa_port_bridge_join(dp, info->upper_dev);
err = notifier_from_errno(err);
} else {
--
2.17.1
From a18cd91ea44f55b852cd4c771da0abb014ef50ab Mon Sep 17 00:00:00 2001
From: Florian Fainelli <f.fainelli@gmail.com>
Date: Mon, 10 Dec 2018 15:15:17 -0800
Subject: [PATCH 08/13] net: dsa: Add ndo_vlan_rx_{add,kill}_vid implementation
In order to properly support VLAN filtering being enabled/disabled on a
bridge, while having other ports being non bridge port members, we need
to support the ndo_vlan_rx_{add,kill}_vid callbacks in order to make
sure the non-bridge ports can continue receiving VLAN tags, even when
the switch is globally configured to do ingress/egress VID checking.
We don't allow configuring VLAN devices on a bridge port member though,
since the bridge with VLAN awareness should be taking care of that, if
needed.
Since we can call dsa_port_vlan_{add,del} with a bridge_dev pointer
NULL, we now need to check that in these two functions.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
net/dsa/port.c | 10 ++++++++--
net/dsa/slave.c | 53 ++++++++++++++++++++++++++++++++++++++++++++++++-
2 files changed, 60 insertions(+), 3 deletions(-)
diff --git a/net/dsa/port.c b/net/dsa/port.c
index 2d7e01b23572..a18d8409acf3 100644
--- a/net/dsa/port.c
+++ b/net/dsa/port.c
@@ -252,7 +252,10 @@ int dsa_port_vlan_add(struct dsa_port *dp,
.vlan = vlan,
};
- if (br_vlan_enabled(dp->bridge_dev))
+ /* Can be called from dsa_slave_port_obj_add() or
+ * dsa_slave_vlan_rx_add_vid()
+ */
+ if (!dp->bridge_dev || br_vlan_enabled(dp->bridge_dev))
return dsa_port_notify(dp, DSA_NOTIFIER_VLAN_ADD, &info);
return 0;
@@ -270,7 +273,10 @@ int dsa_port_vlan_del(struct dsa_port *dp,
if (netif_is_bridge_master(vlan->obj.orig_dev))
return -EOPNOTSUPP;
- if (br_vlan_enabled(dp->bridge_dev))
+ /* Can be called from dsa_slave_port_obj_del() or
+ * dsa_slave_vlan_rx_kill_vid()
+ */
+ if (!dp->bridge_dev || br_vlan_enabled(dp->bridge_dev))
return dsa_port_notify(dp, DSA_NOTIFIER_VLAN_DEL, &info);
return 0;
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index f565b6a595d2..cc4fb8725fac 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -1037,6 +1037,54 @@ static int dsa_slave_get_ts_info(struct net_device *dev,
return ds->ops->get_ts_info(ds, p->dp->index, ts);
}
+static int dsa_slave_vlan_rx_add_vid(struct net_device *dev, __be16 proto,
+ u16 vid)
+{
+ struct dsa_port *dp = dsa_slave_to_port(dev);
+ struct switchdev_obj_port_vlan vlan = { };
+ int ret = 0;
+
+ /* If the port is bridged and the bridge is VLAN aware, let the bridge
+ * manage VLANs
+ */
+ if (dp->bridge_dev && br_vlan_enabled(dp->bridge_dev))
+ return -EINVAL;
+
+ /* This API only allows programming tagged, non-PVID VIDs */
+ vlan.vid_begin = vid;
+ vlan.vid_end = vid;
+
+ ret = dsa_port_vlan_add(dp, &vlan, NULL);
+ if (ret == -EOPNOTSUPP)
+ ret = 0;
+
+ return ret;
+}
+
+static int dsa_slave_vlan_rx_kill_vid(struct net_device *dev, __be16 proto,
+ u16 vid)
+{
+ struct dsa_port *dp = dsa_slave_to_port(dev);
+ struct switchdev_obj_port_vlan vlan = { };
+ int ret = 0;
+
+ /* If the port is bridged and the bridge is VLAN aware, let the bridge
+ * manage VLANs
+ */
+ if (dp->bridge_dev && br_vlan_enabled(dp->bridge_dev))
+ return -EINVAL;
+
+ /* This API only allows programming tagged, non-PVID VIDs */
+ vlan.vid_begin = vid;
+ vlan.vid_end = vid;
+
+ ret = dsa_port_vlan_del(dp, &vlan);
+ if (ret == -EOPNOTSUPP)
+ ret = 0;
+
+ return ret;
+}
+
static const struct ethtool_ops dsa_slave_ethtool_ops = {
.get_drvinfo = dsa_slave_get_drvinfo,
.get_regs_len = dsa_slave_get_regs_len,
@@ -1102,6 +1150,8 @@ static const struct net_device_ops dsa_slave_netdev_ops = {
.ndo_setup_tc = dsa_slave_setup_tc,
.ndo_get_stats64 = dsa_slave_get_stats64,
.ndo_get_port_parent_id = dsa_slave_get_port_parent_id,
+ .ndo_vlan_rx_add_vid = dsa_slave_vlan_rx_add_vid,
+ .ndo_vlan_rx_kill_vid = dsa_slave_vlan_rx_kill_vid,
};
static const struct switchdev_ops dsa_slave_switchdev_ops = {
@@ -1362,7 +1412,8 @@ int dsa_slave_create(struct dsa_port *port)
if (slave_dev == NULL)
return -ENOMEM;
- slave_dev->features = master->vlan_features | NETIF_F_HW_TC;
+ slave_dev->features = master->vlan_features | NETIF_F_HW_TC |
+ NETIF_F_HW_VLAN_CTAG_FILTER;
slave_dev->hw_features |= NETIF_F_HW_TC;
slave_dev->ethtool_ops = &dsa_slave_ethtool_ops;
eth_hw_addr_inherit(slave_dev, master);
--
2.17.1
From 2d4c257f1b9fd0d1dc2af49c50ebc215c2b561ee Mon Sep 17 00:00:00 2001
From: Florian Fainelli <f.fainelli@gmail.com>
Date: Mon, 7 Jan 2019 15:21:34 -0800
Subject: [PATCH 09/13] net: dsa: Make VLAN filtering use DSA notifiers
In preparation for allowing for global checks that would apply to the
entire switch and not just on a per-port basis, make the VLAN filtering
attribute follow other switchdev attributes/objects and make it use the
DSA notifier infrastructure.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
net/dsa/dsa_priv.h | 11 ++++++++++-
net/dsa/port.c | 17 +++++++----------
net/dsa/switch.c | 29 +++++++++++++++++++++++++++++
3 files changed, 46 insertions(+), 11 deletions(-)
diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h
index 1f4972dab9f2..1e3db5f2a699 100644
--- a/net/dsa/dsa_priv.h
+++ b/net/dsa/dsa_priv.h
@@ -26,6 +26,7 @@ enum {
DSA_NOTIFIER_MDB_DEL,
DSA_NOTIFIER_VLAN_ADD,
DSA_NOTIFIER_VLAN_DEL,
+ DSA_NOTIFIER_VLAN_FILTERING,
};
/* DSA_NOTIFIER_AGEING_TIME */
@@ -57,7 +58,7 @@ struct dsa_notifier_mdb_info {
int port;
};
-/* DSA_NOTIFIER_VLAN_* */
+/* DSA_NOTIFIER_VLAN_{ADD,DEL} */
struct dsa_notifier_vlan_info {
const struct switchdev_obj_port_vlan *vlan;
struct switchdev_trans *trans;
@@ -65,6 +66,14 @@ struct dsa_notifier_vlan_info {
int port;
};
+/* DSA_NOTIFIER_VLAN_FILTERING */
+struct dsa_notifier_vlan_filtering_info {
+ bool vlan_filtering;
+ struct switchdev_trans *trans;
+ int sw_index;
+ int port;
+};
+
struct dsa_slave_priv {
/* Copy of CPU port xmit for faster access in slave transmit hot path */
struct sk_buff * (*xmit)(struct sk_buff *skb,
diff --git a/net/dsa/port.c b/net/dsa/port.c
index a18d8409acf3..60c6eca7d964 100644
--- a/net/dsa/port.c
+++ b/net/dsa/port.c
@@ -146,17 +146,14 @@ void dsa_port_bridge_leave(struct dsa_port *dp, struct net_device *br)
int dsa_port_vlan_filtering(struct dsa_port *dp, bool vlan_filtering,
struct switchdev_trans *trans)
{
- struct dsa_switch *ds = dp->ds;
-
- /* bridge skips -EOPNOTSUPP, so skip the prepare phase */
- if (switchdev_trans_ph_prepare(trans))
- return 0;
-
- if (ds->ops->port_vlan_filtering)
- return ds->ops->port_vlan_filtering(ds, dp->index,
- vlan_filtering);
+ struct dsa_notifier_vlan_filtering_info info = {
+ .sw_index = dp->ds->index,
+ .port = dp->index,
+ .trans = trans,
+ .vlan_filtering = vlan_filtering,
+ };
- return 0;
+ return dsa_port_notify(dp, DSA_NOTIFIER_VLAN_FILTERING, &info);
}
int dsa_port_ageing_time(struct dsa_port *dp, clock_t ageing_clock,
diff --git a/net/dsa/switch.c b/net/dsa/switch.c
index 142b294d3446..831334dc5e79 100644
--- a/net/dsa/switch.c
+++ b/net/dsa/switch.c
@@ -235,6 +235,32 @@ static int dsa_switch_vlan_del(struct dsa_switch *ds,
return 0;
}
+static int dsa_switch_vlan_filtering(struct dsa_switch *ds,
+ struct dsa_notifier_vlan_filtering_info *info)
+{
+ struct switchdev_trans *trans = info->trans;
+ bool vlan_filtering = info->vlan_filtering;
+ int port = info->port;
+ int err;
+
+ /* bridge skips -EOPNOTSUPP, so skip the prepare phase */
+ if (switchdev_trans_ph_prepare(trans))
+ return 0;
+
+ /* Build a mask of port members */
+ bitmap_zero(ds->bitmap, ds->num_ports);
+ if (ds->index == info->sw_index)
+ set_bit(port, ds->bitmap);
+
+ for_each_set_bit(port, ds->bitmap, ds->num_ports) {
+ err = ds->ops->port_vlan_filtering(ds, port, vlan_filtering);
+ if (err)
+ return err;
+ }
+
+ return 0;
+}
+
static int dsa_switch_event(struct notifier_block *nb,
unsigned long event, void *info)
{
@@ -269,6 +295,9 @@ static int dsa_switch_event(struct notifier_block *nb,
case DSA_NOTIFIER_VLAN_DEL:
err = dsa_switch_vlan_del(ds, info);
break;
+ case DSA_NOTIFIER_VLAN_FILTERING:
+ err = dsa_switch_vlan_filtering(ds, info);
+ break;
default:
err = -EOPNOTSUPP;
break;
--
2.17.1
From 3f2236e39d2d9b0d7d77d9033becdfeda717c41f Mon Sep 17 00:00:00 2001
From: Florian Fainelli <f.fainelli@gmail.com>
Date: Wed, 12 Dec 2018 15:14:34 -0800
Subject: [PATCH 10/13] net: dsa: Wire up multicast IGMP snooping attribute
notification
The bridge can at runtime be configured with or without IGMP snooping
enabled but we were not processing the switchdev attribute that notifies
about that toggle, do this now.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
include/net/dsa.h | 8 ++++++++
net/dsa/dsa_priv.h | 11 +++++++++++
net/dsa/port.c | 13 +++++++++++++
net/dsa/slave.c | 9 +++++++++
net/dsa/switch.c | 28 ++++++++++++++++++++++++++++
5 files changed, 69 insertions(+)
diff --git a/include/net/dsa.h b/include/net/dsa.h
index 7f2a668ef2cc..220f2b1d06d9 100644
--- a/include/net/dsa.h
+++ b/include/net/dsa.h
@@ -425,6 +425,8 @@ struct dsa_switch_ops {
/*
* Multicast database
*/
+ int (*port_mc_disabled)(struct dsa_switch *ds, int port,
+ bool mc_disabled);
int (*port_mdb_prepare)(struct dsa_switch *ds, int port,
const struct switchdev_obj_port_mdb *mdb);
void (*port_mdb_add)(struct dsa_switch *ds, int port,
@@ -467,6 +469,12 @@ struct dsa_switch_ops {
struct sk_buff *clone, unsigned int type);
bool (*port_rxtstamp)(struct dsa_switch *ds, int port,
struct sk_buff *skb, unsigned int type);
+
+ /*
+ * SWITCHDEV integration
+ */
+ int (*port_attr_get)(struct dsa_switch *ds, int port,
+ struct switchdev_attr *attr);
};
struct dsa_switch_driver {
diff --git a/net/dsa/dsa_priv.h b/net/dsa/dsa_priv.h
index 1e3db5f2a699..79dd1dcbcb28 100644
--- a/net/dsa/dsa_priv.h
+++ b/net/dsa/dsa_priv.h
@@ -27,6 +27,7 @@ enum {
DSA_NOTIFIER_VLAN_ADD,
DSA_NOTIFIER_VLAN_DEL,
DSA_NOTIFIER_VLAN_FILTERING,
+ DSA_NOTIFIER_MC_DISABLED,
};
/* DSA_NOTIFIER_AGEING_TIME */
@@ -74,6 +75,14 @@ struct dsa_notifier_vlan_filtering_info {
int port;
};
+/* DSA_NOTIFIER_MC_DISABLED */
+struct dsa_notifier_mc_disabled_info {
+ bool mc_disabled;
+ struct switchdev_trans *trans;
+ int sw_index;
+ int port;
+};
+
struct dsa_slave_priv {
/* Copy of CPU port xmit for faster access in slave transmit hot path */
struct sk_buff * (*xmit)(struct sk_buff *skb,
@@ -155,6 +164,8 @@ int dsa_port_enable(struct dsa_port *dp, struct phy_device *phy);
void dsa_port_disable(struct dsa_port *dp, struct phy_device *phy);
int dsa_port_bridge_join(struct dsa_port *dp, struct net_device *br);
void dsa_port_bridge_leave(struct dsa_port *dp, struct net_device *br);
+int dsa_port_mc_disabled(struct dsa_port *dp, bool mc_disabled,
+ struct switchdev_trans *trans);
int dsa_port_vlan_filtering(struct dsa_port *dp, bool vlan_filtering,
struct switchdev_trans *trans);
int dsa_port_ageing_time(struct dsa_port *dp, clock_t ageing_clock,
diff --git a/net/dsa/port.c b/net/dsa/port.c
index 60c6eca7d964..725824fd88cb 100644
--- a/net/dsa/port.c
+++ b/net/dsa/port.c
@@ -143,6 +143,19 @@ void dsa_port_bridge_leave(struct dsa_port *dp, struct net_device *br)
dsa_port_set_state_now(dp, BR_STATE_FORWARDING);
}
+int dsa_port_mc_disabled(struct dsa_port *dp, bool mc_disabled,
+ struct switchdev_trans *trans)
+{
+ struct dsa_notifier_mc_disabled_info info = {
+ .sw_index = dp->ds->index,
+ .port = dp->index,
+ .trans = trans,
+ .mc_disabled = mc_disabled,
+ };
+
+ return dsa_port_notify(dp, DSA_NOTIFIER_MC_DISABLED, &info);
+}
+
int dsa_port_vlan_filtering(struct dsa_port *dp, bool vlan_filtering,
struct switchdev_trans *trans)
{
diff --git a/net/dsa/slave.c b/net/dsa/slave.c
index cc4fb8725fac..128a8e5b3a42 100644
--- a/net/dsa/slave.c
+++ b/net/dsa/slave.c
@@ -342,6 +342,9 @@ static int dsa_slave_port_attr_set(struct net_device *dev,
case SWITCHDEV_ATTR_ID_BRIDGE_AGEING_TIME:
ret = dsa_port_ageing_time(dp, attr->u.ageing_time, trans);
break;
+ case SWITCHDEV_ATTR_ID_BRIDGE_MC_DISABLED:
+ ret = dsa_port_mc_disabled(dp, attr->u.mc_disabled, trans);
+ break;
default:
ret = -EOPNOTSUPP;
break;
@@ -428,6 +431,12 @@ static int dsa_slave_get_port_parent_id(struct net_device *dev,
static int dsa_slave_port_attr_get(struct net_device *dev,
struct switchdev_attr *attr)
{
+ struct dsa_port *dp = dsa_slave_to_port(dev);
+ struct dsa_switch *ds = dp->ds;
+
+ if (ds->ops->port_attr_get)
+ return ds->ops->port_attr_get(ds, dp->index, attr);
+
switch (attr->id) {
case SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS_SUPPORT:
attr->u.brport_flags_support = 0;
diff --git a/net/dsa/switch.c b/net/dsa/switch.c
index 831334dc5e79..9e7ac39a8dc0 100644
--- a/net/dsa/switch.c
+++ b/net/dsa/switch.c
@@ -261,6 +261,31 @@ static int dsa_switch_vlan_filtering(struct dsa_switch *ds,
return 0;
}
+static int dsa_switch_mc_disabled(struct dsa_switch *ds,
+ struct dsa_notifier_mc_disabled_info *info)
+{
+ struct switchdev_trans *trans = info->trans;
+ bool mc_disabled = info->mc_disabled;
+ int port = info->port;
+ int err;
+
+ if (switchdev_trans_ph_prepare(trans))
+ return ds->ops->port_mc_disabled ? 0 : -EOPNOTSUPP;
+
+ /* Build a mask of port members */
+ bitmap_zero(ds->bitmap, ds->num_ports);
+ if (ds->index == info->sw_index)
+ set_bit(port, ds->bitmap);
+
+ for_each_set_bit(port, ds->bitmap, ds->num_ports) {
+ err = ds->ops->port_mc_disabled(ds, port, mc_disabled);
+ if (err)
+ return err;
+ }
+
+ return 0;
+}
+
static int dsa_switch_event(struct notifier_block *nb,
unsigned long event, void *info)
{
@@ -298,6 +323,9 @@ static int dsa_switch_event(struct notifier_block *nb,
case DSA_NOTIFIER_VLAN_FILTERING:
err = dsa_switch_vlan_filtering(ds, info);
break;
+ case DSA_NOTIFIER_MC_DISABLED:
+ err = dsa_switch_mc_disabled(ds, info);
+ break;
default:
err = -EOPNOTSUPP;
break;
--
2.17.1
From 224cfbb5f7920bc53c5ba7e025301646dca839cf Mon Sep 17 00:00:00 2001
From: Florian Fainelli <f.fainelli@gmail.com>
Date: Wed, 12 Dec 2018 15:35:52 -0800
Subject: [PATCH 11/13] net: dsa: b53: Add support for toggling IGMP snooping
Add the required configuration knobs to honor the turning off of IGMP
snooping (typically through the bridge interface) which means that when
IGMP snooping is off, we must be flooding mutlicast since we do not get
any notifications about IGMP join/leave through the network stack
running on the bridge.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
drivers/net/dsa/b53/b53_common.c | 88 ++++++++++++++++++++++++++++++++
drivers/net/dsa/b53/b53_priv.h | 3 ++
2 files changed, 91 insertions(+)
diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c
index 6c894ad4768a..22a6f6afff7a 100644
--- a/drivers/net/dsa/b53/b53_common.c
+++ b/drivers/net/dsa/b53/b53_common.c
@@ -1640,6 +1640,93 @@ int b53_fdb_dump(struct dsa_switch *ds, int port,
}
EXPORT_SYMBOL(b53_fdb_dump);
+int b53_port_attr_get(struct dsa_switch *ds, int port,
+ struct switchdev_attr *attr)
+{
+ struct b53_device *dev = ds->priv;
+ struct net_device *bridge_dev;
+ const struct dsa_port *dp;
+ bool mc_disabled;
+ unsigned int i;
+
+ if (is5325(dev) || is5365(dev))
+ return -EOPNOTSUPP;
+
+ switch (attr->id) {
+ case SWITCHDEV_ATTR_ID_PORT_BRIDGE_FLAGS_SUPPORT:
+ attr->u.brport_flags_support = 0;
+ return 0;
+ case SWITCHDEV_ATTR_ID_BRIDGE_MC_DISABLED_SUPPORT:
+ mc_disabled = attr->u.mc_disabled_support;
+ break;
+ default:
+ return -EOPNOTSUPP;
+ }
+
+ /* Handle the case were multiple bridges span the same switch device
+ * and one of them has a different setting than what is being requested
+ * which would be breaking filtering semantics for any of the other
+ * bridge devices. We must also take care of non-bridged ports which
+ * expect multicast filtering to remain turned on.
+ */
+ for (i = 0; i < ds->num_ports; i++) {
+ if (dsa_is_unused_port(ds, i) || dsa_is_cpu_port(ds, i))
+ continue;
+
+ if (i == port)
+ continue;
+
+ dp = dsa_to_port(ds, i);
+ bridge_dev = dp->bridge_dev;
+ if ((bridge_dev &&
+ bridge_dev != dsa_to_port(ds, port)->bridge_dev &&
+ br_multicast_enabled(bridge_dev) != !mc_disabled) ||
+ (!bridge_dev && mc_disabled)) {
+ netdev_err(dp->slave,
+ "MC filtering is global to the switch!\n");
+ return -EINVAL;
+ }
+ }
+
+ return 0;
+}
+EXPORT_SYMBOL(b53_port_attr_get);
+
+int b53_mc_disabled(struct dsa_switch *ds, int port, bool mc_disabled)
+{
+ unsigned int cpu_port = dsa_to_port(ds, port)->cpu_dp->index;
+ struct b53_device *dev = ds->priv;
+ u8 port_ctrl;
+ u16 mc_ctrl;
+
+ /* Allow CPU port to receive multicast traffic */
+ b53_read8(dev, B53_CTRL_PAGE, B53_PORT_CTRL(cpu_port), &port_ctrl);
+ if (mc_disabled)
+ port_ctrl |= PORT_CTRL_RX_MCST_EN;
+ else
+ port_ctrl &= ~PORT_CTRL_RX_MCST_EN;
+ b53_write8(dev, B53_CTRL_PAGE, B53_PORT_CTRL(cpu_port), port_ctrl);
+
+ /* Allow port to flood multicast */
+ b53_read16(dev, B53_CTRL_PAGE, B53_MC_FLOOD_MASK, &mc_ctrl);
+ if (mc_disabled)
+ mc_ctrl |= BIT(port);
+ else
+ mc_ctrl &= ~BIT(port);
+ b53_write16(dev, B53_CTRL_PAGE, B53_MC_FLOOD_MASK, mc_ctrl);
+
+ /* And flood IP multicast as well */
+ b53_read16(dev, B53_CTRL_PAGE, B53_IPMC_FLOOD_MASK, &mc_ctrl);
+ if (mc_disabled)
+ mc_ctrl |= BIT(port);
+ else
+ mc_ctrl &= ~BIT(port);
+ b53_write16(dev, B53_CTRL_PAGE, B53_IPMC_FLOOD_MASK, mc_ctrl);
+
+ return 0;
+}
+EXPORT_SYMBOL(b53_mc_disabled);
+
int b53_mdb_prepare(struct dsa_switch *ds, int port,
const struct switchdev_obj_port_mdb *mdb)
{
@@ -2025,6 +2112,7 @@ static const struct dsa_switch_ops b53_switch_ops = {
.port_mirror_add = b53_mirror_add,
.port_mirror_del = b53_mirror_del,
.port_mdb_prepare = b53_mdb_prepare,
+ .port_mc_disabled = b53_mc_disabled,
.port_mdb_add = b53_mdb_add,
.port_mdb_del = b53_mdb_del,
};
diff --git a/drivers/net/dsa/b53/b53_priv.h b/drivers/net/dsa/b53/b53_priv.h
index 620638ff9338..d407512180e3 100644
--- a/drivers/net/dsa/b53/b53_priv.h
+++ b/drivers/net/dsa/b53/b53_priv.h
@@ -350,6 +350,9 @@ int b53_fdb_del(struct dsa_switch *ds, int port,
const unsigned char *addr, u16 vid);
int b53_fdb_dump(struct dsa_switch *ds, int port,
dsa_fdb_dump_cb_t *cb, void *data);
+int b53_port_attr_get(struct dsa_switch *ds, int port,
+ struct switchdev_attr *attr);
+int b53_mc_disabled(struct dsa_switch *ds, int port, bool mc_disabled);
int b53_mdb_prepare(struct dsa_switch *ds, int port,
const struct switchdev_obj_port_mdb *mdb);
void b53_mdb_add(struct dsa_switch *ds, int port,
--
2.17.1
From 7fa263f108e8148a86762d73c74fde048a6e0eb3 Mon Sep 17 00:00:00 2001
From: Florian Fainelli <f.fainelli@gmail.com>
Date: Thu, 6 Dec 2018 12:33:08 -0800
Subject: [PATCH 12/13] net: dsa: bcm_sf2: Enable management mode
Now that we have all the necessary plumbing in place to get notified
when a multicast MAC address must be programmed, configure the switch to
operate in managed mode and let the network stack learn about management
traffic.
During suspend, we need to bring the switch back into unmanaged/dumb
forwarding mode to allow stations to communicate with each other despite
the host CPU being suspended.
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
drivers/net/dsa/b53/b53_common.c | 39 ++++++++++++++++++-
drivers/net/dsa/b53/b53_priv.h | 1 +
drivers/net/dsa/bcm_sf2.c | 65 +++++++++++++++++++++++++-------
drivers/net/dsa/bcm_sf2_regs.h | 5 +++
4 files changed, 95 insertions(+), 15 deletions(-)
diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c
index 22a6f6afff7a..2d56e045c7be 100644
--- a/drivers/net/dsa/b53/b53_common.c
+++ b/drivers/net/dsa/b53/b53_common.c
@@ -364,8 +364,6 @@ static void b53_enable_vlan(struct b53_device *dev, bool enable,
b53_read8(dev, B53_VLAN_PAGE, B53_VLAN_CTRL5, &vc5);
}
- mgmt &= ~SM_SW_FWD_MODE;
-
if (enable) {
vc0 |= VC0_VLAN_EN | VC0_VID_CHK_EN | VC0_VID_HASH_VID;
vc1 |= VC1_RX_MCST_UNTAG_EN | VC1_RX_MCST_FWD_EN;
@@ -490,6 +488,43 @@ static int b53_fast_age_vlan(struct b53_device *dev, u16 vid)
return b53_flush_arl(dev, FAST_AGE_VLAN);
}
+void b53_port_learn_setup(struct dsa_switch *ds, int port)
+{
+ struct b53_device *dev = ds->priv;
+ u16 reg;
+
+ /* Enable learning */
+ b53_read16(dev, B53_CTRL_PAGE, B53_DIS_LEARN, ®);
+ reg &= ~BIT(port);
+ b53_write16(dev, B53_CTRL_PAGE, B53_DIS_LEARN, reg);
+
+ /* Software learning control disabled */
+ b53_read16(dev, B53_CTRL_PAGE, B53_SFT_LRN_CTRL, ®);
+ reg &= ~BIT(port);
+ b53_write16(dev, B53_CTRL_PAGE, B53_SFT_LRN_CTRL, reg);
+
+ /* Configure IP multicast, allow Unicast ARL misses to be forwarded */
+ b53_read16(dev, B53_CTRL_PAGE, B53_IP_MULTICAST_CTRL, ®);
+ reg |= B53_IPMC_FWD_EN | B53_UC_FWD_EN;
+ b53_write16(dev, B53_CTRL_PAGE, B53_IP_MULTICAST_CTRL, reg);
+
+ /* Set port in Unicast lookup forward map */
+ b53_read16(dev, B53_CTRL_PAGE, B53_UC_FLOOD_MASK, ®);
+ reg |= BIT(port);
+ b53_write16(dev, B53_CTRL_PAGE, B53_UC_FLOOD_MASK, reg);
+
+ /* Do not set port in Multicast lookup forward map, learn */
+ b53_read16(dev, B53_CTRL_PAGE, B53_MC_FLOOD_MASK, ®);
+ reg &= ~BIT(port);
+ b53_write16(dev, B53_CTRL_PAGE, B53_MC_FLOOD_MASK, reg);
+
+ /* Do not set port in IP multicast lookup formward map, learn */
+ b53_read16(dev, B53_CTRL_PAGE, B53_IPMC_FLOOD_MASK, ®);
+ reg &= ~BIT(port);
+ b53_write16(dev, B53_CTRL_PAGE, B53_IPMC_FLOOD_MASK, reg);
+}
+EXPORT_SYMBOL(b53_port_learn_setup);
+
void b53_imp_vlan_setup(struct dsa_switch *ds, int cpu_port)
{
struct b53_device *dev = ds->priv;
diff --git a/drivers/net/dsa/b53/b53_priv.h b/drivers/net/dsa/b53/b53_priv.h
index d407512180e3..0af394161be7 100644
--- a/drivers/net/dsa/b53/b53_priv.h
+++ b/drivers/net/dsa/b53/b53_priv.h
@@ -309,6 +309,7 @@ static inline int b53_switch_get_reset_gpio(struct b53_device *dev)
#endif
/* Exported functions towards other drivers */
+void b53_port_learn_setup(struct dsa_switch *ds, int port);
void b53_imp_vlan_setup(struct dsa_switch *ds, int cpu_port);
int b53_configure_vlan(struct dsa_switch *ds);
void b53_get_strings(struct dsa_switch *ds, int port, u32 stringset,
diff --git a/drivers/net/dsa/bcm_sf2.c b/drivers/net/dsa/bcm_sf2.c
index 5193da67dcdc..05bcfbf26495 100644
--- a/drivers/net/dsa/bcm_sf2.c
+++ b/drivers/net/dsa/bcm_sf2.c
@@ -51,20 +51,25 @@ static void bcm_sf2_imp_setup(struct dsa_switch *ds, int port)
reg &= ~P_TXQ_PSM_VDD(port);
core_writel(priv, reg, CORE_MEM_PSM_VDD_CTRL);
- /* Enable Broadcast, Multicast, Unicast forwarding to IMP port */
- reg = core_readl(priv, CORE_IMP_CTL);
- reg |= (RX_BCST_EN | RX_MCST_EN | RX_UCST_EN);
- reg &= ~(RX_DIS | TX_DIS);
- core_writel(priv, reg, CORE_IMP_CTL);
+ /* Enable forwarding and managed mode */
+ core_writel(priv, SW_FWDG_EN | SW_FWDG_MODE, CORE_SWMODE);
- /* Enable forwarding */
- core_writel(priv, SW_FWDG_EN, CORE_SWMODE);
+ /* Configure port for learning */
+ b53_port_learn_setup(ds, port);
- /* Enable IMP port in dumb mode */
+ /* Disable dumb forwarding mode for IMP port */
reg = core_readl(priv, CORE_SWITCH_CTRL);
- reg |= MII_DUMB_FWDG_EN;
+ reg &= ~MII_DUMB_FWDG_EN;
core_writel(priv, reg, CORE_SWITCH_CTRL);
+ /* Enable IGMP and MLD high-level protocol snooping support */
+ reg = HL_PRTC_IGMP_RPTLVE_EN | HL_PRTC_IGMP_RPTVLE_FWD_MODE |
+ HL_PRTC_IGMP_QRY_EN | HL_PRTC_IGMP_QRY_FWD_MODE |
+ HL_PRTC_IGMP_UKN_EN | HL_PRTC_IGMP_UKN_FWD_MODE |
+ HL_PRTC_MLD_RPTDONE_EN | HL_PRTC_MLD_RPTDONE_FWD_MODE |
+ HL_PRTC_MLD_QRY_EN | HL_PRTC_MLD_QRY_FWD_MODE;
+ b53_write32(priv->dev, B53_MGMT_PAGE, B53_HL_PRTC_CTRL, reg);
+
/* Configure Traffic Class to QoS mapping, allow each priority to map
* to a different queue number
*/
@@ -75,10 +80,26 @@ static void bcm_sf2_imp_setup(struct dsa_switch *ds, int port)
b53_brcm_hdr_setup(ds, port);
+ /* Set IMP0 or IMP1 port to be managed port, enable BPDU */
+ reg = core_readl(priv, CORE_GMNCFGCFG);
+ reg &= ~(FRM_MGNP_MASK << FRM_MGNP_SHIFT);
+ if (port == core_readl(priv, CORE_IMP0_PRT_ID))
+ reg |= FRM_MNGP_IMP0 << FRM_MGNP_SHIFT;
+ if (port == core_readl(priv, CORE_IMP1_PRT_ID))
+ reg |= FRM_MGNP_IMP_DUAL << FRM_MGNP_SHIFT;
+ reg |= RXBPDU_EN;
+ core_writel(priv, reg, CORE_GMNCFGCFG);
+
/* Force link status for IMP port */
reg = core_readl(priv, offset);
reg |= (MII_SW_OR | LINK_STS);
core_writel(priv, reg, offset);
+
+ /* Enable Broadcast, Unicast forwarding to IMP port */
+ reg = core_readl(priv, CORE_IMP_CTL);
+ reg |= (RX_BCST_EN | RX_UCST_EN);
+ reg &= ~(RX_DIS | TX_DIS);
+ core_writel(priv, reg, CORE_IMP_CTL);
}
static void bcm_sf2_gphy_enable_set(struct dsa_switch *ds, bool enable)
@@ -166,10 +187,8 @@ static int bcm_sf2_port_setup(struct dsa_switch *ds, int port,
reg &= ~P_TXQ_PSM_VDD(port);
core_writel(priv, reg, CORE_MEM_PSM_VDD_CTRL);
- /* Enable learning */
- reg = core_readl(priv, CORE_DIS_LEARN);
- reg &= ~BIT(port);
- core_writel(priv, reg, CORE_DIS_LEARN);
+ /* Configure port for learning */
+ b53_port_learn_setup(ds, port);
/* Enable Broadcom tags for that port if requested */
if (priv->brcm_tag_mask & BIT(port))
@@ -683,6 +702,7 @@ static int bcm_sf2_sw_suspend(struct dsa_switch *ds)
{
struct bcm_sf2_priv *priv = bcm_sf2_to_priv(ds);
unsigned int port;
+ u32 reg;
bcm_sf2_intr_disable(priv);
@@ -695,6 +715,20 @@ static int bcm_sf2_sw_suspend(struct dsa_switch *ds)
bcm_sf2_port_disable(ds, port, NULL);
}
+ /* Disable management mode since we won't be able to
+ * perform any tasks while being suspended.
+ */
+ reg = core_readl(priv, CORE_SWMODE);
+ reg &= ~SW_FWDG_MODE;
+ core_writel(priv, reg, CORE_SWMODE);
+
+ /* Enable IMP port in dumb switching mode to allow
+ * Magic Packets to reach IMP port.
+ */
+ reg = core_readl(priv, CORE_SWITCH_CTRL);
+ reg |= MII_DUMB_FWDG_EN;
+ core_writel(priv, reg, CORE_SWITCH_CTRL);
+
return 0;
}
@@ -962,6 +996,11 @@ static const struct dsa_switch_ops bcm_sf2_ops = {
.set_rxnfc = bcm_sf2_set_rxnfc,
.port_mirror_add = b53_mirror_add,
.port_mirror_del = b53_mirror_del,
+ .port_mc_disabled = b53_mc_disabled,
+ .port_mdb_prepare = b53_mdb_prepare,
+ .port_mdb_add = b53_mdb_add,
+ .port_mdb_del = b53_mdb_del,
+ .port_attr_get = b53_port_attr_get,
};
struct bcm_sf2_of_data {
diff --git a/drivers/net/dsa/bcm_sf2_regs.h b/drivers/net/dsa/bcm_sf2_regs.h
index 67f056206f37..d95572a175bc 100644
--- a/drivers/net/dsa/bcm_sf2_regs.h
+++ b/drivers/net/dsa/bcm_sf2_regs.h
@@ -222,8 +222,13 @@ enum bcm_sf2_reg_offs {
#define CORE_GMNCFGCFG 0x0800
#define RST_MIB_CNT (1 << 0)
#define RXBPDU_EN (1 << 1)
+#define FRM_MGNP_SHIFT 6
+#define FRM_MGNP_MASK 0x3
+#define FRM_MNGP_IMP0 2
+#define FRM_MGNP_IMP_DUAL 3
#define CORE_IMP0_PRT_ID 0x0804
+#define CORE_IMP1_PRT_ID 0x0808
#define CORE_RST_MIB_CNT_EN 0x0950
--
2.17.1
From 3fd8225c5a60005141d7c5f0954aaa37d49a51b5 Mon Sep 17 00:00:00 2001
From: Florian Fainelli <f.fainelli@gmail.com>
Date: Thu, 31 Jan 2019 17:04:33 -0800
Subject: [PATCH 13/13] DNP: b53: Debugging
Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
---
drivers/net/dsa/b53/b53_common.c | 19 ++++++++++++++++---
1 file changed, 16 insertions(+), 3 deletions(-)
diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c
index 2d56e045c7be..2a05ff3fce27 100644
--- a/drivers/net/dsa/b53/b53_common.c
+++ b/drivers/net/dsa/b53/b53_common.c
@@ -281,7 +281,7 @@ static void b53_set_vlan_entry(struct b53_device *dev, u16 vid,
b53_do_vlan_op(dev, VTA_CMD_WRITE);
}
- dev_dbg(dev->ds->dev, "VID: %d, members: 0x%04x, untag: 0x%04x\n",
+ dev_info(dev->ds->dev, "VID: %d, members: 0x%04x, untag: 0x%04x\n",
vid, vlan->members, vlan->untag);
}
@@ -427,6 +427,9 @@ static void b53_enable_vlan(struct b53_device *dev, bool enable,
dev->vlan_enabled = enable;
dev->vlan_filtering_enabled = enable_filtering;
+
+ dev_info(dev->dev, "VLAN: %d, filtering: %d\n",
+ dev->vlan_enabled, dev->vlan_filtering_enabled);
}
static int b53_set_jumbo(struct b53_device *dev, bool enable, bool allow_10_100)
@@ -1343,6 +1346,8 @@ int b53_vlan_filtering(struct dsa_switch *ds, int port, bool vlan_filtering)
new_pvid = dev->ports[port].pvid;
}
+ dev_info(ds->dev, "PVID: %d, new PVID: %d\n", pvid, new_pvid);
+
if (pvid != new_pvid)
b53_write16(dev, B53_VLAN_PAGE, B53_VLAN_PORT_DEF_TAG(port),
new_pvid);
@@ -1385,7 +1390,7 @@ void b53_vlan_add(struct dsa_switch *ds, int port,
b53_get_vlan_entry(dev, vid, vl);
vl->members |= BIT(port);
- if (untagged && !dsa_is_cpu_port(ds, port))
+ if (untagged) //!dsa_is_cpu_port(ds, port))
vl->untag |= BIT(port);
else
vl->untag &= ~BIT(port);
@@ -1394,7 +1399,7 @@ void b53_vlan_add(struct dsa_switch *ds, int port,
b53_fast_age_vlan(dev, vid);
}
- if (pvid) {
+ if (pvid && !dsa_is_cpu_port(ds, port)) {
b53_write16(dev, B53_VLAN_PAGE, B53_VLAN_PORT_DEF_TAG(port),
vlan->vid_end);
b53_fast_age_vlan(dev, vid);
@@ -1758,6 +1763,8 @@ int b53_mc_disabled(struct dsa_switch *ds, int port, bool mc_disabled)
mc_ctrl &= ~BIT(port);
b53_write16(dev, B53_CTRL_PAGE, B53_IPMC_FLOOD_MASK, mc_ctrl);
+ dev_info(dev->dev, "MC: %d, port: %d\n", mc_disabled, port);
+
return 0;
}
EXPORT_SYMBOL(b53_mc_disabled);
@@ -1783,6 +1790,9 @@ void b53_mdb_add(struct dsa_switch *ds, int port,
struct b53_device *priv = ds->priv;
int ret;
+ dev_info(ds->dev, "Adding, port: %d MAC: %pM, VID: %d\n",
+ port, mdb->addr, mdb->vid);
+
ret = b53_arl_op(priv, 0, port, mdb->addr, mdb->vid, true);
if (ret)
dev_err(ds->dev, "failed to add MDB entry\n");
@@ -1795,6 +1805,9 @@ int b53_mdb_del(struct dsa_switch *ds, int port,
struct b53_device *priv = ds->priv;
int ret;
+ dev_info(ds->dev, "Removing, port: %d MAC: %pM, VID: %d\n",
+ port, mdb->addr, mdb->vid);
+
ret = b53_arl_op(priv, 0, port, mdb->addr, mdb->vid, false);
if (ret)
dev_err(ds->dev, "failed to delete MDB entry\n");
--
2.17.1
^ permalink raw reply related
* [PATCH bpf 1/2] bpf/test_run: fix unkillable BPF_PROG_TEST_RUN
From: Stanislav Fomichev @ 2019-02-12 23:42 UTC (permalink / raw)
To: netdev; +Cc: davem, ast, daniel, Stanislav Fomichev, syzbot
Syzbot found out that running BPF_PROG_TEST_RUN with repeat=0xffffffff
makes process unkillable. The problem is that when CONFIG_PREEMPT is
enabled, we never see need_resched() return true. This is due to the
fact that preempt_enable() (which we do in bpf_test_run_one on each
iteration) now handles resched if it's needed.
Let's disable preemption for the whole run, not per test. In this case
we can properly see whether resched is needed.
Let's also properly return -EINTR to the userspace in case of a signal
interrupt.
See recent discussion:
http://lore.kernel.org/netdev/CAH3MdRWHr4N8jei8jxDppXjmw-Nw=puNDLbu1dQOFQHxfU2onA@mail.gmail.com
I'll follow up with the same fix bpf_prog_test_run_flow_dissector in
bpf-next.
Reported-by: syzbot <syzkaller@googlegroups.com>
Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
net/bpf/test_run.c | 45 ++++++++++++++++++++++++---------------------
1 file changed, 24 insertions(+), 21 deletions(-)
diff --git a/net/bpf/test_run.c b/net/bpf/test_run.c
index fa2644d276ef..e31e1b20f7f4 100644
--- a/net/bpf/test_run.c
+++ b/net/bpf/test_run.c
@@ -13,27 +13,13 @@
#include <net/sock.h>
#include <net/tcp.h>
-static __always_inline u32 bpf_test_run_one(struct bpf_prog *prog, void *ctx,
- struct bpf_cgroup_storage *storage[MAX_BPF_CGROUP_STORAGE_TYPE])
-{
- u32 ret;
-
- preempt_disable();
- rcu_read_lock();
- bpf_cgroup_storage_set(storage);
- ret = BPF_PROG_RUN(prog, ctx);
- rcu_read_unlock();
- preempt_enable();
-
- return ret;
-}
-
-static int bpf_test_run(struct bpf_prog *prog, void *ctx, u32 repeat, u32 *ret,
- u32 *time)
+static int bpf_test_run(struct bpf_prog *prog, void *ctx, u32 repeat,
+ u32 *retval, u32 *time)
{
struct bpf_cgroup_storage *storage[MAX_BPF_CGROUP_STORAGE_TYPE] = { 0 };
enum bpf_cgroup_storage_type stype;
u64 time_start, time_spent = 0;
+ int ret = 0;
u32 i;
for_each_cgroup_storage_type(stype) {
@@ -48,25 +34,42 @@ static int bpf_test_run(struct bpf_prog *prog, void *ctx, u32 repeat, u32 *ret,
if (!repeat)
repeat = 1;
+
+ rcu_read_lock();
+ preempt_disable();
time_start = ktime_get_ns();
for (i = 0; i < repeat; i++) {
- *ret = bpf_test_run_one(prog, ctx, storage);
+ bpf_cgroup_storage_set(storage);
+ *retval = BPF_PROG_RUN(prog, ctx);
+
+ if (signal_pending(current)) {
+ ret = -EINTR;
+ break;
+ }
+
if (need_resched()) {
- if (signal_pending(current))
- break;
time_spent += ktime_get_ns() - time_start;
+ preempt_enable();
+ rcu_read_unlock();
+
cond_resched();
+
+ rcu_read_lock();
+ preempt_disable();
time_start = ktime_get_ns();
}
}
time_spent += ktime_get_ns() - time_start;
+ preempt_enable();
+ rcu_read_unlock();
+
do_div(time_spent, repeat);
*time = time_spent > U32_MAX ? U32_MAX : (u32)time_spent;
for_each_cgroup_storage_type(stype)
bpf_cgroup_storage_free(storage[stype]);
- return 0;
+ return ret;
}
static int bpf_test_finish(const union bpf_attr *kattr,
--
2.20.1.791.gb4d0f1c61a-goog
^ permalink raw reply related
* [PATCH bpf 2/2] selftests/bpf: make sure signal interrupts BPF_PROG_TEST_RUN
From: Stanislav Fomichev @ 2019-02-12 23:42 UTC (permalink / raw)
To: netdev; +Cc: davem, ast, daniel, Stanislav Fomichev
In-Reply-To: <20190212234239.174386-1-sdf@google.com>
Simple test that I used to reproduce the issue in the previous commit:
Do BPF_PROG_TEST_RUN with max iterations, each program is 4096 simple
move instructions. File alarm in 0.1 second and check that
bpf_prog_test_run is interrupted (i.e. test doesn't hang).
Feel free to ignore it if you feel like that's just a one-off fix and it
doesn't require a test going forward.
Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
tools/testing/selftests/bpf/test_progs.c | 44 ++++++++++++++++++++++++
1 file changed, 44 insertions(+)
diff --git a/tools/testing/selftests/bpf/test_progs.c b/tools/testing/selftests/bpf/test_progs.c
index 25f0083a9b2e..7842e3749b19 100644
--- a/tools/testing/selftests/bpf/test_progs.c
+++ b/tools/testing/selftests/bpf/test_progs.c
@@ -11,6 +11,7 @@
#include <assert.h>
#include <stdlib.h>
#include <time.h>
+#include <signal.h>
#include <linux/types.h>
typedef __u16 __sum16;
@@ -27,6 +28,7 @@ typedef __u16 __sum16;
#include <sys/ioctl.h>
#include <sys/wait.h>
#include <sys/types.h>
+#include <sys/time.h>
#include <fcntl.h>
#include <linux/bpf.h>
@@ -1912,6 +1914,47 @@ static void test_queue_stack_map(int type)
bpf_object__close(obj);
}
+static void sigalrm_handler(int s) {}
+static struct sigaction sigalrm_action = {
+ .sa_handler = sigalrm_handler,
+};
+
+static void test_signal_pending(void)
+{
+ struct bpf_insn prog[4096];
+ struct itimerval timeo = {
+ .it_value.tv_usec = 100000, /* 100ms */
+ };
+ __u32 duration, retval;
+ int prog_fd;
+ int err;
+ int i;
+
+ for (i = 0; i < ARRAY_SIZE(prog); i++)
+ prog[i] = BPF_ALU64_IMM(BPF_MOV, BPF_REG_0, 0);
+ prog[ARRAY_SIZE(prog) - 1] = BPF_EXIT_INSN();
+
+ prog_fd = bpf_load_program(BPF_PROG_TYPE_SOCKET_FILTER,
+ prog, ARRAY_SIZE(prog),
+ "GPL", 0, NULL, 0);
+ CHECK(prog_fd < 0, "test-run", "errno %d\n", errno);
+
+ err = sigaction(SIGALRM, &sigalrm_action, NULL);
+ CHECK(err, "test-run-signal-sigaction", "errno %d\n", errno);
+
+ err = setitimer(ITIMER_REAL, &timeo, NULL);
+ CHECK(err, "test-run-signal-timer", "errno %d\n", errno);
+
+ err = bpf_prog_test_run(prog_fd, 0xffffffff, &pkt_v4, sizeof(pkt_v4),
+ NULL, NULL, &retval, &duration);
+ CHECK(err != -1 || errno != EINTR || duration > 1000000000,
+ "test-run-signal-run",
+ "err %d errno %d retval %d\n",
+ err, errno, retval);
+
+ signal(SIGALRM, SIG_DFL);
+}
+
int main(void)
{
srand(time(NULL));
@@ -1939,6 +1982,7 @@ int main(void)
test_reference_tracking();
test_queue_stack_map(QUEUE);
test_queue_stack_map(STACK);
+ test_signal_pending();
printf("Summary: %d PASSED, %d FAILED\n", pass_cnt, error_cnt);
return error_cnt ? EXIT_FAILURE : EXIT_SUCCESS;
--
2.20.1.791.gb4d0f1c61a-goog
^ permalink raw reply related
* Re: [PATCHv5 iproute2 net-next 2/2] lib/libnetlink: re malloc buff if size is not enough
From: Eric Dumazet @ 2019-02-12 23:43 UTC (permalink / raw)
To: Hangbin Liu, netdev; +Cc: Stephen Hemminger, Michal Kubecek, Phil Sutter
In-Reply-To: <244174ae-3ab4-68c0-6783-f8c91840a7e1@gmail.com>
On 02/12/2019 03:32 PM, Eric Dumazet wrote:
>
>
> This patch brings a serious performance penalty.
>
> ss command now uses two system calls per ~4KB worth of data
>
> recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{NULL, 0}], msg_controllen=0, msg_flags=MSG_TRUNC}, MSG_PEEK|MSG_TRUNC) = 3328 <0.000120>
> recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{"h\0\0\0\24\0\2\0@\342\1\0\322\0\6\0\n\1\1\0\250\253\276@&\7\370\260\200\231\16\6"..., 3328}], msg_controllen=0, msg_flags=0}, 0) = 3328 <0.000108>
> recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{NULL, 0}], msg_controllen=0, msg_flags=MSG_TRUNC}, MSG_PEEK|MSG_TRUNC) = 3328 <0.000086>
> recvmsg(3, {msg_name(12)={sa_family=AF_NETLINK, pid=0, groups=00000000}, msg_iov(1)=[{"h\0\0\0\24\0\2\0@\342\1\0\322\0\6\0\n\10\2\0002A\266S&\7\370\260\200\231\16\6"..., 3328}], msg_controllen=0, msg_flags=0}, 0) = 3328 <0.000121>
>
>
> So we are back to a very pessimistic situation.
>
I guess this patch will solve the issue :
diff --git a/lib/libnetlink.c b/lib/libnetlink.c
index ced33728777a17e0905e76acb904ac4709707488..309b5b3787e3d8f8c47f035d270ae2b4df01703e 100644
--- a/lib/libnetlink.c
+++ b/lib/libnetlink.c
@@ -442,6 +442,8 @@ static int rtnl_recvmsg(int fd, struct msghdr *msg, char **answer)
if (len < 0)
return len;
+ if (len < 32768)
+ len = 32768;
buf = malloc(len);
if (!buf) {
fprintf(stderr, "malloc error: not enough buffer\n");
^ permalink raw reply related
* linux-next: manual merge of the net-next tree with the net tree
From: Stephen Rothwell @ 2019-02-13 0:13 UTC (permalink / raw)
To: David Miller, Networking
Cc: Linux Next Mailing List, Linux Kernel Mailing List, Cong Wang,
Vlad Buslov
[-- Attachment #1: Type: text/plain, Size: 1009 bytes --]
Hi all,
Today's linux-next merge of the net-next tree got a conflict in:
net/sched/cls_tcindex.c
between commits:
8015d93ebd27 ("net_sched: fix a race condition in tcindex_destroy()")
033b228e7f26 ("net_sched: fix a memory leak in cls_tcindex")
from the net tree and commit:
12db03b65c2b ("net: sched: extend proto ops to support unlocked classifiers")
from the net-next tree.
I fixed it up (see the final resolution when linux-next is published)
and can carry the fix as necessary. This is now fixed as far as
linux-next is concerned, but any non trivial conflicts should be
mentioned to your upstream maintainer when your tree is submitted for
merging. You may also want to consider cooperating with the maintainer
of the conflicting tree to minimise any particularly complex conflicts.
--
Cheers,
Stephen Rothwell
diff --cc net/sched/cls_tcindex.c
index 38bb882bb958,14d6b4058045..000000000000
--- a/net/sched/cls_tcindex.c
+++ b/net/sched/cls_tcindex.c
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply
* [PATCH net-next 1/1] flow_offload: fix block stats
From: John Hurley @ 2019-02-13 0:23 UTC (permalink / raw)
To: jiri, davem; +Cc: netdev, pablo, oss-drivers, John Hurley
With the introduction of flow_stats_update(), drivers now update the stats
fields of the passed tc_cls_flower_offload struct, rather than call
tcf_exts_stats_update() directly to update the stats of offloaded TC
flower rules. However, if multiple qdiscs are registered to a TC shared
block and a flower rule is applied, then, when getting stats for the rule,
multiple callbacks may be made.
Take this into consideration by modifying flow_stats_update to gather the
stats from all callbacks. Currently, the values in tc_cls_flower_offload
only account for the last stats callback in the list.
Fixes: 3b1903ef97c0 ("flow_offload: add statistics retrieval infrastructure and use it")
Signed-off-by: John Hurley <john.hurley@netronome.com>
Reviewed-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
include/net/flow_offload.h | 6 +++---
1 file changed, 3 insertions(+), 3 deletions(-)
diff --git a/include/net/flow_offload.h b/include/net/flow_offload.h
index a307ccb..d035183 100644
--- a/include/net/flow_offload.h
+++ b/include/net/flow_offload.h
@@ -195,9 +195,9 @@ struct flow_stats {
static inline void flow_stats_update(struct flow_stats *flow_stats,
u64 bytes, u64 pkts, u64 lastused)
{
- flow_stats->pkts = pkts;
- flow_stats->bytes = bytes;
- flow_stats->lastused = lastused;
+ flow_stats->pkts += pkts;
+ flow_stats->bytes += bytes;
+ flow_stats->lastused = max_t(u64, flow_stats->lastused, lastused);
}
#endif /* _NET_FLOW_OFFLOAD_H */
--
2.7.4
^ permalink raw reply related
* Re: linux-next: manual merge of the net-next tree with the net tree
From: Stephen Rothwell @ 2019-02-13 0:33 UTC (permalink / raw)
To: David Miller, Networking
Cc: Linux Next Mailing List, Linux Kernel Mailing List, Cong Wang,
Vlad Buslov
In-Reply-To: <20190213111325.30cfc931@canb.auug.org.au>
[-- Attachment #1: Type: text/plain, Size: 2344 bytes --]
Hi all,
On Wed, 13 Feb 2019 11:13:25 +1100 Stephen Rothwell <sfr@canb.auug.org.au> wrote:
>
> Today's linux-next merge of the net-next tree got a conflict in:
>
> net/sched/cls_tcindex.c
>
> between commits:
>
> 8015d93ebd27 ("net_sched: fix a race condition in tcindex_destroy()")
> 033b228e7f26 ("net_sched: fix a memory leak in cls_tcindex")
>
> from the net tree and commit:
>
> 12db03b65c2b ("net: sched: extend proto ops to support unlocked classifiers")
>
> from the net-next tree.
>
> I fixed it up (see the final resolution when linux-next is published)
> and can carry the fix as necessary. This is now fixed as far as
> linux-next is concerned, but any non trivial conflicts should be
> mentioned to your upstream maintainer when your tree is submitted for
> merging. You may also want to consider cooperating with the maintainer
> of the conflicting tree to minimise any particularly complex conflicts.
Actually, see the below resolution.
--
Cheers,
Stephen Rothwell
diff --cc net/sched/cls_tcindex.c
index 38bb882bb958,14d6b4058045..e6cf20bc8e80
--- a/net/sched/cls_tcindex.c
+++ b/net/sched/cls_tcindex.c
@@@ -559,34 -563,15 +560,34 @@@ static void tcindex_destroy(struct tcf_
struct netlink_ext_ack *extack)
{
struct tcindex_data *p = rtnl_dereference(tp->root);
- struct tcf_walker walker;
+ int i;
pr_debug("tcindex_destroy(tp %p),p %p\n", tp, p);
- walker.count = 0;
- walker.skip = 0;
- walker.fn = tcindex_destroy_element;
- tcindex_walk(tp, &walker, true);
- call_rcu(&p->rcu, __tcindex_destroy);
+ if (p->perfect) {
+ for (i = 0; i < p->hash; i++) {
+ struct tcindex_filter_result *r = p->perfect + i;
+
+ tcf_unbind_filter(tp, &r->res);
+ if (tcf_exts_get_net(&r->exts))
+ tcf_queue_work(&r->rwork,
+ tcindex_destroy_rexts_work);
+ else
+ __tcindex_destroy_rexts(r);
+ }
+ }
+
+ for (i = 0; p->h && i < p->hash; i++) {
+ struct tcindex_filter *f, *next;
+ bool last;
+
+ for (f = rtnl_dereference(p->h[i]); f; f = next) {
+ next = rtnl_dereference(f->next);
- tcindex_delete(tp, &f->result, &last, NULL);
++ tcindex_delete(tp, &f->result, &last, false, NULL);
+ }
+ }
+
+ tcf_queue_work(&p->rwork, tcindex_destroy_work);
}
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply
* Re: [PATCH iproute2 net-next v2 3/4] ss: Buffer raw fields first, then render them as a table
From: Eric Dumazet @ 2019-02-13 0:42 UTC (permalink / raw)
To: Stefano Brivio, Stephen Hemminger; +Cc: netdev, Sabrina Dubroca
In-Reply-To: <fef5eed38c3f47f334136fa2d857b6478d497d03.1513039237.git.sbrivio@redhat.com>
On 12/11/2017 04:46 PM, Stefano Brivio wrote:
> This allows us to measure the maximum field length for each
> column before printing fields and will permit us to apply
> optimal field spacing and distribution. Structure of the output
> buffer with chunked allocation is described in comments.
>
> Output is still unchanged, original spacing is used.
>
> Running over one million sockets with -tul options by simply
> modifying main() to loop 50,000 times over the *_show()
> functions, buffering the whole output and rendering it at the
> end, with 10 UDP sockets, 10 TCP sockets, while throwing
> output away, doesn't show significant changes in execution time
> on my laptop with an Intel i7-6600U CPU:
>
> - before this patch:
> $ time ./ss -tul > /dev/null
> real 0m29.899s
> user 0m2.017s
> sys 0m27.801s
>
> - after this patch:
> $ time ./ss -tul > /dev/null
> real 0m29.827s
> user 0m1.942s
> sys 0m27.812s
>
I do not get it.
"ss -emoi " uses almost 1KB per socket.
10,000,000 sockets -> we need about 10GB of memory ???
This is a serious regression.
^ permalink raw reply
* linux-next: build warning after merge of the net-next tree
From: Stephen Rothwell @ 2019-02-13 0:49 UTC (permalink / raw)
To: David Miller, Networking
Cc: Linux Next Mailing List, Linux Kernel Mailing List,
Florian Fainelli
[-- Attachment #1: Type: text/plain, Size: 562 bytes --]
Hi all,
After merging the net-next tree, today's linux-next build (x86_64
allmodconfig) produced this warning:
drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c: In function 'mlxsw_sp_port_attr_get':
drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c:438:19: warning: unused variable 'mlxsw_sp' [-Wunused-variable]
struct mlxsw_sp *mlxsw_sp = mlxsw_sp_port->mlxsw_sp;
^~~~~~~~
Introduced by commit
1ecb195753a1 ("mlxsw: spectrum_switchdev: Remove getting PORT_BRIDGE_FLAGS")
--
Cheers,
Stephen Rothwell
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply
* linux-next: build warning after merge of the net-next tree
From: Stephen Rothwell @ 2019-02-13 0:51 UTC (permalink / raw)
To: David Miller, Networking
Cc: Linux Next Mailing List, Linux Kernel Mailing List,
Florian Fainelli
[-- Attachment #1: Type: text/plain, Size: 538 bytes --]
Hi all,
After merging the net-next tree, today's linux-next build (x86_64
allmodconfig) produced this warning:
drivers/staging/fsl-dpaa2/ethsw/ethsw.c: In function 'swdev_port_attr_get':
drivers/staging/fsl-dpaa2/ethsw/ethsw.c:646:26: warning: unused variable 'port_priv' [-Wunused-variable]
struct ethsw_port_priv *port_priv = netdev_priv(netdev);
^~~~~~~~~
Introduced by commit
1b8b589d9103 ("staging: fsl-dpaa2: ethsw: Remove getting PORT_BRIDGE_FLAGS")
--
Cheers,
Stephen Rothwell
[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]
^ permalink raw reply
* Re: linux-next: build warning after merge of the net-next tree
From: Florian Fainelli @ 2019-02-13 0:57 UTC (permalink / raw)
To: Stephen Rothwell, David Miller, Networking
Cc: Linux Next Mailing List, Linux Kernel Mailing List
In-Reply-To: <20190213114916.1a83a363@canb.auug.org.au>
Le 2/12/19 à 4:49 PM, Stephen Rothwell a écrit :
> Hi all,
>
> After merging the net-next tree, today's linux-next build (x86_64
> allmodconfig) produced this warning:
>
> drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c: In function 'mlxsw_sp_port_attr_get':
> drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c:438:19: warning: unused variable 'mlxsw_sp' [-Wunused-variable]
> struct mlxsw_sp *mlxsw_sp = mlxsw_sp_port->mlxsw_sp;
> ^~~~~~~~
>
> Introduced by commit
>
> 1ecb195753a1 ("mlxsw: spectrum_switchdev: Remove getting PORT_BRIDGE_FLAGS")
Fixed with:
http://patchwork.ozlabs.org/project/netdev/list/?series=91603
--
Florian
^ permalink raw reply
* Re: linux-next: build warning after merge of the net-next tree
From: Florian Fainelli @ 2019-02-13 0:58 UTC (permalink / raw)
To: Stephen Rothwell, David Miller, Networking
Cc: Linux Next Mailing List, Linux Kernel Mailing List
In-Reply-To: <20190213115141.376058e8@canb.auug.org.au>
Le 2/12/19 à 4:51 PM, Stephen Rothwell a écrit :
> Hi all,
>
> After merging the net-next tree, today's linux-next build (x86_64
> allmodconfig) produced this warning:
>
> drivers/staging/fsl-dpaa2/ethsw/ethsw.c: In function 'swdev_port_attr_get':
> drivers/staging/fsl-dpaa2/ethsw/ethsw.c:646:26: warning: unused variable 'port_priv' [-Wunused-variable]
> struct ethsw_port_priv *port_priv = netdev_priv(netdev);
> ^~~~~~~~~
>
> Introduced by commit
>
> 1b8b589d9103 ("staging: fsl-dpaa2: ethsw: Remove getting PORT_BRIDGE_FLAGS")
>
Also fixed with:
http://patchwork.ozlabs.org/project/netdev/list/?series=91603
--
Florian
^ permalink raw reply
* [PATCH net-next] net: sched: remove duplicated include from cls_api.c
From: YueHaibing @ 2019-02-13 1:42 UTC (permalink / raw)
To: Jamal Hadi Salim, Cong Wang, Jiri Pirko, David S . Miller
Cc: YueHaibing, netdev, kernel-janitors
Remove duplicated include.
Signed-off-by: YueHaibing <yuehaibing@huawei.com>
---
net/sched/cls_api.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
index 02cf6d2fa0e1..774b663e07f1 100644
--- a/net/sched/cls_api.c
+++ b/net/sched/cls_api.c
@@ -38,7 +38,6 @@
#include <net/tc_act/tc_csum.h>
#include <net/tc_act/tc_gact.h>
#include <net/tc_act/tc_skbedit.h>
-#include <net/tc_act/tc_mirred.h>
extern const struct nla_policy rtm_tca_policy[TCA_MAX + 1];
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox