Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH 0/8] pull request for net-next: batman-adv 2017-05-24
From: David Miller @ 2018-05-24 15:03 UTC (permalink / raw)
  To: sw; +Cc: netdev, b.a.t.m.a.n
In-Reply-To: <20180524120300.15829-1-sw@simonwunderlich.de>

From: Simon Wunderlich <sw@simonwunderlich.de>
Date: Thu, 24 May 2018 14:02:52 +0200

> here is a our feature/cleanup pull request of batman-adv to go into net-next.
> 
> Please pull or let me know of any problem!

Pulled.

You should really remove the EXPERIMENTAL tag from the V
protocol support if you want it to be on by default.  Maybe
even remove the Kconfig knob entirely.

^ permalink raw reply

* 4.16 issue with mbim modem and ping with size > 14552 bytes
From: Daniele Palmas @ 2018-05-24 15:04 UTC (permalink / raw)
  To: netdev, linux-usb

Hello,

I have an issue with an USB mbim modem when trying to send with ping
more than 14552 bytes: it looks like to me a kernel issue, but not at
the cdc_mbim or cdc_ncm level, anyway not sure, so I'm reporting the
issue.

My kernel is 4.16. The device is the following:

root@L2122:~# ifconfig
wwp0s20u7i2 Link encap:Ethernet  HWaddr be:3d:f2:f4:0d:e9
          inet addr:2.193.7.73  Bcast:0.0.0.0  Mask:255.255.255.252
          inet6 addr: fe80::bc3d:f2ff:fef4:de9/64 Scope:Link
          UP BROADCAST RUNNING NOARP MULTICAST  MTU:1500  Metric:1
          RX packets:5 errors:0 dropped:0 overruns:0 frame:0
          TX packets:55 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:759 (759.0 B)  TX bytes:6275 (6.2 KB)

Sending ping sized 14552 no issue:

root@L2122:~# ping -s 14552 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 14552(14580) bytes of data.
1448 bytes from 8.8.8.8: icmp_seq=1 ttl=52 (truncated)
1448 bytes from 8.8.8.8: icmp_seq=2 ttl=52 (truncated)
1448 bytes from 8.8.8.8: icmp_seq=3 ttl=52 (truncated)
1448 bytes from 8.8.8.8: icmp_seq=4 ttl=52 (truncated)
1448 bytes from 8.8.8.8: icmp_seq=5 ttl=52 (truncated)
1448 bytes from 8.8.8.8: icmp_seq=6 ttl=52 (truncated)
1448 bytes from 8.8.8.8: icmp_seq=7 ttl=52 (truncated)
1448 bytes from 8.8.8.8: icmp_seq=8 ttl=52 (truncated)
^C
--- 8.8.8.8 ping statistics ---
8 packets transmitted, 8 received, 0% packet loss, time 7008ms
rtt min/avg/max/mdev = 54.887/83.154/102.563/18.502 ms

root@L2122:~# ifconfig
wwp0s20u7i2 Link encap:Ethernet  HWaddr be:3d:f2:f4:0d:e9
          inet addr:2.193.7.73  Bcast:0.0.0.0  Mask:255.255.255.252
          inet6 addr: fe80::bc3d:f2ff:fef4:de9/64 Scope:Link
          UP BROADCAST RUNNING NOARP MULTICAST  MTU:1500  Metric:1
          RX packets:15 errors:0 dropped:0 overruns:0 frame:0
          TX packets:161 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:12583 (12.5 KB)  TX bytes:125999 (125.9 KB)

If I try ping sized 14554, it does not work

root@L2122:~# ping -s 14554 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 14554(14582) bytes of data.
^C
--- 8.8.8.8 ping statistics ---
7 packets transmitted, 0 received, 100% packet loss, time 6122ms

and I see tx errors in the network interface

root@L2122:~# ifconfig
wwp0s20u7i2 Link encap:Ethernet  HWaddr be:3d:f2:f4:0d:e9
          inet addr:2.193.7.73  Bcast:0.0.0.0  Mask:255.255.255.252
          inet6 addr: fe80::bc3d:f2ff:fef4:de9/64 Scope:Link
          UP BROADCAST RUNNING NOARP MULTICAST  MTU:1500  Metric:1
          RX packets:20 errors:0 dropped:0 overruns:0 frame:0
          TX packets:190 errors:5 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:12943 (12.9 KB)  TX bytes:142476 (142.4 KB)

but the real problem is that the network interface seems not to be
working anymore:

root@L2122:~# ping 8.8.8.8
PING 8.8.8.8 (8.8.8.8) 56(84) bytes of data.
^C
--- 8.8.8.8 ping statistics ---
10 packets transmitted, 0 received, 100% packet loss, time 9193ms

root@L2122:~#
root@L2122:~# ifconfig
wwp0s20u7i2 Link encap:Ethernet  HWaddr be:3d:f2:f4:0d:e9
          inet addr:2.193.7.73  Bcast:0.0.0.0  Mask:255.255.255.252
          inet6 addr: fe80::bc3d:f2ff:fef4:de9/64 Scope:Link
          UP BROADCAST RUNNING NOARP MULTICAST  MTU:1500  Metric:1
          RX packets:20 errors:0 dropped:0 overruns:0 frame:0
          TX packets:190 errors:20 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:12943 (12.9 KB)  TX bytes:142476 (142.4 KB)

Nothing relevant in the kernel log.

Anyone can suggest me how to debug this further?

Thanks in advance,
Daniele

^ permalink raw reply

* Re: [PATCH 0/4] RFC CPSW switchdev mode
From: Ilias Apalodimas @ 2018-05-24 15:07 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Ivan Vecera, Jiri Pirko, netdev, grygorii.strashko,
	ivan.khoronzhuk, nsekhar, francois.ozog, yogeshs, spatton
In-Reply-To: <20180524145441.GE5128@lunn.ch>

On Thu, May 24, 2018 at 04:54:41PM +0200, Andrew Lunn wrote:
> If you cannot get an IP address, it is plain broken. The whole idea is
> that switch port interfaces are just linux interfaces. A linux
> interface which cannot get an IP address is broken.
The switch interfaces can get ip addresses just like every linux interface. The
cpu port can't (sw0p0)
> 
> > Similar cases exist for customers on adding MDBs as far as i know. So they want
> > the "customer facing ports" to have the MDBs present but not the cpu port.
> 
> That i can understand. And it should actually work now with
> switchdev. It performs IGMP snooping, and if there is nothing joining
> the group on the CPU, it won't add an MDB entry to forward traffic to
> the CPU.
Yes, but this should be configurable (i.e the customer can deny adding the MDB
on the cpu port)
> 
> > Adding a cpu port that cannot transmit or receive traffic is a bit "weird"
> 
> And how is it supposed to send BPDUs? STP is going to be broken....
Not sure about this, i'll have to check

Regards
Ilias

^ permalink raw reply

* Re: [PATCH net] packet: in packet_snd start writing at link layer allocation
From: Tariq Toukan @ 2018-05-24 15:07 UTC (permalink / raw)
  To: David Miller, willemdebruijn.kernel
  Cc: netdev, eric.dumazet, willemb, Maor Gottlieb
In-Reply-To: <20180513.202055.2059612987939748570.davem@davemloft.net>



On 14/05/2018 3:20 AM, David Miller wrote:
> From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
> Date: Fri, 11 May 2018 13:24:25 -0400
> 
>> From: Willem de Bruijn <willemb@google.com>
>>
>> Packet sockets allow construction of packets shorter than
>> dev->hard_header_len to accommodate protocols with variable length
>> link layer headers. These packets are padded to dev->hard_header_len,
>> because some device drivers interpret that as a minimum packet size.
>>
>> packet_snd reserves dev->hard_header_len bytes on allocation.
>> SOCK_DGRAM sockets call skb_push in dev_hard_header() to ensure that
>> link layer headers are stored in the reserved range. SOCK_RAW sockets
>> do the same in tpacket_snd, but not in packet_snd.
>>
>> Syzbot was able to send a zero byte packet to a device with massive
>> 116B link layer header, causing padding to cross over into skb_shinfo.
>> Fix this by writing from the start of the llheader reserved range also
>> in the case of packet_snd/SOCK_RAW.
>>
>> Update skb_set_network_header to the new offset. This also corrects
>> it for SOCK_DGRAM, where it incorrectly double counted reserve due to
>> the skb_push in dev_hard_header.
>>
>> Fixes: 9ed988cd5915 ("packet: validate variable length ll headers")
>> Reported-by: syzbot+71d74a5406d02057d559@syzkaller.appspotmail.com
>> Signed-off-by: Willem de Bruijn <willemb@google.com>
> 
> Applied and queued up for -stable, thanks Willem.
> 

Hi,

One of our regression tests started failing. Once this patch is 
reverted, test passes.

The tests add flow steering rules in the receiver side and in the sender 
side it send the packet with some RAW socket applications. Then received 
side gets completion with error.

Our verification team compared the packets between the stable and the 
broken version, in the broken version we have some extra bytes at the 
end of the packet.

It looks like some bad push to the SKB, maybe the conditional reserved 
addition should be more strict?

Any idea?

Regards,
Tariq

^ permalink raw reply

* Re: [PATCH net-next] cxgb4: Check for kvzalloc allocation failure
From: David Miller @ 2018-05-24 15:07 UTC (permalink / raw)
  To: yuehaibing; +Cc: ganeshgr, linux-kernel, netdev
In-Reply-To: <20180522070718.12864-1-yuehaibing@huawei.com>

From: YueHaibing <yuehaibing@huawei.com>
Date: Tue, 22 May 2018 15:07:18 +0800

> diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
> index 130d1ee..019cffe 100644
> --- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
> +++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_main.c
> @@ -4135,6 +4135,10 @@ static int adap_init0(struct adapter *adap)
>  		 * card
>  		 */
>  		card_fw = kvzalloc(sizeof(*card_fw), GFP_KERNEL);
> +		if (!card_fw) {
> +			ret = -ENOMEM;
> +			goto bye;
> +		}
>  

On error, this leaks fw_info.

^ permalink raw reply

* Fw: [Bug 199819] New: Kernel version 4.15 breaks compatibility with Realtek 8111E
From: Stephen Hemminger @ 2018-05-24 15:08 UTC (permalink / raw)
  To: netdev



Begin forwarded message:

Date: Thu, 24 May 2018 06:23:20 +0000
From: bugzilla-daemon@bugzilla.kernel.org
To: stephen@networkplumber.org
Subject: [Bug 199819] New: Kernel version 4.15 breaks compatibility with Realtek 8111E


https://bugzilla.kernel.org/show_bug.cgi?id=199819

            Bug ID: 199819
           Summary: Kernel version 4.15 breaks compatibility with Realtek
                    8111E
           Product: Networking
           Version: 2.5
    Kernel Version: 4.15.1
          Hardware: x86-64
                OS: Linux
              Tree: Mainline
            Status: NEW
          Severity: normal
          Priority: P1
         Component: Other
          Assignee: stephen@networkplumber.org
          Reporter: frank.oehler@outlook.com
        Regression: No

Starting at Kernel version 4.15, the connection is constantly lost, 
followed by reconnects. Kernel version 4.14.11-041411-generic still 
worked perfectly fine. I also tried the newest release, 4.17-rc6, 
the issue has not been resolved. 

My Mainboard:
MSI 970A-G43

-- 
You are receiving this mail because:
You are the assignee for the bug.

^ permalink raw reply

* [PATCH net-next 0/7] net: bridge: Notify about bridge VLANs
From: Petr Machata @ 2018-05-24 15:09 UTC (permalink / raw)
  To: netdev, devel, bridge
  Cc: jiri, idosch, davem, razvan.stefanescu, gregkh, stephen, andrew,
	vivien.didelot, f.fainelli, nikolay

In commit 946a11e7408e ("mlxsw: spectrum_span: Allow bridge for gretap
mirror"), mlxsw got support for offloading mirror-to-gretap such that
the underlay packet path involves a bridge. In that case, the offload is
also influenced by PVID setting of said bridge. However, changes to VLAN
configuration of the bridge itself do not generate switchdev
notifications, so there's no mechanism to prod mlxsw to update the
offload when these settings change.

In this patchset, the problem is resolved by distributing the switchdev
notification SWITCHDEV_OBJ_ID_PORT_VLAN also for configuration changes
on bridge VLANs. Since stacked devices distribute the notification to
lower devices, such event eventually reaches the driver, which can
determine whether it's a bridge or port VLAN by inspecting orig_dev.

To keep things consistent, the newly-distributed notifications observe
the same protocol as the existing ones: dual prepare/commit, with
-EOPNOTSUPP indicating lack of support, even though there's currently
nothing to prepare for and nothing to support. Correspondingly, all
switchdev drivers have been updated to return -EOPNOTSUPP for bridge
VLAN notifications.

In patch #1, the code to send notifications for adding and deleting is
factored out into two named functions.

In patches #2-#5, respectively for mlxsw, rocker, DSA and DPAA2 ethsw,
the new notifications (which are not enabled yet) are ignored to
maintain the current behavior.

In patch #6, the notification is actually enabled.

In patch #7, mlxsw is changed to update offloads of mirror-to-gre also
for bridge-related notifications.

Petr Machata (7):
  net: bridge: Extract boilerplate around switchdev_port_obj_*()
  mlxsw: spectrum_switchdev: Ignore bridge VLAN events
  rocker: rocker_main: Ignore bridge VLAN events
  dsa: port: Ignore bridge VLAN events
  staging: fsl-dpaa2: ethsw: Ignore bridge VLAN events
  net: bridge: Notify about bridge VLANs
  mlxsw: spectrum_switchdev: Schedule respin during trans prepare

 .../ethernet/mellanox/mlxsw/spectrum_switchdev.c   |  8 ++-
 drivers/net/ethernet/rocker/rocker_main.c          |  6 +++
 drivers/staging/fsl-dpaa2/ethsw/ethsw.c            |  6 +++
 net/bridge/br_vlan.c                               | 58 ++++++++++++++--------
 net/dsa/port.c                                     |  6 +++
 5 files changed, 62 insertions(+), 22 deletions(-)

-- 
2.4.11

^ permalink raw reply

* [PATCH net-next 1/7] net: bridge: Extract boilerplate around switchdev_port_obj_*()
From: Petr Machata @ 2018-05-24 15:10 UTC (permalink / raw)
  To: netdev, devel, bridge
  Cc: jiri, idosch, davem, razvan.stefanescu, gregkh, stephen, andrew,
	vivien.didelot, f.fainelli, nikolay
In-Reply-To: <cover.1527173527.git.petrm@mellanox.com>

A call to switchdev_port_obj_add() or switchdev_port_obj_del() involves
initializing a struct switchdev_obj_port_vlan, a piece of code that
repeats on each call site almost verbatim. While in the current codebase
there is just one duplicated add call, the follow-up patches add more of
both add and del calls.

Thus to remove the duplication, extract the repetition into named
functions and reuse.

Signed-off-by: Petr Machata <petrm@mellanox.com>
---
 net/bridge/br_vlan.c | 44 +++++++++++++++++++++++---------------------
 1 file changed, 23 insertions(+), 21 deletions(-)

diff --git a/net/bridge/br_vlan.c b/net/bridge/br_vlan.c
index dc832c09..a75fe930 100644
--- a/net/bridge/br_vlan.c
+++ b/net/bridge/br_vlan.c
@@ -79,8 +79,7 @@ static bool __vlan_add_flags(struct net_bridge_vlan *v, u16 flags)
 	return ret || !!(old_flags ^ v->flags);
 }
 
-static int __vlan_vid_add(struct net_device *dev, struct net_bridge *br,
-			  u16 vid, u16 flags)
+static int br_switchdev_port_obj_add(struct net_device *dev, u16 vid, u16 flags)
 {
 	struct switchdev_obj_port_vlan v = {
 		.obj.orig_dev = dev,
@@ -89,12 +88,29 @@ static int __vlan_vid_add(struct net_device *dev, struct net_bridge *br,
 		.vid_begin = vid,
 		.vid_end = vid,
 	};
-	int err;
 
+	return switchdev_port_obj_add(dev, &v.obj);
+}
+
+static int br_switchdev_port_obj_del(struct net_device *dev, u16 vid)
+{
+	struct switchdev_obj_port_vlan v = {
+		.obj.orig_dev = dev,
+		.obj.id = SWITCHDEV_OBJ_ID_PORT_VLAN,
+		.vid_begin = vid,
+		.vid_end = vid,
+	};
+
+	return switchdev_port_obj_del(dev, &v.obj);
+}
+
+static int __vlan_vid_add(struct net_device *dev, struct net_bridge *br,
+			  u16 vid, u16 flags)
+{
 	/* Try switchdev op first. In case it is not supported, fallback to
 	 * 8021q add.
 	 */
-	err = switchdev_port_obj_add(dev, &v.obj);
+	int err = br_switchdev_port_obj_add(dev, vid, flags);
 	if (err == -EOPNOTSUPP)
 		return vlan_vid_add(dev, br->vlan_proto, vid);
 	return err;
@@ -130,18 +146,11 @@ static void __vlan_del_list(struct net_bridge_vlan *v)
 static int __vlan_vid_del(struct net_device *dev, struct net_bridge *br,
 			  u16 vid)
 {
-	struct switchdev_obj_port_vlan v = {
-		.obj.orig_dev = dev,
-		.obj.id = SWITCHDEV_OBJ_ID_PORT_VLAN,
-		.vid_begin = vid,
-		.vid_end = vid,
-	};
-	int err;
-
 	/* Try switchdev op first. In case it is not supported, fallback to
 	 * 8021q del.
 	 */
-	err = switchdev_port_obj_del(dev, &v.obj);
+	int err = br_switchdev_port_obj_del(dev, vid);
+
 	if (err == -EOPNOTSUPP) {
 		vlan_vid_del(dev, br->vlan_proto, vid);
 		return 0;
@@ -1053,13 +1062,6 @@ int nbp_vlan_init(struct net_bridge_port *p)
 int nbp_vlan_add(struct net_bridge_port *port, u16 vid, u16 flags,
 		 bool *changed)
 {
-	struct switchdev_obj_port_vlan v = {
-		.obj.orig_dev = port->dev,
-		.obj.id = SWITCHDEV_OBJ_ID_PORT_VLAN,
-		.flags = flags,
-		.vid_begin = vid,
-		.vid_end = vid,
-	};
 	struct net_bridge_vlan *vlan;
 	int ret;
 
@@ -1069,7 +1071,7 @@ int nbp_vlan_add(struct net_bridge_port *port, u16 vid, u16 flags,
 	vlan = br_vlan_find(nbp_vlan_group(port), vid);
 	if (vlan) {
 		/* Pass the flags to the hardware bridge */
-		ret = switchdev_port_obj_add(port->dev, &v.obj);
+		ret = br_switchdev_port_obj_add(port->dev, vid, flags);
 		if (ret && ret != -EOPNOTSUPP)
 			return ret;
 		*changed = __vlan_add_flags(vlan, flags);
-- 
2.4.11

^ permalink raw reply related

* [PATCH net-next 2/7] mlxsw: spectrum_switchdev: Ignore bridge VLAN events
From: Petr Machata @ 2018-05-24 15:10 UTC (permalink / raw)
  To: netdev, devel, bridge
  Cc: jiri, idosch, davem, razvan.stefanescu, gregkh, stephen, andrew,
	vivien.didelot, f.fainelli, nikolay
In-Reply-To: <cover.1527173527.git.petrm@mellanox.com>

Ignore VLAN events where the orig_dev is the bridge device itself.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
index 8c9cf8e..cbc8fab 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
@@ -1144,6 +1144,9 @@ static int mlxsw_sp_port_vlans_add(struct mlxsw_sp_port *mlxsw_sp_port,
 	struct mlxsw_sp_bridge_port *bridge_port;
 	u16 vid;
 
+	if (netif_is_bridge_master(orig_dev))
+		return -EOPNOTSUPP;
+
 	if (switchdev_trans_ph_prepare(trans))
 		return 0;
 
@@ -1741,6 +1744,9 @@ static int mlxsw_sp_port_vlans_del(struct mlxsw_sp_port *mlxsw_sp_port,
 	struct mlxsw_sp_bridge_port *bridge_port;
 	u16 vid;
 
+	if (netif_is_bridge_master(orig_dev))
+		return -EOPNOTSUPP;
+
 	bridge_port = mlxsw_sp_bridge_port_find(mlxsw_sp->bridge, orig_dev);
 	if (WARN_ON(!bridge_port))
 		return -EINVAL;
-- 
2.4.11

^ permalink raw reply related

* [PATCH net-next 3/7] rocker: rocker_main: Ignore bridge VLAN events
From: Petr Machata @ 2018-05-24 15:10 UTC (permalink / raw)
  To: netdev, devel, bridge
  Cc: jiri, idosch, davem, razvan.stefanescu, gregkh, stephen, andrew,
	vivien.didelot, f.fainelli, nikolay
In-Reply-To: <cover.1527173527.git.petrm@mellanox.com>

Ignore VLAN events where the orig_dev is the bridge device itself.

Signed-off-by: Petr Machata <petrm@mellanox.com>
---
 drivers/net/ethernet/rocker/rocker_main.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/net/ethernet/rocker/rocker_main.c b/drivers/net/ethernet/rocker/rocker_main.c
index e73e4fe..aeafdb9 100644
--- a/drivers/net/ethernet/rocker/rocker_main.c
+++ b/drivers/net/ethernet/rocker/rocker_main.c
@@ -1632,6 +1632,9 @@ rocker_world_port_obj_vlan_add(struct rocker_port *rocker_port,
 {
 	struct rocker_world_ops *wops = rocker_port->rocker->wops;
 
+	if (netif_is_bridge_master(vlan->obj.orig_dev))
+		return -EOPNOTSUPP;
+
 	if (!wops->port_obj_vlan_add)
 		return -EOPNOTSUPP;
 
@@ -1647,6 +1650,9 @@ rocker_world_port_obj_vlan_del(struct rocker_port *rocker_port,
 {
 	struct rocker_world_ops *wops = rocker_port->rocker->wops;
 
+	if (netif_is_bridge_master(vlan->obj.orig_dev))
+		return -EOPNOTSUPP;
+
 	if (!wops->port_obj_vlan_del)
 		return -EOPNOTSUPP;
 	return wops->port_obj_vlan_del(rocker_port, vlan);
-- 
2.4.11

^ permalink raw reply related

* [PATCH net-next 4/7] dsa: port: Ignore bridge VLAN events
From: Petr Machata @ 2018-05-24 15:10 UTC (permalink / raw)
  To: netdev, devel, bridge
  Cc: jiri, idosch, davem, razvan.stefanescu, gregkh, stephen, andrew,
	vivien.didelot, f.fainelli, nikolay
In-Reply-To: <cover.1527173527.git.petrm@mellanox.com>

Ignore VLAN events where the orig_dev is the bridge device itself.

Signed-off-by: Petr Machata <petrm@mellanox.com>
---
 net/dsa/port.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/net/dsa/port.c b/net/dsa/port.c
index 2413beb..ed05954 100644
--- a/net/dsa/port.c
+++ b/net/dsa/port.c
@@ -252,6 +252,9 @@ int dsa_port_vlan_add(struct dsa_port *dp,
 		.vlan = vlan,
 	};
 
+	if (netif_is_bridge_master(vlan->obj.orig_dev))
+		return -EOPNOTSUPP;
+
 	if (br_vlan_enabled(dp->bridge_dev))
 		return dsa_port_notify(dp, DSA_NOTIFIER_VLAN_ADD, &info);
 
@@ -267,6 +270,9 @@ int dsa_port_vlan_del(struct dsa_port *dp,
 		.vlan = vlan,
 	};
 
+	if (netif_is_bridge_master(vlan->obj.orig_dev))
+		return -EOPNOTSUPP;
+
 	if (br_vlan_enabled(dp->bridge_dev))
 		return dsa_port_notify(dp, DSA_NOTIFIER_VLAN_DEL, &info);
 
-- 
2.4.11

^ permalink raw reply related

* [PATCH net-next 5/7] staging: fsl-dpaa2: ethsw: Ignore bridge VLAN events
From: Petr Machata @ 2018-05-24 15:10 UTC (permalink / raw)
  To: netdev, devel, bridge
  Cc: jiri, idosch, davem, razvan.stefanescu, gregkh, stephen, andrew,
	vivien.didelot, f.fainelli, nikolay
In-Reply-To: <cover.1527173527.git.petrm@mellanox.com>

Ignore VLAN events where the orig_dev is the bridge device itself.

Signed-off-by: Petr Machata <petrm@mellanox.com>
---
 drivers/staging/fsl-dpaa2/ethsw/ethsw.c | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/drivers/staging/fsl-dpaa2/ethsw/ethsw.c b/drivers/staging/fsl-dpaa2/ethsw/ethsw.c
index c723a04..a17dd29 100644
--- a/drivers/staging/fsl-dpaa2/ethsw/ethsw.c
+++ b/drivers/staging/fsl-dpaa2/ethsw/ethsw.c
@@ -719,6 +719,9 @@ static int port_vlans_add(struct net_device *netdev,
 	struct ethsw_port_priv *port_priv = netdev_priv(netdev);
 	int vid, err;
 
+	if (netif_is_bridge_master(vlan->obj.orig_dev))
+		return -EOPNOTSUPP;
+
 	if (switchdev_trans_ph_prepare(trans))
 		return 0;
 
@@ -873,6 +876,9 @@ static int port_vlans_del(struct net_device *netdev,
 	struct ethsw_port_priv *port_priv = netdev_priv(netdev);
 	int vid, err;
 
+	if (netif_is_bridge_master(vlan->obj.orig_dev))
+		return -EOPNOTSUPP;
+
 	for (vid = vlan->vid_begin; vid <= vlan->vid_end; vid++) {
 		err = ethsw_port_del_vlan(port_priv, vid);
 		if (err)
-- 
2.4.11

^ permalink raw reply related

* [PATCH net-next 6/7] net: bridge: Notify about bridge VLANs
From: Petr Machata @ 2018-05-24 15:10 UTC (permalink / raw)
  To: netdev, devel, bridge
  Cc: jiri, idosch, davem, razvan.stefanescu, gregkh, stephen, andrew,
	vivien.didelot, f.fainelli, nikolay
In-Reply-To: <cover.1527173527.git.petrm@mellanox.com>

A driver might need to react to changes in settings of brentry VLANs.
Therefore send switchdev port notifications for these as well. Reuse
SWITCHDEV_OBJ_ID_PORT_VLAN for this purpose. Listeners should use
netif_is_bridge_master() on orig_dev to determine whether the
notification is about a bridge port or a bridge.

Signed-off-by: Petr Machata <petrm@mellanox.com>
---
 net/bridge/br_vlan.c | 14 ++++++++++++++
 1 file changed, 14 insertions(+)

diff --git a/net/bridge/br_vlan.c b/net/bridge/br_vlan.c
index a75fe930..14c1b6c 100644
--- a/net/bridge/br_vlan.c
+++ b/net/bridge/br_vlan.c
@@ -268,6 +268,10 @@ static int __vlan_add(struct net_bridge_vlan *v, u16 flags)
 			goto out_filt;
 		v->brvlan = masterv;
 		v->stats = masterv->stats;
+	} else {
+		err = br_switchdev_port_obj_add(dev, v->vid, flags);
+		if (err && err != -EOPNOTSUPP)
+			goto out;
 	}
 
 	/* Add the dev mac and count the vlan only if it's usable */
@@ -303,6 +307,8 @@ static int __vlan_add(struct net_bridge_vlan *v, u16 flags)
 			br_vlan_put_master(masterv);
 			v->brvlan = NULL;
 		}
+	} else {
+		br_switchdev_port_obj_del(dev, v->vid);
 	}
 
 	goto out;
@@ -328,6 +334,11 @@ static int __vlan_del(struct net_bridge_vlan *v)
 		err = __vlan_vid_del(p->dev, p->br, v->vid);
 		if (err)
 			goto out;
+	} else {
+		err = br_switchdev_port_obj_del(v->br->dev, v->vid);
+		if (err && err != -EOPNOTSUPP)
+			goto out;
+		err = 0;
 	}
 
 	if (br_vlan_should_use(v)) {
@@ -605,6 +616,9 @@ int br_vlan_add(struct net_bridge *br, u16 vid, u16 flags, bool *changed)
 			vg->num_vlans++;
 			*changed = true;
 		}
+		ret = br_switchdev_port_obj_add(br->dev, vid, flags);
+		if (ret && ret != -EOPNOTSUPP)
+			return ret;
 		if (__vlan_add_flags(vlan, flags))
 			*changed = true;
 
-- 
2.4.11

^ permalink raw reply related

* [PATCH net-next 7/7] mlxsw: spectrum_switchdev: Schedule respin during trans prepare
From: Petr Machata @ 2018-05-24 15:10 UTC (permalink / raw)
  To: netdev, devel, bridge
  Cc: jiri, idosch, davem, razvan.stefanescu, gregkh, stephen, andrew,
	vivien.didelot, f.fainelli, nikolay
In-Reply-To: <cover.1527173527.git.petrm@mellanox.com>

Since there's no special support for the bridge events, the driver
returns -EOPNOTSUPP, and thus the commit never happens. Therefore
schedule respin during the prepare stage: there's no real difference one
way or another.

This fixes the problem that mirror-to-gretap offload wouldn't adapt to
changes in bridge vlan configuration right away and another notification
would have to arrive for mlxsw to catch up.

Signed-off-by: Petr Machata <petrm@mellanox.com>
Reviewed-by: Ido Schimmel <idosch@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
index cbc8fab..8a15ac4 100644
--- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
+++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_switchdev.c
@@ -1697,7 +1697,7 @@ static int mlxsw_sp_port_obj_add(struct net_device *dev,
 		vlan = SWITCHDEV_OBJ_PORT_VLAN(obj);
 		err = mlxsw_sp_port_vlans_add(mlxsw_sp_port, vlan, trans);
 
-		if (switchdev_trans_ph_commit(trans)) {
+		if (switchdev_trans_ph_prepare(trans)) {
 			/* The event is emitted before the changes are actually
 			 * applied to the bridge. Therefore schedule the respin
 			 * call for later, so that the respin logic sees the
-- 
2.4.11

^ permalink raw reply related

* Re: [PATCH net] packet: in packet_snd start writing at link layer allocation
From: Willem de Bruijn @ 2018-05-24 15:17 UTC (permalink / raw)
  To: Tariq Toukan
  Cc: David Miller, Network Development, Eric Dumazet, Willem de Bruijn,
	Maor Gottlieb
In-Reply-To: <bb77c544-fc52-7984-e421-114a9fd1ac4d@mellanox.com>

On Thu, May 24, 2018 at 11:07 AM, Tariq Toukan <tariqt@mellanox.com> wrote:
>
>
> On 14/05/2018 3:20 AM, David Miller wrote:
>>
>> From: Willem de Bruijn <willemdebruijn.kernel@gmail.com>
>> Date: Fri, 11 May 2018 13:24:25 -0400
>>
>>> From: Willem de Bruijn <willemb@google.com>
>>>
>>> Packet sockets allow construction of packets shorter than
>>> dev->hard_header_len to accommodate protocols with variable length
>>> link layer headers. These packets are padded to dev->hard_header_len,
>>> because some device drivers interpret that as a minimum packet size.
>>>
>>> packet_snd reserves dev->hard_header_len bytes on allocation.
>>> SOCK_DGRAM sockets call skb_push in dev_hard_header() to ensure that
>>> link layer headers are stored in the reserved range. SOCK_RAW sockets
>>> do the same in tpacket_snd, but not in packet_snd.
>>>
>>> Syzbot was able to send a zero byte packet to a device with massive
>>> 116B link layer header, causing padding to cross over into skb_shinfo.
>>> Fix this by writing from the start of the llheader reserved range also
>>> in the case of packet_snd/SOCK_RAW.
>>>
>>> Update skb_set_network_header to the new offset. This also corrects
>>> it for SOCK_DGRAM, where it incorrectly double counted reserve due to
>>> the skb_push in dev_hard_header.
>>>
>>> Fixes: 9ed988cd5915 ("packet: validate variable length ll headers")
>>> Reported-by: syzbot+71d74a5406d02057d559@syzkaller.appspotmail.com
>>> Signed-off-by: Willem de Bruijn <willemb@google.com>
>>
>>
>> Applied and queued up for -stable, thanks Willem.
>>
>
> Hi,
>
> One of our regression tests started failing. Once this patch is reverted,
> test passes.
>
> The tests add flow steering rules in the receiver side and in the sender
> side it send the packet with some RAW socket applications. Then received
> side gets completion with error.
>
> Our verification team compared the packets between the stable and the broken
> version, in the broken version we have some extra bytes at the end of the
> packet.
>
> It looks like some bad push to the SKB, maybe the conditional reserved
> addition should be more strict?
>
> Any idea?

Thanks for reporting, sorry for the breakage.

I think I might. This skb_push moves back the start of skb->data in the
same way that tpacket_snd does. But it does not reduce the length
passed to skb_put, so this might double count hard_header_len.

Let me construct a test.

^ permalink raw reply

* Re: [PATCH bpf] bpf: properly enforce index mask to prevent out-of-bounds speculation
From: Alexei Starovoitov @ 2018-05-24 15:20 UTC (permalink / raw)
  To: Daniel Borkmann; +Cc: netdev
In-Reply-To: <20180524003253.2918-1-daniel@iogearbox.net>

On Thu, May 24, 2018 at 02:32:53AM +0200, Daniel Borkmann wrote:
> While reviewing the verifier code, I recently noticed that the
> following two program variants in relation to tail calls can be
> loaded.
> 
> Fixes: b2157399cc98 ("bpf: prevent out-of-bounds speculation")
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
> Acked-by: Alexei Starovoitov <ast@kernel.org>

Applied, Thanks.

^ permalink raw reply

* Re: [PATCH 0/4] RFC CPSW switchdev mode
From: Andrew Lunn @ 2018-05-24 15:25 UTC (permalink / raw)
  To: Ilias Apalodimas
  Cc: Ivan Vecera, Jiri Pirko, netdev, grygorii.strashko,
	ivan.khoronzhuk, nsekhar, francois.ozog, yogeshs, spatton
In-Reply-To: <20180524150704.GA20031@apalos>

> > That i can understand. And it should actually work now with
> > switchdev. It performs IGMP snooping, and if there is nothing joining
> > the group on the CPU, it won't add an MDB entry to forward traffic to
> > the CPU.

> Yes, but this should be configurable (i.e the customer can deny adding the MDB
> on the cpu port)

O.K, back to the basic idea. Switch ports are just normal Linux
interfaces.

How would you configure this with two e1000e put in a bridge? I want
multicast to be bridged between the two e1000e, but the host stack
should not see the packets.

	Andrew

^ permalink raw reply

* Re: [V9fs-developer] [PATCH] net/9p: fix error path of p9_virtio_probe
From: Greg Kurz @ 2018-05-24 14:22 UTC (permalink / raw)
  To: Jean-Philippe Brucker
  Cc: v9fs-developer, ericvh, rminnich, lucho, netdev, davem
In-Reply-To: <20180524101021.49880-1-jean-philippe.brucker@arm.com>

On Thu, 24 May 2018 11:10:21 +0100
Jean-Philippe Brucker <jean-philippe.brucker@arm.com> wrote:

> Currently when virtio_find_single_vq fails, we go through del_vqs which
> throws a warning (Trying to free already-free IRQ). Skip del_vqs if vq
> allocation failed.
> 
> Signed-off-by: Jean-Philippe Brucker <jean-philippe.brucker@arm.com>
> ---

Reviewed-by: Greg Kurz <groug@kaod.org>

>  net/9p/trans_virtio.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/net/9p/trans_virtio.c b/net/9p/trans_virtio.c
> index 4d0372263e5d..1c87eee522b7 100644
> --- a/net/9p/trans_virtio.c
> +++ b/net/9p/trans_virtio.c
> @@ -562,7 +562,7 @@ static int p9_virtio_probe(struct virtio_device *vdev)
>  	chan->vq = virtio_find_single_vq(vdev, req_done, "requests");
>  	if (IS_ERR(chan->vq)) {
>  		err = PTR_ERR(chan->vq);
> -		goto out_free_vq;
> +		goto out_free_chan;
>  	}
>  	chan->vq->vdev->priv = chan;
>  	spin_lock_init(&chan->lock);
> @@ -615,6 +615,7 @@ static int p9_virtio_probe(struct virtio_device *vdev)
>  	kfree(tag);
>  out_free_vq:
>  	vdev->config->del_vqs(vdev);
> +out_free_chan:
>  	kfree(chan);
>  fail:
>  	return err;

^ permalink raw reply

* [PATCH net-next] vrf: add CRC32c offload to device features
From: Davide Caratti @ 2018-05-24 15:49 UTC (permalink / raw)
  To: David Ahern, Vlad Yasevich, Marcelo Ricardo Leitner; +Cc: linux-sctp, netdev

SCTP sockets originated in a VRF can improve their performance if CRC32c
computation is delegated to underlying devices: update device features,
setting NETIF_F_SCTP_CRC. Iterating the following command in the topology
proposed with [1],

 # ip vrf exec vrf-h2 netperf -H 192.0.2.1 -t SCTP_STREAM -- -m 10K

the measured throughput in Mbit/s improved from 2395 ± 1% to 2720 ± 1%.

[1] https://www.spinics.net/lists/netdev/msg486007.html

Signed-off-by: Davide Caratti <dcaratti@redhat.com>
---
 drivers/net/vrf.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/vrf.c b/drivers/net/vrf.c
index 90b5f3900c22..f93547f257fb 100644
--- a/drivers/net/vrf.c
+++ b/drivers/net/vrf.c
@@ -1254,7 +1254,7 @@ static void vrf_setup(struct net_device *dev)
 
 	/* enable offload features */
 	dev->features   |= NETIF_F_GSO_SOFTWARE;
-	dev->features   |= NETIF_F_RXCSUM | NETIF_F_HW_CSUM;
+	dev->features   |= NETIF_F_RXCSUM | NETIF_F_HW_CSUM | NETIF_F_SCTP_CRC;
 	dev->features   |= NETIF_F_SG | NETIF_F_FRAGLIST | NETIF_F_HIGHDMA;
 
 	dev->hw_features = dev->features;
-- 
2.17.0

^ permalink raw reply related

* Re: [PATCH] ath10k: transmit queued frames after waking queues
From: Bob Copeland @ 2018-05-24 15:50 UTC (permalink / raw)
  To: Niklas Cassel
  Cc: Adrian Chadd, Kalle Valo, David Miller, ath10k, linux-wireless,
	netdev, Linux Kernel Mailing List
In-Reply-To: <20180521203701.GA7619@localhost.localdomain>

On Mon, May 21, 2018 at 10:37:01PM +0200, Niklas Cassel wrote:
> On Thu, May 17, 2018 at 03:26:25PM -0700, Adrian Chadd wrote:
> > On Thu, 17 May 2018 at 16:16, Niklas Cassel <niklas.cassel@linaro.org>
> > wrote:
> > 
> > > diff --git a/drivers/net/wireless/ath/ath10k/txrx.c
> > b/drivers/net/wireless/ath/ath10k/txrx.c
> > > index cda164f6e9f6..1d3b2d2c3fee 100644
> > > --- a/drivers/net/wireless/ath/ath10k/txrx.c
> > > +++ b/drivers/net/wireless/ath/ath10k/txrx.c
> > > @@ -95,6 +95,9 @@ int ath10k_txrx_tx_unref(struct ath10k_htt *htt,
> > >                  wake_up(&htt->empty_tx_wq);
> > >          spin_unlock_bh(&htt->tx_lock);
> > 
> > > +       if (htt->num_pending_tx <= 3 && !list_empty(&ar->txqs))
> > > +               ath10k_mac_tx_push_pending(ar);
> > > +
> > 
> > Just sanity checking - what's protecting htt->num_pending_tx? or is it
> > serialised some other way?
[...]
> I can't see that any of the examples applies, but let's add READ_ONCE(),
> to make sure that the compiler doesn't try to optimize this.

Couldn't you just move the num_pending_tx read inside tx_lock which is 2 lines
above?  I think all the other manipulations are protected by tx_lock.

-- 
Bob Copeland %% https://bobcopeland.com/

^ permalink raw reply

* Re: 4.16 issue with mbim modem and ping with size > 14552 bytes
From: Greg KH @ 2018-05-24 15:53 UTC (permalink / raw)
  To: Daniele Palmas; +Cc: netdev, linux-usb
In-Reply-To: <CAGRyCJFTqJOjjx0G6fgd8f0AZ2bTo6-89Exx+HKEZCsrJM4nNw@mail.gmail.com>

On Thu, May 24, 2018 at 05:04:49PM +0200, Daniele Palmas wrote:
> Hello,
> 
> I have an issue with an USB mbim modem when trying to send with ping
> more than 14552 bytes: it looks like to me a kernel issue, but not at
> the cdc_mbim or cdc_ncm level, anyway not sure, so I'm reporting the
> issue.
> 
> My kernel is 4.16. The device is the following:

Does older kernels work, or is this something that has always been
there?

I ask, as my mobile provider does horrible things to large packet sizes.
So much so that I have to set the mtu to 1280 just to get things to work
properly when tethering my phone through to my laptop.  So this might be
a network provider issue :)

thanks,

greg k-h

^ permalink raw reply

* Re: [PATCH bpf-next v4 2/7] bpf: introduce bpf subcommand BPF_TASK_FD_QUERY
From: Yonghong Song @ 2018-05-24 16:00 UTC (permalink / raw)
  To: Martin KaFai Lau; +Cc: peterz, ast, daniel, netdev, kernel-team
In-Reply-To: <20180524050738.fgay6jcagx4exrnr@kafai-mbp>



On 5/23/18 10:07 PM, Martin KaFai Lau wrote:
> On Wed, May 23, 2018 at 05:18:42PM -0700, Yonghong Song wrote:
>> Currently, suppose a userspace application has loaded a bpf program
>> and attached it to a tracepoint/kprobe/uprobe, and a bpf
>> introspection tool, e.g., bpftool, wants to show which bpf program
>> is attached to which tracepoint/kprobe/uprobe. Such attachment
>> information will be really useful to understand the overall bpf
>> deployment in the system.
>>
>> There is a name field (16 bytes) for each program, which could
>> be used to encode the attachment point. There are some drawbacks
>> for this approaches. First, bpftool user (e.g., an admin) may not
>> really understand the association between the name and the
>> attachment point. Second, if one program is attached to multiple
>> places, encoding a proper name which can imply all these
>> attachments becomes difficult.
>>
>> This patch introduces a new bpf subcommand BPF_TASK_FD_QUERY.
>> Given a pid and fd, if the <pid, fd> is associated with a
>> tracepoint/kprobe/uprobe perf event, BPF_TASK_FD_QUERY will return
>>     . prog_id
>>     . tracepoint name, or
>>     . k[ret]probe funcname + offset or kernel addr, or
>>     . u[ret]probe filename + offset
>> to the userspace.
>> The user can use "bpftool prog" to find more information about
>> bpf program itself with prog_id.
>>
>> Signed-off-by: Yonghong Song <yhs@fb.com>
>> ---
>>   include/linux/trace_events.h |  17 +++++++
>>   include/uapi/linux/bpf.h     |  26 ++++++++++
>>   kernel/bpf/syscall.c         | 115 +++++++++++++++++++++++++++++++++++++++++++
>>   kernel/trace/bpf_trace.c     |  48 ++++++++++++++++++
>>   kernel/trace/trace_kprobe.c  |  29 +++++++++++
>>   kernel/trace/trace_uprobe.c  |  22 +++++++++
>>   6 files changed, 257 insertions(+)
>>
>> diff --git a/include/linux/trace_events.h b/include/linux/trace_events.h
>> index 2bde3ef..d34144a 100644
>> --- a/include/linux/trace_events.h
>> +++ b/include/linux/trace_events.h
>> @@ -473,6 +473,9 @@ int perf_event_query_prog_array(struct perf_event *event, void __user *info);
>>   int bpf_probe_register(struct bpf_raw_event_map *btp, struct bpf_prog *prog);
>>   int bpf_probe_unregister(struct bpf_raw_event_map *btp, struct bpf_prog *prog);
>>   struct bpf_raw_event_map *bpf_find_raw_tracepoint(const char *name);
>> +int bpf_get_perf_event_info(const struct perf_event *event, u32 *prog_id,
>> +			    u32 *fd_type, const char **buf,
>> +			    u64 *probe_offset, u64 *probe_addr);
>>   #else
>>   static inline unsigned int trace_call_bpf(struct trace_event_call *call, void *ctx)
>>   {
>> @@ -504,6 +507,13 @@ static inline struct bpf_raw_event_map *bpf_find_raw_tracepoint(const char *name
>>   {
>>   	return NULL;
>>   }
>> +static inline int bpf_get_perf_event_info(const struct perf_event *event,
>> +					  u32 *prog_id, u32 *fd_type,
>> +					  const char **buf, u64 *probe_offset,
>> +					  u64 *probe_addr)
>> +{
>> +	return -EOPNOTSUPP;
>> +}
>>   #endif
>>   
>>   enum {
>> @@ -560,10 +570,17 @@ extern void perf_trace_del(struct perf_event *event, int flags);
>>   #ifdef CONFIG_KPROBE_EVENTS
>>   extern int  perf_kprobe_init(struct perf_event *event, bool is_retprobe);
>>   extern void perf_kprobe_destroy(struct perf_event *event);
>> +extern int bpf_get_kprobe_info(const struct perf_event *event,
>> +			       u32 *fd_type, const char **symbol,
>> +			       u64 *probe_offset, u64 *probe_addr,
>> +			       bool perf_type_tracepoint);
>>   #endif
>>   #ifdef CONFIG_UPROBE_EVENTS
>>   extern int  perf_uprobe_init(struct perf_event *event, bool is_retprobe);
>>   extern void perf_uprobe_destroy(struct perf_event *event);
>> +extern int bpf_get_uprobe_info(const struct perf_event *event,
>> +			       u32 *fd_type, const char **filename,
>> +			       u64 *probe_offset, bool perf_type_tracepoint);
>>   #endif
>>   extern int  ftrace_profile_set_filter(struct perf_event *event, int event_id,
>>   				     char *filter_str);
>> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
>> index c3e502d..0d51946 100644
>> --- a/include/uapi/linux/bpf.h
>> +++ b/include/uapi/linux/bpf.h
>> @@ -97,6 +97,7 @@ enum bpf_cmd {
>>   	BPF_RAW_TRACEPOINT_OPEN,
>>   	BPF_BTF_LOAD,
>>   	BPF_BTF_GET_FD_BY_ID,
>> +	BPF_TASK_FD_QUERY,
>>   };
>>   
>>   enum bpf_map_type {
>> @@ -379,6 +380,22 @@ union bpf_attr {
>>   		__u32		btf_log_size;
>>   		__u32		btf_log_level;
>>   	};
>> +
>> +	struct {
>> +		__u32		pid;		/* input: pid */
>> +		__u32		fd;		/* input: fd */
>> +		__u32		flags;		/* input: flags */
>> +		__u32		buf_len;	/* input/output: buf len */
>> +		__aligned_u64	buf;		/* input/output:
>> +						 *   tp_name for tracepoint
>> +						 *   symbol for kprobe
>> +						 *   filename for uprobe
>> +						 */
>> +		__u32		prog_id;	/* output: prod_id */
>> +		__u32		fd_type;	/* output: BPF_FD_TYPE_* */
>> +		__u64		probe_offset;	/* output: probe_offset */
>> +		__u64		probe_addr;	/* output: probe_addr */
>> +	} task_fd_query;
>>   } __attribute__((aligned(8)));
>>   
>>   /* The description below is an attempt at providing documentation to eBPF
>> @@ -2458,4 +2475,13 @@ struct bpf_fib_lookup {
>>   	__u8	dmac[6];     /* ETH_ALEN */
>>   };
>>   
>> +enum bpf_task_fd_type {
>> +	BPF_FD_TYPE_RAW_TRACEPOINT,	/* tp name */
>> +	BPF_FD_TYPE_TRACEPOINT,		/* tp name */
>> +	BPF_FD_TYPE_KPROBE,		/* (symbol + offset) or addr */
>> +	BPF_FD_TYPE_KRETPROBE,		/* (symbol + offset) or addr */
>> +	BPF_FD_TYPE_UPROBE,		/* filename + offset */
>> +	BPF_FD_TYPE_URETPROBE,		/* filename + offset */
>> +};
>> +
>>   #endif /* _UAPI__LINUX_BPF_H__ */
>> diff --git a/kernel/bpf/syscall.c b/kernel/bpf/syscall.c
>> index 0b4c945..7dd8c86 100644
>> --- a/kernel/bpf/syscall.c
>> +++ b/kernel/bpf/syscall.c
>> @@ -18,7 +18,9 @@
>>   #include <linux/vmalloc.h>
>>   #include <linux/mmzone.h>
>>   #include <linux/anon_inodes.h>
>> +#include <linux/fdtable.h>
>>   #include <linux/file.h>
>> +#include <linux/fs.h>
>>   #include <linux/license.h>
>>   #include <linux/filter.h>
>>   #include <linux/version.h>
>> @@ -2102,6 +2104,116 @@ static int bpf_btf_get_fd_by_id(const union bpf_attr *attr)
>>   	return btf_get_fd_by_id(attr->btf_id);
>>   }
>>   
>> +static int bpf_task_fd_query_copy(const union bpf_attr *attr,
>> +				    union bpf_attr __user *uattr,
>> +				    u32 prog_id, u32 fd_type,
>> +				    const char *buf, u64 probe_offset,
>> +				    u64 probe_addr)
>> +{
>> +	void __user *ubuf = u64_to_user_ptr(attr->task_fd_query.buf);
>> +	u32 len = buf ? strlen(buf) + 1 : 0, input_len;
>> +	int err = 0;
>> +
>> +	if (put_user(len, &uattr->task_fd_query.buf_len))
>> +		return -EFAULT;
>> +	input_len = attr->task_fd_query.buf_len;
>> +	if (input_len && len && ubuf) {
> When len is 0 and input_len > 0, ubuf will not be touched (and
> so not null terminated).

This follows what we did for cgroup prog array query, when len (to be 
copied) is 0, nothing will be copied. But see below.

> 
> It may be helpful to note in uapi bpf.h that !output_buf_len has to be
> checked on top of checking the syscall return value.  It is reasonable for
> the userspace to assume that ubuf can be directly used with
> strlen()/printf()... as long as the syscall does not return -1/ENOSPC.
> I think the comment change could be done in a follow up patch.
> 
> or
> 
> always null terminate ubuf as long as input_len > 0
> and the output_buf_len should be strlen(buf) instead of
> strlen(buf) + 1 (i.e. exclude the null char in output_buf_len)
> such that the !buf case will have output_buf_len == 0.
> The user can depend on ENOSPC or input_buf_len <= output_buf_len
> to decide the truncated condition.  This convention should be
> closer to the snprintf() situation.


The second approach is better as in cases the user space
may have limited space to print and we should not enforce users
to massage further for string printing under ENOSPC.

I will respin the patch set. Thanks for suggestion!

> Other than that,
> 
> Acked-by: Martin KaFai Lau <kafai@fb.com>
> 

^ permalink raw reply

* Re: [PATCH 0/4] RFC CPSW switchdev mode
From: Ilias Apalodimas @ 2018-05-24 16:02 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: Ivan Vecera, Jiri Pirko, netdev, grygorii.strashko,
	ivan.khoronzhuk, nsekhar, francois.ozog, yogeshs, spatton
In-Reply-To: <20180524152559.GF5128@lunn.ch>

On Thu, May 24, 2018 at 05:25:59PM +0200, Andrew Lunn wrote:
> O.K, back to the basic idea. Switch ports are just normal Linux
> interfaces.
> 
> How would you configure this with two e1000e put in a bridge? I want
> multicast to be bridged between the two e1000e, but the host stack
> should not see the packets.
I am not sure i am following. I might be missing something. In your case you
got two ethernet pci/pcie interfaces bridged through software. You can filter
those if needed. In the case we are trying to cover, you got a hardware that
offers that capability. Since not all switches are pcie based shouldn't we be
able to allow this ?

Regards
Ilias

^ permalink raw reply

* Re: [PATCH 1/6] ravb: remove custom .nway_reset from ethtool ops
From: Sergei Shtylyov @ 2018-05-24 16:18 UTC (permalink / raw)
  To: Vladimir Zapolskiy, Andrew Lunn, Vladimir Zapolskiy
  Cc: David S. Miller, netdev, linux-renesas-soc
In-Reply-To: <2bf96562-e402-d797-31e6-ba7e262e5637@mleia.com>

Hello!

On 05/24/2018 05:11 PM, Vladimir Zapolskiy wrote:

>>> The change fixes a sleep in atomic context issue, which can be
>>> always triggered by running 'ethtool -r' command, because
>>> phy_start_aneg() protects phydev fields by a mutex.

  You don't say that *not* grabbing the spinlock is safe... 

>>> Another note is that the change implicitly replaces phy_start_aneg()
>>> with a newer phy_restart_aneg().

   Hm, perphaps this could be a material for a separate patch? 

>>>
>>> Signed-off-by: Vladimir Zapolskiy <vladimir_zapolskiy@mentor.com>
>>> ---
>>>  drivers/net/ethernet/renesas/ravb_main.c | 17 +----------------
>>>  1 file changed, 1 insertion(+), 16 deletions(-)
>>>
>>> diff --git a/drivers/net/ethernet/renesas/ravb_main.c b/drivers/net/ethernet/renesas/ravb_main.c
>>> index 68f122140966..4a043eb0e2aa 100644
>>> --- a/drivers/net/ethernet/renesas/ravb_main.c
>>> +++ b/drivers/net/ethernet/renesas/ravb_main.c
>>> @@ -1150,21 +1150,6 @@ static int ravb_set_link_ksettings(struct net_device *ndev,
>>>  	return error;
>>>  }
>>>  
>>> -static int ravb_nway_reset(struct net_device *ndev)
>>> -{
>>> -	struct ravb_private *priv = netdev_priv(ndev);
>>> -	int error = -ENODEV;
>>> -	unsigned long flags;
>>> -
>>> -	if (ndev->phydev) {
>>> -		spin_lock_irqsave(&priv->lock, flags);
>>> -		error = phy_start_aneg(ndev->phydev);
>>> -		spin_unlock_irqrestore(&priv->lock, flags);
>>> -	}
>>
>> Eck! phylib assumes thread context and takes a mutex while calling
>> into the PHY driver.
>>
>> It would be good to add some sort of fixes: tag. Maybe for the commit
>> that added the generic nway_reset? That would at least cover some
>> stable kernels.
>>
>> Reviewed-by: Andrew Lunn <andrew@lunn.ch>
>>
> 
> Hi Andrew, thank you for review.
> 
> generally it makes sense to add Fixes tag, but as I said in
> the commit message the problem is present before reused phy_ethtool_*()
> functions were added to the kernel, so some kind of juggling with
> the proper kernel version would be required in assumption that
> the fixes are backported as an unmodified changes.

   The -stable fixes can vary from version to version, IIUC. You could be
asked to backport your patch if Greg KH (or somebody else from the -stable
kernel maintainers) gets in trouble backporting your patch. 

> Hopefully Sergei as the driver maintainer can verify the fixes on

   I'm *not* a maintainer, just a humble reviewer! :-)

> older kernels and suggest the right kernel versions for backporting.

   This would be asking too much from me, I'm afraid...
   Still, Dave, could you please give me a couple of days to spend on
this series?

> --
> With best wishes,
> Vladimir

MBR, Sergei

^ permalink raw reply

* Re: [PATCH net-next] selftests: net: Test headroom handling of ip6_gre devices
From: William Tu @ 2018-05-24 16:19 UTC (permalink / raw)
  To: Petr Machata
  Cc: Linux Kernel Network Developers, linux-kselftest, David Miller,
	Shuah Khan
In-Reply-To: <a78543459df02997bc298c09e9aa56167b22d5a4.1527093523.git.petrm@mellanox.com>

Hi Petr,

I tried to test this patch on latest net-next but encounter a couple issues.

On Wed, May 23, 2018 at 9:41 AM, Petr Machata <petrm@mellanox.com> wrote:
> Commit 5691484df961 ("net: ip6_gre: Fix headroom request in
> ip6erspan_tunnel_xmit()") and commit 01b8d064d58b ("net: ip6_gre:
> Request headroom in __gre6_xmit()") fix problems in reserving headroom
> in the packets tunneled through ip6gre/tap and ip6erspan netdevices.
>
> These two patches included snippets that reproduced the issues. This
> patch elevates the snippets to a full-fledged test case.
>
> Suggested-by: David Miller <davem@davemloft.net>
> Signed-off-by: Petr Machata <petrm@mellanox.com>
> ---
>  tools/testing/selftests/net/ip6_gre_headroom.sh | 59 +++++++++++++++++++++++++
>  1 file changed, 59 insertions(+)
>  create mode 100755 tools/testing/selftests/net/ip6_gre_headroom.sh
>
> diff --git a/tools/testing/selftests/net/ip6_gre_headroom.sh b/tools/testing/selftests/net/ip6_gre_headroom.sh
> new file mode 100755
> index 0000000..9aaf63fd
> --- /dev/null
> +++ b/tools/testing/selftests/net/ip6_gre_headroom.sh
> @@ -0,0 +1,59 @@
> +#!/bin/bash
> +# SPDX-License-Identifier: GPL-2.0
> +#
> +# Test that enough headroom is reserved for the first packet passing through an
> +# IPv6 GRE-like netdevice.
> +
> +setup_prepare()
> +{
> +       ip link add h1 type veth peer name swp1
> +       ip link add h3 type veth peer name swp3
> +
> +       ip link set dev h1 up
> +       ip address add 192.0.2.1/28 dev h1
> +
> +       ip link add dev vh3 type vrf table 20
> +       ip link set dev h3 master vh3
> +       ip link set dev vh3 up
> +       ip link set dev h3 up
> +
> +       ip link set dev swp3 up
> +       ip address add dev swp3 2001:db8:2::1/64
> +
> +       ip link set dev swp1 up
> +       tc qdisc add dev swp1 clsact
> +}
> +
> +cleanup()
> +{
> +       ip link del dev swp1
> +       ip link del dev swp3
> +       ip link del dev vh3
I think we also need to do:
ip link del dev gt6

> +}
> +
> +test_headroom()
> +{
> +       ip link add name gt6 "$@"
> +       ip link set dev gt6 up
> +
> +       sleep 1
> +
> +       tc filter add dev swp1 ingress pref 1000 matchall skip_hw \
> +               action mirred egress mirror dev gt6
> +       ping -I h1 192.0.2.2 -c 1 -w 2 &> /dev/null

I increase ping count from 1 to 1000
and after a while the program hangs when I try to ctrl+c
+ cleanup
+ ip link del dev swp1
dmesg shows:
....
[ 1256.002453] unregister_netdevice: waiting for swp1 to become free.
Usage count = 9
[ 1266.082571] unregister_netdevice: waiting for swp1 to become free.
Usage count = 9
[ 1276.163011] unregister_netdevice: waiting for swp1 to become free.
Usage count = 9

Thanks
William

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox