Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH RFC 2/9] veth: Add driver XDP
From: Toshiaki Makita @ 2018-04-26 10:46 UTC (permalink / raw)
  To: Jesper Dangaard Brouer; +Cc: Toshiaki Makita, netdev
In-Reply-To: <20180425223852.0be2fd67@brouer.com>

Hi Jesper,

Thanks for taking a look!

On 2018/04/26 5:39, Jesper Dangaard Brouer wrote:
> On Tue, 24 Apr 2018 23:39:16 +0900
> Toshiaki Makita <toshiaki.makita1@gmail.com> wrote:
> 
>> This is basic implementation of veth driver XDP.
>>
>> Incoming packets are sent from the peer veth device in the form of skb,
>> so this is generally doing the same thing as generic XDP.
> 
> I'm unsure that context you are calling veth_xdp_rcv_skb() from.  The
> XDP (RX side) depend heavily on the protection provided by NAPI context.
> It looks like you are adding NAPI handler later.  

This is called from softirq or bh disabled context.
I can see XDP REDIRECT depends on NAPI since it uses per-cpu temporary
storage which is used in ndo_xdp_flush. I thought DROP and PASS is safe
here. Also this is basically the same context as generic XDP, which is
called from netif_rx_internal.

Anyway this is a temporary state and not needed. It looks like this does
not help review so I'll squash this and patch 4 (napi patch).

-- 
Toshiaki Makita

^ permalink raw reply

* Re: [PATCH net-next 6/6] mlxsw: spectrum_span: Allow bridge for gretap mirror
From: Nikolay Aleksandrov @ 2018-04-26 10:50 UTC (permalink / raw)
  To: Ido Schimmel, netdev, bridge; +Cc: davem, stephen, jiri, petrm, mlxsw
In-Reply-To: <20180426090637.25262-7-idosch@mellanox.com>

On 26/04/18 12:06, Ido Schimmel wrote:
> From: Petr Machata <petrm@mellanox.com>
> 
> When handling mirroring to a gretap or ip6gretap netdevice in mlxsw, the
> underlay address (i.e. the remote address of the tunnel) may be routed
> to a bridge.
> 
> In that case, look up the resolved neighbor Ethernet address in that
> bridge's FDB. Then configure the offload to direct the mirrored traffic
> to that port, possibly with tagging.
> 
> Signed-off-by: Petr Machata <petrm@mellanox.com>
> Signed-off-by: Ido Schimmel <idosch@mellanox.com>
> ---
>   .../net/ethernet/mellanox/mlxsw/spectrum_span.c    | 102 +++++++++++++++++++--
>   .../net/ethernet/mellanox/mlxsw/spectrum_span.h    |   1 +
>   2 files changed, 97 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_span.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_span.c
> index 65a77708ff61..92fb81839852 100644
> --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_span.c
> +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_span.c
> @@ -32,6 +32,7 @@
>    * POSSIBILITY OF SUCH DAMAGE.
>    */
>   
> +#include <linux/if_bridge.h>
>   #include <linux/list.h>
>   #include <net/arp.h>
>   #include <net/gre.h>
> @@ -39,8 +40,9 @@
>   #include <net/ip6_tunnel.h>
>   
>   #include "spectrum.h"
> -#include "spectrum_span.h"
>   #include "spectrum_ipip.h"
> +#include "spectrum_span.h"
> +#include "spectrum_switchdev.h"
>   
>   int mlxsw_sp_span_init(struct mlxsw_sp *mlxsw_sp)
>   {
> @@ -167,6 +169,79 @@ mlxsw_sp_span_entry_unoffloadable(struct mlxsw_sp_span_parms *sparmsp)
>   	return 0;
>   }
>   
> +static struct net_device *
> +mlxsw_sp_span_entry_bridge_8021q(const struct net_device *br_dev,
> +				 unsigned char *dmac,
> +				 u16 *p_vid)
> +{
> +	struct net_bridge_vlan_group *vg = br_vlan_group_rtnl(br_dev);
> +	u16 pvid = br_vlan_group_pvid(vg);
> +	struct net_device *edev = NULL;
> +	struct net_bridge_vlan *v;
> +

Hi,
These structures are not really exported anywhere outside of br_private.h
Instead of passing them around and risking someone else actually trying to
dereference an incomplete type, why don't you add just 2 new helpers -
br_vlan_pvid(netdevice) and br_vlan_info(netdevice, vid, *bridge_vlan_info)

br_vlan_info can return the exported and already available type "bridge_vlan_info"
(defined in uapi/linux/if_bridge.h) into the bridge_vlan_info arg
and br_vlan_pvid is obvious - return the current dev pvid if available.

Another bonus you'll avoid dealing with the bridge's vlan internals.

> +	if (pvid)
> +		edev = br_fdb_find_port_hold(br_dev, dmac, pvid);
> +	if (!edev)
> +		return NULL;
> +
> +	/* RTNL prevents edev from being removed. */
> +	dev_put(edev);

Minor point here - since this is always called in rtnl, the dev_hold/put dance isn't needed,
unless you plan to use the fdb_find_port_hold outside of rtnl of course. In case that is
not planned why don't you do br_fdb_find_port_rtnl() instead with RCU inside (thus not blocking
learning) and ASSERT_RTNL() to avoid some complexity ?

Thanks,
  Nik

> +
> +	vg = br_port_vlan_group_rtnl(edev);
> +	v = br_vlan_find(vg, pvid);
> +	if (!v)
> +		return NULL;
> +	if (!(br_vlan_flags(v) & BRIDGE_VLAN_INFO_UNTAGGED))
> +		*p_vid = pvid;
> +	return edev;
> +}
> +
> +static struct net_device *
> +mlxsw_sp_span_entry_bridge_8021d(const struct net_device *br_dev,
> +				 unsigned char *dmac)
> +{
> +	struct net_device *edev = br_fdb_find_port_hold(br_dev, dmac, 0);
> +
> +	/* RTNL prevents edev from being removed. */
> +	dev_put(edev);
> +
> +	return edev;
> +}
> +
> +static struct net_device *
> +mlxsw_sp_span_entry_bridge(const struct net_device *br_dev,
> +			   unsigned char dmac[ETH_ALEN],
> +			   u16 *p_vid)
> +{
> +	struct mlxsw_sp_bridge_port *bridge_port;
> +	enum mlxsw_reg_spms_state spms_state;
> +	struct mlxsw_sp_port *port;
> +	struct net_device *dev;
> +	u8 stp_state;
> +
> +	if (br_vlan_enabled(br_dev))
> +		dev = mlxsw_sp_span_entry_bridge_8021q(br_dev, dmac, p_vid);
> +	else
> +		dev = mlxsw_sp_span_entry_bridge_8021d(br_dev, dmac);
> +	if (!dev)
> +		return NULL;
> +
> +	port = mlxsw_sp_port_dev_lower_find(dev);
> +	if (!port)
> +		return NULL;
> +
> +	bridge_port = mlxsw_sp_bridge_port_find(port->mlxsw_sp->bridge, dev);
> +	if (!bridge_port)
> +		return NULL;
> +
> +	stp_state = mlxsw_sp_bridge_port_stp_state(bridge_port);
> +	spms_state = mlxsw_sp_stp_spms_state(stp_state);
> +	if (spms_state != MLXSW_REG_SPMS_STATE_FORWARDING)
> +		return NULL;
> +
> +	return dev;
> +}
> +
>   static __maybe_unused int
>   mlxsw_sp_span_entry_tunnel_parms_common(struct net_device *l3edev,
>   					union mlxsw_sp_l3addr saddr,
> @@ -177,13 +252,22 @@ mlxsw_sp_span_entry_tunnel_parms_common(struct net_device *l3edev,
>   					struct mlxsw_sp_span_parms *sparmsp)
>   {
>   	unsigned char dmac[ETH_ALEN];
> +	u16 vid = 0;
>   
>   	if (mlxsw_sp_l3addr_is_zero(gw))
>   		gw = daddr;
>   
> -	if (!l3edev || !mlxsw_sp_port_dev_check(l3edev) ||
> -	    mlxsw_sp_span_dmac(tbl, &gw, l3edev, dmac))
> -		return mlxsw_sp_span_entry_unoffloadable(sparmsp);
> +	if (!l3edev || mlxsw_sp_span_dmac(tbl, &gw, l3edev, dmac))
> +		goto unoffloadable;
> +
> +	if (netif_is_bridge_master(l3edev)) {
> +		l3edev = mlxsw_sp_span_entry_bridge(l3edev, dmac, &vid);
> +		if (!l3edev)
> +			goto unoffloadable;
> +	}
> +
> +	if (!mlxsw_sp_port_dev_check(l3edev))
> +		goto unoffloadable;
>   
>   	sparmsp->dest_port = netdev_priv(l3edev);
>   	sparmsp->ttl = ttl;
> @@ -191,7 +275,11 @@ mlxsw_sp_span_entry_tunnel_parms_common(struct net_device *l3edev,
>   	memcpy(sparmsp->smac, l3edev->dev_addr, ETH_ALEN);
>   	sparmsp->saddr = saddr;
>   	sparmsp->daddr = daddr;
> +	sparmsp->vid = vid;
>   	return 0;
> +
> +unoffloadable:
> +	return mlxsw_sp_span_entry_unoffloadable(sparmsp);
>   }
>   
>   #if IS_ENABLED(CONFIG_NET_IPGRE)
> @@ -268,9 +356,10 @@ mlxsw_sp_span_entry_gretap4_configure(struct mlxsw_sp_span_entry *span_entry,
>   	/* Create a new port analayzer entry for local_port. */
>   	mlxsw_reg_mpat_pack(mpat_pl, pa_id, local_port, true,
>   			    MLXSW_REG_MPAT_SPAN_TYPE_REMOTE_ETH_L3);
> +	mlxsw_reg_mpat_eth_rspan_pack(mpat_pl, sparms.vid);
>   	mlxsw_reg_mpat_eth_rspan_l2_pack(mpat_pl,
>   				    MLXSW_REG_MPAT_ETH_RSPAN_VERSION_NO_HEADER,
> -				    sparms.dmac, false);
> +				    sparms.dmac, !!sparms.vid);
>   	mlxsw_reg_mpat_eth_rspan_l3_ipv4_pack(mpat_pl,
>   					      sparms.ttl, sparms.smac,
>   					      be32_to_cpu(sparms.saddr.addr4),
> @@ -368,9 +457,10 @@ mlxsw_sp_span_entry_gretap6_configure(struct mlxsw_sp_span_entry *span_entry,
>   	/* Create a new port analayzer entry for local_port. */
>   	mlxsw_reg_mpat_pack(mpat_pl, pa_id, local_port, true,
>   			    MLXSW_REG_MPAT_SPAN_TYPE_REMOTE_ETH_L3);
> +	mlxsw_reg_mpat_eth_rspan_pack(mpat_pl, sparms.vid);
>   	mlxsw_reg_mpat_eth_rspan_l2_pack(mpat_pl,
>   				    MLXSW_REG_MPAT_ETH_RSPAN_VERSION_NO_HEADER,
> -				    sparms.dmac, false);
> +				    sparms.dmac, !!sparms.vid);
>   	mlxsw_reg_mpat_eth_rspan_l3_ipv6_pack(mpat_pl, sparms.ttl, sparms.smac,
>   					      sparms.saddr.addr6,
>   					      sparms.daddr.addr6);
> diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_span.h b/drivers/net/ethernet/mellanox/mlxsw/spectrum_span.h
> index 4b87ec20e658..14a6de904db1 100644
> --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_span.h
> +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_span.h
> @@ -63,6 +63,7 @@ struct mlxsw_sp_span_parms {
>   	unsigned char smac[ETH_ALEN];
>   	union mlxsw_sp_l3addr daddr;
>   	union mlxsw_sp_l3addr saddr;
> +	u16 vid;
>   };
>   
>   struct mlxsw_sp_span_entry_ops;
> 

^ permalink raw reply

* Re: [PATCH RFC 6/9] veth: Add ndo_xdp_xmit
From: Toshiaki Makita @ 2018-04-26 10:52 UTC (permalink / raw)
  To: Jesper Dangaard Brouer; +Cc: Toshiaki Makita, netdev
In-Reply-To: <20180425222452.1ea5c69f@redhat.com>

On 2018/04/26 5:24, Jesper Dangaard Brouer wrote:
> On Tue, 24 Apr 2018 23:39:20 +0900
> Toshiaki Makita <toshiaki.makita1@gmail.com> wrote:
> 
>> +static int veth_xdp_xmit(struct net_device *dev, struct xdp_frame *frame)
>> +{
>> +	struct veth_priv *rcv_priv, *priv = netdev_priv(dev);
>> +	int headroom = frame->data - (void *)frame;
>> +	struct net_device *rcv;
>> +	int err = 0;
>> +
>> +	rcv = rcu_dereference(priv->peer);
>> +	if (unlikely(!rcv))
>> +		return -ENXIO;
>> +
>> +	rcv_priv = netdev_priv(rcv);
>> +	/* xdp_ring is initialized on receive side? */
>> +	if (rcu_access_pointer(rcv_priv->xdp_prog)) {
>> +		err = xdp_ok_fwd_dev(rcv, frame->len);
>> +		if (unlikely(err))
>> +			return err;
>> +
>> +		err = veth_xdp_enqueue(rcv_priv, veth_xdp_to_ptr(frame));
>> +	} else {
>> +		struct sk_buff *skb;
>> +
>> +		skb = veth_build_skb(frame, headroom, frame->len, 0);
>> +		if (unlikely(!skb))
>> +			return -ENOMEM;
>> +
>> +		/* Get page ref in case skb is dropped in netif_rx.
>> +		 * The caller is responsible for freeing the page on error.
>> +		 */
>> +		get_page(virt_to_page(frame->data));
> 
> I'm not sure you can make this assumption, that xdp_frames coming from
> another device driver uses a refcnt based memory model. But maybe I'm
> confused, as this looks like an SKB receive path, but in the
> ndo_xdp_xmit().

I find this path similar to cpumap, which creates skb from redirected
xdp frame. Once it is converted to skb, skb head is freed by
page_frag_free, so anyway I needed to get the refcount here regardless
of memory model.

> 
> 
>> +		if (unlikely(veth_forward_skb(rcv, skb, false) != NET_RX_SUCCESS))
>> +			return -ENXIO;
>> +
>> +		/* Put page ref on success */
>> +		page_frag_free(frame->data);
>> +	}
>> +
>> +	return err;
>> +}
> 
> 

-- 
Toshiaki Makita

^ permalink raw reply

* Re: [PATCH net] sctp: handle two v4 addrs comparison in sctp_inet6_cmp_addr
From: Neil Horman @ 2018-04-26 11:25 UTC (permalink / raw)
  To: Xin Long; +Cc: network dev, linux-sctp, davem, Marcelo Ricardo Leitner,
	syzkaller
In-Reply-To: <17bfe46d7b9941f2283043f45ea5644c166c32c3.1524723237.git.lucien.xin@gmail.com>

On Thu, Apr 26, 2018 at 02:13:57PM +0800, Xin Long wrote:
> Since sctp ipv6 socket also supports v4 addrs, it's possible to
> compare two v4 addrs in pf v6 .cmp_addr, sctp_inet6_cmp_addr.
> 
> However after Commit 1071ec9d453a ("sctp: do not check port in
> sctp_inet6_cmp_addr"), it no longer calls af1->cmp_addr, which
> in this case is sctp_v4_cmp_addr, but calls __sctp_v6_cmp_addr
> where it handles them as two v6 addrs. It would cause a out of
> bounds crash.
> 
> syzbot found this crash when trying to bind two v4 addrs to a
> v6 socket.
> 
> This patch fixes it by adding the process for two v4 addrs in
> sctp_inet6_cmp_addr.
> 
> Fixes: 1071ec9d453a ("sctp: do not check port in sctp_inet6_cmp_addr")
> Reported-by: syzbot+cd494c1dd681d4d93ebb@syzkaller.appspotmail.com
> Signed-off-by: Xin Long <lucien.xin@gmail.com>
> ---
>  net/sctp/ipv6.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/net/sctp/ipv6.c b/net/sctp/ipv6.c
> index 2e3f7b7..4224711 100644
> --- a/net/sctp/ipv6.c
> +++ b/net/sctp/ipv6.c
> @@ -895,6 +895,9 @@ static int sctp_inet6_cmp_addr(const union sctp_addr *addr1,
>  	if (sctp_is_any(sk, addr1) || sctp_is_any(sk, addr2))
>  		return 1;
>  
> +	if (addr1->sa.sa_family == AF_INET && addr2->sa.sa_family == AF_INET)
> +		return addr1->v4.sin_addr.s_addr == addr2->v4.sin_addr.s_addr;
> +
>  	return __sctp_v6_cmp_addr(addr1, addr2);
>  }
>  
> -- 
> 2.1.0
> 
> 
Acked-by: Neil Horman <nhorman@tuxdriver.com>

^ permalink raw reply

* Re: [PATCH net] sctp: clear the new asoc's stream outcnt in sctp_stream_update
From: Neil Horman @ 2018-04-26 11:44 UTC (permalink / raw)
  To: Xin Long; +Cc: network dev, linux-sctp, davem, Marcelo Ricardo Leitner
In-Reply-To: <7a1180e29789ab0aa339ae8b456a100520ffcdc5.1524727304.git.lucien.xin@gmail.com>

On Thu, Apr 26, 2018 at 03:21:44PM +0800, Xin Long wrote:
> When processing a duplicate cookie-echo chunk, sctp moves the new
> temp asoc's stream out/in into the old asoc, and later frees this
> new temp asoc.
> 
> But now after this move, the new temp asoc's stream->outcnt is not
> cleared while stream->out is set to NULL, which would cause a same
> crash as the one fixed in Commit 79d0895140e9 ("sctp: fix error
> path in sctp_stream_init") when freeing this asoc later.
> 
> This fix is to clear this outcnt in sctp_stream_update.
> 
> Fixes: f952be79cebd ("sctp: introduce struct sctp_stream_out_ext")
> Reported-by: Jianwen Ji <jiji@redhat.com>
> Signed-off-by: Xin Long <lucien.xin@gmail.com>
> ---
>  net/sctp/stream.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/net/sctp/stream.c b/net/sctp/stream.c
> index f799043..f1f1d1b 100644
> --- a/net/sctp/stream.c
> +++ b/net/sctp/stream.c
> @@ -240,6 +240,8 @@ void sctp_stream_update(struct sctp_stream *stream, struct sctp_stream *new)
>  
>  	new->out = NULL;
>  	new->in  = NULL;
> +	new->outcnt = 0;
> +	new->incnt  = 0;
>  }
>  
>  static int sctp_send_reconf(struct sctp_association *asoc,
> -- 
> 2.1.0
> 
> 
Acked-by: Neil Horman <nhorman@tuxdriver.com>

^ permalink raw reply

* Re: WARNING: kobject bug in br_add_if
From: Nikolay Aleksandrov @ 2018-04-26 11:49 UTC (permalink / raw)
  To: Hangbin Liu, Dmitry Vyukov
  Cc: syzbot, bridge, David Miller, LKML, netdev, stephen hemminger,
	syzkaller-bugs, Greg Kroah-Hartman
In-Reply-To: <20180426103702.GI20683@leo.usersys.redhat.com>

On 26/04/18 13:37, Hangbin Liu wrote:
> On Thu, Apr 26, 2018 at 10:04:16AM +0200, Dmitry Vyukov wrote:
>> On Thu, Apr 26, 2018 at 8:13 AM, Hangbin Liu <liuhangbin@gmail.com> wrote:
>>> On Wed, Apr 11, 2018 at 05:18:23PM +0200, Dmitry Vyukov wrote:
>>>> On Wed, Apr 11, 2018 at 5:15 PM, syzbot
>>>> <syzbot+de73361ee4971b6e6f75@syzkaller.appspotmail.com> wrote:
>>>>> kobject_add_internal failed for brport (error: -12 parent: bond0)
[snip]
> 
> Re-checked the error. This is a -ENOMEM. So normally we could ignore it.
> 
> But on the other hand, although we could find out the slave iface's
> master in netdev_master_upper_dev_link(). It already go much further
> and allocate some resource and change iface state. e.g.
> 
> [54273.968516] br0: port 1(em1) entered blocking state
> [54273.973979] br0: port 1(em1) entered disabled state
> 
> So I think we'd better return as early as possible. I will post a fix
> for this.
> 
> Thanks
> Hangbin

If I'm not mistaken the bridge allocated resources for the port are
cleaned on kobject_init_and_add() error return. Or are you talking
about some other resources ?

^ permalink raw reply

* Re: ip6-in-ip{4,6} ipsec tunnel issues with 1280 MTU
From: Paolo Abeni @ 2018-04-26 11:51 UTC (permalink / raw)
  To: Ashwanth Goli; +Cc: netdev, maloney, edumazet, David Ahern
In-Reply-To: <c3a1f2fc545b622bbb362e98313d7eba@codeaurora.org>

Hi,

[fixed CC list]

On Wed, 2018-04-25 at 21:43 +0530, Ashwanth Goli wrote:
> Hi Pablo,

Actually I'm Paolo, but yours is a recurring mistake ;)

> I am noticing an issue similar to the one reported by Alexis Perez 
> [Regression for ip6-in-ip4 IPsec tunnel in 4.14.16]
> 
> In my IPsec setup outer MTU is set to 1280, ip6_setup_cork sees an MTU 
> less than IPV6_MIN_MTU because of the tunnel headers. -EINVAL is being 
> returned as a result of the MTU check that got added with below patch.
> 
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git/commit/net/ipv6/ip6_output.c?h=v4.14.34&id=8278804e05f6bcfe3fdfea4a404020752ead15a6
> 
> Can we remove this MTU check since your recent patch [ipv6: the entire 
> IPv6 header chain must fit the first fragment] fixes a similar issue?

AFAICS, RFC 2473 implies we can have MTU below 1280 for tunnel devices
so we can probably relax the MTU check for such devices, but I think we
would still need it in the general case.

Cheers,

Paolo

^ permalink raw reply

* Re: WARNING: kobject bug in br_add_if
From: Nikolay Aleksandrov @ 2018-04-26 11:51 UTC (permalink / raw)
  To: Hangbin Liu, Dmitry Vyukov
  Cc: syzbot, bridge, David Miller, LKML, netdev, stephen hemminger,
	syzkaller-bugs, Greg Kroah-Hartman
In-Reply-To: <2170a49c-be84-1bf4-4c73-bf7a668e5288@cumulusnetworks.com>

On 26/04/18 14:49, Nikolay Aleksandrov wrote:
> On 26/04/18 13:37, Hangbin Liu wrote:
>> On Thu, Apr 26, 2018 at 10:04:16AM +0200, Dmitry Vyukov wrote:
>>> On Thu, Apr 26, 2018 at 8:13 AM, Hangbin Liu <liuhangbin@gmail.com> wrote:
>>>> On Wed, Apr 11, 2018 at 05:18:23PM +0200, Dmitry Vyukov wrote:
>>>>> On Wed, Apr 11, 2018 at 5:15 PM, syzbot
>>>>> <syzbot+de73361ee4971b6e6f75@syzkaller.appspotmail.com> wrote:
>>>>>> kobject_add_internal failed for brport (error: -12 parent: bond0)
> [snip]
>>
>> Re-checked the error. This is a -ENOMEM. So normally we could ignore it.
>>
>> But on the other hand, although we could find out the slave iface's
>> master in netdev_master_upper_dev_link(). It already go much further
>> and allocate some resource and change iface state. e.g.
>>
>> [54273.968516] br0: port 1(em1) entered blocking state
>> [54273.973979] br0: port 1(em1) entered disabled state
>>
>> So I think we'd better return as early as possible. I will post a fix
>> for this.
>>
>> Thanks
>> Hangbin
> 
> If I'm not mistaken the bridge allocated resources for the port are
> cleaned on kobject_init_and_add() error return. Or are you talking
> about some other resources ?
> 

Ah, my bad - you weren't talking about resource freeing.
Nevermind my comment.

^ permalink raw reply

* Re: [PATCH] connector: add parent pid and tgid to coredump and exit events
From: Stefan Strogin @ 2018-04-26 12:04 UTC (permalink / raw)
  To: Jesper Derehag, David Miller, zbr@ioremap.net
  Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	xe-linux-external@cisco.com, matt.helsley@gmail.com
In-Reply-To: <DB6P190MB0389513E240BD8ECEE575DB4DDBB0@DB6P190MB0389.EURP190.PROD.OUTLOOK.COM>

Hi David, Evgeniy,

Sorry to bother you, but could you please comment about the UAPI change and the patch?

Thanks, Jesper.

--
Stefan

On 05/04/18 12:07, Jesper Derehag wrote:
> Unless David comes back with something I have (also) missed regarding uapi breakage, this looks good to me.
> 
> /Jesper
> 
> ________________________________________
> Från: Stefan Strogin <sstrogin@cisco.com>
> Skickat: den 2 april 2018 17:18
> Till: David Miller
> Kopia: zbr@ioremap.net; netdev@vger.kernel.org; linux-kernel@vger.kernel.org; xe-linux-external@cisco.com; jderehag@hotmail.com; matt.helsley@gmail.com; minipli@googlemail.com
> Ämne: Re: [PATCH] connector: add parent pid and tgid to coredump and exit events
> 
> Hi David,
> 
> I don't see how it breaks UAPI. The point is that structures
> coredump_proc_event and exit_proc_event are members of *union*
> event_data, thus position of the existing data in the structure is
> unchanged. Furthermore, this change won't increase size of struct
> proc_event, because comm_proc_event (also a member of event_data) is
> of bigger size than the changed structures.
> 
> If I'm wrong, could you please explain what exactly will the change
> break in UAPI?
> 
> 
> On 30/03/18 19:59, David Miller wrote:
>> From: Stefan Strogin <sstrogin@cisco.com>
>> Date: Thu, 29 Mar 2018 17:12:47 +0300
>>
>>> diff --git a/include/uapi/linux/cn_proc.h b/include/uapi/linux/cn_proc.h
>>> index 68ff25414700..db210625cee8 100644
>>> --- a/include/uapi/linux/cn_proc.h
>>> +++ b/include/uapi/linux/cn_proc.h
>>> @@ -116,12 +116,16 @@ struct proc_event {
>>>              struct coredump_proc_event {
>>>                      __kernel_pid_t process_pid;
>>>                      __kernel_pid_t process_tgid;
>>> +                    __kernel_pid_t parent_pid;
>>> +                    __kernel_pid_t parent_tgid;
>>>              } coredump;
>>>
>>>              struct exit_proc_event {
>>>                      __kernel_pid_t process_pid;
>>>                      __kernel_pid_t process_tgid;
>>>                      __u32 exit_code, exit_signal;
>>> +                    __kernel_pid_t parent_pid;
>>> +                    __kernel_pid_t parent_tgid;
>>>              } exit;
>>>
>>>      } event_data;
>>
>> I don't think you can add these members without breaking UAPI.
>>
> 

^ permalink raw reply

* Re: [Patch nf] ipvs: initialize tbl->entries after allocation
From: Simon Horman @ 2018-04-26 12:14 UTC (permalink / raw)
  To: Julian Anastasov
  Cc: Cong Wang, netdev, lvs-devel, netfilter-devel, Pablo Neira Ayuso
In-Reply-To: <alpine.LFD.2.20.1804240815120.2470@ja.home.ssi.bg>

On Tue, Apr 24, 2018 at 08:16:14AM +0300, Julian Anastasov wrote:
> 
> 	Hello,
> 
> On Mon, 23 Apr 2018, Cong Wang wrote:
> 
> > tbl->entries is not initialized after kmalloc(), therefore
> > causes an uninit-value warning in ip_vs_lblc_check_expire()
> > as reported by syzbot.
> > 
> > Reported-by: <syzbot+3dfdea57819073a04f21@syzkaller.appspotmail.com>
> > Cc: Simon Horman <horms@verge.net.au>
> > Cc: Julian Anastasov <ja@ssi.bg>
> > Cc: Pablo Neira Ayuso <pablo@netfilter.org>
> > Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
> 
> 	Thanks!
> 
> Acked-by: Julian Anastasov <ja@ssi.bg>

Thanks.

Pablo, could you take this into nf?

Acked-by: Simon Horman <horms@verge.net.au>

> 
> > ---
> >  net/netfilter/ipvs/ip_vs_lblcr.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/net/netfilter/ipvs/ip_vs_lblcr.c b/net/netfilter/ipvs/ip_vs_lblcr.c
> > index 92adc04557ed..bc2bc5eebcb8 100644
> > --- a/net/netfilter/ipvs/ip_vs_lblcr.c
> > +++ b/net/netfilter/ipvs/ip_vs_lblcr.c
> > @@ -534,6 +534,7 @@ static int ip_vs_lblcr_init_svc(struct ip_vs_service *svc)
> >  	tbl->counter = 1;
> >  	tbl->dead = false;
> >  	tbl->svc = svc;
> > +	atomic_set(&tbl->entries, 0);
> >  
> >  	/*
> >  	 *    Hook periodic timer for garbage collection
> > -- 
> > 2.13.0
> 
> Regards
> 
> --
> Julian Anastasov <ja@ssi.bg>
> 

^ permalink raw reply

* Re: [Patch nf] ipvs: initialize tbl->entries in ip_vs_lblc_init_svc()
From: Simon Horman @ 2018-04-26 12:14 UTC (permalink / raw)
  To: Julian Anastasov
  Cc: Cong Wang, netdev, lvs-devel, netfilter-devel, Pablo Neira Ayuso
In-Reply-To: <alpine.LFD.2.20.1804240816210.2470@ja.home.ssi.bg>

On Tue, Apr 24, 2018 at 08:17:06AM +0300, Julian Anastasov wrote:
> 
> 	Hello,
> 
> On Mon, 23 Apr 2018, Cong Wang wrote:
> 
> > Similarly, tbl->entries is not initialized after kmalloc(),
> > therefore causes an uninit-value warning in ip_vs_lblc_check_expire(),
> > as reported by syzbot.
> > 
> > Reported-by: <syzbot+3e9695f147fb529aa9bc@syzkaller.appspotmail.com>
> > Cc: Simon Horman <horms@verge.net.au>
> > Cc: Julian Anastasov <ja@ssi.bg>
> > Cc: Pablo Neira Ayuso <pablo@netfilter.org>
> > Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
> 
> 	Thanks!
> 
> Acked-by: Julian Anastasov <ja@ssi.bg>

Thanks. 

Pablo, could you take this into nf?

Acked-by: Simon Horman <horms@verge.net.au>

> 
> > ---
> >  net/netfilter/ipvs/ip_vs_lblc.c | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/net/netfilter/ipvs/ip_vs_lblc.c b/net/netfilter/ipvs/ip_vs_lblc.c
> > index 3057e453bf31..83918119ceb8 100644
> > --- a/net/netfilter/ipvs/ip_vs_lblc.c
> > +++ b/net/netfilter/ipvs/ip_vs_lblc.c
> > @@ -371,6 +371,7 @@ static int ip_vs_lblc_init_svc(struct ip_vs_service *svc)
> >  	tbl->counter = 1;
> >  	tbl->dead = false;
> >  	tbl->svc = svc;
> > +	atomic_set(&tbl->entries, 0);
> >  
> >  	/*
> >  	 *    Hook periodic timer for garbage collection
> > -- 
> > 2.13.0
> 
> Regards
> 
> --
> Julian Anastasov <ja@ssi.bg>
> 

^ permalink raw reply

* [PATCH net-next] net: Fix coccinelle warning
From: Kirill Tkhai @ 2018-04-26 12:18 UTC (permalink / raw)
  To: davem, netdev, lkp, ktkhai

kbuild test robot says:

  >coccinelle warnings: (new ones prefixed by >>)
  >>> net/core/dev.c:1588:2-3: Unneeded semicolon

So, let's remove it.

Reported-by: kbuild test robot <lkp@intel.com>
Signed-off-by: Kirill Tkhai <ktkhai@virtuozzo.com>
---
 net/core/dev.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/dev.c b/net/core/dev.c
index c624a04dad1f..72e9299cd1e6 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -1587,7 +1587,7 @@ const char *netdev_cmd_to_name(enum netdev_cmd cmd)
 	N(UDP_TUNNEL_DROP_INFO) N(CHANGE_TX_QUEUE_LEN)
 	N(CVLAN_FILTER_PUSH_INFO) N(CVLAN_FILTER_DROP_INFO)
 	N(SVLAN_FILTER_PUSH_INFO) N(SVLAN_FILTER_DROP_INFO)
-	};
+	}
 #undef N
 	return "UNKNOWN_NETDEV_EVENT";
 }

^ permalink raw reply related

* Re: [PATCH v2 net-next] lan78xx: Lan7801 Support for Fixed PHY
From: Andrew Lunn @ 2018-04-26 12:20 UTC (permalink / raw)
  To: RaghuramChary.Jallipalli
  Cc: f.fainelli, davem, netdev, UNGLinuxDriver, Woojung.Huh
In-Reply-To: <0573C9D4B793EF43BF95221F2F4CC85153E4BD@CHN-SV-EXMX06.mchp-main.com>

On Thu, Apr 26, 2018 at 06:27:57AM +0000, RaghuramChary.Jallipalli@microchip.com wrote:
> Hi Florian,
> 
> > > v0->v1:
> > >    * Remove driver version #define
> > 
> > This should be a separate patch of its own, more comment below.
> > 
> OK. Patch should be for net, correct?

net-next. As far as i can see, none of this is a fix.

	  Andrew

^ permalink raw reply

* Re: [PATCH net-next] microchipT1phy: Add driver for Microchip LAN87XX T1 PHYs
From: Andrew Lunn @ 2018-04-26 12:27 UTC (permalink / raw)
  To: Nisar.Sayed; +Cc: f.fainelli, davem, UNGLinuxDriver, netdev
In-Reply-To: <CE371C1263339941885964188A0225FA3B0CB2@CHN-SV-EXMX03.mchp-main.com>

> Fine, will change the filename.

> The reason for moving to separate file is that we have a series of
> T1 standard PHYs, which support cable diagnostics, signal quality
> indicator(SQI) and sleep and wakeup (TC10) support etc. we planned
> to keep all T1 standard PHYs separate to support additional features
> supported by these PHYs.

Is there anything shared with the other microchip PHYs? If there is
potential for code sharing, you should do it.

> > > + */
> > > +#ifndef _MICROCHIPT1PHY_H_
> > > +#define _MICROCHIPT1PHY_H_
> > > +
> > > +/* Interrupt Source Register */
> > > +#define LAN87XX_INTERRUPT_SOURCE                (0x18)
> > > +
> > > +/* Interrupt Mask Register */
> > > +#define LAN87XX_INTERRUPT_MASK                  (0x19)
> > > +#define LAN87XX_MASK_LINK_UP                    (0x0004)
> > > +#define LAN87XX_MASK_LINK_DOWN                  (0x0002)
> > 
> > What's the point of that header file if all definitions are consumed by the
> > same driver?
> > 
> 
> We have planned a series of patches where we planned to use this further.

Are you adding multiple files which share the header? If not, just add
the defines to the C code.

    Andrew

^ permalink raw reply

* [PATCH] ath10k: sdio: jump to correct label in error handling path
From: Niklas Cassel @ 2018-04-26 12:35 UTC (permalink / raw)
  To: Kalle Valo; +Cc: Niklas Cassel, ath10k, linux-wireless, netdev, linux-kernel

Jump to the correct label in error handling path.
At this point of execution create_singlethread_workqueue() has succeeded,
so it should be properly destroyed.

Jump label was renamed in commit ec2c64e20257 ("ath10k: sdio: fix memory
leak for probe allocations").
However, the bug was originally introduced in commit d96db25d2025
("ath10k: add initial SDIO support").

Fixes: d96db25d2025 ("ath10k: add initial SDIO support")
Signed-off-by: Niklas Cassel <niklas.cassel@linaro.org>
---
 drivers/net/wireless/ath/ath10k/sdio.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/wireless/ath/ath10k/sdio.c b/drivers/net/wireless/ath/ath10k/sdio.c
index 2d04c54a4153..d612ce8c9cff 100644
--- a/drivers/net/wireless/ath/ath10k/sdio.c
+++ b/drivers/net/wireless/ath/ath10k/sdio.c
@@ -2011,7 +2011,7 @@ static int ath10k_sdio_probe(struct sdio_func *func,
 		ret = -ENODEV;
 		ath10k_err(ar, "unsupported device id %u (0x%x)\n",
 			   dev_id_base, id->device);
-		goto err_core_destroy;
+		goto err_free_wq;
 	}
 
 	ar->id.vendor = id->vendor;
-- 
2.14.3

^ permalink raw reply related

* Re: [PATCH] net: phy: marvell: clear wol event before setting it
From: Andrew Lunn @ 2018-04-26 12:40 UTC (permalink / raw)
  To: Jisheng Zhang
  Cc: Bhadram Varka, Florian Fainelli, David S. Miller, netdev,
	linux-kernel, Jingju Hou
In-Reply-To: <20180426155619.2c5d87d1@xhacker.debian>

> hmm, so you want a "stick" WOL feature, I dunno whether Linux kernel
> requires WOL should be "stick".

I see two different cases:

Suspend/resume: The WoL state in the kernel is probably kept across
such a cycle. If so, you would expect another suspend/resume to also
work, without needs to reconfigure it.

Boot from powered off: If the interrupt just enables the power supply,
it is possible to wake up after a shutdown. There is no state kept, so
WoL will be disabled in the kernel. So WoL should also be disabled in
the hardware.

    Andrew

^ permalink raw reply

* Re: [dm-devel] [PATCH v5] fault-injection: introduce kvmalloc fallback options
From: Michal Hocko @ 2018-04-26 12:58 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: eric.dumazet, mst, netdev, Randy Dunlap, linux-kernel,
	Matthew Wilcox, virtualization, James Bottomley, linux-mm,
	dm-devel, Vlastimil Babka, David Rientjes, Andrew Morton,
	David Miller, edumazet
In-Reply-To: <alpine.LRH.2.02.1804251830540.25124@file01.intranet.prod.int.rdu2.redhat.com>

On Wed 25-04-18 18:42:57, Mikulas Patocka wrote:
> 
> 
> On Wed, 25 Apr 2018, James Bottomley wrote:
[...]
> > Kconfig proliferation, conversely, is a bit of a nightmare from both
> > the user and the tester's point of view, so we're trying to avoid it
> > unless absolutely necessary.
> > 
> > James
> 
> I already offered that we don't need to introduce a new kernel option and 
> we can bind this feature to any other kernel option, that is enabled in 
> the debug kernel, for example CONFIG_DEBUG_SG. Michal said no and he said 
> that he wants a new kernel option instead.

Just for the record. I didn't say I _want_ a config option. Do not
misinterpret my words. I've said that a config option would be
acceptable if there is no way to deliver the functionality via kernel
package automatically. You haven't provided any argument that would
explain why the kernel package cannot add a boot option. Maybe there are
some but I do not see them right now.
-- 
Michal Hocko
SUSE Labs

^ permalink raw reply

* [PATCH] rsi_91x_mac80211: fix structurally dead code
From: Gustavo A. R. Silva @ 2018-04-26 13:01 UTC (permalink / raw)
  To: Prameela Rani Garnepudi, Kalle Valo
  Cc: linux-wireless, netdev, linux-kernel, Gustavo A. R. Silva

Function rsi_hal_key_config returns before reaching code at line
922 if (status), hence this code is structurally dead.

Fix this by storing the value returned by rsi_hal_load_key
into _status_ for its further evaluation and use.

Addresses-Coverity-ID: 1468409 ("Structurally dead code")
Fixes: 4fd6c4762f37 ("rsi: roaming enhancements")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
---
 drivers/net/wireless/rsi/rsi_91x_mac80211.c | 16 ++++++++--------
 1 file changed, 8 insertions(+), 8 deletions(-)

diff --git a/drivers/net/wireless/rsi/rsi_91x_mac80211.c b/drivers/net/wireless/rsi/rsi_91x_mac80211.c
index 766d874..80e7f4f 100644
--- a/drivers/net/wireless/rsi/rsi_91x_mac80211.c
+++ b/drivers/net/wireless/rsi/rsi_91x_mac80211.c
@@ -911,14 +911,14 @@ static int rsi_hal_key_config(struct ieee80211_hw *hw,
 		}
 	}
 
-	return rsi_hal_load_key(adapter->priv,
-				key->key,
-				key->keylen,
-				key_type,
-				key->keyidx,
-				key->cipher,
-				sta_id,
-				vif);
+	status = rsi_hal_load_key(adapter->priv,
+				  key->key,
+				  key->keylen,
+				  key_type,
+				  key->keyidx,
+				  key->cipher,
+				  sta_id,
+				  vif);
 	if (status)
 		return status;
 
-- 
2.7.4

^ permalink raw reply related

* Re: [PATCH net-next 6/6] mlxsw: spectrum_span: Allow bridge for gretap mirror
From: Petr Machata @ 2018-04-26 13:03 UTC (permalink / raw)
  To: Nikolay Aleksandrov
  Cc: Ido Schimmel, netdev, bridge, davem, stephen, jiri, mlxsw
In-Reply-To: <1e3e120f-821a-2d99-b720-0585ce9c1803@cumulusnetworks.com>

Nikolay Aleksandrov <nikolay@cumulusnetworks.com> writes:

> On 26/04/18 12:06, Ido Schimmel wrote:
>> From: Petr Machata <petrm@mellanox.com>
>>
>> diff --git a/drivers/net/ethernet/mellanox/mlxsw/spectrum_span.c b/drivers/net/ethernet/mellanox/mlxsw/spectrum_span.c
>> index 65a77708ff61..92fb81839852 100644
>> --- a/drivers/net/ethernet/mellanox/mlxsw/spectrum_span.c
>> +++ b/drivers/net/ethernet/mellanox/mlxsw/spectrum_span.c
>> @@ -167,6 +169,79 @@ mlxsw_sp_span_entry_unoffloadable(struct mlxsw_sp_span_parms *sparmsp)
>>   	return 0;
>>   }
>>   +static struct net_device *
>> +mlxsw_sp_span_entry_bridge_8021q(const struct net_device *br_dev,
>> +				 unsigned char *dmac,
>> +				 u16 *p_vid)
>> +{
>> +	struct net_bridge_vlan_group *vg = br_vlan_group_rtnl(br_dev);
>> +	u16 pvid = br_vlan_group_pvid(vg);
>> +	struct net_device *edev = NULL;
>> +	struct net_bridge_vlan *v;
>> +
>
> Hi,
> These structures are not really exported anywhere outside of br_private.h
> Instead of passing them around and risking someone else actually trying to
> dereference an incomplete type, why don't you add just 2 new helpers -
> br_vlan_pvid(netdevice) and br_vlan_info(netdevice, vid, *bridge_vlan_info)
>
> br_vlan_info can return the exported and already available type "bridge_vlan_info"
> (defined in uapi/linux/if_bridge.h) into the bridge_vlan_info arg
> and br_vlan_pvid is obvious - return the current dev pvid if available.
>
> Another bonus you'll avoid dealing with the bridge's vlan internals.

All right, I'll do it like that.

>
>> +	if (pvid)
>> +		edev = br_fdb_find_port_hold(br_dev, dmac, pvid);
>> +	if (!edev)
>> +		return NULL;
>> +
>> +	/* RTNL prevents edev from being removed. */
>> +	dev_put(edev);
>
> Minor point here - since this is always called in rtnl, the dev_hold/put dance isn't needed,
> unless you plan to use the fdb_find_port_hold outside of rtnl of course. In case that is
> not planned why don't you do br_fdb_find_port_rtnl() instead with RCU inside (thus not blocking
> learning) and ASSERT_RTNL() to avoid some complexity ?

OK, sounds good.

I'll spin a v2.

Thanks,
Petr

^ permalink raw reply

* [PATCH] rsi_91x_usb: fix uninitialized variable
From: Gustavo A. R. Silva @ 2018-04-26 13:13 UTC (permalink / raw)
  To: Amitkumar Karwar, Kalle Valo
  Cc: linux-wireless, netdev, linux-kernel, Gustavo A. R. Silva

There is a potential execution path in which variable ret is returned
without being properly initialized previously.

Fix this by storing the value returned by function
rsi_usb_master_reg_write into _ret_.

Addresses-Coverity-ID: 1468407 ("Uninitialized scalar variable")
Fixes: 16d3bb7b2f37 ("rsi: disable fw watchdog timer during reset")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
---
 drivers/net/wireless/rsi/rsi_91x_usb.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/net/wireless/rsi/rsi_91x_usb.c b/drivers/net/wireless/rsi/rsi_91x_usb.c
index b065438..6ce6b75 100644
--- a/drivers/net/wireless/rsi/rsi_91x_usb.c
+++ b/drivers/net/wireless/rsi/rsi_91x_usb.c
@@ -687,9 +687,10 @@ static int rsi_reset_card(struct rsi_hw *adapter)
 	 */
 	msleep(100);
 
-	if (rsi_usb_master_reg_write(adapter, SWBL_REGOUT,
-				     RSI_FW_WDT_DISABLE_REQ,
-				     RSI_COMMON_REG_SIZE) < 0) {
+	ret = rsi_usb_master_reg_write(adapter, SWBL_REGOUT,
+				       RSI_FW_WDT_DISABLE_REQ,
+				       RSI_COMMON_REG_SIZE);
+	if (ret < 0) {
 		rsi_dbg(ERR_ZONE, "Disabling firmware watchdog timer failed\n");
 		goto fail;
 	}
-- 
2.7.4

^ permalink raw reply related

* [PATCH net-next] net: sch: prio: Set bands to default on delete instead of noop
From: Nogah Frankel @ 2018-04-26 13:32 UTC (permalink / raw)
  To: netdev; +Cc: davem, jiri, jhs, xiyou.wangcong, mlxsw, Nogah Frankel

When a band is created, it is set to the default qdisc, which is
"invisible" pfifo.
However, if a band is set to a qdisc that is later being deleted, it will
be set to noop qdisc. This can cause a packet loss, while there is no clear
user indication for it. ("invisible" qdisc are not being shown by default).
This patch sets a band to the default qdisc, rather then the noop qdisc, on
delete operation.

Signed-off-by: Nogah Frankel <nogahf@mellanox.com>
---
 net/sched/sch_prio.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

diff --git a/net/sched/sch_prio.c b/net/sched/sch_prio.c
index 222e53d..6862d23 100644
--- a/net/sched/sch_prio.c
+++ b/net/sched/sch_prio.c
@@ -314,8 +314,15 @@ static int prio_graft(struct Qdisc *sch, unsigned long arg, struct Qdisc *new,
 	bool any_qdisc_is_offloaded;
 	int err;
 
-	if (new == NULL)
-		new = &noop_qdisc;
+	if (!new) {
+		new = qdisc_create_dflt(sch->dev_queue, &pfifo_qdisc_ops,
+					TC_H_MAKE(sch->handle, band + 1),
+					extack);
+		if (!new)
+			new = &noop_qdisc;
+		else
+			qdisc_hash_add(new, true);
+	}
 
 	*old = qdisc_replace(sch, new, &q->queues[band]);
 
@@ -332,7 +339,7 @@ static int prio_graft(struct Qdisc *sch, unsigned long arg, struct Qdisc *new,
 					    &graft_offload);
 
 	/* Don't report error if the graft is part of destroy operation. */
-	if (err && new != &noop_qdisc) {
+	if (err && new->handle) {
 		/* Don't report error if the parent, the old child and the new
 		 * one are not offloaded.
 		 */
-- 
2.4.11

^ permalink raw reply related

* i.MX6S/DL and QCA8334 switch using DSA driver - CPU port not working
From: Michal Vokáč @ 2018-04-26 13:37 UTC (permalink / raw)
  To: netdev
  Cc: Andrew Lunn <andrew@lunn.ch>; Vivien Didelot <vivien.didelot@savoirfairelinux.com>; Florian Fainelli
In-Reply-To: <037faf3c-8e8f-a696-8312-d1380c3b8656@gmail.com>

Ahoj,

Sorry for bothering you guys but I could not come up with a more appropriate
place to ask my question.

I would very much appreciate some adivce to the DSA/QCA8K driver usage.
I am not (yet) very educated in networking internals so my wording may not be
right and my expectations may be wrong.

I am working on custom boards based on i.MX6S and i.MX6DL with QCA8334 four
port switch chip. The switch is connected to CPU using MDIO and RGMII
interface.
   _____                 _____________
  |     |               |   QCA8334   |
  |     |  RGMII        |        port2| eth2 <---> LAN (DHCP server)
  | CPU | <------> eth0 |port0        |
  |     |   MDIO        |        port3| eth1 <---> Some device
  |_____| <------>      |_____________|

HW issue is out of question as we are using the HW for a long time with
older kernel and a custom PHY driver for the switch.

I am now in a process of upgrading the kernel from 3.14 to 4.9 and I would
like to get rid of our ugly driver and utilize the mainline code.

I went through a lot of documentation/slides/video regarding the DSA subsystem.
I am still having a hard time to make the switch work as I expect with DSA and
qca8k driver.

I will describe my problem in three steps:

  - 1. The old version - this is what we were using so far.
  - 2. Current state - this is what I was able to come up with.
  - 3. Questions

------------------
1. The old version
------------------

  - Linux 3.14 (Freescale 3.14-1.1.x-imx branch)
  - Custom PHY driver for the switch chip

Device tree snippet:

&fec {
	pinctrl-names = "default";
	pinctrl-0 = <&pinctrl_enet_4>;
	phy-mode = "rgmii";
	phy-reset-gpios = <&gpio1 25 0>;
	phy-reset-duration = <0x14>;
	phy-supply = <&sw2_reg>;
	phy-handle = <&ethphy0>;
	status = "okay";

	mdio {
		#address-cells = <1>;
		#size-cells = <0>;
		ethphy0: ethernet-phy@0 {
			compatible = "ethernet-phy-id004d.d036";
			reg = <0>;
			max-speed = <1000>;
		};
	};
};

In my old setup I see only one ethernet interface.
The driver leaves most internal registers of the switch in its default state.
The major change is that forwarding is enabled not just across all the user
ports (HW default) but to the CPU port as well.

	qca8k_write(priv, 0x0624, 0x007F7F7F);

Regardless of state of the eth0 interface, communication is possible through
the switch. Device connected to port3 can get IP address from local DHCP server.
This is what you get by default without touching a single bit of the switch
internal registers.

Then when I bring up the eth0 interface also the CPU itself can get IP address
from local DHCP server and communicate to LAN or with the device.

----------------
2. Current state
----------------

  - Linux 4.9.84 (Freescale 4.9-1.0.x-imx branch)
  - Mainline DSA drivers
  - Mainline qca8k driver

The Freescale branch does not introduce any changes to the DSA nor to the QCA8K
drivers from mainline.

The QCA8K driver currently supports only the seven-port variant QCA8337.
AFAIK the only difference between the QCA8337 and QCA8334 we use is the number
of ports - 7 vs 4 respectively. So I added "qca,qca8334" compatible string to
the qca8k_of_match table in qca8k.c driver.

These corresponding configs are enabled:

  CONFIG_HAVE_NET_DSA=y
  CONFIG_NET_DSA=y
  CONFIG_NET_DSA_TAG_QCA=y
  CONFIG_NET_DSA_QCA8K=y

Device tree snippet:

&fec {
	pinctrl-names = "default";
	pinctrl-0 = <&pinctrl_enet_4>;
	phy-mode = "rgmii";
	phy-reset-gpios = <&gpio1 25 0>;
	phy-reset-duration = <0x14>;
	phy-supply = <&sw2_reg>;
	phy-handle = <&ethphy0>;
	status = "okay";

	mdio {
		#address-cells = <1>;
		#size-cells = <0>;

		phy_port2: phy@1 {
		    reg = <1>;
		};

		phy_port3: phy@2 {
		    reg = <2>;
		};

		switch@0 {
			compatible = "qca,qca8334";
			#address-cells = <1>;
			#size-cells = <0>;
			reg = <0>;

			ports {
				#address-cells = <1>;
				#size-cells = <0>;
				ethphy0: port@0 {
					reg = <0>;
					label = "cpu";
					phy-mode = "rgmii";
					ethernet = <&fec>;
					fixed-link {
						speed = <1000>;
						full-duplex;
					};
				};
				ethphy2: port@2 {
					reg = <2>;
					label = "eth2";
					phy-handle = <&phy_port2>;
				};

				ethphy3: port@3 {
					reg = <3>;
					label = "eth1";
					phy-handle = <&phy_port3>;
				};
			};
		};
	};
};

  # dmesg
  [    1.954903] libphy: fec_enet_mii_bus: probed
  [    1.955457] mdio_bus 2188000.ethernet-1:00: mdio_device_register
  [    1.955763] dsa2: enabling port: 2
  [    1.955770] dsa2: enabling port: 3
  [    1.956493] fec 2188000.ethernet eth0: registered PHC device 0
  ...
  [    2.920810] dsa2: enabling port: 2
  [    2.920817] dsa2: enabling port: 3
  [    2.920878] DSA: switch 0 0 parsed
  [    2.920884] DSA: tree 0 parsed
  [    2.927628] libphy: dsa slave smi: probed
  [    2.928161] Generic PHY 2188000.ethernet-1:01: attached PHY driver [Generic PHY] (mii_bus:phy_addr=2188000.ethernet-1:01, irq=-1)
  [    2.929326] Generic PHY dsa-0.0:02: attached PHY driver [Generic PHY] (mii_bus:phy_addr=dsa-0.0:02, irq=-1)
  
  # ip a
  2: eth0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
       link/ether 68:7c:d5:04:01:5a brd ff:ff:ff:ff:ff:ff
  3: eth2@eth0: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN group default qlen 1000
       link/ether 68:7c:d5:04:01:5a brd ff:ff:ff:ff:ff:ff
  4: eth1@eth0: <BROADCAST,MULTICAST,M-DOWN> mtu 1500 qdisc noop state DOWN group default qlen 1000
       link/ether 68:7c:d5:04:01:5a brd ff:ff:ff:ff:ff:ff

At this point I am able to bring up all three interfaces, create a bridge, add
eth1 and eth2 to the bridge and bring up the bridge. But the bridge does not
work. No communication is possible from one port to the other.

To make the bridge work I need to enable forwarding across all the switch ports
at setup.

--- a/drivers/net/dsa/qca8k.c
+++ b/drivers/net/dsa/qca8k.c
@@ -578,12 +578,12 @@ qca8k_setup(struct dsa_switch *ds)
    		if (ds->enabled_port_mask & BIT(i))
    			qca8k_port_set_status(priv, i, 0);
    
-	/* Forward all unknown frames to CPU port for Linux processing */
+	/* Forward all unknown frames to all pors */
    	qca8k_write(priv, QCA8K_REG_GLOBAL_FW_CTRL1,
    		    BIT(0) << QCA8K_GLOBAL_FW_CTRL1_IGMP_DP_S |
-		    BIT(0) << QCA8K_GLOBAL_FW_CTRL1_BC_DP_S |
-		    BIT(0) << QCA8K_GLOBAL_FW_CTRL1_MC_DP_S |
-		    BIT(0) << QCA8K_GLOBAL_FW_CTRL1_UC_DP_S);
+		    0x7f << QCA8K_GLOBAL_FW_CTRL1_BC_DP_S |
+		    0x7f << QCA8K_GLOBAL_FW_CTRL1_MC_DP_S |
+		    0x7f << QCA8K_GLOBAL_FW_CTRL1_UC_DP_S);
    
    	/* Setup connection between CPU port & user ports */
    	for (i = 0; i < DSA_MAX_PORTS; i++) {
--

Then when I do:

  # ifconfig eth0 up
  # ifconfig eth1 up
  # ifconfig eth2 up
  # brctl addbr br0
  # brctl addif eth1
  # brctl addif eth2
  # ifconfig br0 up

the bridge works fine.
But I am still not able to make work the CPU port though.

  # udhcpc -i eth2
  Sending discover...
  [FOREVER]

The same for eth1, eth2 and br0.

I suspect the problem may be at different levels:

  - The RGMII interface is not properly configured
   -- at the CPU side, or
   -- at the switch chip side.
  - Some setup that I have not done needs to be done (in userspace).

------------
3. Questions
------------

Q: What is the correct interface that the CPU should use to talk to the outside
world? I am convinced any of the eth1, eth2, br0 interfaces can be used.

Q: Is there something else that needs to be done to get networking going on the
CPU interface?

Q: Is my device tree correct?

Q: Lets assume the CPU port is working. Is NFS boot of the CPU possible from at
least one port of the switch?

Any hints how to deal with that will be much appreciated.
BR, Michal

^ permalink raw reply

* Re: [PATCH v2 net-next 1/2] tcp: add TCP_ZEROCOPY_RECEIVE support for zerocopy receive
From: Ka-Cheong Poon @ 2018-04-26 13:40 UTC (permalink / raw)
  To: Eric Dumazet, David S . Miller
  Cc: netdev, Andy Lutomirski, linux-kernel, linux-mm, Eric Dumazet,
	Soheil Hassas Yeganeh
In-Reply-To: <20180425214307.159264-2-edumazet@google.com>

On 04/26/2018 05:43 AM, Eric Dumazet wrote:
> When adding tcp mmap() implementation, I forgot that socket lock
> had to be taken before current->mm->mmap_sem. syzbot eventually caught
> the bug.
> 
> Since we can not lock the socket in tcp mmap() handler we have to
> split the operation in two phases.
> 
> 1) mmap() on a tcp socket simply reserves VMA space, and nothing else.
>    This operation does not involve any TCP locking.
> 
> 2) setsockopt(fd, IPPROTO_TCP, TCP_ZEROCOPY_RECEIVE, ...) implements
>   the transfert of pages from skbs to one VMA.
>    This operation only uses down_read(&current->mm->mmap_sem) after
>    holding TCP lock, thus solving the lockdep issue.


A quick question.  Is it a normal practice to return a result
in setsockopt() given that the optval parameter is supposed to
be a const void *?




-- 
K. Poon
ka-cheong.poon@oracle.com

^ permalink raw reply

* [RFC PATCH] bpf: bpf_push_seg6_encap() can be static
From: kbuild test robot @ 2018-04-26 13:45 UTC (permalink / raw)
  To: Mathieu Xhonneux; +Cc: kbuild-all, netdev, dlebrun, alexei.starovoitov
In-Reply-To: <1d9d533b5d2640fe958d599ac0944132c3b7d61b.1524591163.git.m.xhonneux@gmail.com>


Fixes: 14ab8554dc46 ("bpf: Add IPv6 Segment Routing helpers")
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
---
 filter.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/core/filter.c b/net/core/filter.c
index 8e67c42..4d1204f 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -3725,7 +3725,7 @@ static const struct bpf_func_proto bpf_bind_proto = {
 	.arg3_type	= ARG_CONST_SIZE,
 };
 
-int bpf_push_seg6_encap(struct sk_buff *skb, u32 type, void *hdr, u32 len)
+static int bpf_push_seg6_encap(struct sk_buff *skb, u32 type, void *hdr, u32 len)
 {
 	int err;
 	struct ipv6_sr_hdr *srh = (struct ipv6_sr_hdr *)hdr;

^ permalink raw reply related

* Re: [PATCH net-next v2 2/5] bpf: Add IPv6 Segment Routing helpers
From: kbuild test robot @ 2018-04-26 13:45 UTC (permalink / raw)
  To: Mathieu Xhonneux; +Cc: kbuild-all, netdev, dlebrun, alexei.starovoitov
In-Reply-To: <1d9d533b5d2640fe958d599ac0944132c3b7d61b.1524591163.git.m.xhonneux@gmail.com>

Hi Mathieu,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on net-next/master]

url:    https://github.com/0day-ci/linux/commits/Mathieu-Xhonneux/ipv6-sr-introduce-seg6local-End-BPF-action/20180426-082209
reproduce:
        # apt-get install sparse
        make ARCH=x86_64 allmodconfig
        make C=1 CF=-D__CHECK_ENDIAN__


sparse warnings: (new ones prefixed by >>)

   net/core/filter.c:110:48: sparse: expression using sizeof(void)
   net/core/filter.c:110:48: sparse: expression using sizeof(void)
   net/core/filter.c:323:33: sparse: subtraction of functions? Share your drugs
   net/core/filter.c:326:33: sparse: subtraction of functions? Share your drugs
   net/core/filter.c:329:33: sparse: subtraction of functions? Share your drugs
   net/core/filter.c:332:33: sparse: subtraction of functions? Share your drugs
   net/core/filter.c:335:33: sparse: subtraction of functions? Share your drugs
   include/linux/filter.h:612:16: sparse: expression using sizeof(void)
   include/linux/filter.h:612:16: sparse: expression using sizeof(void)
   include/linux/filter.h:612:16: sparse: expression using sizeof(void)
   include/linux/filter.h:612:16: sparse: expression using sizeof(void)
   net/core/filter.c:1189:39: sparse: incorrect type in argument 1 (different address spaces) @@    expected struct sock_filter const *filter @@    got struct sockstruct sock_filter const *filter @@
   net/core/filter.c:1189:39:    expected struct sock_filter const *filter
   net/core/filter.c:1189:39:    got struct sock_filter [noderef] <asn:1>*filter
   include/linux/filter.h:612:16: sparse: expression using sizeof(void)
   include/linux/filter.h:612:16: sparse: expression using sizeof(void)
   net/core/filter.c:1291:39: sparse: incorrect type in argument 1 (different address spaces) @@    expected struct sock_filter const *filter @@    got struct sockstruct sock_filter const *filter @@
   net/core/filter.c:1291:39:    expected struct sock_filter const *filter
   net/core/filter.c:1291:39:    got struct sock_filter [noderef] <asn:1>*filter
   include/linux/filter.h:612:16: sparse: expression using sizeof(void)
   net/core/filter.c:1552:43: sparse: incorrect type in argument 2 (different base types) @@    expected restricted __wsum [usertype] diff @@    got unsigned lonrestricted __wsum [usertype] diff @@
   net/core/filter.c:1552:43:    expected restricted __wsum [usertype] diff
   net/core/filter.c:1552:43:    got unsigned long long [unsigned] [usertype] to
   net/core/filter.c:1555:36: sparse: incorrect type in argument 2 (different base types) @@    expected restricted __be16 [usertype] old @@    got unsigned lonrestricted __be16 [usertype] old @@
   net/core/filter.c:1555:36:    expected restricted __be16 [usertype] old
   net/core/filter.c:1555:36:    got unsigned long long [unsigned] [usertype] from
   net/core/filter.c:1555:42: sparse: incorrect type in argument 3 (different base types) @@    expected restricted __be16 [usertype] new @@    got unsigned lonrestricted __be16 [usertype] new @@
   net/core/filter.c:1555:42:    expected restricted __be16 [usertype] new
   net/core/filter.c:1555:42:    got unsigned long long [unsigned] [usertype] to
   net/core/filter.c:1558:36: sparse: incorrect type in argument 2 (different base types) @@    expected restricted __be32 [usertype] from @@    got unsigned lonrestricted __be32 [usertype] from @@
   net/core/filter.c:1558:36:    expected restricted __be32 [usertype] from
   net/core/filter.c:1558:36:    got unsigned long long [unsigned] [usertype] from
   net/core/filter.c:1558:42: sparse: incorrect type in argument 3 (different base types) @@    expected restricted __be32 [usertype] to @@    got unsigned lonrestricted __be32 [usertype] to @@
   net/core/filter.c:1558:42:    expected restricted __be32 [usertype] to
   net/core/filter.c:1558:42:    got unsigned long long [unsigned] [usertype] to
   net/core/filter.c:1603:59: sparse: incorrect type in argument 3 (different base types) @@    expected restricted __wsum [usertype] diff @@    got unsigned lonrestricted __wsum [usertype] diff @@
   net/core/filter.c:1603:59:    expected restricted __wsum [usertype] diff
   net/core/filter.c:1603:59:    got unsigned long long [unsigned] [usertype] to
   net/core/filter.c:1606:52: sparse: incorrect type in argument 3 (different base types) @@    expected restricted __be16 [usertype] from @@    got unsigned lonrestricted __be16 [usertype] from @@
   net/core/filter.c:1606:52:    expected restricted __be16 [usertype] from
   net/core/filter.c:1606:52:    got unsigned long long [unsigned] [usertype] from
   net/core/filter.c:1606:58: sparse: incorrect type in argument 4 (different base types) @@    expected restricted __be16 [usertype] to @@    got unsigned lonrestricted __be16 [usertype] to @@
   net/core/filter.c:1606:58:    expected restricted __be16 [usertype] to
   net/core/filter.c:1606:58:    got unsigned long long [unsigned] [usertype] to
   net/core/filter.c:1609:52: sparse: incorrect type in argument 3 (different base types) @@    expected restricted __be32 [usertype] from @@    got unsigned lonrestricted __be32 [usertype] from @@
   net/core/filter.c:1609:52:    expected restricted __be32 [usertype] from
   net/core/filter.c:1609:52:    got unsigned long long [unsigned] [usertype] from
   net/core/filter.c:1609:58: sparse: incorrect type in argument 4 (different base types) @@    expected restricted __be32 [usertype] to @@    got unsigned lonrestricted __be32 [usertype] to @@
   net/core/filter.c:1609:58:    expected restricted __be32 [usertype] to
   net/core/filter.c:1609:58:    got unsigned long long [unsigned] [usertype] to
   net/core/filter.c:1655:28: sparse: incorrect type in return expression (different base types) @@    expected unsigned long long @@    got nsigned long long @@
   net/core/filter.c:1655:28:    expected unsigned long long
   net/core/filter.c:1655:28:    got restricted __wsum
   net/core/filter.c:1677:35: sparse: incorrect type in return expression (different base types) @@    expected unsigned long long @@    got restricted unsigned long long @@
   net/core/filter.c:1677:35:    expected unsigned long long
   net/core/filter.c:1677:35:    got restricted __wsum [usertype] csum
   net/core/filter.c:3462:41: sparse: expression using sizeof(void)
   net/core/filter.c:3466:41: sparse: expression using sizeof(void)
   net/core/filter.c:3470:46: sparse: expression using sizeof(void)
   net/core/filter.c:3470:46: sparse: expression using sizeof(void)
   net/core/filter.c:3538:47: sparse: expression using sizeof(void)
>> net/core/filter.c:3728:5: sparse: symbol 'bpf_push_seg6_encap' was not declared. Should it be static?

Please review and possibly fold the followup patch.

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox