Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH bpf-next 0/4] bpf: fixes for lockdep and deadlock
From: Alexei Starovoitov @ 2019-01-30  4:07 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David S. Miller, Daniel Borkmann, Peter Zijlstra, Eric Dumazet,
	Jann Horn, Network Development, Kernel Team
In-Reply-To: <20190130040458.2544340-1-ast@kernel.org>

On Tue, Jan 29, 2019 at 8:06 PM Alexei Starovoitov <ast@kernel.org> wrote:
>
> In addition to preempt_disable patch for socket filters
> https://patchwork.ozlabs.org/patch/1032437/
> the first three patches fix various lockdep false positives.
> Last patch fixes potential deadlock in stackmap access from
> tracing bpf prog and from syscall.

Typo in subject.
All patches are for 'bpf' tree.

^ permalink raw reply

* Re: [RFC 03/14] net: hstats: add basic/core functionality
From: David Ahern @ 2019-01-30  4:18 UTC (permalink / raw)
  To: Jakub Kicinski, davem
  Cc: oss-drivers, netdev, jiri, f.fainelli, andrew, mkubecek,
	simon.horman, jesse.brandeburg, maciejromanfijalkowski,
	vasundhara-v.volam, michael.chan, shalomt, idosch
In-Reply-To: <20190128234507.32028-4-jakub.kicinski@netronome.com>

On 1/28/19 4:44 PM, Jakub Kicinski wrote:
> @@ -4946,6 +4964,9 @@ static size_t if_nlmsg_stats_size(const struct net_device *dev,
>  		rcu_read_unlock();
>  	}
>  
> +	if (stats_attr_valid(filter_mask, IFLA_STATS_LINK_HSTATS, 0))

filter_mask is populated by RTEXT_FILTER_ from
include/uapi/linux/rtnetlink.h

> +		size += rtnl_get_link_hstats_size(dev);

rtnl_get_link_hstats_size == __rtnl_get_link_hstats can return < 0.

> +
>  	return size;
>  }
>  
> 


^ permalink raw reply

* Re: [PATCH iproute2-next] Introduce ip-brctl shell script
From: David Ahern @ 2019-01-30  4:51 UTC (permalink / raw)
  To: Stefano Brivio
  Cc: Phil Sutter, Eric Garver, Tomas Dolezal, Stephen Hemminger,
	Lennert Buytenhek, netdev
In-Reply-To: <ed6b04eab48a70d6416a6b021f04f9901f7e9f01.1547830302.git.sbrivio@redhat.com>

On 1/18/19 10:00 AM, Stefano Brivio wrote:
> This script wraps 'ip' and 'bridge' tools to provide a drop-in replacement
> of the standalone 'brctl' utility.
> 
> It's bug-to-bug compatible with brctl as of bridge-utils version 1.6,
> has no dependencies other than a POSIX shell, and it's less than half
> the binary size of brctl on x86_64.
> 
> As many users (including myself) seem to find brctl usage vastly more
> intuitive than ip-link, possibly due to habit, this might be a lightweight
> approach to provide brctl syntax without the need to maintain bridge-utils
> any longer.
> 
> Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
> Acked-by: Phil Sutter <phil@nwl.cc>
> ---
>  man/man8/Makefile   |   5 +-
>  man/man8/ip-brctl.8 | 187 +++++++++++++++
>  misc/Makefile       |   9 +-
>  misc/ip-brctl.in    | 572 ++++++++++++++++++++++++++++++++++++++++++++
>  4 files changed, 770 insertions(+), 3 deletions(-)
>  create mode 100644 man/man8/ip-brctl.8
>  create mode 100755 misc/ip-brctl.in

I get your intent, but this seems more appropriate for you / Red Hat to
carry than something we want to distribute as part of iproute2.

^ permalink raw reply

* Re: [PATCH] ipmr: ip6mr: Create new sockopt to clear mfc cache only
From: kbuild test robot @ 2019-01-30  5:42 UTC (permalink / raw)
  To: Callum Sinclair
  Cc: kbuild-all, davem, kuznet, yoshfuji, nikolay, netdev,
	linux-kernel, Callum Sinclair
In-Reply-To: <20190130022509.25303-2-callum.sinclair@alliedtelesis.co.nz>

[-- Attachment #1: Type: text/plain, Size: 4325 bytes --]

Hi Callum,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on net/master]
[also build test ERROR on v5.0-rc4 next-20190129]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Callum-Sinclair/ipmr-ip6mr-Create-new-sockopt-to-clear-mfc-cache-only/20190130-104146
config: arm-allmodconfig (attached as .config)
compiler: arm-linux-gnueabi-gcc (Debian 8.2.0-11) 8.2.0
reproduce:
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # save the attached .config to linux build tree
        GCC_VERSION=8.2.0 make.cross ARCH=arm 

All errors (new ones prefixed by >>):

   net/ipv4/ipmr.c: In function 'mroute_clean_cache':
>> net/ipv4/ipmr.c:1312:3: error: 'cache' undeclared (first use in this function); did you mean 'cacheid'?
      cache = (struct mfc_cache *)c;
      ^~~~~
      cacheid
   net/ipv4/ipmr.c:1312:3: note: each undeclared identifier is reported only once for each function it appears in
   net/ipv4/ipmr.c:1313:33: error: 'net' undeclared (first use in this function)
      call_ipmr_mfc_entry_notifiers(net, FIB_EVENT_ENTRY_DEL, cache,
                                    ^~~
   net/ipv4/ipmr.c: In function 'mroute_clean_tables':
   net/ipv4/ipmr.c:1334:14: warning: unused variable 'net' [-Wunused-variable]
     struct net *net = read_pnet(&mrt->net);
                 ^~~

vim +1312 net/ipv4/ipmr.c

^1da177e4 Linus Torvalds      2005-04-16  1300  
7ba7b80d1 Callum Sinclair     2019-01-30  1301  /* Clear the vif tables */
7ba7b80d1 Callum Sinclair     2019-01-30  1302  static void mroute_clean_cache(struct mr_table *mrt, bool all)
^1da177e4 Linus Torvalds      2005-04-16  1303  {
494fff563 Yuval Mintz         2018-02-28  1304  	struct mr_mfc *c, *tmp;
^1da177e4 Linus Torvalds      2005-04-16  1305  
a8cb16dd9 Eric Dumazet        2010-10-01  1306  	/* Wipe the cache */
8fb472c09 Nikolay Aleksandrov 2017-01-12  1307  	list_for_each_entry_safe(c, tmp, &mrt->mfc_cache_list, list) {
0e615e960 Nikolay Aleksandrov 2015-11-20  1308  		if (!all && (c->mfc_flags & MFC_STATIC))
^1da177e4 Linus Torvalds      2005-04-16  1309  			continue;
8fb472c09 Nikolay Aleksandrov 2017-01-12  1310  		rhltable_remove(&mrt->mfc_hash, &c->mnode, ipmr_rht_params);
a8c9486b8 Eric Dumazet        2010-10-01  1311  		list_del_rcu(&c->list);
494fff563 Yuval Mintz         2018-02-28 @1312  		cache = (struct mfc_cache *)c;
494fff563 Yuval Mintz         2018-02-28  1313  		call_ipmr_mfc_entry_notifiers(net, FIB_EVENT_ENTRY_DEL, cache,
b362053a7 Yotam Gigi          2017-09-27  1314  					      mrt->id);
494fff563 Yuval Mintz         2018-02-28  1315  		mroute_netlink_event(mrt, cache, RTM_DELROUTE);
8c13af2a2 Yuval Mintz         2018-03-26  1316  		mr_cache_put(c);
^1da177e4 Linus Torvalds      2005-04-16  1317  	}
^1da177e4 Linus Torvalds      2005-04-16  1318  
0c12295a7 Patrick McHardy     2010-04-13  1319  	if (atomic_read(&mrt->cache_resolve_queue_len) != 0) {
^1da177e4 Linus Torvalds      2005-04-16  1320  		spin_lock_bh(&mfc_unres_lock);
8fb472c09 Nikolay Aleksandrov 2017-01-12  1321  		list_for_each_entry_safe(c, tmp, &mrt->mfc_unres_queue, list) {
862465f2e Patrick McHardy     2010-04-13  1322  			list_del(&c->list);
494fff563 Yuval Mintz         2018-02-28  1323  			cache = (struct mfc_cache *)c;
494fff563 Yuval Mintz         2018-02-28  1324  			mroute_netlink_event(mrt, cache, RTM_DELROUTE);
494fff563 Yuval Mintz         2018-02-28  1325  			ipmr_destroy_unres(mrt, cache);
^1da177e4 Linus Torvalds      2005-04-16  1326  		}
^1da177e4 Linus Torvalds      2005-04-16  1327  		spin_unlock_bh(&mfc_unres_lock);
^1da177e4 Linus Torvalds      2005-04-16  1328  	}
^1da177e4 Linus Torvalds      2005-04-16  1329  }
^1da177e4 Linus Torvalds      2005-04-16  1330  

:::::: The code at line 1312 was first introduced by commit
:::::: 494fff56379c4ad5b8fe36a5b7ffede4044ca7bb ipmr, ip6mr: Make mfc_cache a common structure

:::::: TO: Yuval Mintz <yuvalm@mellanox.com>
:::::: CC: David S. Miller <davem@davemloft.net>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 68417 bytes --]

^ permalink raw reply

* Re: [PATCH iproute2-next 2/2] ss: add AF_XDP support
From: Björn Töpel @ 2019-01-30  6:00 UTC (permalink / raw)
  To: David Ahern
  Cc: Netdev, Stephen Hemminger, Björn Töpel,
	Karlsson, Magnus, Magnus Karlsson
In-Reply-To: <46eae118-0017-8de2-3608-a427a1ed398b@gmail.com>

Den ons 30 jan. 2019 kl 03:39 skrev David Ahern <dsahern@gmail.com>:
>
[...]
>
> AF_XDP is not currently defined for a number of distributions. Add a
> definition to include/utils.h similar to what is done for MPLS.
>
> Also, please add example output to the commit log.

Ok, I'll address AF_XDP/AF_MAX and fix the commit log in v2.

Thanks, David!
Björn

^ permalink raw reply

* Re: [PATCH net-next v8 0/8] devlink: Add configuration parameters support for devlink_port
From: David Miller @ 2019-01-30  6:13 UTC (permalink / raw)
  To: vasundhara-v.volam; +Cc: michael.chan, jiri, jakub.kicinski, mkubecek, netdev
In-Reply-To: <1548678627-21938-1-git-send-email-vasundhara-v.volam@broadcom.com>

From: Vasundhara Volam <vasundhara-v.volam@broadcom.com>
Date: Mon, 28 Jan 2019 18:00:19 +0530

> This patchset adds support for configuration parameters setting through
> devlink_port.  Each device registers supported configuration parameters
> table.
> 
> The user can retrieve data on these parameters by
> "devlink port param show" command and can set new value to a
> parameter by "devlink port param set" command.
> All configuration modes supported by devlink_dev are supported
> by devlink_port also.
> 
> Command examples and output:
 ...

Series applied to net-next, thanks.

^ permalink raw reply

* Re: [PATCH net] net: tls: Save iv in tls_rec for async crypto requests
From: David Miller @ 2019-01-30  6:14 UTC (permalink / raw)
  To: davejwatson; +Cc: netdev
In-Reply-To: <20190129172215.cnzhluzzqpj3l6sd@davejwatson-mba.local>

From: Dave Watson <davejwatson@fb.com>
Date: Tue, 29 Jan 2019 17:21:41 +0000

> Can we get a net->net-next merge when convenient?

This has now been done.

^ permalink raw reply

* Re: [PATCH net 0/4] various compat ioctl fixes
From: David Miller @ 2019-01-30  6:19 UTC (permalink / raw)
  To: johannes; +Cc: netdev, viro, robert
In-Reply-To: <149d1ddec433d7cb766c99eeb78b220b33090287.camel@sipsolutions.net>

From: Johannes Berg <johannes@sipsolutions.net>
Date: Mon, 28 Jan 2019 22:32:30 +0100

> Al, care to speak up about this here?

I'll give Al one day to respond.

I'll apply this series if he agrees or fails to give feedback.

^ permalink raw reply

* [Patch net] xfrm: destroy xfrm_state synchronously on net exit path
From: Cong Wang @ 2019-01-30  6:27 UTC (permalink / raw)
  To: netdev; +Cc: Cong Wang, syzbot+e9aebef558e3ed673934, Steffen Klassert

xfrm_state_put() moves struct xfrm_state to the GC list
and schedules the GC work to clean it up. On net exit call
path, xfrm_state_flush() is called to clean up and
xfrm_flush_gc() is called to wait for the GC work to complete
before exit.

However, this doesn't work because one of the ->destructor(),
ipcomp_destroy(), schedules the same GC work again inside
the GC work. It is hard to wait for such a nested async
callback. This is also why syzbot still reports the following
warning:

 WARNING: CPU: 1 PID: 33 at net/ipv6/xfrm6_tunnel.c:351 xfrm6_tunnel_net_exit+0x2cb/0x500 net/ipv6/xfrm6_tunnel.c:351
 ...
  ops_exit_list.isra.0+0xb0/0x160 net/core/net_namespace.c:153
  cleanup_net+0x51d/0xb10 net/core/net_namespace.c:551
  process_one_work+0xd0c/0x1ce0 kernel/workqueue.c:2153
  worker_thread+0x143/0x14a0 kernel/workqueue.c:2296
  kthread+0x357/0x430 kernel/kthread.c:246
  ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:352

In fact, it is perfectly fine to bypass GC and destroy xfrm_state
synchronously on net exit call path, because it is in process context
and doesn't need a work struct to do any blocking work.

This patch introduces xfrm_state_put_sync() which simply bypasses
GC, and lets its callers to decide whether to use this synchronous
version. On net exit path, xfrm_state_fini() and xfrm6_tunnel_net_exit()
use it. And, as ipcomp_destroy() itself is blocking, it can use
xfrm_state_put_sync() directly too.

Also rename xfrm_state_gc_destroy() to ___xfrm_state_destroy() to
reflect this change.

Fixes: b48c05ab5d32 ("xfrm: Fix warning in xfrm6_tunnel_net_exit.")
Reported-by: syzbot+e9aebef558e3ed673934@syzkaller.appspotmail.com
Cc: Steffen Klassert <steffen.klassert@secunet.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
---
 include/net/xfrm.h      | 13 +++++++++----
 net/ipv6/xfrm6_tunnel.c |  3 +--
 net/key/af_key.c        |  2 +-
 net/xfrm/xfrm_state.c   | 37 +++++++++++++++++++------------------
 net/xfrm/xfrm_user.c    |  2 +-
 5 files changed, 31 insertions(+), 26 deletions(-)

diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index 7298a53b9702..88f1f16ecac4 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -383,7 +383,6 @@ struct xfrm_input_afinfo {
 int xfrm_input_register_afinfo(const struct xfrm_input_afinfo *afinfo);
 int xfrm_input_unregister_afinfo(const struct xfrm_input_afinfo *afinfo);
 
-void xfrm_flush_gc(void);
 void xfrm_state_delete_tunnel(struct xfrm_state *x);
 
 struct xfrm_type {
@@ -853,7 +852,7 @@ static inline void xfrm_pols_put(struct xfrm_policy **pols, int npols)
 		xfrm_pol_put(pols[i]);
 }
 
-void __xfrm_state_destroy(struct xfrm_state *);
+void __xfrm_state_destroy(struct xfrm_state *, bool);
 
 static inline void __xfrm_state_put(struct xfrm_state *x)
 {
@@ -863,7 +862,13 @@ static inline void __xfrm_state_put(struct xfrm_state *x)
 static inline void xfrm_state_put(struct xfrm_state *x)
 {
 	if (refcount_dec_and_test(&x->refcnt))
-		__xfrm_state_destroy(x);
+		__xfrm_state_destroy(x, false);
+}
+
+static inline void xfrm_state_put_sync(struct xfrm_state *x)
+{
+	if (refcount_dec_and_test(&x->refcnt))
+		__xfrm_state_destroy(x, true);
 }
 
 static inline void xfrm_state_hold(struct xfrm_state *x)
@@ -1590,7 +1595,7 @@ struct xfrmk_spdinfo {
 
 struct xfrm_state *xfrm_find_acq_byseq(struct net *net, u32 mark, u32 seq);
 int xfrm_state_delete(struct xfrm_state *x);
-int xfrm_state_flush(struct net *net, u8 proto, bool task_valid);
+int xfrm_state_flush(struct net *net, u8 proto, bool task_valid, bool sync);
 int xfrm_dev_state_flush(struct net *net, struct net_device *dev, bool task_valid);
 void xfrm_sad_getinfo(struct net *net, struct xfrmk_sadinfo *si);
 void xfrm_spd_getinfo(struct net *net, struct xfrmk_spdinfo *si);
diff --git a/net/ipv6/xfrm6_tunnel.c b/net/ipv6/xfrm6_tunnel.c
index f5b4febeaa25..08bf374a80eb 100644
--- a/net/ipv6/xfrm6_tunnel.c
+++ b/net/ipv6/xfrm6_tunnel.c
@@ -344,8 +344,7 @@ static void __net_exit xfrm6_tunnel_net_exit(struct net *net)
 	struct xfrm6_tunnel_net *xfrm6_tn = xfrm6_tunnel_pernet(net);
 	unsigned int i;
 
-	xfrm_state_flush(net, IPSEC_PROTO_ANY, false);
-	xfrm_flush_gc();
+	xfrm_state_flush(net, IPSEC_PROTO_ANY, false, true);
 
 	for (i = 0; i < XFRM6_TUNNEL_SPI_BYADDR_HSIZE; i++)
 		WARN_ON_ONCE(!hlist_empty(&xfrm6_tn->spi_byaddr[i]));
diff --git a/net/key/af_key.c b/net/key/af_key.c
index 655c787f9d54..637030f43b67 100644
--- a/net/key/af_key.c
+++ b/net/key/af_key.c
@@ -1783,7 +1783,7 @@ static int pfkey_flush(struct sock *sk, struct sk_buff *skb, const struct sadb_m
 	if (proto == 0)
 		return -EINVAL;
 
-	err = xfrm_state_flush(net, proto, true);
+	err = xfrm_state_flush(net, proto, true, false);
 	err2 = unicast_flush_resp(sk, hdr);
 	if (err || err2) {
 		if (err == -ESRCH) /* empty table - go quietly */
diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
index 23c92891758a..83133bfaf724 100644
--- a/net/xfrm/xfrm_state.c
+++ b/net/xfrm/xfrm_state.c
@@ -432,7 +432,7 @@ void xfrm_state_free(struct xfrm_state *x)
 }
 EXPORT_SYMBOL(xfrm_state_free);
 
-static void xfrm_state_gc_destroy(struct xfrm_state *x)
+static void ___xfrm_state_destroy(struct xfrm_state *x)
 {
 	tasklet_hrtimer_cancel(&x->mtimer);
 	del_timer_sync(&x->rtimer);
@@ -474,7 +474,7 @@ static void xfrm_state_gc_task(struct work_struct *work)
 	synchronize_rcu();
 
 	hlist_for_each_entry_safe(x, tmp, &gc_list, gclist)
-		xfrm_state_gc_destroy(x);
+		___xfrm_state_destroy(x);
 }
 
 static enum hrtimer_restart xfrm_timer_handler(struct hrtimer *me)
@@ -598,14 +598,19 @@ struct xfrm_state *xfrm_state_alloc(struct net *net)
 }
 EXPORT_SYMBOL(xfrm_state_alloc);
 
-void __xfrm_state_destroy(struct xfrm_state *x)
+void __xfrm_state_destroy(struct xfrm_state *x, bool sync)
 {
 	WARN_ON(x->km.state != XFRM_STATE_DEAD);
 
-	spin_lock_bh(&xfrm_state_gc_lock);
-	hlist_add_head(&x->gclist, &xfrm_state_gc_list);
-	spin_unlock_bh(&xfrm_state_gc_lock);
-	schedule_work(&xfrm_state_gc_work);
+	if (sync) {
+		synchronize_rcu();
+		___xfrm_state_destroy(x);
+	} else {
+		spin_lock_bh(&xfrm_state_gc_lock);
+		hlist_add_head(&x->gclist, &xfrm_state_gc_list);
+		spin_unlock_bh(&xfrm_state_gc_lock);
+		schedule_work(&xfrm_state_gc_work);
+	}
 }
 EXPORT_SYMBOL(__xfrm_state_destroy);
 
@@ -708,7 +713,7 @@ xfrm_dev_state_flush_secctx_check(struct net *net, struct net_device *dev, bool
 }
 #endif
 
-int xfrm_state_flush(struct net *net, u8 proto, bool task_valid)
+int xfrm_state_flush(struct net *net, u8 proto, bool task_valid, bool sync)
 {
 	int i, err = 0, cnt = 0;
 
@@ -730,7 +735,10 @@ int xfrm_state_flush(struct net *net, u8 proto, bool task_valid)
 				err = xfrm_state_delete(x);
 				xfrm_audit_state_delete(x, err ? 0 : 1,
 							task_valid);
-				xfrm_state_put(x);
+				if (sync)
+					xfrm_state_put_sync(x);
+				else
+					xfrm_state_put(x);
 				if (!err)
 					cnt++;
 
@@ -2200,12 +2208,6 @@ struct xfrm_state_afinfo *xfrm_state_get_afinfo(unsigned int family)
 	return afinfo;
 }
 
-void xfrm_flush_gc(void)
-{
-	flush_work(&xfrm_state_gc_work);
-}
-EXPORT_SYMBOL(xfrm_flush_gc);
-
 /* Temporarily located here until net/xfrm/xfrm_tunnel.c is created */
 void xfrm_state_delete_tunnel(struct xfrm_state *x)
 {
@@ -2215,7 +2217,7 @@ void xfrm_state_delete_tunnel(struct xfrm_state *x)
 		if (atomic_read(&t->tunnel_users) == 2)
 			xfrm_state_delete(t);
 		atomic_dec(&t->tunnel_users);
-		xfrm_state_put(t);
+		xfrm_state_put_sync(t);
 		x->tunnel = NULL;
 	}
 }
@@ -2375,8 +2377,7 @@ void xfrm_state_fini(struct net *net)
 	unsigned int sz;
 
 	flush_work(&net->xfrm.state_hash_work);
-	xfrm_state_flush(net, IPSEC_PROTO_ANY, false);
-	flush_work(&xfrm_state_gc_work);
+	xfrm_state_flush(net, IPSEC_PROTO_ANY, false, true);
 
 	WARN_ON(!list_empty(&net->xfrm.state_all));
 
diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c
index c6d26afcf89d..a131f9ff979e 100644
--- a/net/xfrm/xfrm_user.c
+++ b/net/xfrm/xfrm_user.c
@@ -1932,7 +1932,7 @@ static int xfrm_flush_sa(struct sk_buff *skb, struct nlmsghdr *nlh,
 	struct xfrm_usersa_flush *p = nlmsg_data(nlh);
 	int err;
 
-	err = xfrm_state_flush(net, p->proto, true);
+	err = xfrm_state_flush(net, p->proto, true, false);
 	if (err) {
 		if (err == -ESRCH) /* empty table */
 			return 0;
-- 
2.20.1


^ permalink raw reply related

* [PATCH net v4] l2tp: fix reading optional fields of L2TPv3
From: Jacob Wen @ 2019-01-30  6:55 UTC (permalink / raw)
  To: netdev; +Cc: eric.dumazet, gnault

Use pskb_may_pull() to make sure the optional fields are in skb linear
parts, so we can safely read them later.

It's easy to reproduce the issue with a net driver that supports paged
skb data. Just create a L2TPv3 over IP tunnel and then generates some
network traffic.
Once reproduced, rx err in /sys/kernel/debug/l2tp/tunnels will increase.

Changes in v4:
1. s/l2tp_v3_pull_opt/l2tp_v3_ensure_opt_in_linear/
2. s/tunnel->version != L2TP_HDR_VER_2/tunnel->version == L2TP_HDR_VER_3/
3. Add 'Fixes' in commit messages.

Changes in v3:
1. To keep consistency, move the code out of l2tp_recv_common.
2. Use "net" instead of "net-next", since this is a bug fix.

Changes in v2:
1. Only fix L2TPv3 to make code simple.
   To fix both L2TPv3 and L2TPv2, we'd better refactor l2tp_recv_common.
   It's complicated to do so.
2. Reloading pointers after pskb_may_pull

Fixes: f7faffa3ff8e ("l2tp: Add L2TPv3 protocol support")
Fixes: 0d76751fad77 ("l2tp: Add L2TPv3 IP encapsulation (no UDP) support")
Fixes: a32e0eec7042 ("l2tp: introduce L2TPv3 IP encapsulation support for IPv6")

Signed-off-by: Jacob Wen <jian.w.wen@oracle.com>
Acked-by: Guillaume Nault <gnault@redhat.com>
---
 net/l2tp/l2tp_core.c |  4 ++++
 net/l2tp/l2tp_core.h | 20 ++++++++++++++++++++
 net/l2tp/l2tp_ip.c   |  3 +++
 net/l2tp/l2tp_ip6.c  |  3 +++
 4 files changed, 30 insertions(+)

diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
index 26f1d435696a..dd5ba0c11ab3 100644
--- a/net/l2tp/l2tp_core.c
+++ b/net/l2tp/l2tp_core.c
@@ -884,6 +884,10 @@ static int l2tp_udp_recv_core(struct l2tp_tunnel *tunnel, struct sk_buff *skb)
 		goto error;
 	}
 
+	if (tunnel->version == L2TP_HDR_VER_3 &&
+	    l2tp_v3_ensure_opt_in_linear(session, skb, &ptr, &optr))
+		goto error;
+
 	l2tp_recv_common(session, skb, ptr, optr, hdrflags, length);
 	l2tp_session_dec_refcount(session);
 
diff --git a/net/l2tp/l2tp_core.h b/net/l2tp/l2tp_core.h
index 9c9afe94d389..b2ce90260c35 100644
--- a/net/l2tp/l2tp_core.h
+++ b/net/l2tp/l2tp_core.h
@@ -301,6 +301,26 @@ static inline bool l2tp_tunnel_uses_xfrm(const struct l2tp_tunnel *tunnel)
 }
 #endif
 
+static inline int l2tp_v3_ensure_opt_in_linear(struct l2tp_session *session, struct sk_buff *skb,
+					       unsigned char **ptr, unsigned char **optr)
+{
+	int opt_len = session->peer_cookie_len + l2tp_get_l2specific_len(session);
+
+	if (opt_len > 0) {
+		int off = *ptr - *optr;
+
+		if (!pskb_may_pull(skb, off + opt_len))
+			return -1;
+
+		if (skb->data != *optr) {
+			*optr = skb->data;
+			*ptr = skb->data + off;
+		}
+	}
+
+	return 0;
+}
+
 #define l2tp_printk(ptr, type, func, fmt, ...)				\
 do {									\
 	if (((ptr)->debug) & (type))					\
diff --git a/net/l2tp/l2tp_ip.c b/net/l2tp/l2tp_ip.c
index 35f6f86d4dcc..d4c60523c549 100644
--- a/net/l2tp/l2tp_ip.c
+++ b/net/l2tp/l2tp_ip.c
@@ -165,6 +165,9 @@ static int l2tp_ip_recv(struct sk_buff *skb)
 		print_hex_dump_bytes("", DUMP_PREFIX_OFFSET, ptr, length);
 	}
 
+	if (l2tp_v3_ensure_opt_in_linear(session, skb, &ptr, &optr))
+		goto discard_sess;
+
 	l2tp_recv_common(session, skb, ptr, optr, 0, skb->len);
 	l2tp_session_dec_refcount(session);
 
diff --git a/net/l2tp/l2tp_ip6.c b/net/l2tp/l2tp_ip6.c
index 237f1a4a0b0c..0ae6899edac0 100644
--- a/net/l2tp/l2tp_ip6.c
+++ b/net/l2tp/l2tp_ip6.c
@@ -178,6 +178,9 @@ static int l2tp_ip6_recv(struct sk_buff *skb)
 		print_hex_dump_bytes("", DUMP_PREFIX_OFFSET, ptr, length);
 	}
 
+	if (l2tp_v3_ensure_opt_in_linear(session, skb, &ptr, &optr))
+		goto discard_sess;
+
 	l2tp_recv_common(session, skb, ptr, optr, 0, skb->len);
 	l2tp_session_dec_refcount(session);
 
-- 
2.17.1


^ permalink raw reply related

* [PATCH iproute2-next v2] ss: add AF_XDP support
From: bjorn.topel @ 2019-01-30  6:57 UTC (permalink / raw)
  To: netdev, stephen, dsahern
  Cc: Björn Töpel, magnus.karlsson, magnus.karlsson

From: Björn Töpel <bjorn.topel@intel.com>

AF_XDP is an address family that is optimized for high performance
packet processing.

This patch adds AF_XDP support to ss(8) so that sockets can be queried
and monitored.

Example:
$ sudo ss --xdp -e -p -m
Recv-Q      Send-Q           Local Address:Port             Peer Address:Port

0           0                   enp134s0f0:q20                          *
 users:(("xdpsock",pid=17787,fd=3)) ino:39424 sk:4
        rx(entries:2048)
        tx(entries:2048)
        umem(id:1,size:8388608,num_pages:2048,chunk_size:2048,headroom:0,ifindex:7,
qid:20,zc:0,refs:1)
        fr(entries:2048)
        cr(entries:2048) skmem:(r0,rb212992,t0,tb212992,f0,w0,o0,bl0,d0)
0           0                    enp24s0f0:q0                           *
 users:(("xdpsock",pid=17780,fd=3)) ino:37384 sk:5
        rx(entries:2048)
        tx(entries:2048)
        umem(id:0,size:8388608,num_pages:2048,chunk_size:2048,headroom:0,ifindex:6,
qid:0,zc:1,refs:1)
        fr(entries:2048)
        cr(entries:2048) skmem:(r0,rb212992,t0,tb212992,f0,w0,o0,bl0,d0)

Signed-off-by: Björn Töpel <bjorn.topel@intel.com>
---
v1->v2:
  * Define/redefine AF_XDP/AF_MAX if missing. (David)
  * Add example to the commit log. (David)
---
 include/utils.h |   8 +++
 man/man8/ss.8   |   9 ++-
 misc/ss.c       | 168 +++++++++++++++++++++++++++++++++++++++++++++++-
 3 files changed, 180 insertions(+), 5 deletions(-)

diff --git a/include/utils.h b/include/utils.h
index 92bbe82d3366..8a9c302082b2 100644
--- a/include/utils.h
+++ b/include/utils.h
@@ -127,6 +127,14 @@ struct dn_naddr
 # define CLOCK_TAI 11
 #endif
 
+#ifndef AF_XDP
+# define AF_XDP 44
+# if AF_MAX < 45
+#  undef AF_MAX
+#  define AF_MAX 45
+# endif
+#endif
+
 __u32 get_addr32(const char *name);
 int get_addr_1(inet_prefix *dst, const char *arg, int family);
 int get_prefix_1(inet_prefix *dst, char *arg, int family);
diff --git a/man/man8/ss.8 b/man/man8/ss.8
index 553a6cf46f0e..bc120fd78716 100644
--- a/man/man8/ss.8
+++ b/man/man8/ss.8
@@ -324,16 +324,19 @@ Display SCTP sockets.
 .B \-\-vsock
 Display vsock sockets (alias for -f vsock).
 .TP
+.B \-\-xdp
+Display XDP sockets (alias for -f xdp).
+.TP
 .B \-f FAMILY, \-\-family=FAMILY
 Display sockets of type FAMILY.
-Currently the following families are supported: unix, inet, inet6, link, netlink, vsock.
+Currently the following families are supported: unix, inet, inet6, link, netlink, vsock, xdp.
 .TP
 .B \-A QUERY, \-\-query=QUERY, \-\-socket=QUERY
 List of socket tables to dump, separated by commas. The following identifiers
 are understood: all, inet, tcp, udp, raw, unix, packet, netlink, unix_dgram,
 unix_stream, unix_seqpacket, packet_raw, packet_dgram, dccp, sctp,
-vsock_stream, vsock_dgram. Any item in the list may optionally be prefixed by
-an exclamation mark
+vsock_stream, vsock_dgram, xdp Any item in the list may optionally be
+prefixed by an exclamation mark
 .RB ( ! )
 to exclude that socket table from being dumped.
 .TP
diff --git a/misc/ss.c b/misc/ss.c
index 3589ebedc5a0..8e5cc16b6f52 100644
--- a/misc/ss.c
+++ b/misc/ss.c
@@ -42,6 +42,7 @@
 #include <linux/unix_diag.h>
 #include <linux/netdevice.h>	/* for MAX_ADDR_LEN */
 #include <linux/filter.h>
+#include <linux/xdp_diag.h>
 #include <linux/packet_diag.h>
 #include <linux/netlink_diag.h>
 #include <linux/sctp.h>
@@ -198,6 +199,7 @@ enum {
 	VSOCK_ST_DB,
 	VSOCK_DG_DB,
 	TIPC_DB,
+	XDP_DB,
 	MAX_DB
 };
 
@@ -309,6 +311,10 @@ static const struct filter default_dbs[MAX_DB] = {
 		.states   = TIPC_SS_CONN,
 		.families = FAMILY_MASK(AF_TIPC),
 	},
+	[XDP_DB] = {
+		.states   = (1 << SS_CLOSE),
+		.families = FAMILY_MASK(AF_XDP),
+	},
 };
 
 static const struct filter default_afs[AF_MAX] = {
@@ -340,6 +346,10 @@ static const struct filter default_afs[AF_MAX] = {
 		.dbs    = (1 << TIPC_DB),
 		.states = TIPC_SS_CONN,
 	},
+	[AF_XDP] = {
+		.dbs    = (1 << XDP_DB),
+		.states = (1 << SS_CLOSE),
+	},
 };
 
 static int do_default = 1;
@@ -366,7 +376,7 @@ static int filter_db_parse(struct filter *f, const char *s)
 		ENTRY(all, UDP_DB, DCCP_DB, TCP_DB, RAW_DB,
 			   UNIX_ST_DB, UNIX_DG_DB, UNIX_SQ_DB,
 			   PACKET_R_DB, PACKET_DG_DB, NETLINK_DB,
-			   SCTP_DB, VSOCK_ST_DB, VSOCK_DG_DB),
+			   SCTP_DB, VSOCK_ST_DB, VSOCK_DG_DB, XDP_DB),
 		ENTRY(inet, UDP_DB, DCCP_DB, TCP_DB, SCTP_DB, RAW_DB),
 		ENTRY(udp, UDP_DB),
 		ENTRY(dccp, DCCP_DB),
@@ -391,6 +401,7 @@ static int filter_db_parse(struct filter *f, const char *s)
 		ENTRY(v_str, VSOCK_ST_DB),	/* alias for vsock_stream */
 		ENTRY(vsock_dgram, VSOCK_DG_DB),
 		ENTRY(v_dgr, VSOCK_DG_DB),	/* alias for vsock_dgram */
+		ENTRY(xdp, XDP_DB),
 #undef ENTRY
 	};
 	bool enable = true;
@@ -1331,6 +1342,9 @@ static void sock_state_print(struct sockstat *s)
 	case AF_VSOCK:
 		sock_name = vsock_netid_name(s->type);
 		break;
+	case AF_XDP:
+		sock_name = "xdp";
+		break;
 	default:
 		sock_name = "unknown";
 	}
@@ -4055,6 +4069,142 @@ static int packet_show(struct filter *f)
 	return rc;
 }
 
+static int xdp_stats_print(struct sockstat *s, const struct filter *f)
+{
+	const char *addr, *port;
+	char q_str[16];
+
+	s->local.family = s->remote.family = AF_XDP;
+
+	if (f->f) {
+		if (run_ssfilter(f->f, s) == 0)
+			return 1;
+	}
+
+	sock_state_print(s);
+
+	if (s->iface) {
+		addr = xll_index_to_name(s->iface);
+		snprintf(q_str, sizeof(q_str), "q%d", s->lport);
+		port = q_str;
+		sock_addr_print(addr, ":", port, NULL);
+	} else {
+		sock_addr_print("", "*", "", NULL);
+	}
+
+	sock_addr_print("", "*", "", NULL);
+
+	proc_ctx_print(s);
+
+	if (show_details)
+		sock_details_print(s);
+
+	return 0;
+}
+
+static void xdp_show_ring(const char *name, struct xdp_diag_ring *ring)
+{
+	out("\n\t%s(", name);
+	out("entries:%u", ring->entries);
+	out(")");
+}
+
+static void xdp_show_umem(struct xdp_diag_umem *umem, struct xdp_diag_ring *fr,
+			  struct xdp_diag_ring *cr)
+{
+	out("\n\tumem(");
+	out("id:%u", umem->id);
+	out(",size:%llu", umem->size);
+	out(",num_pages:%u", umem->num_pages);
+	out(",chunk_size:%u", umem->chunk_size);
+	out(",headroom:%u", umem->headroom);
+	out(",ifindex:%u", umem->ifindex);
+	out(",qid:%u", umem->queue_id);
+	out(",zc:%u", umem->flags & XDP_DU_F_ZEROCOPY);
+	out(",refs:%u", umem->refs);
+	out(")");
+
+	if (fr)
+		xdp_show_ring("fr", fr);
+	if (cr)
+		xdp_show_ring("cr", cr);
+}
+
+static int xdp_show_sock(struct nlmsghdr *nlh, void *arg)
+{
+	struct xdp_diag_ring *rx = NULL, *tx = NULL, *fr = NULL, *cr = NULL;
+	struct xdp_diag_msg *msg = NLMSG_DATA(nlh);
+	struct rtattr *tb[XDP_DIAG_MAX + 1];
+	struct xdp_diag_info *info = NULL;
+	struct xdp_diag_umem *umem = NULL;
+	const struct filter *f = arg;
+	struct sockstat stat = {};
+
+	parse_rtattr(tb, XDP_DIAG_MAX, (struct rtattr *)(msg + 1),
+		     nlh->nlmsg_len - NLMSG_LENGTH(sizeof(*msg)));
+
+	stat.type = msg->xdiag_type;
+	stat.ino = msg->xdiag_ino;
+	stat.state = SS_CLOSE;
+	stat.sk = cookie_sk_get(&msg->xdiag_cookie[0]);
+
+	if (tb[XDP_DIAG_INFO]) {
+		info = RTA_DATA(tb[XDP_DIAG_INFO]);
+		stat.iface = info->ifindex;
+		stat.lport = info->queue_id;
+	}
+
+	if (tb[XDP_DIAG_UID])
+		stat.uid = rta_getattr_u32(tb[XDP_DIAG_UID]);
+	if (tb[XDP_DIAG_RX_RING])
+		rx = RTA_DATA(tb[XDP_DIAG_RX_RING]);
+	if (tb[XDP_DIAG_TX_RING])
+		tx = RTA_DATA(tb[XDP_DIAG_TX_RING]);
+	if (tb[XDP_DIAG_UMEM])
+		umem = RTA_DATA(tb[XDP_DIAG_UMEM]);
+	if (tb[XDP_DIAG_UMEM_FILL_RING])
+		fr = RTA_DATA(tb[XDP_DIAG_UMEM_FILL_RING]);
+	if (tb[XDP_DIAG_UMEM_COMPLETION_RING])
+		cr = RTA_DATA(tb[XDP_DIAG_UMEM_COMPLETION_RING]);
+	if (tb[XDP_DIAG_MEMINFO]) {
+		__u32 *skmeminfo = RTA_DATA(tb[XDP_DIAG_MEMINFO]);
+
+		stat.rq = skmeminfo[SK_MEMINFO_RMEM_ALLOC];
+	}
+
+	if (xdp_stats_print(&stat, f))
+		return 0;
+
+	if (show_details) {
+		if (rx)
+			xdp_show_ring("rx", rx);
+		if (tx)
+			xdp_show_ring("tx", tx);
+		if (umem)
+			xdp_show_umem(umem, fr, cr);
+	}
+
+	if (show_mem)
+		print_skmeminfo(tb, XDP_DIAG_MEMINFO); // really?
+
+
+	return 0;
+}
+
+static int xdp_show(struct filter *f)
+{
+	DIAG_REQUEST(req, struct xdp_diag_req r);
+
+	if (!filter_af_get(f, AF_XDP) || !(f->states & (1 << SS_CLOSE)))
+		return 0;
+
+	req.r.sdiag_family = AF_XDP;
+	req.r.xdiag_show = XDP_SHOW_INFO | XDP_SHOW_RING_CFG | XDP_SHOW_UMEM |
+			   XDP_SHOW_MEMINFO;
+
+	return handle_netlink_request(f, &req.nlh, sizeof(req), xdp_show_sock);
+}
+
 static int netlink_show_one(struct filter *f,
 				int prot, int pid, unsigned int groups,
 				int state, int dst_pid, unsigned int dst_group,
@@ -4442,6 +4592,9 @@ static int generic_show_sock(struct nlmsghdr *nlh, void *arg)
 	case AF_VSOCK:
 		ret = vsock_show_sock(nlh, arg);
 		break;
+	case AF_XDP:
+		ret = xdp_show_sock(nlh, arg);
+		break;
 	default:
 		ret = -1;
 	}
@@ -4679,7 +4832,7 @@ static void _usage(FILE *dest)
 "       --tipc          display only TIPC sockets\n"
 "       --vsock         display only vsock sockets\n"
 "   -f, --family=FAMILY display sockets of type FAMILY\n"
-"       FAMILY := {inet|inet6|link|unix|netlink|vsock|tipc|help}\n"
+"       FAMILY := {inet|inet6|link|unix|netlink|vsock|tipc|xdp|help}\n"
 "\n"
 "   -K, --kill          forcibly close sockets, display what was closed\n"
 "   -H, --no-header     Suppress header line\n"
@@ -4765,6 +4918,9 @@ static int scan_state(const char *state)
 #define OPT_TIPCSOCK 257
 #define OPT_TIPCINFO 258
 
+/* Values of 'x' are already used so a non-character is used */
+#define OPT_XDPSOCK 259
+
 static const struct option long_opts[] = {
 	{ "numeric", 0, 0, 'n' },
 	{ "resolve", 0, 0, 'r' },
@@ -4802,6 +4958,7 @@ static const struct option long_opts[] = {
 	{ "tipcinfo", 0, 0, OPT_TIPCINFO},
 	{ "kill", 0, 0, 'K' },
 	{ "no-header", 0, 0, 'H' },
+	{ "xdp", 0, 0, OPT_XDPSOCK},
 	{ 0 }
 
 };
@@ -4889,6 +5046,9 @@ int main(int argc, char *argv[])
 		case '0':
 			filter_af_set(&current_filter, AF_PACKET);
 			break;
+		case OPT_XDPSOCK:
+			filter_af_set(&current_filter, AF_XDP);
+			break;
 		case 'f':
 			if (strcmp(optarg, "inet") == 0)
 				filter_af_set(&current_filter, AF_INET);
@@ -4904,6 +5064,8 @@ int main(int argc, char *argv[])
 				filter_af_set(&current_filter, AF_TIPC);
 			else if (strcmp(optarg, "vsock") == 0)
 				filter_af_set(&current_filter, AF_VSOCK);
+			else if (strcmp(optarg, "xdp") == 0)
+				filter_af_set(&current_filter, AF_XDP);
 			else if (strcmp(optarg, "help") == 0)
 				help();
 			else {
@@ -5101,6 +5263,8 @@ int main(int argc, char *argv[])
 		vsock_show(&current_filter);
 	if (current_filter.dbs & (1<<TIPC_DB))
 		tipc_show(&current_filter);
+	if (current_filter.dbs & (1<<XDP_DB))
+		xdp_show(&current_filter);
 
 	if (show_users || show_proc_ctx || show_sock_ctx)
 		user_ent_destroy();
-- 
2.19.1


^ permalink raw reply related

* Re: [PATCH net-next 02/24] sctp: use SCTP_FUTURE_ASSOC for SCTP_PEER_ADDR_PARAMS sockopt
From: Xin Long @ 2019-01-30  7:03 UTC (permalink / raw)
  To: Neil Horman; +Cc: network dev, linux-sctp, Marcelo Ricardo Leitner, davem
In-Reply-To: <20190129212508.GB2592@neilslaptop.think-freely.org>

On Wed, Jan 30, 2019 at 5:25 AM Neil Horman <nhorman@tuxdriver.com> wrote:
>
> On Mon, Jan 28, 2019 at 03:08:24PM +0800, Xin Long wrote:
> > Check with SCTP_FUTURE_ASSOC instead in
> > sctp_/setgetsockopt_peer_addr_params, it's compatible with 0.
> >
> > Signed-off-by: Xin Long <lucien.xin@gmail.com>
> > ---
> >  net/sctp/socket.c | 18 ++++++++++--------
> >  1 file changed, 10 insertions(+), 8 deletions(-)
> >
> > diff --git a/net/sctp/socket.c b/net/sctp/socket.c
> > index a52d132..4c43b95 100644
> > --- a/net/sctp/socket.c
> > +++ b/net/sctp/socket.c
> > @@ -2750,12 +2750,13 @@ static int sctp_setsockopt_peer_addr_params(struct sock *sk,
> >                       return -EINVAL;
> >       }
> >
> > -     /* Get association, if assoc_id != 0 and the socket is a one
> > -      * to many style socket, and an association was not found, then
> > -      * the id was invalid.
> > +     /* Get association, if assoc_id != SCTP_FUTURE_ASSOC and the
> > +      * socket is a one to many style socket, and an association
> > +      * was not found, then the id was invalid.
> >        */
> >       asoc = sctp_id2assoc(sk, params.spp_assoc_id);
> > -     if (!asoc && params.spp_assoc_id && sctp_style(sk, UDP))
> > +     if (!asoc && params.spp_assoc_id != SCTP_FUTURE_ASSOC &&
> Sorry to follow up, but I misspoke in my previous email, I should have said, why
> do we only allow future associations as the only special case association id
> here?  Since the function is meant to set a specific association id, it seems to
> me that you would want to:
>
> a) allow setting of a specific id
> b) allow setting of all association ids on the socket
> (SCTP_CURRENT_ASSOC)
> c) allow recording of a set of params to apply to all current and future
> associations (FUTURE/ALL).
>
> (a) is already handled clearly, but (b) and (c) require more work on this
> function than just checking association id on entry.
Hi, Neil,

Note that not all sockopts support both of them, like
we don't allow some sockopt to be applied to current assocs.
and we don't allow some to be applied to future assocs.

SCTP_FUTURE_ASSOC means sock's xxxx only
SCTP_CURRENT_ASSOC means all sock->asocs' xxxx only

If we only check assoc_id != SCTP_FUTURE_ASSOC, it means this
sockopt doesn't support for all sock->asocs xxxx setting,
or getting (like all sockopts getting).
If we only check assoc_id != SCTP_CURRENT_ASSOC, it means this
sockopt doesn't support for sock's xxxx setting (like Patch 11)

As for SCTP_ALL_ASSOC, it means both of sock's and sock->asocs
xxxx, not either of them. So we don't allow it in here.

As you can see, in this patchset:
1. for sockopt setting:
   these who only support FUTURE will check SCTP_FUTURE_ASSOC, like Patch 2-10.
   these who only support CURRENT will check SCTP_CURRENT_ASSOC, like Patch 11.
   these who support both FUTURE and CURRENT will check assoc_id >
SCTP_ALL_ASSOC, like Patch 12-24.
2. for sockopt getting:
   all sockopts will check SCTP_FUTURE_ASSOC. (as it's impossible to
get all sock->asocs' xxxx)

>
> I think this comment may apply to all the socket option functions
>
> > +         sctp_style(sk, UDP))
> >               return -EINVAL;
> >
> >       /* Heartbeat demand can only be sent on a transport or
> > @@ -5676,12 +5677,13 @@ static int sctp_getsockopt_peer_addr_params(struct sock *sk, int len,
> >               }
> >       }
> >
> > -     /* Get association, if assoc_id != 0 and the socket is a one
> > -      * to many style socket, and an association was not found, then
> > -      * the id was invalid.
> > +     /* Get association, if assoc_id != SCTP_FUTURE_ASSOC and the
> > +      * socket is a one to many style socket, and an association
> > +      * was not found, then the id was invalid.
> >        */
> >       asoc = sctp_id2assoc(sk, params.spp_assoc_id);
> > -     if (!asoc && params.spp_assoc_id && sctp_style(sk, UDP)) {
> > +     if (!asoc && params.spp_assoc_id != SCTP_FUTURE_ASSOC &&
> > +         sctp_style(sk, UDP)) {
> >               pr_debug("%s: failed no association\n", __func__);
> >               return -EINVAL;
> >       }
> > --
> > 2.1.0
> >
> >

^ permalink raw reply

* Re: [PATCH net v3] net: l2tp: fix reading optional fields of L2TPv3
From: Jacob Wen @ 2019-01-30  7:06 UTC (permalink / raw)
  To: Guillaume Nault; +Cc: netdev, eric.dumazet
In-Reply-To: <20190129223727.GA4062@linux.home>

Thanks for the detailed review.

On 1/30/19 6:37 AM, Guillaume Nault wrote:
> On Tue, Jan 29, 2019 at 02:18:13PM +0800, Jacob Wen wrote:
>> Use pskb_may_pull() to make sure the optional fields are in skb linear
>> parts, so we can safely read them in l2tp_recv_common.
>>
> Looks fine to me. Just a few nitpicks. Not sure if they're worth a repost.
> But if you send a v4, you can:
>    * Add the proper Fixes tag.
>    * Drop 'net:' from the subsystem prefix ('l2tp:' is enough).
>    * Move your patch history inside the commit description.
>    * Keep my Acked-by tag.
Done in v4.
>> diff --git a/net/l2tp/l2tp_core.c b/net/l2tp/l2tp_core.c
>> index 26f1d435696a..82c28008b438 100644
>> --- a/net/l2tp/l2tp_core.c
>> +++ b/net/l2tp/l2tp_core.c
>> @@ -884,6 +884,10 @@ static int l2tp_udp_recv_core(struct l2tp_tunnel *tunnel, struct sk_buff *skb)
>>   		goto error;
>>   	}
>>   
>> +	if (tunnel->version != L2TP_HDR_VER_2 &&
>>
> Using tunnel->version == L2TP_HDR_VER_3 would have been clearer.
Ditto.
>
>> diff --git a/net/l2tp/l2tp_core.h b/net/l2tp/l2tp_core.h
>> index 9c9afe94d389..870f8ccf95f7 100644
>> --- a/net/l2tp/l2tp_core.h
>> +++ b/net/l2tp/l2tp_core.h
>> @@ -301,6 +301,27 @@ static inline bool l2tp_tunnel_uses_xfrm(const struct l2tp_tunnel *tunnel)
>>   }
>>   #endif
>>   
>> +/* Pull optional fields of L2TPv3 */
>> +static inline int l2tp_v3_pull_opt(struct l2tp_session *session, struct sk_buff *skb,
>>
> The comment and function name are a bit misleading: nothing is pulled
> here.
>
You are right. The misleading is inherited from pskb_may_pull.
s/l2tp_v3_pull_opt/l2tp_v3_ensure_opt_in_linear/ in v4.
>
> BTW, Do you plan to also fix L2TPv2?
> It looks like defining L2TP_HDR_SIZE_MAX to 14 (size of L2TPv2 header
> with all optional fields) and using it in place of L2TP_HDR_SIZE_SEQ in
> l2tp_udp_recv_core() should be enough.
I will do that with your Suggested-by. Thanks.

-- 
Jacob


^ permalink raw reply

* Re: [PATCH] ipmr: ip6mr: Create new sockopt to clear mfc cache only
From: kbuild test robot @ 2019-01-30  7:13 UTC (permalink / raw)
  To: Callum Sinclair
  Cc: kbuild-all, davem, kuznet, yoshfuji, nikolay, netdev,
	linux-kernel, Callum Sinclair
In-Reply-To: <20190130022509.25303-2-callum.sinclair@alliedtelesis.co.nz>

[-- Attachment #1: Type: text/plain, Size: 4146 bytes --]

Hi Callum,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on net/master]
[also build test ERROR on v5.0-rc4 next-20190129]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Callum-Sinclair/ipmr-ip6mr-Create-new-sockopt-to-clear-mfc-cache-only/20190130-104146
config: i386-defconfig (attached as .config)
compiler: gcc-8 (Debian 8.2.0-14) 8.2.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

All errors (new ones prefixed by >>):

   net/ipv4/ipmr.c: In function 'mroute_clean_cache':
>> net/ipv4/ipmr.c:1312:3: error: 'cache' undeclared (first use in this function); did you mean 'hh_cache'?
      cache = (struct mfc_cache *)c;
      ^~~~~
      hh_cache
   net/ipv4/ipmr.c:1312:3: note: each undeclared identifier is reported only once for each function it appears in
>> net/ipv4/ipmr.c:1313:33: error: 'net' undeclared (first use in this function)
      call_ipmr_mfc_entry_notifiers(net, FIB_EVENT_ENTRY_DEL, cache,
                                    ^~~
   net/ipv4/ipmr.c: In function 'mroute_clean_tables':
   net/ipv4/ipmr.c:1334:14: warning: unused variable 'net' [-Wunused-variable]
     struct net *net = read_pnet(&mrt->net);
                 ^~~

vim +1312 net/ipv4/ipmr.c

^1da177e4 Linus Torvalds      2005-04-16  1300  
7ba7b80d1 Callum Sinclair     2019-01-30  1301  /* Clear the vif tables */
7ba7b80d1 Callum Sinclair     2019-01-30  1302  static void mroute_clean_cache(struct mr_table *mrt, bool all)
^1da177e4 Linus Torvalds      2005-04-16  1303  {
494fff563 Yuval Mintz         2018-02-28  1304  	struct mr_mfc *c, *tmp;
^1da177e4 Linus Torvalds      2005-04-16  1305  
a8cb16dd9 Eric Dumazet        2010-10-01  1306  	/* Wipe the cache */
8fb472c09 Nikolay Aleksandrov 2017-01-12  1307  	list_for_each_entry_safe(c, tmp, &mrt->mfc_cache_list, list) {
0e615e960 Nikolay Aleksandrov 2015-11-20  1308  		if (!all && (c->mfc_flags & MFC_STATIC))
^1da177e4 Linus Torvalds      2005-04-16  1309  			continue;
8fb472c09 Nikolay Aleksandrov 2017-01-12  1310  		rhltable_remove(&mrt->mfc_hash, &c->mnode, ipmr_rht_params);
a8c9486b8 Eric Dumazet        2010-10-01  1311  		list_del_rcu(&c->list);
494fff563 Yuval Mintz         2018-02-28 @1312  		cache = (struct mfc_cache *)c;
494fff563 Yuval Mintz         2018-02-28 @1313  		call_ipmr_mfc_entry_notifiers(net, FIB_EVENT_ENTRY_DEL, cache,
b362053a7 Yotam Gigi          2017-09-27  1314  					      mrt->id);
494fff563 Yuval Mintz         2018-02-28  1315  		mroute_netlink_event(mrt, cache, RTM_DELROUTE);
8c13af2a2 Yuval Mintz         2018-03-26  1316  		mr_cache_put(c);
^1da177e4 Linus Torvalds      2005-04-16  1317  	}
^1da177e4 Linus Torvalds      2005-04-16  1318  
0c12295a7 Patrick McHardy     2010-04-13  1319  	if (atomic_read(&mrt->cache_resolve_queue_len) != 0) {
^1da177e4 Linus Torvalds      2005-04-16  1320  		spin_lock_bh(&mfc_unres_lock);
8fb472c09 Nikolay Aleksandrov 2017-01-12  1321  		list_for_each_entry_safe(c, tmp, &mrt->mfc_unres_queue, list) {
862465f2e Patrick McHardy     2010-04-13  1322  			list_del(&c->list);
494fff563 Yuval Mintz         2018-02-28  1323  			cache = (struct mfc_cache *)c;
494fff563 Yuval Mintz         2018-02-28  1324  			mroute_netlink_event(mrt, cache, RTM_DELROUTE);
494fff563 Yuval Mintz         2018-02-28  1325  			ipmr_destroy_unres(mrt, cache);
^1da177e4 Linus Torvalds      2005-04-16  1326  		}
^1da177e4 Linus Torvalds      2005-04-16  1327  		spin_unlock_bh(&mfc_unres_lock);
^1da177e4 Linus Torvalds      2005-04-16  1328  	}
^1da177e4 Linus Torvalds      2005-04-16  1329  }
^1da177e4 Linus Torvalds      2005-04-16  1330  

:::::: The code at line 1312 was first introduced by commit
:::::: 494fff56379c4ad5b8fe36a5b7ffede4044ca7bb ipmr, ip6mr: Make mfc_cache a common structure

:::::: TO: Yuval Mintz <yuvalm@mellanox.com>
:::::: CC: David S. Miller <davem@davemloft.net>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 26942 bytes --]

^ permalink raw reply

* [PATCH net-next] strparser: Return if socket does not have required number of bytes
From: Vakul Garg @ 2019-01-30  7:31 UTC (permalink / raw)
  To: netdev@vger.kernel.org
  Cc: borisp@mellanox.com, aviadye@mellanox.com, davejwatson@fb.com,
	davem@davemloft.net, doronrk@fb.com, Vakul Garg

Function strp_data_ready() should peek the associated socket to check
whether it has the required number of bytes available before queueing
work or initiating socket read via strp_read_sock(). This saves cpu
cycles because strp_read_sock() is called only when required amount of
data is available.

Signed-off-by: Vakul Garg <vakul.garg@nxp.com>
---
 net/strparser/strparser.c | 13 ++++++++-----
 1 file changed, 8 insertions(+), 5 deletions(-)

diff --git a/net/strparser/strparser.c b/net/strparser/strparser.c
index da1a676860ca..38f8d8d8f4ad 100644
--- a/net/strparser/strparser.c
+++ b/net/strparser/strparser.c
@@ -384,6 +384,14 @@ void strp_data_ready(struct strparser *strp)
 	if (unlikely(strp->stopped) || strp->paused)
 		return;
 
+	/* If the socket does not contain the number bytes required by
+	 * stream parser context to proceed, return silently.
+	 */
+	if (strp->need_bytes) {
+		if (strp_peek_len(strp) < strp->need_bytes)
+			return;
+	}
+
 	/* This check is needed to synchronize with do_strp_work.
 	 * do_strp_work acquires a process lock (lock_sock) whereas
 	 * the lock held here is bh_lock_sock. The two locks can be
@@ -396,11 +404,6 @@ void strp_data_ready(struct strparser *strp)
 		return;
 	}
 
-	if (strp->need_bytes) {
-		if (strp_peek_len(strp) < strp->need_bytes)
-			return;
-	}
-
 	if (strp_read_sock(strp) == -ENOMEM)
 		queue_work(strp_wq, &strp->work);
 }
-- 
2.13.6


^ permalink raw reply related

* Re: [PATCH net-next v2 01/12] net: bridge: multicast: Propagate br_mc_disabled_update() return
From: Ido Schimmel @ 2019-01-30  7:36 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: netdev@vger.kernel.org, andrew@lunn.ch, vivien.didelot@gmail.com,
	davem@davemloft.net, Jiri Pirko, ilias.apalodimas@linaro.org,
	ivan.khoronzhuk@linaro.org, roopa@cumulusnetworks.com,
	nikolay@cumulusnetworks.com
In-Reply-To: <20190130005548.2212-2-f.fainelli@gmail.com>

On Tue, Jan 29, 2019 at 04:55:37PM -0800, Florian Fainelli wrote:
> Some Ethernet switches might not be able to support disabling multicast
> flooding globally when e.g: several bridges span the same physical
> device, propagate the return value of br_mc_disabled_update() such that
> this propagates correctly to user-space.
> 
> Signed-off-by: Florian Fainelli <f.fainelli@gmail.com>
> ---
>  net/bridge/br_multicast.c | 23 ++++++++++++++++-------
>  1 file changed, 16 insertions(+), 7 deletions(-)
> 
> diff --git a/net/bridge/br_multicast.c b/net/bridge/br_multicast.c
> index 3aeff0895669..aff5e003d34f 100644
> --- a/net/bridge/br_multicast.c
> +++ b/net/bridge/br_multicast.c
> @@ -813,20 +813,22 @@ static void br_ip6_multicast_port_query_expired(struct timer_list *t)
>  }
>  #endif
>  
> -static void br_mc_disabled_update(struct net_device *dev, bool value)
> +static int br_mc_disabled_update(struct net_device *dev, bool value)
>  {
>  	struct switchdev_attr attr = {
>  		.orig_dev = dev,
>  		.id = SWITCHDEV_ATTR_ID_BRIDGE_MC_DISABLED,
> -		.flags = SWITCHDEV_F_DEFER,
> +		.flags = SWITCHDEV_F_DEFER | SWITCHDEV_F_SKIP_EOPNOTSUPP,

Actually, since the operation is deferred I don't think the return value
from the driver is ever checked. Can you test it?

I think it would be good to convert the attributes to use the switchdev
notifier like commit d17d9f5e5143 ("switchdev: Replace port obj add/del
SDO with a notification") did for objects. Then you can have your
listener veto the operation in the same context it is happening.

>  		.u.mc_disabled = !value,
>  	};
>  
> -	switchdev_port_attr_set(dev, &attr);
> +	return switchdev_port_attr_set(dev, &attr);
>  }
>  
>  int br_multicast_add_port(struct net_bridge_port *port)
>  {
> +	int ret;
> +
>  	port->multicast_router = MDB_RTR_TYPE_TEMP_QUERY;
>  
>  	timer_setup(&port->multicast_router_timer,
> @@ -837,8 +839,11 @@ int br_multicast_add_port(struct net_bridge_port *port)
>  	timer_setup(&port->ip6_own_query.timer,
>  		    br_ip6_multicast_port_query_expired, 0);
>  #endif
> -	br_mc_disabled_update(port->dev,
> -			      br_opt_get(port->br, BROPT_MULTICAST_ENABLED));
> +	ret = br_mc_disabled_update(port->dev,
> +				    br_opt_get(port->br,
> +					       BROPT_MULTICAST_ENABLED));
> +	if (ret)
> +		return ret;
>  
>  	port->mcast_stats = netdev_alloc_pcpu_stats(struct bridge_mcast_stats);
>  	if (!port->mcast_stats)
> @@ -1937,12 +1942,16 @@ static void br_multicast_start_querier(struct net_bridge *br,
>  int br_multicast_toggle(struct net_bridge *br, unsigned long val)
>  {
>  	struct net_bridge_port *port;
> +	int err = 0;
>  
>  	spin_lock_bh(&br->multicast_lock);
>  	if (!!br_opt_get(br, BROPT_MULTICAST_ENABLED) == !!val)
>  		goto unlock;
>  
> -	br_mc_disabled_update(br->dev, val);
> +	err = br_mc_disabled_update(br->dev, val);
> +	if (err)
> +		goto unlock;
> +
>  	br_opt_toggle(br, BROPT_MULTICAST_ENABLED, !!val);
>  	if (!br_opt_get(br, BROPT_MULTICAST_ENABLED))
>  		goto unlock;
> @@ -1957,7 +1966,7 @@ int br_multicast_toggle(struct net_bridge *br, unsigned long val)
>  unlock:
>  	spin_unlock_bh(&br->multicast_lock);
>  
> -	return 0;
> +	return err;
>  }
>  
>  bool br_multicast_enabled(const struct net_device *dev)
> -- 
> 2.17.1
> 

^ permalink raw reply

* Re: [PATCH net-next 02/24] sctp: use SCTP_FUTURE_ASSOC for SCTP_PEER_ADDR_PARAMS sockopt
From: Neil Horman @ 2019-01-30  7:37 UTC (permalink / raw)
  To: Xin Long; +Cc: network dev, linux-sctp, Marcelo Ricardo Leitner, davem
In-Reply-To: <CADvbK_fPbz+Rso6ADXh2PpUYoMRtHnbJdu4VygckAjDiW1MzSw@mail.gmail.com>

On Wed, Jan 30, 2019 at 03:03:01PM +0800, Xin Long wrote:
> On Wed, Jan 30, 2019 at 5:25 AM Neil Horman <nhorman@tuxdriver.com> wrote:
> >
> > On Mon, Jan 28, 2019 at 03:08:24PM +0800, Xin Long wrote:
> > > Check with SCTP_FUTURE_ASSOC instead in
> > > sctp_/setgetsockopt_peer_addr_params, it's compatible with 0.
> > >
> > > Signed-off-by: Xin Long <lucien.xin@gmail.com>
> > > ---
> > >  net/sctp/socket.c | 18 ++++++++++--------
> > >  1 file changed, 10 insertions(+), 8 deletions(-)
> > >
> > > diff --git a/net/sctp/socket.c b/net/sctp/socket.c
> > > index a52d132..4c43b95 100644
> > > --- a/net/sctp/socket.c
> > > +++ b/net/sctp/socket.c
> > > @@ -2750,12 +2750,13 @@ static int sctp_setsockopt_peer_addr_params(struct sock *sk,
> > >                       return -EINVAL;
> > >       }
> > >
> > > -     /* Get association, if assoc_id != 0 and the socket is a one
> > > -      * to many style socket, and an association was not found, then
> > > -      * the id was invalid.
> > > +     /* Get association, if assoc_id != SCTP_FUTURE_ASSOC and the
> > > +      * socket is a one to many style socket, and an association
> > > +      * was not found, then the id was invalid.
> > >        */
> > >       asoc = sctp_id2assoc(sk, params.spp_assoc_id);
> > > -     if (!asoc && params.spp_assoc_id && sctp_style(sk, UDP))
> > > +     if (!asoc && params.spp_assoc_id != SCTP_FUTURE_ASSOC &&
> > Sorry to follow up, but I misspoke in my previous email, I should have said, why
> > do we only allow future associations as the only special case association id
> > here?  Since the function is meant to set a specific association id, it seems to
> > me that you would want to:
> >
> > a) allow setting of a specific id
> > b) allow setting of all association ids on the socket
> > (SCTP_CURRENT_ASSOC)
> > c) allow recording of a set of params to apply to all current and future
> > associations (FUTURE/ALL).
> >
> > (a) is already handled clearly, but (b) and (c) require more work on this
> > function than just checking association id on entry.
> Hi, Neil,
> 
> Note that not all sockopts support both of them, like
> we don't allow some sockopt to be applied to current assocs.
> and we don't allow some to be applied to future assocs.
> 
> SCTP_FUTURE_ASSOC means sock's xxxx only
> SCTP_CURRENT_ASSOC means all sock->asocs' xxxx only
> 
> If we only check assoc_id != SCTP_FUTURE_ASSOC, it means this
> sockopt doesn't support for all sock->asocs xxxx setting,
> or getting (like all sockopts getting).
> If we only check assoc_id != SCTP_CURRENT_ASSOC, it means this
> sockopt doesn't support for sock's xxxx setting (like Patch 11)
> 
> As for SCTP_ALL_ASSOC, it means both of sock's and sock->asocs
> xxxx, not either of them. So we don't allow it in here.
> 
> As you can see, in this patchset:
> 1. for sockopt setting:
>    these who only support FUTURE will check SCTP_FUTURE_ASSOC, like Patch 2-10.
>    these who only support CURRENT will check SCTP_CURRENT_ASSOC, like Patch 11.
>    these who support both FUTURE and CURRENT will check assoc_id >
> SCTP_ALL_ASSOC, like Patch 12-24.
> 2. for sockopt getting:
>    all sockopts will check SCTP_FUTURE_ASSOC. (as it's impossible to
> get all sock->asocs' xxxx)
> 
> >
> > I think this comment may apply to all the socket option functions
> >
> > > +         sctp_style(sk, UDP))
> > >               return -EINVAL;
> > >
> > >       /* Heartbeat demand can only be sent on a transport or
> > > @@ -5676,12 +5677,13 @@ static int sctp_getsockopt_peer_addr_params(struct sock *sk, int len,
> > >               }
> > >       }
> > >
> > > -     /* Get association, if assoc_id != 0 and the socket is a one
> > > -      * to many style socket, and an association was not found, then
> > > -      * the id was invalid.
> > > +     /* Get association, if assoc_id != SCTP_FUTURE_ASSOC and the
> > > +      * socket is a one to many style socket, and an association
> > > +      * was not found, then the id was invalid.
> > >        */
> > >       asoc = sctp_id2assoc(sk, params.spp_assoc_id);
> > > -     if (!asoc && params.spp_assoc_id && sctp_style(sk, UDP)) {
> > > +     if (!asoc && params.spp_assoc_id != SCTP_FUTURE_ASSOC &&
> > > +         sctp_style(sk, UDP)) {
> > >               pr_debug("%s: failed no association\n", __func__);
> > >               return -EINVAL;
> > >       }
> > > --
> > > 2.1.0
> > >
> > >
> 
Ah, ok, apologies for misunderstanding.  I'm apparently still jet lagged.
Neil


^ permalink raw reply

* Re: [PATCH net-next v2 00/12] net: dsa: management mode for bcm_sf2
From: Ido Schimmel @ 2019-01-30  7:38 UTC (permalink / raw)
  To: Florian Fainelli
  Cc: netdev@vger.kernel.org, andrew@lunn.ch, vivien.didelot@gmail.com,
	davem@davemloft.net, Jiri Pirko, ilias.apalodimas@linaro.org,
	ivan.khoronzhuk@linaro.org, roopa@cumulusnetworks.com,
	nikolay@cumulusnetworks.com
In-Reply-To: <20190130005548.2212-1-f.fainelli@gmail.com>

On Tue, Jan 29, 2019 at 04:55:36PM -0800, Florian Fainelli wrote:
> Hi all,
> 
> This patch series does a number of things in order to enable management
> mode for bcm_sf2 (which could be easily extended to b53 with proper
> testing later on). In order to get there, there were several use cases
> that did not work correctly and that needed to be fixed:
> 
> - VLAN devices on top of switch ports not being member of a bridge, with
>   other switch ports being bridged, with the bridge having VLAN
>   filtering enabled.
> 
> - lack of multicast filtering by default on network ports which should
>   be happening in order for the non-bridged DSA ports to behave strictly
>   as Ethernet NICs with proper filering. This is accomplished by hooking
>   a ndo_set_rx_mode() function to the DSA slave network devices
> 
> - when VLAN filtering is globally enabled on the switch (because at
>   least a bridge device requires it), then we also need to make sure
>   that when doing multicast over VLAN devices over a switch port
>   (bridged or not) happens with the correct MDB address *and* VID
> 
> Hopefully the changes to net/8021q and net/bridge are deemed acceptable.

You're not touching net/8021q :) Probably leftover from v1

...

> 
>  drivers/net/dsa/b53/b53_common.c           | 257 +++++++++++++++++++--
>  drivers/net/dsa/b53/b53_priv.h             |  14 +-
>  drivers/net/dsa/b53/b53_regs.h             |  22 ++
>  drivers/net/dsa/bcm_sf2.c                  |  56 +++--
>  drivers/net/dsa/bcm_sf2_regs.h             |   5 +
>  drivers/net/ethernet/broadcom/bcmsysport.c |   4 +
>  include/net/dsa.h                          |   2 +
>  net/bridge/br_multicast.c                  |  23 +-
>  net/dsa/dsa_priv.h                         |  22 +-
>  net/dsa/port.c                             |  42 +++-
>  net/dsa/slave.c                            | 107 ++++++++-
>  net/dsa/switch.c                           |  57 +++++
>  12 files changed, 552 insertions(+), 59 deletions(-)
> 
> -- 
> 2.17.1
> 

^ permalink raw reply

* Re: [PATCH net] net: b44: replace dev_kfree_skb_xxx by dev_consume_skb_xxx for drop profiles
From: Sergei Shtylyov @ 2019-01-30  7:43 UTC (permalink / raw)
  To: Yang Wei, netdev; +Cc: michael.chan, davem, yang.wei9
In-Reply-To: <1548774280-5757-1-git-send-email-albin_yang@163.com>

Hello!

On 29.01.2019 18:04, Yang Wei wrote:

> From: Yang Wei <yang.wei9@zte.com.cn>
> 
> The skb should be freed by dev_consume_skb_any() in b44_start_xmit()
> when bounce_skb is used. The skb is be replaced by bounce_skb, so the

    s/be/being/?

> original skb should be consumed(not drop).
> 
> dev_consume_skb_irq() should be called in b44_tx() when skb xmit
> done. It makes drop profiles(dropwatch, perf) more friendly.
> 
> Signed-off-by: Yang Wei <yang.wei9@zte.com.cn>
[...]

MBR, Sergei

^ permalink raw reply

* Re: [PATCH net-next 00/24] sctp: support SCTP_FUTURE/CURRENT/ALL_ASSOC
From: Neil Horman @ 2019-01-30  7:46 UTC (permalink / raw)
  To: Xin Long; +Cc: network dev, linux-sctp, Marcelo Ricardo Leitner, davem
In-Reply-To: <cover.1548659198.git.lucien.xin@gmail.com>

On Mon, Jan 28, 2019 at 03:08:22PM +0800, Xin Long wrote:
> This patchset adds the support for 3 assoc_id constants: SCTP_FUTURE_ASSOC
> SCTP_CURRENT_ASSOC, SCTP_ALL_ASSOC, described in rfc6458#section-7.2:
> 
>    All socket options set on a one-to-one style listening socket also
>    apply to all future accepted sockets.  For one-to-many style sockets,
>    often a socket option will pass a structure that includes an assoc_id
>    field.  This field can be filled with the association identifier of a
>    particular association and unless otherwise specified can be filled
>    with one of the following constants:
> 
>    SCTP_FUTURE_ASSOC:  Specifies that only future associations created
>       after this socket option will be affected by this call.
> 
>    SCTP_CURRENT_ASSOC:  Specifies that only currently existing
>       associations will be affected by this call, and future
>       associations will still receive the previous default value.
> 
>    SCTP_ALL_ASSOC:  Specifies that all current and future associations
>       will be affected by this call.
> 
> The functions for many other sockopts that use assoc_id also need to be
> updated accordingly.
> 
> Xin Long (24):
>   sctp: introduce SCTP_FUTURE/CURRENT/ALL_ASSOC
>   sctp: use SCTP_FUTURE_ASSOC for SCTP_PEER_ADDR_PARAMS sockopt
>   sctp: use SCTP_FUTURE_ASSOC for SCTP_RTOINFO sockopt
>   sctp: use SCTP_FUTURE_ASSOC for SCTP_ASSOCINFO sockopt
>   sctp: use SCTP_FUTURE_ASSOC for SCTP_MAXSEG sockopt
>   sctp: use SCTP_FUTURE_ASSOC for SCTP_LOCAL_AUTH_CHUNKS sockopt
>   sctp: add SCTP_FUTURE_ASSOC for SCTP_PEER_ADDR_THLDS sockopt
>   sctp: use SCTP_FUTURE_ASSOC for SCTP_PR_SUPPORTED sockopt
>   sctp: use SCTP_FUTURE_ASSOC for SCTP_RECONFIG_SUPPORTED sockopt
>   sctp: use SCTP_FUTURE_ASSOC for SCTP_INTERLEAVING_SUPPORTED sockopt
>   sctp: add SCTP_CURRENT_ASSOC for SCTP_STREAM_SCHEDULER_VALUE sockopt
>   sctp: use SCTP_FUTURE_ASSOC and add SCTP_CURRENT_ASSOC for
>     SCTP_DELAYED_SACK sockopt
>   sctp: use SCTP_FUTURE_ASSOC and add SCTP_CURRENT_ASSOC for
>     SCTP_DEFAULT_SEND_PARAM sockopt
>   sctp: use SCTP_FUTURE_ASSOC and add SCTP_CURRENT_ASSOC for
>     SCTP_DEFAULT_SNDINFO sockopt
>   sctp: use SCTP_FUTURE_ASSOC and add SCTP_CURRENT_ASSOC for
>     SCTP_CONTEXT sockopt
>   sctp: use SCTP_FUTURE_ASSOC and add SCTP_CURRENT_ASSOC for
>     SCTP_MAX_BURST sockopt
>   sctp: use SCTP_FUTURE_ASSOC and add SCTP_CURRENT_ASSOC for
>     SCTP_AUTH_KEY sockopt
>   sctp: use SCTP_FUTURE_ASSOC and add SCTP_CURRENT_ASSOC for
>     SCTP_AUTH_ACTIVE_KEY sockopt
>   sctp: use SCTP_FUTURE_ASSOC and add SCTP_CURRENT_ASSOC for
>     SCTP_AUTH_DELETE_KEY sockopt
>   sctp: use SCTP_FUTURE_ASSOC and add SCTP_CURRENT_ASSOC for
>     SCTP_AUTH_DEACTIVATE_KEY sockopt
>   sctp: use SCTP_FUTURE_ASSOC and add SCTP_CURRENT_ASSOC for
>     SCTP_DEFAULT_PRINFO sockopt
>   sctp: use SCTP_FUTURE_ASSOC and add SCTP_CURRENT_ASSOC for
>     SCTP_ENABLE_STREAM_RESET sockopt
>   sctp: use SCTP_FUTURE_ASSOC and add SCTP_CURRENT_ASSOC for SCTP_EVENT
>     sockopt
>   sctp: add SCTP_FUTURE_ASOC and SCTP_CURRENT_ASSOC for
>     SCTP_STREAM_SCHEDULER sockopt
> 
>  include/net/sctp/structs.h |   4 +
>  include/uapi/linux/sctp.h  |   4 +
>  net/sctp/associola.c       |   9 +-
>  net/sctp/outqueue.c        |   2 +-
>  net/sctp/socket.c          | 773 ++++++++++++++++++++++++++++++---------------
>  5 files changed, 525 insertions(+), 267 deletions(-)
> 
> -- 
> 2.1.0
> 
> 
Ok, Dave, thank you for waiting on me for this, I've looked at this series, and
after Xins explination on my question, I've no issue with this change:

Acked-by: Neil Horman <nhorman@tuxdriver.com>


^ permalink raw reply

* Re: [PATCH] can: mark expected switch fall-throughs
From: Nicolas.Ferre @ 2019-01-30  8:11 UTC (permalink / raw)
  To: gustavo, wg, mkl, davem, alexandre.belloni, Ludovic.Desroches
  Cc: linux-can, netdev, linux-arm-kernel, linux-kernel
In-Reply-To: <20190129180612.GA28650@embeddedor>

On 29/01/2019 at 19:06, Gustavo A. R. Silva wrote:
> In preparation to enabling -Wimplicit-fallthrough, mark switch cases
> where we are expecting to fall through.
> 
> This patch fixes the following warnings:
> 
> drivers/net/can/peak_canfd/peak_pciefd_main.c:668:3: warning: this statement may fall through [-Wimplicit-fallthrough=]
> drivers/net/can/spi/mcp251x.c:875:7: warning: this statement may fall through [-Wimplicit-fallthrough=]
> drivers/net/can/usb/peak_usb/pcan_usb.c:422:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
> drivers/net/can/at91_can.c:895:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
> drivers/net/can/at91_can.c:953:15: warning: this statement may fall through [-Wimplicit-fallthrough=]
> drivers/net/can/usb/peak_usb/pcan_usb.c: In function ‘pcan_usb_decode_error’:
> drivers/net/can/usb/peak_usb/pcan_usb.c:422:6: warning: this statement may fall through [-Wimplicit-fallthrough=]
>     if (n & PCAN_USB_ERROR_BUS_LIGHT) {
>        ^
> drivers/net/can/usb/peak_usb/pcan_usb.c:428:2: note: here
>    case CAN_STATE_ERROR_WARNING:
>    ^~~~
> 
> Warning level 3 was used: -Wimplicit-fallthrough=3
> 
> This patch is part of the ongoing efforts to enabling
> -Wimplicit-fallthrough.
> 
> Notice that in some cases spelling mistakes were fixed.
> In other cases, the /* fall through */ comment is placed
> at the bottom of the case statement, which is what GCC
> is expecting to find.
> 
> Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
> ---
>   drivers/net/can/at91_can.c                    | 6 ++++--

For this one:
Acked-by: Nicolas Ferre <nicolas.ferre@microchip.com>

>   drivers/net/can/peak_canfd/peak_pciefd_main.c | 2 +-
>   drivers/net/can/spi/mcp251x.c                 | 3 ++-
>   drivers/net/can/usb/peak_usb/pcan_usb.c       | 2 +-
>   4 files changed, 8 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/net/can/at91_can.c b/drivers/net/can/at91_can.c
> index d98c69045b17..1718c20f9c99 100644
> --- a/drivers/net/can/at91_can.c
> +++ b/drivers/net/can/at91_can.c
> @@ -902,7 +902,8 @@ static void at91_irq_err_state(struct net_device *dev,
>   				CAN_ERR_CRTL_TX_WARNING :
>   				CAN_ERR_CRTL_RX_WARNING;
>   		}
> -	case CAN_STATE_ERROR_WARNING:	/* fallthrough */
> +		/* fall through */
> +	case CAN_STATE_ERROR_WARNING:
>   		/*
>   		 * from: ERROR_ACTIVE, ERROR_WARNING
>   		 * to  : ERROR_PASSIVE, BUS_OFF
> @@ -951,7 +952,8 @@ static void at91_irq_err_state(struct net_device *dev,
>   		netdev_dbg(dev, "Error Active\n");
>   		cf->can_id |= CAN_ERR_PROT;
>   		cf->data[2] = CAN_ERR_PROT_ACTIVE;
> -	case CAN_STATE_ERROR_WARNING:	/* fallthrough */
> +		/* fall through */
> +	case CAN_STATE_ERROR_WARNING:
>   		reg_idr = AT91_IRQ_ERRA | AT91_IRQ_WARN | AT91_IRQ_BOFF;
>   		reg_ier = AT91_IRQ_ERRP;
>   		break;
> diff --git a/drivers/net/can/peak_canfd/peak_pciefd_main.c b/drivers/net/can/peak_canfd/peak_pciefd_main.c
> index c458d5fdc8d3..e4f4d65a76b4 100644
> --- a/drivers/net/can/peak_canfd/peak_pciefd_main.c
> +++ b/drivers/net/can/peak_canfd/peak_pciefd_main.c
> @@ -668,7 +668,7 @@ static int pciefd_can_probe(struct pciefd_board *pciefd)
>   		pciefd_can_writereg(priv, CANFD_CLK_SEL_80MHZ,
>   				    PCIEFD_REG_CAN_CLK_SEL);
>   
> -		/* fallthough */
> +		/* fall through */
>   	case CANFD_CLK_SEL_80MHZ:
>   		priv->ucan.can.clock.freq = 80 * 1000 * 1000;
>   		break;
> diff --git a/drivers/net/can/spi/mcp251x.c b/drivers/net/can/spi/mcp251x.c
> index e90817608645..17257c73c302 100644
> --- a/drivers/net/can/spi/mcp251x.c
> +++ b/drivers/net/can/spi/mcp251x.c
> @@ -875,7 +875,8 @@ static irqreturn_t mcp251x_can_ist(int irq, void *dev_id)
>   			if (new_state >= CAN_STATE_ERROR_WARNING &&
>   			    new_state <= CAN_STATE_BUS_OFF)
>   				priv->can.can_stats.error_warning++;
> -		case CAN_STATE_ERROR_WARNING:	/* fallthrough */
> +			/* fall through */
> +		case CAN_STATE_ERROR_WARNING:
>   			if (new_state >= CAN_STATE_ERROR_PASSIVE &&
>   			    new_state <= CAN_STATE_BUS_OFF)
>   				priv->can.can_stats.error_passive++;
> diff --git a/drivers/net/can/usb/peak_usb/pcan_usb.c b/drivers/net/can/usb/peak_usb/pcan_usb.c
> index 13238a72a338..eca785532b6b 100644
> --- a/drivers/net/can/usb/peak_usb/pcan_usb.c
> +++ b/drivers/net/can/usb/peak_usb/pcan_usb.c
> @@ -423,7 +423,7 @@ static int pcan_usb_decode_error(struct pcan_usb_msg_context *mc, u8 n,
>   			new_state = CAN_STATE_ERROR_WARNING;
>   			break;
>   		}
> -		/* else: fall through */
> +		/* fall through */
>   
>   	case CAN_STATE_ERROR_WARNING:
>   		if (n & PCAN_USB_ERROR_BUS_HEAVY) {
> 


-- 
Nicolas Ferre

^ permalink raw reply

* Re: INFO: task hung in vhost_init_device_iotlb
From: Dmitry Vyukov @ 2019-01-30  8:12 UTC (permalink / raw)
  To: Michael S. Tsirkin
  Cc: syzbot, Jason Wang, KVM list, LKML, netdev, syzkaller-bugs,
	virtualization
In-Reply-To: <20190129105957-mutt-send-email-mst@kernel.org>

On Tue, Jan 29, 2019 at 5:06 PM Michael S. Tsirkin <mst@redhat.com> wrote:
>
> On Tue, Jan 29, 2019 at 01:22:02AM -0800, syzbot wrote:
> > Hello,
> >
> > syzbot found the following crash on:
> >
> > HEAD commit:    983542434e6b Merge tag 'edac_fix_for_5.0' of git://git.ker..
> > git tree:       upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=17476498c00000
> > kernel config:  https://syzkaller.appspot.com/x/.config?x=505743eba4e4f68
> > dashboard link: https://syzkaller.appspot.com/bug?extid=40e28a8bd59d10ed0c42
> > compiler:       gcc (GCC) 9.0.0 20181231 (experimental)
> >
> > Unfortunately, I don't have any reproducer for this crash yet.
>
> Hmm nothing obvious below. Generic corruption elsewhere?

Hard to say, a silent memory corruption is definitely possible.
If there is nothing obvious let's wait, maybe syzbot will come up with
a repro or we get more such hangs so that it will be possible to rule
out flakes/corruptions.


> > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > Reported-by: syzbot+40e28a8bd59d10ed0c42@syzkaller.appspotmail.com
> >
> > protocol 88fb is buggy, dev hsr_slave_1
> > protocol 88fb is buggy, dev hsr_slave_0
> > protocol 88fb is buggy, dev hsr_slave_1
> > protocol 88fb is buggy, dev hsr_slave_0
> > protocol 88fb is buggy, dev hsr_slave_1
> > INFO: task syz-executor5:9417 blocked for more than 140 seconds.
> >       Not tainted 5.0.0-rc3+ #48
> > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > syz-executor5   D27576  9417   8469 0x00000004
> > Call Trace:
> >  context_switch kernel/sched/core.c:2831 [inline]
> >  __schedule+0x897/0x1e60 kernel/sched/core.c:3472
> >  schedule+0xfe/0x350 kernel/sched/core.c:3516
> > protocol 88fb is buggy, dev hsr_slave_0
> > protocol 88fb is buggy, dev hsr_slave_1
> >  schedule_preempt_disabled+0x13/0x20 kernel/sched/core.c:3574
> >  __mutex_lock_common kernel/locking/mutex.c:1002 [inline]
> >  __mutex_lock+0xa3b/0x1670 kernel/locking/mutex.c:1072
> >  mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:1087
> >  vhost_init_device_iotlb+0x124/0x280 drivers/vhost/vhost.c:1606
> >  vhost_net_set_features drivers/vhost/net.c:1674 [inline]
> >  vhost_net_ioctl+0x1282/0x1c00 drivers/vhost/net.c:1739
> >  vfs_ioctl fs/ioctl.c:46 [inline]
> >  file_ioctl fs/ioctl.c:509 [inline]
> >  do_vfs_ioctl+0x107b/0x17d0 fs/ioctl.c:696
> >  ksys_ioctl+0xab/0xd0 fs/ioctl.c:713
> >  __do_sys_ioctl fs/ioctl.c:720 [inline]
> >  __se_sys_ioctl fs/ioctl.c:718 [inline]
> >  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
> >  do_syscall_64+0x1a3/0x800 arch/x86/entry/common.c:290
> > protocol 88fb is buggy, dev hsr_slave_0
> > protocol 88fb is buggy, dev hsr_slave_1
> >  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> > RIP: 0033:0x458099
> > Code: Bad RIP value.
> > RSP: 002b:00007efd7ca9bc78 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> > RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000458099
> > RDX: 0000000020000080 RSI: 000000004008af00 RDI: 0000000000000003
> > RBP: 000000000073bfa0 R08: 0000000000000000 R09: 0000000000000000
> > R10: 0000000000000000 R11: 0000000000000246 R12: 00007efd7ca9c6d4
> > R13: 00000000004c295b R14: 00000000004d5280 R15: 00000000ffffffff
> > INFO: task syz-executor5:9418 blocked for more than 140 seconds.
> >       Not tainted 5.0.0-rc3+ #48
> > "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > syz-executor5   D27800  9418   8469 0x00000004
> > Call Trace:
> >  context_switch kernel/sched/core.c:2831 [inline]
> >  __schedule+0x897/0x1e60 kernel/sched/core.c:3472
> >  schedule+0xfe/0x350 kernel/sched/core.c:3516
> >  schedule_preempt_disabled+0x13/0x20 kernel/sched/core.c:3574
> >  __mutex_lock_common kernel/locking/mutex.c:1002 [inline]
> >  __mutex_lock+0xa3b/0x1670 kernel/locking/mutex.c:1072
> >  mutex_lock_nested+0x16/0x20 kernel/locking/mutex.c:1087
> >  vhost_net_set_owner drivers/vhost/net.c:1697 [inline]
> >  vhost_net_ioctl+0x426/0x1c00 drivers/vhost/net.c:1754
> >  vfs_ioctl fs/ioctl.c:46 [inline]
> >  file_ioctl fs/ioctl.c:509 [inline]
> >  do_vfs_ioctl+0x107b/0x17d0 fs/ioctl.c:696
> >  ksys_ioctl+0xab/0xd0 fs/ioctl.c:713
> >  __do_sys_ioctl fs/ioctl.c:720 [inline]
> >  __se_sys_ioctl fs/ioctl.c:718 [inline]
> >  __x64_sys_ioctl+0x73/0xb0 fs/ioctl.c:718
> >  do_syscall_64+0x1a3/0x800 arch/x86/entry/common.c:290
> >  entry_SYSCALL_64_after_hwframe+0x49/0xbe
> > RIP: 0033:0x458099
> > Code: Bad RIP value.
> > RSP: 002b:00007efd7ca7ac78 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> > RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000458099
> > RDX: 0000000000000000 RSI: 000040010000af01 RDI: 0000000000000003
> > RBP: 000000000073c040 R08: 0000000000000000 R09: 0000000000000000
> > R10: 0000000000000000 R11: 0000000000000246 R12: 00007efd7ca7b6d4
> > R13: 00000000004c33a4 R14: 00000000004d5e80 R15: 00000000ffffffff
> >
> > Showing all locks held in the system:
> > 1 lock held by khungtaskd/1040:
> >  #0: 00000000b7479fbe (rcu_read_lock){....}, at:
> > debug_show_all_locks+0xc6/0x41d kernel/locking/lockdep.c:4389
> > 1 lock held by rsyslogd/8285:
> >  #0: 000000006d9ccf7d (&f->f_pos_lock){+.+.}, at: __fdget_pos+0x1b3/0x1f0
> > fs/file.c:795
> > 2 locks held by getty/8406:
> >  #0: 00000000052e805b (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x33/0x40
> > drivers/tty/tty_ldsem.c:341
> >  #1: 00000000b90dc267 (&ldata->atomic_read_lock){+.+.}, at:
> > n_tty_read+0x30a/0x1eb0 drivers/tty/n_tty.c:2154
> > 2 locks held by getty/8407:
> >  #0: 000000009fdef632 (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x33/0x40
> > drivers/tty/tty_ldsem.c:341
> >  #1: 00000000ff2b1a16 (&ldata->atomic_read_lock){+.+.}, at:
> > n_tty_read+0x30a/0x1eb0 drivers/tty/n_tty.c:2154
> > 2 locks held by getty/8408:
> >  #0: 00000000e48a8e78 (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x33/0x40
> > drivers/tty/tty_ldsem.c:341
> >  #1: 000000008fcf2060 (&ldata->atomic_read_lock){+.+.}, at:
> > n_tty_read+0x30a/0x1eb0 drivers/tty/n_tty.c:2154
> > 2 locks held by getty/8409:
> >  #0: 0000000063f3f4f5 (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x33/0x40
> > drivers/tty/tty_ldsem.c:341
> >  #1: 000000001dc973ca (&ldata->atomic_read_lock){+.+.}, at:
> > n_tty_read+0x30a/0x1eb0 drivers/tty/n_tty.c:2154
> > 2 locks held by getty/8410:
> >  #0: 00000000f3c14150 (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x33/0x40
> > drivers/tty/tty_ldsem.c:341
> >  #1: 000000007987cec5 (&ldata->atomic_read_lock){+.+.}, at:
> > n_tty_read+0x30a/0x1eb0 drivers/tty/n_tty.c:2154
> > 2 locks held by getty/8411:
> >  #0: 00000000d04f4305 (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x33/0x40
> > drivers/tty/tty_ldsem.c:341
> >  #1: 000000003f47e3a6 (&ldata->atomic_read_lock){+.+.}, at:
> > n_tty_read+0x30a/0x1eb0 drivers/tty/n_tty.c:2154
> > 2 locks held by getty/8412:
> >  #0: 0000000082430560 (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x33/0x40
> > drivers/tty/tty_ldsem.c:341
> >  #1: 0000000094609d81 (&ldata->atomic_read_lock){+.+.}, at:
> > n_tty_read+0x30a/0x1eb0 drivers/tty/n_tty.c:2154
> > 2 locks held by syz-executor5/9417:
> >  #0: 0000000020a0f0a1 (&dev->mutex#4){+.+.}, at: vhost_net_set_features
> > drivers/vhost/net.c:1668 [inline]
> >  #0: 0000000020a0f0a1 (&dev->mutex#4){+.+.}, at:
> > vhost_net_ioctl+0x204/0x1c00 drivers/vhost/net.c:1739
> >  #1: 00000000a7b5872b (&vq->mutex){+.+.}, at:
> > vhost_init_device_iotlb+0x124/0x280 drivers/vhost/vhost.c:1606
> > 1 lock held by syz-executor5/9418:
> >  #0: 0000000020a0f0a1 (&dev->mutex#4){+.+.}, at: vhost_net_set_owner
> > drivers/vhost/net.c:1697 [inline]
> >  #0: 0000000020a0f0a1 (&dev->mutex#4){+.+.}, at:
> > vhost_net_ioctl+0x426/0x1c00 drivers/vhost/net.c:1754
> > 1 lock held by vhost-9408/9413:
> >
> > =============================================
> >
> > NMI backtrace for cpu 0
> > CPU: 0 PID: 1040 Comm: khungtaskd Not tainted 5.0.0-rc3+ #48
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> > Google 01/01/2011
> > Call Trace:
> >  __dump_stack lib/dump_stack.c:77 [inline]
> >  dump_stack+0x1db/0x2d0 lib/dump_stack.c:113
> >  nmi_cpu_backtrace.cold+0x63/0xa4 lib/nmi_backtrace.c:101
> >  nmi_trigger_cpumask_backtrace+0x1be/0x236 lib/nmi_backtrace.c:62
> >  arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38
> >  trigger_all_cpu_backtrace include/linux/nmi.h:146 [inline]
> >  check_hung_uninterruptible_tasks kernel/hung_task.c:203 [inline]
> >  watchdog+0xbbb/0x1170 kernel/hung_task.c:287
> >  kthread+0x357/0x430 kernel/kthread.c:246
> >  ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:352
> > Sending NMI from CPU 0 to CPUs 1:
> > NMI backtrace for cpu 1
> > CPU: 1 PID: 7 Comm: kworker/u4:0 Not tainted 5.0.0-rc3+ #48
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> > Google 01/01/2011
> > Workqueue: bat_events batadv_nc_worker
> > RIP: 0010:__sanitizer_cov_trace_const_cmp1+0x15/0x20 kernel/kcov.c:174
> > Code: 00 48 89 e5 48 8b 4d 08 e8 18 ff ff ff 5d c3 66 0f 1f 44 00 00 55 40
> > 0f b6 d6 40 0f b6 f7 bf 01 00 00 00 48 89 e5 48 8b 4d 08 <e8> f6 fe ff ff 5d
> > c3 0f 1f 40 00 55 0f b7 d6 0f b7 f7 bf 03 00 00
> > RSP: 0018:ffff8880a947f8a8 EFLAGS: 00000246
> > RAX: ffff8880a94701c0 RBX: ffff8880a05efc40 RCX: ffffffff87d36c97
> > RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000001
> > RBP: ffff8880a947f8a8 R08: ffff8880a94701c0 R09: ffffed1015ce5b90
> > R10: ffffed1015ce5b8f R11: ffff8880ae72dc7b R12: 0000000000000000
> > R13: 0000000000000000 R14: 000000000000019e R15: dffffc0000000000
> > FS:  0000000000000000(0000) GS:ffff8880ae700000(0000) knlGS:0000000000000000
> > CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > CR2: ffffffffff600400 CR3: 00000000a005a000 CR4: 00000000001426e0
> > DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> > DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> > Call Trace:
> >  rcu_read_unlock include/linux/rcupdate.h:657 [inline]
> >  batadv_nc_purge_orig_hash net/batman-adv/network-coding.c:423 [inline]
> >  batadv_nc_worker+0x2f7/0x920 net/batman-adv/network-coding.c:730
> >  process_one_work+0xd0c/0x1ce0 kernel/workqueue.c:2153
> >  worker_thread+0x143/0x14a0 kernel/workqueue.c:2296
> >  kthread+0x357/0x430 kernel/kthread.c:246
> >  ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:352
> >
> >
> > ---
> > This bug is generated by a bot. It may contain errors.
> > See https://goo.gl/tpsmEJ for more information about syzbot.
> > syzbot engineers can be reached at syzkaller@googlegroups.com.
> >
> > syzbot will keep track of this bug report. See:
> > https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with
> > syzbot.
>
> --
> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/20190129105957-mutt-send-email-mst%40kernel.org.
> For more options, visit https://groups.google.com/d/optout.

^ permalink raw reply

* Re: [RFC PATCH net-next 6/6 v2] net/sched: act_ct: Add tc recirc id set/del support
From: Paul Blakey @ 2019-01-30  8:20 UTC (permalink / raw)
  To: Marcelo Leitner
  Cc: Paul Blakey, Guy Shattah, Aaron Conole, John Hurley, Simon Horman,
	Justin Pettit, Gregory Rose, Eelco Chaudron, Flavio Leitner,
	Florian Westphal, Jiri Pirko, Rashid Khan, Sushil Kulkarni,
	Andy Gospodarek, Roi Dayan, Yossi Kuperman, Or Gerlitz,
	Rony Efraim, davem@davemloft.net, netdev@vger.kernel.org
In-Reply-To: <20190129181221.GY10660@localhost.localdomain>



On 29/01/2019 20:12, Marcelo Leitner wrote:
> On Tue, Jan 29, 2019 at 10:02:06AM +0200, Paul Blakey wrote:
>> Set or clears (free) the skb tc recirc id extension.
>> If used with OVS, OVS can clear this recirc id after it reads it.
>>
>> Signed-off-by: Paul Blakey <paulb@mellanox.com>
>> ---
>>  include/net/tc_act/tc_ct.h        |  2 ++
>>  include/uapi/linux/tc_act/tc_ct.h |  2 ++
>>  net/sched/act_ct.c                | 18 ++++++++++++++++++
>>  3 files changed, 22 insertions(+)
>>
>> diff --git a/include/net/tc_act/tc_ct.h b/include/net/tc_act/tc_ct.h
>> index 4a16375..6ea19d8 100644
>> --- a/include/net/tc_act/tc_ct.h
>> +++ b/include/net/tc_act/tc_ct.h
>> @@ -16,6 +16,8 @@ struct tcf_ct {
>>  	u32 mark_mask;
>>  	u16 zone;
>>  	bool commit;
>> +	uint32_t set_recirc;
>> +	bool del_recirc;
>>  };
>>  
>>  #define to_ct(a) ((struct tcf_ct *)a)
>> diff --git a/include/uapi/linux/tc_act/tc_ct.h b/include/uapi/linux/tc_act/tc_ct.h
>> index 6dbd771..2279a9b 100644
>> --- a/include/uapi/linux/tc_act/tc_ct.h
>> +++ b/include/uapi/linux/tc_act/tc_ct.h
>> @@ -14,6 +14,8 @@ struct tc_ct {
>>  	__u32 labels_mask[4];
>>  	__u32 mark;
>>  	__u32 mark_mask;
>> +	uint32_t set_recirc;
>> +	bool del_recirc;
>>  	bool commit;
>>  };
> 
> Have you considered adding a specific action for this? Asking because
> setting recirc_id can be useful outside of ct context too: decap, set
> recirc_id, reclassify.
> 

Yes I did, I didn't know if it was overkill, till a use case such as you
suggested arises beside CT.

One thing I'm not sure of is, who will delete the recirc id, it seems
it's freed only via skb_ext_put(skb) at skb_release_head_state()
But I might be missing a del from skb_scrub_packet like nf_reset does.

>>  
>> diff --git a/net/sched/act_ct.c b/net/sched/act_ct.c
>> index 61155cc..7822385 100644
>> --- a/net/sched/act_ct.c
>> +++ b/net/sched/act_ct.c
>> @@ -117,6 +117,11 @@ static int tcf_ct_act(struct sk_buff *skb, const struct tc_action *a,
>>  	u_int8_t family;
>>  	bool cached;
>>  
>> +	if (ca->del_recirc) {
>> +		skb_ext_del(skb, SKB_EXT_TC_RECIRC_ID);
>> +		return ca->tcf_action;
>> +	}
>> +
>>  	/* The conntrack module expects to be working at L3. */
>>  	nh_ofs = skb_network_offset(skb);
>>  	skb_pull_rcsum(skb, nh_ofs);
>> @@ -243,6 +248,15 @@ static int tcf_ct_act(struct sk_buff *skb, const struct tc_action *a,
>>  	skb_postpush_rcsum(skb, skb->data, nh_ofs);
>>  
>>  	spin_unlock(&ca->tcf_lock);
>> +
>> +	if (ca->set_recirc) {
>> +		u32 recirc = ca->set_recirc;
>> +		uint32_t *recircp = skb_ext_add(skb, SKB_EXT_TC_RECIRC_ID);
>> +
>> +		if (recircp)
>> +			*recircp = recirc;
>> +	}
>> +
>>  	return ca->tcf_action;
>>  
>>  drop:
>> @@ -305,6 +319,8 @@ static int tcf_ct_init(struct net *net, struct nlattr *nla,
>>  		ci->net = net;
>>  		ci->commit = parm->commit;
>>  		ci->zone = parm->zone;
>> +		ci->set_recirc = parm->set_recirc;
>> +		ci->del_recirc = parm->del_recirc;
>>  #if !IS_ENABLED(CONFIG_NF_CONNTRACK_MARK)
>>  		if (parm->mark_mask) {
>>  			NL_SET_ERR_MSG_MOD(extack, "Mark not supported by kernel config");
>> @@ -378,6 +394,8 @@ static inline int tcf_ct_dump(struct sk_buff *skb, struct tc_action *a,
>>  	opt.mark_mask = ci->mark_mask,
>>  	memcpy(opt.labels, ci->labels, sizeof(opt.labels));
>>  	memcpy(opt.labels_mask, ci->labels_mask, sizeof(opt.labels_mask));
>> +	opt.set_recirc = ci->set_recirc;
>> +	opt.del_recirc = ci->del_recirc;
>>  
>>  	if (nla_put(skb, TCA_CT_PARMS, sizeof(opt), &opt))
>>  		goto nla_put_failure;
>> -- 
>> 1.8.3.1
>>

^ permalink raw reply

* Re: [RFC PATCH net-next 3/6 v2] net/sched: cls_flower: Add ematch support
From: Paul Blakey @ 2019-01-30  8:40 UTC (permalink / raw)
  To: Marcelo Leitner
  Cc: Paul Blakey, Guy Shattah, Aaron Conole, John Hurley, Simon Horman,
	Justin Pettit, Gregory Rose, Eelco Chaudron, Flavio Leitner,
	Florian Westphal, Jiri Pirko, Rashid Khan, Sushil Kulkarni,
	Andy Gospodarek, Roi Dayan, Yossi Kuperman, Or Gerlitz,
	Rony Efraim, davem@davemloft.net, netdev@vger.kernel.org
In-Reply-To: <20190129180810.GX10660@localhost.localdomain>



On 29/01/2019 20:08, Marcelo Leitner wrote:
> On Tue, Jan 29, 2019 at 10:02:03AM +0200, Paul Blakey wrote:
>> TODO: handle EEXist.
>>
>> Signed-off-by: Paul Blakey <paulb@mellanox.com>
>> ---
>>  include/uapi/linux/pkt_cls.h |  2 ++
>>  net/sched/cls_flower.c       | 22 ++++++++++++++++++----
>>  2 files changed, 20 insertions(+), 4 deletions(-)
>>
>> diff --git a/include/uapi/linux/pkt_cls.h b/include/uapi/linux/pkt_cls.h
>> index 121f1ef..d848d6d 100644
>> --- a/include/uapi/linux/pkt_cls.h
>> +++ b/include/uapi/linux/pkt_cls.h
>> @@ -506,6 +506,8 @@ enum {
>>  	TCA_FLOWER_KEY_CT_LABELS,
>>  	TCA_FLOWER_KEY_CT_LABELS_MASK,
>>  
>> +	TCA_FLOWER_EMATCHES,
>> +
>>  	__TCA_FLOWER_MAX,
>>  };
>>  
>> diff --git a/net/sched/cls_flower.c b/net/sched/cls_flower.c
>> index bf74a31..f11fda0 100644
>> --- a/net/sched/cls_flower.c
>> +++ b/net/sched/cls_flower.c
>> @@ -104,6 +104,7 @@ struct cls_fl_filter {
>>  	struct rhash_head ht_node;
>>  	struct fl_flow_key mkey;
>>  	struct tcf_exts exts;
>> +	struct tcf_ematch_tree ematches;
>>  	struct tcf_result res;
>>  	struct fl_flow_key key;
>>  	struct list_head list;
>> @@ -332,10 +333,14 @@ static int fl_classify(struct sk_buff *skb, const struct tcf_proto *tp,
>>  		fl_set_masked_key(&skb_mkey, &skb_key, mask);
>>  
>>  		f = fl_lookup(mask, &skb_mkey, &skb_key);
>> -		if (f && !tc_skip_sw(f->flags)) {
>> -			*res = f->res;
>> -			return tcf_exts_exec(skb, &f->exts, res);
>> -		}
>> +		if (!f || tc_skip_sw(f->flags))
>> +			continue;
>> +
>> +		if (!tcf_em_tree_match(skb, &f->ematches, NULL))
>> +			continue;
> 
> Considering just the recirc_id (and not the other fields supported by
> ematch), have you considered integrating recirc_id match on flow
> dissector instead?  It would avoid the matching in 2 steps here and
> benefit from the hashing.
> 

yes,
although ematch is no op if not used, I actually have to convert flower
to a rhl hashtable as we can have the flower keys but different ematches
which is a pointer (and why I have the TODO in the commit msg), then all
similar flows , different only by recirc id ematch, could be on the same
list, and this would be slow. I'm not sure how real this example is, but
I agree.

So I'll change it for next patch, unless someone thinks different.

Thanks.

>> +
>> +		*res = f->res;
>> +		return tcf_exts_exec(skb, &f->exts, res);
>>  	}
>>  	return -1;
>>  }
>> @@ -388,6 +393,7 @@ static bool fl_mask_put(struct cls_fl_head *head, struct fl_flow_mask *mask,
>>  static void __fl_destroy_filter(struct cls_fl_filter *f)
>>  {
>>  	tcf_exts_destroy(&f->exts);
>> +	tcf_em_tree_destroy(&f->ematches);
>>  	tcf_exts_put_net(&f->exts);
>>  	kfree(f);
>>  }
>> @@ -523,6 +529,7 @@ static void *fl_get(struct tcf_proto *tp, u32 handle)
>>  static const struct nla_policy fl_policy[TCA_FLOWER_MAX + 1] = {
>>  	[TCA_FLOWER_UNSPEC]		= { .type = NLA_UNSPEC },
>>  	[TCA_FLOWER_CLASSID]		= { .type = NLA_U32 },
>> +	[TCA_FLOWER_EMATCHES]		= { .type = NLA_NESTED },
>>  	[TCA_FLOWER_INDEV]		= { .type = NLA_STRING,
>>  					    .len = IFNAMSIZ },
>>  	[TCA_FLOWER_KEY_ETH_DST]	= { .len = ETH_ALEN },
>> @@ -1348,6 +1355,10 @@ static int fl_set_parms(struct net *net, struct tcf_proto *tp,
>>  	if (err < 0)
>>  		return err;
>>  
>> +	err = tcf_em_tree_validate(tp, tb[TCA_FLOWER_EMATCHES], &f->ematches);
>> +	if (err < 0)
>> +		return err;
>> +
>>  	if (tb[TCA_FLOWER_CLASSID]) {
>>  		f->res.classid = nla_get_u32(tb[TCA_FLOWER_CLASSID]);
>>  		tcf_bind_filter(tp, &f->res, base);
>> @@ -2143,6 +2154,9 @@ static int fl_dump(struct net *net, struct tcf_proto *tp, void *fh,
>>  	    nla_put_u32(skb, TCA_FLOWER_CLASSID, f->res.classid))
>>  		goto nla_put_failure;
>>  
>> +	if (tcf_em_tree_dump(skb, &f->ematches, TCA_FLOWER_EMATCHES) < 0)
>> +		goto nla_put_failure;
>> +
>>  	key = &f->key;
>>  	mask = &f->mask->key;
>>  
>> -- 
>> 1.8.3.1
>>

^ permalink raw reply

* Re: [PATCH net-next 00/24] sctp: support SCTP_FUTURE/CURRENT/ALL_ASSOC
From: David Miller @ 2019-01-30  8:44 UTC (permalink / raw)
  To: nhorman; +Cc: lucien.xin, netdev, linux-sctp, marcelo.leitner
In-Reply-To: <20190130074605.GB6120@neilslaptop.think-freely.org>

From: Neil Horman <nhorman@tuxdriver.com>
Date: Wed, 30 Jan 2019 02:46:05 -0500

> Ok, Dave, thank you for waiting on me for this, I've looked at this series, and
> after Xins explination on my question, I've no issue with this change:
> 
> Acked-by: Neil Horman <nhorman@tuxdriver.com>

Awesome, series applied.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox