Netdev List

Netdev List
 help / color / mirror / Atom feed

* [PATCH 05/53] netfilter: expect: Make sure the max_expected limit is effective
From: Pablo Neira Ayuso @ 2017-05-01 10:46 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <1493635640-24325-1-git-send-email-pablo@netfilter.org>

From: Gao Feng <fgao@ikuai8.com>

Because the type of expecting, the member of nf_conn_help, is u8, it
would overflow after reach U8_MAX(255). So it doesn't work when we
configure the max_expected exceeds 255 with expect policy.

Now add the check for max_expected. Return the -EINVAL when it exceeds
the limit.

Signed-off-by: Gao Feng <fgao@ikuai8.com>
Signed-off-by: Pablo Neira Ayuso <pablo@netfilter.org>
---
 include/net/netfilter/nf_conntrack_expect.h | 1 +
 net/netfilter/nf_conntrack_helper.c         | 3 +++
 net/netfilter/nf_conntrack_irc.c            | 6 ++++++
 net/netfilter/nfnetlink_cthelper.c          | 6 ++++++
 4 files changed, 16 insertions(+)

diff --git a/include/net/netfilter/nf_conntrack_expect.h b/include/net/netfilter/nf_conntrack_expect.h
index 65cc2cb005d9..e84df8d3bf37 100644
--- a/include/net/netfilter/nf_conntrack_expect.h
+++ b/include/net/netfilter/nf_conntrack_expect.h
@@ -73,6 +73,7 @@ struct nf_conntrack_expect_policy {
 };
 
 #define NF_CT_EXPECT_CLASS_DEFAULT	0
+#define NF_CT_EXPECT_MAX_CNT		255
 
 int nf_conntrack_expect_pernet_init(struct net *net);
 void nf_conntrack_expect_pernet_fini(struct net *net);
diff --git a/net/netfilter/nf_conntrack_helper.c b/net/netfilter/nf_conntrack_helper.c
index 6dc44d9b4190..752a977e9eef 100644
--- a/net/netfilter/nf_conntrack_helper.c
+++ b/net/netfilter/nf_conntrack_helper.c
@@ -385,6 +385,9 @@ int nf_conntrack_helper_register(struct nf_conntrack_helper *me)
 	BUG_ON(me->expect_class_max >= NF_CT_MAX_EXPECT_CLASSES);
 	BUG_ON(strlen(me->name) > NF_CT_HELPER_NAME_LEN - 1);
 
+	if (me->expect_policy->max_expected > NF_CT_EXPECT_MAX_CNT)
+		return -EINVAL;
+
 	mutex_lock(&nf_ct_helper_mutex);
 	hlist_for_each_entry(cur, &nf_ct_helper_hash[h], hnode) {
 		if (nf_ct_tuple_src_mask_cmp(&cur->tuple, &me->tuple, &mask)) {
diff --git a/net/netfilter/nf_conntrack_irc.c b/net/netfilter/nf_conntrack_irc.c
index 1972a149f958..1a5af4d4af2d 100644
--- a/net/netfilter/nf_conntrack_irc.c
+++ b/net/netfilter/nf_conntrack_irc.c
@@ -243,6 +243,12 @@ static int __init nf_conntrack_irc_init(void)
 		return -EINVAL;
 	}
 
+	if (max_dcc_channels > NF_CT_EXPECT_MAX_CNT) {
+		pr_err("max_dcc_channels must not be more than %u\n",
+		       NF_CT_EXPECT_MAX_CNT);
+		return -EINVAL;
+	}
+
 	irc_exp_policy.max_expected = max_dcc_channels;
 	irc_exp_policy.timeout = dcc_timeout;
 
diff --git a/net/netfilter/nfnetlink_cthelper.c b/net/netfilter/nfnetlink_cthelper.c
index d45558178da5..d5025cc25df3 100644
--- a/net/netfilter/nfnetlink_cthelper.c
+++ b/net/netfilter/nfnetlink_cthelper.c
@@ -150,6 +150,9 @@ nfnl_cthelper_expect_policy(struct nf_conntrack_expect_policy *expect_policy,
 		nla_data(tb[NFCTH_POLICY_NAME]), NF_CT_HELPER_NAME_LEN);
 	expect_policy->max_expected =
 		ntohl(nla_get_be32(tb[NFCTH_POLICY_EXPECT_MAX]));
+	if (expect_policy->max_expected > NF_CT_EXPECT_MAX_CNT)
+		return -EINVAL;
+
 	expect_policy->timeout =
 		ntohl(nla_get_be32(tb[NFCTH_POLICY_EXPECT_TIMEOUT]));
 
@@ -290,6 +293,9 @@ nfnl_cthelper_update_policy_one(const struct nf_conntrack_expect_policy *policy,
 
 	new_policy->max_expected =
 		ntohl(nla_get_be32(tb[NFCTH_POLICY_EXPECT_MAX]));
+	if (new_policy->max_expected > NF_CT_EXPECT_MAX_CNT)
+		return -EINVAL;
+
 	new_policy->timeout =
 		ntohl(nla_get_be32(tb[NFCTH_POLICY_EXPECT_TIMEOUT]));
 
-- 
2.1.4


^ permalink raw reply related

* [PATCH 02/53] netfilter: ipvs: Replace kzalloc with kcalloc.
From: Pablo Neira Ayuso @ 2017-05-01 10:46 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <1493635640-24325-1-git-send-email-pablo@netfilter.org>

From: Varsha Rao <rvarsha016@gmail.com>

Replace kzalloc with kcalloc. As kcalloc is preferred for allocating an
array instead of kzalloc. This patch fixes the checkpatch issue.

Signed-off-by: Varsha Rao <rvarsha016@gmail.com>
---
 net/netfilter/ipvs/ip_vs_sync.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/netfilter/ipvs/ip_vs_sync.c b/net/netfilter/ipvs/ip_vs_sync.c
index b03c28084f81..30d6b2cc00a0 100644
--- a/net/netfilter/ipvs/ip_vs_sync.c
+++ b/net/netfilter/ipvs/ip_vs_sync.c
@@ -1849,7 +1849,7 @@ int start_sync_thread(struct netns_ipvs *ipvs, struct ipvs_sync_daemon_cfg *c,
 	if (state == IP_VS_STATE_MASTER) {
 		struct ipvs_master_sync_state *ms;
 
-		ipvs->ms = kzalloc(count * sizeof(ipvs->ms[0]), GFP_KERNEL);
+		ipvs->ms = kcalloc(count, sizeof(ipvs->ms[0]), GFP_KERNEL);
 		if (!ipvs->ms)
 			goto out;
 		ms = ipvs->ms;
@@ -1862,7 +1862,7 @@ int start_sync_thread(struct netns_ipvs *ipvs, struct ipvs_sync_daemon_cfg *c,
 			ms->ipvs = ipvs;
 		}
 	} else {
-		array = kzalloc(count * sizeof(struct task_struct *),
+		array = kcalloc(count, sizeof(struct task_struct *),
 				GFP_KERNEL);
 		if (!array)
 			goto out;
-- 
2.1.4


^ permalink raw reply related

* [PATCH 01/53] netfilter: ipvs: don't check for presence of nat extension
From: Pablo Neira Ayuso @ 2017-05-01 10:46 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev
In-Reply-To: <1493635640-24325-1-git-send-email-pablo@netfilter.org>

From: Florian Westphal <fw@strlen.de>

Check for the NAT status bits, they are set once conntrack needs NAT in source or
reply direction, this is slightly faster than nfct_nat() as that has to check the
extension area.

Signed-off-by: Florian Westphal <fw@strlen.de>
---
 net/netfilter/ipvs/ip_vs_ftp.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/netfilter/ipvs/ip_vs_ftp.c b/net/netfilter/ipvs/ip_vs_ftp.c
index d30c327bb578..2e2bf7428cd1 100644
--- a/net/netfilter/ipvs/ip_vs_ftp.c
+++ b/net/netfilter/ipvs/ip_vs_ftp.c
@@ -260,7 +260,7 @@ static int ip_vs_ftp_out(struct ip_vs_app *app, struct ip_vs_conn *cp,
 		buf_len = strlen(buf);
 
 		ct = nf_ct_get(skb, &ctinfo);
-		if (ct && !nf_ct_is_untracked(ct) && nfct_nat(ct)) {
+		if (ct && !nf_ct_is_untracked(ct) && (ct->status & IPS_NAT_MASK)) {
 			/* If mangling fails this function will return 0
 			 * which will cause the packet to be dropped.
 			 * Mangling can only fail under memory pressure,
-- 
2.1.4


^ permalink raw reply related

* [PATCH 00/53] Netfilter/IPVS updates for net-next
From: Pablo Neira Ayuso @ 2017-05-01 10:46 UTC (permalink / raw)
  To: netfilter-devel; +Cc: davem, netdev

Hi David,

The following patchset contains Netfilter updates for your net-next
tree. A large bunch of code cleanups, XXX they are:

1) Check for ct->status bit instead of using nfct_nat() from IPVS and
   Netfilter codebase, patch from Florian Westphal.

2) Use kcalloc() wherever possible in the IPVS code, from Varsha Rao.

3) Simplify FTP IPVS helper module registration path, from Arushi Singhal.

4) Introduce nft_is_base_chain() helper function.

5) Enforce expectation limit from userspace conntrack helper,
   from Gao Feng.

6) Add nf_ct_remove_expect() helper function, from Gao Feng.

7) NAT mangle helper function return boolean, from Gao Feng.

8) ctnetlink_alloc_expect() should only work for conntrack with
   helpers, from Gao Feng.

9) Add nfnl_msg_type() helper function to nfnetlink to build the
   netlink message type.

10) Get rid of unnecessary cast on void, from simran singhal.

11) Use seq_puts()/seq_putc() instead of seq_printf() where possible,
    also from simran singhal.

12) Use list_prev_entry() from nf_tables, from simran signhal.

13) Remove unnecessary & on pointer function in the Netfilter and IPVS
    code.

14) Remove obsolete comment on set of rules per CPU in ip6_tables,
    no longer true. From Arushi Singhal.

15) Remove duplicated nf_conntrack_l4proto_udplite4, from Gao Feng.

16) Remove unnecessary nested rcu_read_lock() in
    __nf_nat_decode_session(). Code running from hooks are already
    guaranteed to run under RCU read side.

17) Remove deadcode in nf_tables_getobj(), from Aaron Conole.

18) Remove double assignment in nf_ct_l4proto_pernet_unregister_one(),
    also from Aaron.

19) Get rid of unsed __ip_set_get_netlink(), from Aaron Conole.

20) Don't propagate NF_DROP error to userspace via ctnetlink in
    __nf_nat_alloc_null_binding() function, from Gao Feng.

21) Revisit nf_ct_deliver_cached_events() to remove unnecessary checks,
    from Gao Feng.

22) Kill the fake untracked conntrack objects, use ctinfo instead to
    annotate a conntrack object is untracked, from Florian Westphal.

23) Remove nf_ct_is_untracked(), now obsolete since we have no
    conntrack template anymore, from Florian.

24) Add event mask support to nft_ct, also from Florian.

25) Move nf_conn_help structure to
    include/net/netfilter/nf_conntrack_helper.h.

26) Add a fixed 32 bytes scratchpad area for conntrack helpers.
    Thus, we don't deal with variable conntrack extensions anymore.
    Make sure userspace conntrack helper doesn't go over that size.
    Remove variable size ct extension infrastructure now this code
    got no more clients. From Florian Westphal.

27) Restore offset and length of nf_ct_ext structure to 8 bytes now
    that wraparound is not possible any longer, also from Florian.

28) Allow to get rid of unassured flows under stress in conntrack,
    this applies to DCCP, SCTP and TCP protocols, from Florian.

29) Shrink size of nf_conntrack_ecache structure, from Florian.

30) Use TCP_MAX_WSCALE instead of hardcoded 14 in TCP tracker,
    from Gao Feng.

31) Register SYNPROXY hooks on demand, from Florian Westphal.

32) Use pernet hook whenever possible, instead of global hook
    registration, from Florian Westphal.

33) Pass hook structure to ebt_register_table() to consolidate some
    infrastructure code, from Florian Westphal.

34) Use consume_skb() and return NF_STOLEN, instead of NF_DROP in the
    SYNPROXY code, to make sure device stats are not fooled, patch
    from Gao Feng.

35) Remove NF_CT_EXT_F_PREALLOC this kills quite some code that we
    don't need anymore if we just select a fixed size instead of
    expensive runtime time calculation of this. From Florian.

36) Constify nf_ct_extend_register() and nf_ct_extend_unregister(),
    from Florian.

37) Simplify nf_ct_ext_add(), this kills nf_ct_ext_create(), from
    Florian.

38) Attach NAT extension on-demand from masquerade and pptp helper
    path, from Florian.

39) Get rid of useless ip_vs_set_state_timeout(), from Aaron Conole.

40) Speed up netns by selective calls of synchronize_net(), from
    Florian Westphal.

41) Silence stack size warning gcc in 32-bit arch in snmp helper,
    from Florian.

42) Inconditionally call nf_ct_ext_destroy(), even if we have no
    extensions, to deal with the NF_NAT_MANIP_SRC case. Patch from
    Liping Zhang.

You can pull these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next.git

Thanks!

----------------------------------------------------------------

The following changes since commit 6f14f443d3e773439fb9cc6f2685ba90d5d026c5:

  Merge git://git.kernel.org/pub/scm/linux/kernel/git/davem/net (2017-04-06 08:24:51 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next.git HEAD

for you to fetch changes up to 8eeef2350453aa012d846457eb6ecd012a35d99b:

  netfilter: nf_ct_ext: invoke destroy even when ext is not attached (2017-05-01 11:48:49 +0200)

----------------------------------------------------------------
Aaron Conole (5):
      netfilter: nf_tables: remove double return statement
      netfilter: nf_conntrack: remove double assignment
      ipset: remove unused function __ip_set_get_netlink
      ipvs: remove unused function ip_vs_set_state_timeout
      ipvs: change comparison on sync_refresh_period

Arushi Singhal (3):
      ipvs: remove unused variable
      netfilter: Remove exceptional & on function name
      netfilter: ip6_tables: Remove unneccessary comments

Florian Westphal (28):
      netfilter: ipvs: don't check for presence of nat extension
      netfilter: nat: avoid use of nf_conn_nat extension
      netfilter: kill the fake untracked conntrack objects
      netfilter: remove nf_ct_is_untracked
      netfilter: nft_ct: allow to set ctnetlink event types of a connection
      netfilter: conntrack: move helper struct to nf_conntrack_helper.h
      netfilter: helper: add build-time asserts for helper data size
      netfilter: nfnetlink_cthelper: reject too large userspace allocation requests
      netfilter: helpers: remove data_len usage for inkernel helpers
      netfilter: remove last traces of variable-sized extensions
      netfilter: conntrack: use u8 for extension sizes again
      netfilter: allow early drop of assured conntracks
      nefilter: eache: reduce struct size from 32 to 24 byte
      netfilter: ipvs: fix incorrect conflict resolution
      netfilter: synproxy: only register hooks when needed
      ipvs: convert to use pernet nf_hook api
      netfilter: decnet: only register hooks in init namespace
      ebtables: remove nf_hook_register usage
      netfilter: conntrack: remove prealloc support
      netfilter: conntrack: mark extension structs as const
      netfilter: conntrack: handle initial extension alloc via krealloc
      netfilter: masquerade: attach nat extension if not present
      netfilter: pptp: attach nat extension when needed
      netfilter: don't attach a nat extension by default
      netfilter: batch synchronize_net calls during hook unregister
      netfilter: nf_log: don't call synchronize_rcu in nf_log_unset
      netfilter: nf_queue: only call synchronize_net twice if nf_queue is active
      netfilter: snmp: avoid stack size warning

Gao Feng (9):
      netfilter: expect: Make sure the max_expected limit is effective
      netfilter: nf_ct_expect: Add nf_ct_remove_expect()
      netfilter: nat: nf_nat_mangle_{udp,tcp}_packet returns boolean
      netfilter: ctnetlink: Expectations must have a conntrack helper area
      netfilter: udplite: Remove duplicated udplite4/6 declaration
      netfilter: nf_nat: Fix return NF_DROP in nfnetlink_parse_nat_setup
      netfilter: ecache: Refine the nf_ct_deliver_cached_events
      netfilter: tcp: Use TCP_MAX_WSCALE instead of literal 14
      netfilter: SYNPROXY: Return NF_STOLEN instead of NF_DROP during handshaking

Liping Zhang (1):
      netfilter: nf_ct_ext: invoke destroy even when ext is not attached

Pablo Neira Ayuso (4):
      netfilter: nf_tables: add nft_is_base_chain() helper
      netfilter: Add nfnl_msg_type() helper function
      Merge tag 'ipvs2-for-v4.12' of https://git.kernel.org/.../horms/ipvs-next
      Merge tag 'ipvs3-for-v4.12' of http://git.kernel.org/.../horms/ipvs-next

Taehee Yoo (1):
      netfilter: nat: remove rcu_read_lock in __nf_nat_decode_session.

Varsha Rao (1):
      netfilter: ipvs: Replace kzalloc with kcalloc.

simran singhal (3):
      netfilter: Remove unnecessary cast on void pointer
      netfilter: Use seq_puts()/seq_putc() where possible
      net: netfilter: Use list_{next/prev}_entry instead of list_entry

 include/linux/netfilter/nfnetlink.h                |   5 +
 include/linux/netfilter_bridge/ebtables.h          |   6 +-
 include/net/ip_vs.h                                |  12 +-
 include/net/netfilter/ipv4/nf_conntrack_ipv4.h     |   1 -
 include/net/netfilter/ipv6/nf_conntrack_ipv6.h     |   1 -
 include/net/netfilter/nf_conntrack.h               |  32 ------
 include/net/netfilter/nf_conntrack_core.h          |   2 +-
 include/net/netfilter/nf_conntrack_ecache.h        |   4 +-
 include/net/netfilter/nf_conntrack_expect.h        |   2 +
 include/net/netfilter/nf_conntrack_extend.h        |  29 +----
 include/net/netfilter/nf_conntrack_helper.h        |  31 ++++-
 include/net/netfilter/nf_conntrack_l4proto.h       |   3 +
 include/net/netfilter/nf_conntrack_synproxy.h      |   2 +
 include/net/netfilter/nf_nat.h                     |   2 +-
 include/net/netfilter/nf_nat_helper.h              |  36 +++---
 include/net/netfilter/nf_queue.h                   |   3 +-
 include/net/netfilter/nf_tables.h                  |   5 +
 include/uapi/linux/netfilter/nf_conntrack_common.h |   9 +-
 include/uapi/linux/netfilter/nf_tables.h           |   2 +
 net/bridge/netfilter/ebtable_broute.c              |   4 +-
 net/bridge/netfilter/ebtable_filter.c              |  15 +--
 net/bridge/netfilter/ebtable_nat.c                 |  15 +--
 net/bridge/netfilter/ebtables.c                    |  63 +++++++----
 net/bridge/netfilter/nft_meta_bridge.c             |   2 +-
 net/decnet/netfilter/dn_rtmsg.c                    |   4 +-
 net/ipv4/netfilter/arp_tables.c                    |  21 ++--
 net/ipv4/netfilter/ip_tables.c                     |  20 ++--
 net/ipv4/netfilter/ipt_SYNPROXY.c                  |  94 ++++++++-------
 net/ipv4/netfilter/nf_dup_ipv4.c                   |   3 +-
 net/ipv4/netfilter/nf_nat_l3proto_ipv4.c           |   8 +-
 net/ipv4/netfilter/nf_nat_masquerade_ipv4.c        |   5 +-
 net/ipv4/netfilter/nf_nat_pptp.c                   |  45 +++++---
 net/ipv4/netfilter/nf_nat_snmp_basic.c             |  12 +-
 net/ipv4/netfilter/nf_socket_ipv4.c                |   2 +-
 net/ipv4/netfilter/nft_fib_ipv4.c                  |   2 +-
 net/ipv6/netfilter/ip6_tables.c                    |  29 ++---
 net/ipv6/netfilter/ip6t_SYNPROXY.c                 |  93 ++++++++-------
 net/ipv6/netfilter/nf_conntrack_proto_icmpv6.c     |   3 +-
 net/ipv6/netfilter/nf_dup_ipv6.c                   |   3 +-
 net/ipv6/netfilter/nf_nat_l3proto_ipv6.c           |   8 +-
 net/ipv6/netfilter/nf_nat_masquerade_ipv6.c        |   5 +-
 net/ipv6/netfilter/nft_fib_ipv6.c                  |   2 +-
 net/netfilter/core.c                               |  53 +++++++--
 net/netfilter/ipset/ip_set_bitmap_gen.h            |   5 +-
 net/netfilter/ipset/ip_set_core.c                  |  14 +--
 net/netfilter/ipvs/ip_vs_core.c                    |  19 ++--
 net/netfilter/ipvs/ip_vs_ctl.c                     |  12 +-
 net/netfilter/ipvs/ip_vs_ftp.c                     |  20 ++--
 net/netfilter/ipvs/ip_vs_nfct.c                    |   4 +-
 net/netfilter/ipvs/ip_vs_proto.c                   |  22 ----
 net/netfilter/ipvs/ip_vs_sync.c                    |   6 +-
 net/netfilter/ipvs/ip_vs_xmit.c                    |   8 +-
 net/netfilter/nf_conntrack_acct.c                  |   2 +-
 net/netfilter/nf_conntrack_amanda.c                |   2 +
 net/netfilter/nf_conntrack_core.c                  | 126 ++++++++++++++-------
 net/netfilter/nf_conntrack_ecache.c                |   9 +-
 net/netfilter/nf_conntrack_expect.c                |  36 +++---
 net/netfilter/nf_conntrack_extend.c                | 114 ++++---------------
 net/netfilter/nf_conntrack_ftp.c                   |   8 +-
 net/netfilter/nf_conntrack_h323_main.c             |   6 +-
 net/netfilter/nf_conntrack_helper.c                |  18 ++-
 net/netfilter/nf_conntrack_irc.c                   |   8 +-
 net/netfilter/nf_conntrack_labels.c                |   2 +-
 net/netfilter/nf_conntrack_netbios_ns.c            |   2 +
 net/netfilter/nf_conntrack_netlink.c               |  55 +++------
 net/netfilter/nf_conntrack_pptp.c                  |  15 ++-
 net/netfilter/nf_conntrack_proto.c                 |   5 +-
 net/netfilter/nf_conntrack_proto_dccp.c            |  16 +++
 net/netfilter/nf_conntrack_proto_sctp.c            |  16 +++
 net/netfilter/nf_conntrack_proto_tcp.c             |  25 +++-
 net/netfilter/nf_conntrack_sane.c                  |   8 +-
 net/netfilter/nf_conntrack_seqadj.c                |   2 +-
 net/netfilter/nf_conntrack_sip.c                   |  18 ++-
 net/netfilter/nf_conntrack_standalone.c            |   6 +-
 net/netfilter/nf_conntrack_tftp.c                  |   6 +-
 net/netfilter/nf_conntrack_timeout.c               |   2 +-
 net/netfilter/nf_conntrack_timestamp.c             |   2 +-
 net/netfilter/nf_internals.h                       |   2 +-
 net/netfilter/nf_log.c                             |   5 +-
 net/netfilter/nf_nat_amanda.c                      |  11 +-
 net/netfilter/nf_nat_core.c                        |  37 ++----
 net/netfilter/nf_nat_helper.c                      |  40 +++----
 net/netfilter/nf_nat_irc.c                         |   9 +-
 net/netfilter/nf_queue.c                           |   7 +-
 net/netfilter/nf_synproxy_core.c                   |  10 +-
 net/netfilter/nf_tables_api.c                      |  54 ++++-----
 net/netfilter/nf_tables_netdev.c                   |   2 +-
 net/netfilter/nf_tables_trace.c                    |   3 +-
 net/netfilter/nfnetlink.c                          |   2 +-
 net/netfilter/nfnetlink_acct.c                     |   2 +-
 net/netfilter/nfnetlink_cthelper.c                 |  18 ++-
 net/netfilter/nfnetlink_cttimeout.c                |   4 +-
 net/netfilter/nfnetlink_log.c                      |   6 +-
 net/netfilter/nfnetlink_queue.c                    |  24 ++--
 net/netfilter/nft_compat.c                         |  13 ++-
 net/netfilter/nft_ct.c                             |  41 +++++--
 net/netfilter/nft_exthdr.c                         |   2 +-
 net/netfilter/nft_hash.c                           |   2 +-
 net/netfilter/nft_meta.c                           |   2 +-
 net/netfilter/nft_numgen.c                         |   2 +-
 net/netfilter/nft_queue.c                          |   2 +-
 net/netfilter/nft_set_hash.c                       |   2 +-
 net/netfilter/xt_CT.c                              |  16 +--
 net/netfilter/xt_HMARK.c                           |   2 +-
 net/netfilter/xt_cluster.c                         |   3 -
 net/netfilter/xt_connlabel.c                       |   2 +-
 net/netfilter/xt_connmark.c                        |   4 +-
 net/netfilter/xt_conntrack.c                       |  11 +-
 net/netfilter/xt_hashlimit.c                       |  10 +-
 net/netfilter/xt_ipvs.c                            |   2 +-
 net/netfilter/xt_recent.c                          |   2 +-
 net/netfilter/xt_state.c                           |  13 +--
 net/openvswitch/conntrack.c                        |   5 -
 113 files changed, 853 insertions(+), 836 deletions(-)

^ permalink raw reply

* Re: [PATCH net-next 1/4] netlink: add NULL-friendly helper for setting extended ACK message
From: Daniel Borkmann @ 2017-05-01 10:45 UTC (permalink / raw)
  To: Jakub Kicinski, netdev
  Cc: davem, johannes, dsa, alexei.starovoitov, bblanco, john.fastabend,
	kubakici, oss-drivers, brouer, jhs
In-Reply-To: <20170501044648.13022-2-jakub.kicinski@netronome.com>

On 05/01/2017 06:46 AM, Jakub Kicinski wrote:
> As we propagate extended ack reporting throughout various paths in
> the kernel it may be that the same function is called with the
> extended ack parameter passed as NULL.  One place where that happens
> is in drivers which have a centralized reconfiguration function
> called both from ndos and from ethtool_ops.  Add a new helper for
> setting the error message in such conditions.
>
> Existing helper is left as is to encourage propagating the ext act
> fully wherever possible.  It also makes it clear in the code which
> messages may be lost due to ext ack being NULL.
>
> Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>

Acked-by: Daniel Borkmann <daniel@iogearbox.net>

^ permalink raw reply

* Re: [PATCH/RFC net-next 0/4] net/sched: cls_flower: avoid false matching of truncated packets
From: Simon Horman @ 2017-05-01 10:36 UTC (permalink / raw)
  To: Jamal Hadi Salim
  Cc: Jiri Pirko, Cong Wang, Dinan Gunawardena, netdev, oss-drivers
In-Reply-To: <0e993860-4dfd-3562-5ccb-c5ad24e5f970@mojatatu.com>

On Sun, Apr 30, 2017 at 09:51:30AM -0400, Jamal Hadi Salim wrote:
> On 17-04-28 10:14 AM, Simon Horman wrote:
> >On Fri, Apr 28, 2017 at 09:41:00AM -0400, Jamal Hadi Salim wrote:
> >>On 17-04-28 09:11 AM, Simon Horman wrote:
> [..]
> >>A default lower prio match all on udp or icmp?
> >
> >I'm certainly not opposed to exploring ideas here.
> >
> >The way that flower currently works is that a match on ip_proto ==
> >UDP/TCP/SCTP/ICMP but not fields in the L4 header itself would not result in
> >the dissector only dissecting the packet's L4 header and thus would not
> >discover (or as in currently the case, silently ignore) the absence of the
> >ports/ICMP type and code in the L4 header.
> >
> >What my patch attempts to do is to describe a policy of what to do if
> >a given classifier invokes the dissector (to pull out the headers needed for
> >the match in question) and that dissection fails. Its basically describing
> >the error-path.
> >
> 
> Understood - I was struggling with whether error-path is the same as
> "didnt match".
> 
> >
> >There are two issues:
> >
> >1. As things stand, without this patch-set, flower does not differentiate
> >   between a packet truncated at the end of the IP header and a packet with
> >   zero ports. Likewise for icmp type and code of zero.
> >
> >   The first three patches of this series address that so that a match for
> >   port == zero only matches if ports are present in the packet. Again,
> >   likewise for ICMP.
> >
> >   This is a bug-fix to my way of thinking.
> 
> Agreed to bug fix. I would have said there is never a legit packet with
> TCP/UDP but I think some fingerprinting apps use it. And one would need
> to distinguish between the two at classification time.

Yes, that is basically what I thought too.

> ICMP type 0 is certainly used.

Agreed.

> minimal some flag should qualify it as "truncated".

Would changing TCA_FLOWER_HEADER_PARSE_ERR_ACT to
TCA_FLOWER_META_TRUNCATED help?

> >2. The behaviour described above, prior to this patchset, might have been
> >   utilised to f.e. drop packets that are either truncated or have port == 0
> >   (because flower didn't differentiate between these cases).
> >
> >   So the question becomes if/how to provide such a feature.
> >   The last patch is my attempt to answer that question.
> 
> It almost feels like you need metadata matching as well - one being
> "truncated".

^ permalink raw reply

* Re: [oss-drivers] [PATCH net-next 0/4] xdp: use netlink extended ACK reporting
From: Simon Horman @ 2017-05-01 10:32 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, davem, johannes, dsa, daniel, alexei.starovoitov, bblanco,
	john.fastabend, kubakici, oss-drivers, brouer, jhs
In-Reply-To: <20170501044648.13022-1-jakub.kicinski@netronome.com>

On Sun, Apr 30, 2017 at 09:46:44PM -0700, Jakub Kicinski wrote:
> Hi!
> 
> This series is an attempt to make XDP more user friendly by 
> enabling exploiting the recently added netlink extended ACK 
> reporting to carry messages to user space.
> 
> David Ahern's iproute2 ext ack patches for ip link are sufficient
> to show the errors like this:
> 
> # ip link set dev p4p1 xdp obj ipip_prepend.o sec ".text"
> Error: nfp: MTU too large w/ XDP enabled
> 
> Where the message is coming directly from the driver.  There could
> still be a bit of a leap for a complete novice from the message 
> above to the right settings, but it's a big improvement over the
> standard "Invalid argument" message.
> 
> v1/non-rfc:
>  - add a separate macro in patch 1;
>  - add KBUILD_MODNAME as part of the message (Daniel);
>  - don't print the error to logs in patch 1.
> 
> Jakub Kicinski (4):
>   netlink: add NULL-friendly helper for setting extended ACK message
>   xdp: propagate extended ack to XDP setup
>   nfp: make use of extended ack message reporting
>   virtio_net: make use of extended ack message reporting

Reviewed-by: Simon Horman <simon.horman@netronome.com>

^ permalink raw reply

* Re: [net-next PATCH 2/2] samples/bpf: fix XDP_FLAGS_SKB_MODE detach for xdp_tx_iptunnel
From: Daniel Borkmann @ 2017-05-01  9:56 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, dsa, netdev; +Cc: Daniel Borkmann, Alexei Starovoitov
In-Reply-To: <149363078062.17600.11958698183756401134.stgit@firesoul>

On 05/01/2017 11:26 AM, Jesper Dangaard Brouer wrote:
> The xdp_tx_iptunnel program can be terminated in two ways, after
> N-seconds or via Ctrl-C SIGINT.  The SIGINT code path does not
> handle detatching the correct XDP program, in-case the program
> was attached with XDP_FLAGS_SKB_MODE.
>
> Fix this by storing the XDP flags as a global variable, which is
> available for the SIGINT handler function.
>
> Fixes: 3993f2cb983b ("samples/bpf: Add support for SKB_MODE to xdp1 and xdp_tx_iptunnel")
> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>

Acked-by: Daniel Borkmann <daniel@iogearbox.net>

^ permalink raw reply

* Re: [net-next PATCH 1/2] samples/bpf: fix SKB_MODE flag to be a 32-bit unsigned int
From: Daniel Borkmann @ 2017-05-01  9:55 UTC (permalink / raw)
  To: Jesper Dangaard Brouer, dsa, netdev; +Cc: Daniel Borkmann, Alexei Starovoitov
In-Reply-To: <149363077553.17600.5609466598508053296.stgit@firesoul>

On 05/01/2017 11:26 AM, Jesper Dangaard Brouer wrote:
> The kernel side of XDP_FLAGS_SKB_MODE is unsigned, and the rtnetlink
> IFLA_XDP_FLAGS is defined as NLA_U32. Thus, userspace programs under
> samples/bpf/ should use the correct type.
>
> Fixes: 3993f2cb983b ("samples/bpf: Add support for SKB_MODE to xdp1 and xdp_tx_iptunnel")
> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>

Acked-by: Daniel Borkmann <daniel@iogearbox.net>

^ permalink raw reply

* Re: [GIT PULL 0/2] Third Round of IPVS Updates for v4.12
From: Pablo Neira Ayuso @ 2017-05-01  9:47 UTC (permalink / raw)
  To: Simon Horman
  Cc: lvs-devel, netdev, netfilter-devel, Wensong Zhang,
	Julian Anastasov
In-Reply-To: <20170428101159.9810-1-horms@verge.net.au>

On Fri, Apr 28, 2017 at 12:11:57PM +0200, Simon Horman wrote:
> Hi Pablo,
> 
> please consider these enhancements to IPVS for v4.12.
> If it is too late for v4.12 then please consider them for v4.13.
> 
> * Remove unused function
> * Correct comparison of unsigned value

Pulled, thanks Simon.

^ permalink raw reply

* [net-next PATCH 2/2] samples/bpf: fix XDP_FLAGS_SKB_MODE detach for xdp_tx_iptunnel
From: Jesper Dangaard Brouer @ 2017-05-01  9:26 UTC (permalink / raw)
  To: dsa, netdev; +Cc: Daniel Borkmann, Alexei Starovoitov, Jesper Dangaard Brouer
In-Reply-To: <149363073213.17600.4480290736818479957.stgit@firesoul>

The xdp_tx_iptunnel program can be terminated in two ways, after
N-seconds or via Ctrl-C SIGINT.  The SIGINT code path does not
handle detatching the correct XDP program, in-case the program
was attached with XDP_FLAGS_SKB_MODE.

Fix this by storing the XDP flags as a global variable, which is
available for the SIGINT handler function.

Fixes: 3993f2cb983b ("samples/bpf: Add support for SKB_MODE to xdp1 and xdp_tx_iptunnel")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 samples/bpf/xdp_tx_iptunnel_user.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/samples/bpf/xdp_tx_iptunnel_user.c b/samples/bpf/xdp_tx_iptunnel_user.c
index 880dd4aebfa4..92b8bde9337c 100644
--- a/samples/bpf/xdp_tx_iptunnel_user.c
+++ b/samples/bpf/xdp_tx_iptunnel_user.c
@@ -25,11 +25,12 @@
 #define STATS_INTERVAL_S 2U
 
 static int ifindex = -1;
+static __u32 xdp_flags = 0;
 
 static void int_exit(int sig)
 {
 	if (ifindex > -1)
-		set_link_xdp_fd(ifindex, -1, 0);
+		set_link_xdp_fd(ifindex, -1, xdp_flags);
 	exit(0);
 }
 
@@ -142,7 +143,6 @@ int main(int argc, char **argv)
 	struct iptnl_info tnl = {};
 	struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
 	struct vip vip = {};
-	__u32 xdp_flags = 0;
 	char filename[256];
 	int opt;
 	int i;

^ permalink raw reply related

* [net-next PATCH 1/2] samples/bpf: fix SKB_MODE flag to be a 32-bit unsigned int
From: Jesper Dangaard Brouer @ 2017-05-01  9:26 UTC (permalink / raw)
  To: dsa, netdev; +Cc: Daniel Borkmann, Alexei Starovoitov, Jesper Dangaard Brouer
In-Reply-To: <149363073213.17600.4480290736818479957.stgit@firesoul>

The kernel side of XDP_FLAGS_SKB_MODE is unsigned, and the rtnetlink
IFLA_XDP_FLAGS is defined as NLA_U32. Thus, userspace programs under
samples/bpf/ should use the correct type.

Fixes: 3993f2cb983b ("samples/bpf: Add support for SKB_MODE to xdp1 and xdp_tx_iptunnel")
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 samples/bpf/bpf_load.c             |    3 ++-
 samples/bpf/bpf_load.h             |    2 +-
 samples/bpf/xdp1_user.c            |    8 ++++----
 samples/bpf/xdp_tx_iptunnel_user.c |    8 ++++----
 4 files changed, 11 insertions(+), 10 deletions(-)

diff --git a/samples/bpf/bpf_load.c b/samples/bpf/bpf_load.c
index 0ec0dea3c41e..4221dc359453 100644
--- a/samples/bpf/bpf_load.c
+++ b/samples/bpf/bpf_load.c
@@ -14,6 +14,7 @@
 #include <linux/perf_event.h>
 #include <linux/netlink.h>
 #include <linux/rtnetlink.h>
+#include <linux/types.h>
 #include <sys/types.h>
 #include <sys/socket.h>
 #include <sys/syscall.h>
@@ -585,7 +586,7 @@ struct ksym *ksym_search(long key)
 	return &syms[0];
 }
 
-int set_link_xdp_fd(int ifindex, int fd, int flags)
+int set_link_xdp_fd(int ifindex, int fd, __u32 flags)
 {
 	struct sockaddr_nl sa;
 	int sock, seq = 0, len, ret = -1;
diff --git a/samples/bpf/bpf_load.h b/samples/bpf/bpf_load.h
index 6bfd75ec6a16..05822f83173a 100644
--- a/samples/bpf/bpf_load.h
+++ b/samples/bpf/bpf_load.h
@@ -47,5 +47,5 @@ struct ksym {
 
 int load_kallsyms(void);
 struct ksym *ksym_search(long key);
-int set_link_xdp_fd(int ifindex, int fd, int flags);
+int set_link_xdp_fd(int ifindex, int fd, __u32 flags);
 #endif
diff --git a/samples/bpf/xdp1_user.c b/samples/bpf/xdp1_user.c
index deb05e630d84..378850c70eb8 100644
--- a/samples/bpf/xdp1_user.c
+++ b/samples/bpf/xdp1_user.c
@@ -20,11 +20,11 @@
 #include "libbpf.h"
 
 static int ifindex;
-static int flags;
+static __u32 xdp_flags;
 
 static void int_exit(int sig)
 {
-	set_link_xdp_fd(ifindex, -1, flags);
+	set_link_xdp_fd(ifindex, -1, xdp_flags);
 	exit(0);
 }
 
@@ -75,7 +75,7 @@ int main(int argc, char **argv)
 	while ((opt = getopt(argc, argv, optstr)) != -1) {
 		switch (opt) {
 		case 'S':
-			flags |= XDP_FLAGS_SKB_MODE;
+			xdp_flags |= XDP_FLAGS_SKB_MODE;
 			break;
 		default:
 			usage(basename(argv[0]));
@@ -103,7 +103,7 @@ int main(int argc, char **argv)
 
 	signal(SIGINT, int_exit);
 
-	if (set_link_xdp_fd(ifindex, prog_fd[0], flags) < 0) {
+	if (set_link_xdp_fd(ifindex, prog_fd[0], xdp_flags) < 0) {
 		printf("link set xdp fd failed\n");
 		return 1;
 	}
diff --git a/samples/bpf/xdp_tx_iptunnel_user.c b/samples/bpf/xdp_tx_iptunnel_user.c
index cb2bda7b5346..880dd4aebfa4 100644
--- a/samples/bpf/xdp_tx_iptunnel_user.c
+++ b/samples/bpf/xdp_tx_iptunnel_user.c
@@ -142,8 +142,8 @@ int main(int argc, char **argv)
 	struct iptnl_info tnl = {};
 	struct rlimit r = {RLIM_INFINITY, RLIM_INFINITY};
 	struct vip vip = {};
+	__u32 xdp_flags = 0;
 	char filename[256];
-	int flags = 0;
 	int opt;
 	int i;
 
@@ -204,7 +204,7 @@ int main(int argc, char **argv)
 			kill_after_s = atoi(optarg);
 			break;
 		case 'S':
-			flags |= XDP_FLAGS_SKB_MODE;
+			xdp_flags |= XDP_FLAGS_SKB_MODE;
 			break;
 		default:
 			usage(argv[0]);
@@ -248,14 +248,14 @@ int main(int argc, char **argv)
 		}
 	}
 
-	if (set_link_xdp_fd(ifindex, prog_fd[0], flags) < 0) {
+	if (set_link_xdp_fd(ifindex, prog_fd[0], xdp_flags) < 0) {
 		printf("link set xdp fd failed\n");
 		return 1;
 	}
 
 	poll_stats(kill_after_s);
 
-	set_link_xdp_fd(ifindex, -1, flags);
+	set_link_xdp_fd(ifindex, -1, xdp_flags);
 
 	return 0;
 }

^ permalink raw reply related

* [net-next PATCH 0/2] samples/bpf: two bug fixes to XDP_FLAGS_SKB_MODE attaching
From: Jesper Dangaard Brouer @ 2017-05-01  9:26 UTC (permalink / raw)
  To: dsa, netdev; +Cc: Daniel Borkmann, Alexei Starovoitov, Jesper Dangaard Brouer
In-Reply-To: <5607e461-b74f-f8b7-8d47-a5341259ddff@cumulusnetworks.com>

Two small bugfixes for:
 commit 3993f2cb983b ("samples/bpf: Add support for SKB_MODE to xdp1 and xdp_tx_iptunnel")

---

Jesper Dangaard Brouer (2):
      samples/bpf: fix SKB_MODE flag to be a 32-bit unsigned int
      samples/bpf: fix XDP_FLAGS_SKB_MODE detach for xdp_tx_iptunnel


 samples/bpf/bpf_load.c             |    3 ++-
 samples/bpf/bpf_load.h             |    2 +-
 samples/bpf/xdp1_user.c            |    8 ++++----
 samples/bpf/xdp_tx_iptunnel_user.c |   10 +++++-----
 4 files changed, 12 insertions(+), 11 deletions(-)

^ permalink raw reply

* Re: [PATCH net-next] samples/bpf: Add support for SKB_MODE to xdp1 and xdp_tx_iptunnel
From: Jesper Dangaard Brouer @ 2017-05-01  9:09 UTC (permalink / raw)
  To: David Ahern; +Cc: netdev, ast, daniel, brouer
In-Reply-To: <5607e461-b74f-f8b7-8d47-a5341259ddff@cumulusnetworks.com>

On Sun, 30 Apr 2017 17:46:13 -0600
David Ahern <dsa@cumulusnetworks.com> wrote:

> On 4/28/17 3:40 PM, Jesper Dangaard Brouer wrote:
> > [...]  
> >> diff --git a/samples/bpf/bpf_load.c b/samples/bpf/bpf_load.c
> >> index 0d449d8032d1..d4433a47e6c3 100644
> >> --- a/samples/bpf/bpf_load.c
> >> +++ b/samples/bpf/bpf_load.c
> >> @@ -563,7 +563,7 @@ struct ksym *ksym_search(long key)
> >>  	return &syms[0];
> >>  }
> >>  
> >> -int set_link_xdp_fd(int ifindex, int fd)
> >> +int set_link_xdp_fd(int ifindex, int fd, int flags)  
> > Shouldn't the flags be a unsigned int, actually a __u32 ?
> >   
> 
> sure. I'll send a patch

I found another bug in xdp_tx_iptunnel ... I'll send a patch for both issues.

-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply

* [PATCH net-next 4/4] virtio_net: make use of extended ack message reporting
From: Jakub Kicinski @ 2017-05-01  4:46 UTC (permalink / raw)
  To: netdev
  Cc: davem, johannes, dsa, daniel, alexei.starovoitov, bblanco,
	john.fastabend, kubakici, oss-drivers, brouer, jhs,
	Jakub Kicinski
In-Reply-To: <20170501044648.13022-1-jakub.kicinski@netronome.com>

Try to carry error messages to the user via the netlink extended
ack message attribute.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
 drivers/net/virtio_net.c | 11 +++++++----
 1 file changed, 7 insertions(+), 4 deletions(-)

diff --git a/drivers/net/virtio_net.c b/drivers/net/virtio_net.c
index 82f1c3a73345..046c60619c59 100644
--- a/drivers/net/virtio_net.c
+++ b/drivers/net/virtio_net.c
@@ -1878,7 +1878,8 @@ static int virtnet_reset(struct virtnet_info *vi, int curr_qp, int xdp_qp)
 	return ret;
 }
 
-static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog)
+static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog,
+			   struct netlink_ext_ack *extack)
 {
 	unsigned long int max_sz = PAGE_SIZE - sizeof(struct padded_vnet_hdr);
 	struct virtnet_info *vi = netdev_priv(dev);
@@ -1890,16 +1891,17 @@ static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog)
 	    virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_TSO6) ||
 	    virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_ECN) ||
 	    virtio_has_feature(vi->vdev, VIRTIO_NET_F_GUEST_UFO)) {
-		netdev_warn(dev, "can't set XDP while host is implementing LRO, disable LRO first\n");
+		NL_SET_ERR_MSG(extack, "can't set XDP while host is implementing LRO, disable LRO first");
 		return -EOPNOTSUPP;
 	}
 
 	if (vi->mergeable_rx_bufs && !vi->any_header_sg) {
-		netdev_warn(dev, "XDP expects header/data in single page, any_header_sg required\n");
+		NL_SET_ERR_MSG(extack, "XDP expects header/data in single page, any_header_sg required");
 		return -EINVAL;
 	}
 
 	if (dev->mtu > max_sz) {
+		NL_SET_ERR_MSG(extack, "MTU too large to enable XDP");
 		netdev_warn(dev, "XDP requires MTU less than %lu\n", max_sz);
 		return -EINVAL;
 	}
@@ -1910,6 +1912,7 @@ static int virtnet_xdp_set(struct net_device *dev, struct bpf_prog *prog)
 
 	/* XDP requires extra queues for XDP_TX */
 	if (curr_qp + xdp_qp > vi->max_queue_pairs) {
+		NL_SET_ERR_MSG(extack, "Too few free TX rings available");
 		netdev_warn(dev, "request %i queues but max is %i\n",
 			    curr_qp + xdp_qp, vi->max_queue_pairs);
 		return -ENOMEM;
@@ -1971,7 +1974,7 @@ static int virtnet_xdp(struct net_device *dev, struct netdev_xdp *xdp)
 {
 	switch (xdp->command) {
 	case XDP_SETUP_PROG:
-		return virtnet_xdp_set(dev, xdp->prog);
+		return virtnet_xdp_set(dev, xdp->prog, xdp->extack);
 	case XDP_QUERY_PROG:
 		xdp->prog_attached = virtnet_xdp_query(dev);
 		return 0;
-- 
2.11.0

^ permalink raw reply related

* [PATCH net-next 2/4] xdp: propagate extended ack to XDP setup
From: Jakub Kicinski @ 2017-05-01  4:46 UTC (permalink / raw)
  To: netdev
  Cc: davem, johannes, dsa, daniel, alexei.starovoitov, bblanco,
	john.fastabend, kubakici, oss-drivers, brouer, jhs,
	Jakub Kicinski
In-Reply-To: <20170501044648.13022-1-jakub.kicinski@netronome.com>

Drivers usually have a number of restrictions for running XDP
- most common being buffer sizes, LRO and number of rings.
Even though some drivers try to be helpful and print error
messages experience shows that users don't often consult
kernel logs on netlink errors.  Try to use the new extended
ack mechanism to carry the message back to user space.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
 include/linux/netdevice.h | 10 ++++++++--
 net/core/dev.c            |  5 ++++-
 net/core/rtnetlink.c      | 13 ++++++++-----
 3 files changed, 20 insertions(+), 8 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 6847714a5ae3..9c23bd2efb56 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -813,11 +813,16 @@ enum xdp_netdev_command {
 	XDP_QUERY_PROG,
 };
 
+struct netlink_ext_ack;
+
 struct netdev_xdp {
 	enum xdp_netdev_command command;
 	union {
 		/* XDP_SETUP_PROG */
-		struct bpf_prog *prog;
+		struct {
+			struct bpf_prog *prog;
+			struct netlink_ext_ack *extack;
+		};
 		/* XDP_QUERY_PROG */
 		bool prog_attached;
 	};
@@ -3291,7 +3296,8 @@ int dev_get_phys_port_id(struct net_device *dev,
 int dev_get_phys_port_name(struct net_device *dev,
 			   char *name, size_t len);
 int dev_change_proto_down(struct net_device *dev, bool proto_down);
-int dev_change_xdp_fd(struct net_device *dev, int fd, u32 flags);
+int dev_change_xdp_fd(struct net_device *dev, struct netlink_ext_ack *extack,
+		      int fd, u32 flags);
 struct sk_buff *validate_xmit_skb_list(struct sk_buff *skb, struct net_device *dev);
 struct sk_buff *dev_hard_start_xmit(struct sk_buff *skb, struct net_device *dev,
 				    struct netdev_queue *txq, int *ret);
diff --git a/net/core/dev.c b/net/core/dev.c
index 8371a01eee87..35a06cebb282 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -6854,12 +6854,14 @@ EXPORT_SYMBOL(dev_change_proto_down);
 /**
  *	dev_change_xdp_fd - set or clear a bpf program for a device rx path
  *	@dev: device
+ *	@extact: netlink extended ack
  *	@fd: new program fd or negative value to clear
  *	@flags: xdp-related flags
  *
  *	Set or clear a bpf program for a device
  */
-int dev_change_xdp_fd(struct net_device *dev, int fd, u32 flags)
+int dev_change_xdp_fd(struct net_device *dev, struct netlink_ext_ack *extack,
+		      int fd, u32 flags)
 {
 	int (*xdp_op)(struct net_device *dev, struct netdev_xdp *xdp);
 	const struct net_device_ops *ops = dev->netdev_ops;
@@ -6892,6 +6894,7 @@ int dev_change_xdp_fd(struct net_device *dev, int fd, u32 flags)
 
 	memset(&xdp, 0, sizeof(xdp));
 	xdp.command = XDP_SETUP_PROG;
+	xdp.extack = extack;
 	xdp.prog = prog;
 
 	err = xdp_op(dev, &xdp);
diff --git a/net/core/rtnetlink.c b/net/core/rtnetlink.c
index 9031a6c8bfa7..6e67315ec368 100644
--- a/net/core/rtnetlink.c
+++ b/net/core/rtnetlink.c
@@ -1919,6 +1919,7 @@ static int do_set_master(struct net_device *dev, int ifindex)
 #define DO_SETLINK_NOTIFY	0x03
 static int do_setlink(const struct sk_buff *skb,
 		      struct net_device *dev, struct ifinfomsg *ifm,
+		      struct netlink_ext_ack *extack,
 		      struct nlattr **tb, char *ifname, int status)
 {
 	const struct net_device_ops *ops = dev->netdev_ops;
@@ -2201,7 +2202,7 @@ static int do_setlink(const struct sk_buff *skb,
 		}
 
 		if (xdp[IFLA_XDP_FD]) {
-			err = dev_change_xdp_fd(dev,
+			err = dev_change_xdp_fd(dev, extack,
 						nla_get_s32(xdp[IFLA_XDP_FD]),
 						xdp_flags);
 			if (err)
@@ -2261,7 +2262,7 @@ static int rtnl_setlink(struct sk_buff *skb, struct nlmsghdr *nlh,
 	if (err < 0)
 		goto errout;
 
-	err = do_setlink(skb, dev, ifm, tb, ifname, 0);
+	err = do_setlink(skb, dev, ifm, extack, tb, ifname, 0);
 errout:
 	return err;
 }
@@ -2423,6 +2424,7 @@ EXPORT_SYMBOL(rtnl_create_link);
 static int rtnl_group_changelink(const struct sk_buff *skb,
 		struct net *net, int group,
 		struct ifinfomsg *ifm,
+		struct netlink_ext_ack *extack,
 		struct nlattr **tb)
 {
 	struct net_device *dev, *aux;
@@ -2430,7 +2432,7 @@ static int rtnl_group_changelink(const struct sk_buff *skb,
 
 	for_each_netdev_safe(net, dev, aux) {
 		if (dev->group == group) {
-			err = do_setlink(skb, dev, ifm, tb, NULL, 0);
+			err = do_setlink(skb, dev, ifm, extack, tb, NULL, 0);
 			if (err < 0)
 				return err;
 		}
@@ -2576,14 +2578,15 @@ static int rtnl_newlink(struct sk_buff *skb, struct nlmsghdr *nlh,
 				status |= DO_SETLINK_NOTIFY;
 			}
 
-			return do_setlink(skb, dev, ifm, tb, ifname, status);
+			return do_setlink(skb, dev, ifm, extack, tb, ifname,
+					  status);
 		}
 
 		if (!(nlh->nlmsg_flags & NLM_F_CREATE)) {
 			if (ifm->ifi_index == 0 && tb[IFLA_GROUP])
 				return rtnl_group_changelink(skb, net,
 						nla_get_u32(tb[IFLA_GROUP]),
-						ifm, tb);
+						ifm, extack, tb);
 			return -ENODEV;
 		}
 
-- 
2.11.0

^ permalink raw reply related

* [PATCH net-next 3/4] nfp: make use of extended ack message reporting
From: Jakub Kicinski @ 2017-05-01  4:46 UTC (permalink / raw)
  To: netdev
  Cc: davem, johannes, dsa, daniel, alexei.starovoitov, bblanco,
	john.fastabend, kubakici, oss-drivers, brouer, jhs,
	Jakub Kicinski
In-Reply-To: <20170501044648.13022-1-jakub.kicinski@netronome.com>

Try to carry error messages to the user via the netlink extended
ack message attribute.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
 drivers/net/ethernet/netronome/nfp/nfp_net.h       |  3 ++-
 .../net/ethernet/netronome/nfp/nfp_net_common.c    | 22 +++++++++++++---------
 .../net/ethernet/netronome/nfp/nfp_net_ethtool.c   |  4 ++--
 3 files changed, 17 insertions(+), 12 deletions(-)

diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net.h b/drivers/net/ethernet/netronome/nfp/nfp_net.h
index 38b41fdeaa8f..fcf81b3be830 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net.h
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net.h
@@ -818,7 +818,8 @@ nfp_net_irqs_assign(struct nfp_net *nn, struct msix_entry *irq_entries,
 		    unsigned int n);
 
 struct nfp_net_dp *nfp_net_clone_dp(struct nfp_net *nn);
-int nfp_net_ring_reconfig(struct nfp_net *nn, struct nfp_net_dp *new);
+int nfp_net_ring_reconfig(struct nfp_net *nn, struct nfp_net_dp *new,
+			  struct netlink_ext_ack *extack);
 
 bool nfp_net_link_changed_read_clear(struct nfp_net *nn);
 int nfp_net_refresh_eth_port(struct nfp_net *nn);
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
index b9f3548bb65f..db20376260f5 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_common.c
@@ -2524,24 +2524,27 @@ struct nfp_net_dp *nfp_net_clone_dp(struct nfp_net *nn)
 	return new;
 }
 
-static int nfp_net_check_config(struct nfp_net *nn, struct nfp_net_dp *dp)
+static int
+nfp_net_check_config(struct nfp_net *nn, struct nfp_net_dp *dp,
+		     struct netlink_ext_ack *extack)
 {
 	/* XDP-enabled tests */
 	if (!dp->xdp_prog)
 		return 0;
 	if (dp->fl_bufsz > PAGE_SIZE) {
-		nn_warn(nn, "MTU too large w/ XDP enabled\n");
+		NL_MOD_TRY_SET_ERR_MSG(extack, "MTU too large w/ XDP enabled");
 		return -EINVAL;
 	}
 	if (dp->num_tx_rings > nn->max_tx_rings) {
-		nn_warn(nn, "Insufficient number of TX rings w/ XDP enabled\n");
+		NL_MOD_TRY_SET_ERR_MSG(extack, "Insufficient number of TX rings w/ XDP enabled");
 		return -EINVAL;
 	}
 
 	return 0;
 }
 
-int nfp_net_ring_reconfig(struct nfp_net *nn, struct nfp_net_dp *dp)
+int nfp_net_ring_reconfig(struct nfp_net *nn, struct nfp_net_dp *dp,
+			  struct netlink_ext_ack *extack)
 {
 	int r, err;
 
@@ -2553,7 +2556,7 @@ int nfp_net_ring_reconfig(struct nfp_net *nn, struct nfp_net_dp *dp)
 
 	dp->num_r_vecs = max(dp->num_rx_rings, dp->num_stack_tx_rings);
 
-	err = nfp_net_check_config(nn, dp);
+	err = nfp_net_check_config(nn, dp, extack);
 	if (err)
 		goto exit_free_dp;
 
@@ -2628,7 +2631,7 @@ static int nfp_net_change_mtu(struct net_device *netdev, int new_mtu)
 
 	dp->mtu = new_mtu;
 
-	return nfp_net_ring_reconfig(nn, dp);
+	return nfp_net_ring_reconfig(nn, dp, NULL);
 }
 
 static void nfp_net_stat64(struct net_device *netdev,
@@ -2944,9 +2947,10 @@ static int nfp_net_xdp_offload(struct nfp_net *nn, struct bpf_prog *prog)
 	return ret;
 }
 
-static int nfp_net_xdp_setup(struct nfp_net *nn, struct bpf_prog *prog)
+static int nfp_net_xdp_setup(struct nfp_net *nn, struct netdev_xdp *xdp)
 {
 	struct bpf_prog *old_prog = nn->dp.xdp_prog;
+	struct bpf_prog *prog = xdp->prog;
 	struct nfp_net_dp *dp;
 	int err;
 
@@ -2969,7 +2973,7 @@ static int nfp_net_xdp_setup(struct nfp_net *nn, struct bpf_prog *prog)
 	dp->rx_dma_off = prog ? XDP_PACKET_HEADROOM - nn->dp.rx_offset : 0;
 
 	/* We need RX reconfig to remap the buffers (BIDIR vs FROM_DEV) */
-	err = nfp_net_ring_reconfig(nn, dp);
+	err = nfp_net_ring_reconfig(nn, dp, xdp->extack);
 	if (err)
 		return err;
 
@@ -2987,7 +2991,7 @@ static int nfp_net_xdp(struct net_device *netdev, struct netdev_xdp *xdp)
 
 	switch (xdp->command) {
 	case XDP_SETUP_PROG:
-		return nfp_net_xdp_setup(nn, xdp->prog);
+		return nfp_net_xdp_setup(nn, xdp);
 	case XDP_QUERY_PROG:
 		xdp->prog_attached = !!nn->dp.xdp_prog;
 		return 0;
diff --git a/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c b/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c
index a704efd4e314..abbb47e60cc3 100644
--- a/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c
+++ b/drivers/net/ethernet/netronome/nfp/nfp_net_ethtool.c
@@ -309,7 +309,7 @@ static int nfp_net_set_ring_size(struct nfp_net *nn, u32 rxd_cnt, u32 txd_cnt)
 	dp->rxd_cnt = rxd_cnt;
 	dp->txd_cnt = txd_cnt;
 
-	return nfp_net_ring_reconfig(nn, dp);
+	return nfp_net_ring_reconfig(nn, dp, NULL);
 }
 
 static int nfp_net_set_ringparam(struct net_device *netdev,
@@ -880,7 +880,7 @@ static int nfp_net_set_num_rings(struct nfp_net *nn, unsigned int total_rx,
 	if (dp->xdp_prog)
 		dp->num_tx_rings += total_rx;
 
-	return nfp_net_ring_reconfig(nn, dp);
+	return nfp_net_ring_reconfig(nn, dp, NULL);
 }
 
 static int nfp_net_set_channels(struct net_device *netdev,
-- 
2.11.0

^ permalink raw reply related

* [PATCH net-next 1/4] netlink: add NULL-friendly helper for setting extended ACK message
From: Jakub Kicinski @ 2017-05-01  4:46 UTC (permalink / raw)
  To: netdev
  Cc: davem, johannes, dsa, daniel, alexei.starovoitov, bblanco,
	john.fastabend, kubakici, oss-drivers, brouer, jhs,
	Jakub Kicinski
In-Reply-To: <20170501044648.13022-1-jakub.kicinski@netronome.com>

As we propagate extended ack reporting throughout various paths in
the kernel it may be that the same function is called with the
extended ack parameter passed as NULL.  One place where that happens
is in drivers which have a centralized reconfiguration function
called both from ndos and from ethtool_ops.  Add a new helper for
setting the error message in such conditions.

Existing helper is left as is to encourage propagating the ext act
fully wherever possible.  It also makes it clear in the code which
messages may be lost due to ext ack being NULL.

Signed-off-by: Jakub Kicinski <jakub.kicinski@netronome.com>
---
 include/linux/netlink.h | 8 ++++++++
 1 file changed, 8 insertions(+)

diff --git a/include/linux/netlink.h b/include/linux/netlink.h
index 8d2a8924705c..c20395edf2de 100644
--- a/include/linux/netlink.h
+++ b/include/linux/netlink.h
@@ -92,6 +92,14 @@ struct netlink_ext_ack {
 	(extack)->_msg = _msg;			\
 } while (0)

+#define NL_MOD_TRY_SET_ERR_MSG(extack, msg) do {		\
+	static const char _msg[] = KBUILD_MODNAME ": " msg;	\
+	struct netlink_ext_ack *_extack = (extack);		\
+								\
+	if (_extack)						\
+		_extack->_msg = _msg;				\
+} while (0)
+
 extern void netlink_kernel_release(struct sock *sk);
 extern int __netlink_change_ngroups(struct sock *sk, unsigned int groups);
 extern int netlink_change_ngroups(struct sock *sk, unsigned int groups);
-- 
2.11.0

^ permalink raw reply related

* [PATCH net-next 0/4] xdp: use netlink extended ACK reporting
From: Jakub Kicinski @ 2017-05-01  4:46 UTC (permalink / raw)
  To: netdev
  Cc: davem, johannes, dsa, daniel, alexei.starovoitov, bblanco,
	john.fastabend, kubakici, oss-drivers, brouer, jhs,
	Jakub Kicinski

Hi!

This series is an attempt to make XDP more user friendly by 
enabling exploiting the recently added netlink extended ACK 
reporting to carry messages to user space.

David Ahern's iproute2 ext ack patches for ip link are sufficient
to show the errors like this:

# ip link set dev p4p1 xdp obj ipip_prepend.o sec ".text"
Error: nfp: MTU too large w/ XDP enabled

Where the message is coming directly from the driver.  There could
still be a bit of a leap for a complete novice from the message 
above to the right settings, but it's a big improvement over the
standard "Invalid argument" message.

v1/non-rfc:
 - add a separate macro in patch 1;
 - add KBUILD_MODNAME as part of the message (Daniel);
 - don't print the error to logs in patch 1.

Jakub Kicinski (4):
  netlink: add NULL-friendly helper for setting extended ACK message
  xdp: propagate extended ack to XDP setup
  nfp: make use of extended ack message reporting
  virtio_net: make use of extended ack message reporting

 drivers/net/ethernet/netronome/nfp/nfp_net.h       |  3 ++-
 .../net/ethernet/netronome/nfp/nfp_net_common.c    | 22 +++++++++++++---------
 .../net/ethernet/netronome/nfp/nfp_net_ethtool.c   |  4 ++--
 drivers/net/virtio_net.c                           | 11 +++++++----
 include/linux/netdevice.h                          | 10 ++++++++--
 include/linux/netlink.h                            |  8 ++++++++
 net/core/dev.c                                     |  5 ++++-
 net/core/rtnetlink.c                               | 13 ++++++++-----
 8 files changed, 52 insertions(+), 24 deletions(-)

-- 
2.11.0

^ permalink raw reply

* Re: [PATCH net-next] mlxsw: spectrum_router: Simplify VRF enslavement
From: David Miller @ 2017-05-01  3:04 UTC (permalink / raw)
  To: idosch; +Cc: netdev, jiri, mlxsw
In-Reply-To: <20170430164714.7303-1-idosch@mellanox.com>

From: <idosch@mellanox.com>
Date: Sun, 30 Apr 2017 19:47:14 +0300

> From: Ido Schimmel <idosch@mellanox.com>
> 
> When a netdev is enslaved to a VRF master, its router interface (RIF)
> needs to be destroyed (if exists) and a new one created using the
> corresponding virtual router (VR).
> 
> From the driver's perspective, the above is equivalent to an inetaddr
> event sent for this netdev. Therefore, when a port netdev (or its
> uppers) are enslaved to a VRF master, call the same function that
> would've been called had a NETDEV_UP was sent for this netdev in the
> inetaddr notification chain.
> 
> This patch also fixes a bug when a LAG netdev with an existing RIF is
> enslaved to a VRF. Before this patch, each LAG port would drop the
> reference on the RIF, but would re-join the same one (in the wrong VR)
> soon after. With this patch, the corresponding RIF is first destroyed
> and a new one is created using the correct VR.
> 
> Fixes: 7179eb5acd59 ("mlxsw: spectrum_router: Add support for VRFs")
> Signed-off-by: Ido Schimmel <idosch@mellanox.com>
> Reviewed-by: Jiri Pirko <jiri@mellanox.com>

Applied, thanks.

^ permalink raw reply

* Re: pull request: bluetooth-next 2017-04-30
From: David Miller @ 2017-05-01  3:03 UTC (permalink / raw)
  To: johan.hedberg; +Cc: linux-bluetooth, netdev
In-Reply-To: <20170430140928.GA2753@x1c>

From: Johan Hedberg <johan.hedberg@gmail.com>
Date: Sun, 30 Apr 2017 17:09:28 +0300

> Here's one last batch of Bluetooth patches in the bluetooth-next tree
> targeting the 4.12 kernel.
> 
>  - Remove custom ECDH implementation and use new KPP API instead
>  - Add protocol checks to hci_ldisc
>  - Add module license to HCI UART Nokia H4+ driver
>  - Minor fix for 32bit user space - 64 bit kernel combination
> 
> Please let me know if there are any issues pulling. Thanks.

Pulled, thanks Johan.

^ permalink raw reply

* Re: [pull request][net-next 00/15] Mellanox, mlx5 updates 2017-04-30
From: David Miller @ 2017-05-01  3:02 UTC (permalink / raw)
  To: saeedm; +Cc: netdev, ogerlitz, hadarh, ilyal, roid
In-Reply-To: <20170430132016.27012-1-saeedm@mellanox.com>

From: Saeed Mahameed <saeedm@mellanox.com>
Date: Sun, 30 Apr 2017 16:20:01 +0300

> This series contains two sets of patches to the mlx5 driver,
> 1. Nine patches (mostly from Hadar) to add 'mlx5 neigh update' feature.
> 2. Six misc patches.
> 
> For more details please see below.
> 
> Sorry for the last minute submission, originally I planned to submit before
> weekend, but in order to provide clean patches, we had to deal with some
> auto build issues first.
> 
> Please pull and let me know if there's any problem.

Pulled, thanks.

^ permalink raw reply

* Re: [PATCH net-next] qed: Prevent warning without CONFIG_RFS_ACCEL
From: David Miller @ 2017-05-01  3:01 UTC (permalink / raw)
  To: Yuval.Mintz; +Cc: netdev, Sudarsana.Kalluru
In-Reply-To: <1493543684-7038-1-git-send-email-Yuval.Mintz@cavium.com>

From: Yuval Mintz <Yuval.Mintz@cavium.com>
Date: Sun, 30 Apr 2017 12:14:44 +0300

> After removing the PTP related initialization from slowpath start,
> the remaining PTT entry is required only in case CONFIG_RFS_ACCEL is set.
> Otherwise, it leads to a warning due to it being unused.
> 
> Fixes: d179bd1699fc ("qed: Acquire/release ptt_ptp lock when enabling/disabling PTP")
> Signed-off-by: Yuval Mintz <Yuval.Mintz@cavium.com>

Also applied, thanks.

^ permalink raw reply

* Re: [PATCH net-next 0/6] qed: RoCE related pseudo-fixes
From: David Miller @ 2017-05-01  2:59 UTC (permalink / raw)
  To: Yuval.Mintz; +Cc: netdev, Ram.Amrani
In-Reply-To: <1493542150-21826-1-git-send-email-Yuval.Mintz@cavium.com>

From: Yuval Mintz <Yuval.Mintz@cavium.com>
Date: Sun, 30 Apr 2017 11:49:04 +0300

> This series contains multiple small corrections to the RoCE logic
> in qed plus some debug information and inter-module parameter
> meant to prevent issues further along.
> 
>  - #1, #6 Share information with protocol driver
>    [either new or filling missing bits in existing API].
>  - #2, #3 correct error flows in qed.
>  - #4 add debug related information.
>  - #5 fixes a minor issue in the HW configuration.

Series applied, thanks.

^ permalink raw reply

* Re: [PATCH net-next] bpf: enhance verifier to understand stack pointer arithmetic
From: David Miller @ 2017-05-01  2:57 UTC (permalink / raw)
  To: ast; +Cc: daniel, netdev, kernel-team
In-Reply-To: <20170430055242.2536070-1-ast@fb.com>

From: Alexei Starovoitov <ast@fb.com>
Date: Sat, 29 Apr 2017 22:52:42 -0700

> From: Yonghong Song <yhs@fb.com>
> 
> llvm 4.0 and above generates the code like below:
> ....
> 440: (b7) r1 = 15
> 441: (05) goto pc+73
> 515: (79) r6 = *(u64 *)(r10 -152)
> 516: (bf) r7 = r10
> 517: (07) r7 += -112
> 518: (bf) r2 = r7
> 519: (0f) r2 += r1
> 520: (71) r1 = *(u8 *)(r8 +0)
> 521: (73) *(u8 *)(r2 +45) = r1
> ....
> and the verifier complains "R2 invalid mem access 'inv'" for insn #521.
> This is because verifier marks register r2 as unknown value after #519
> where r2 is a stack pointer and r1 holds a constant value.
> 
> Teach verifier to recognize "stack_ptr + imm" and
> "stack_ptr + reg with const val" as valid stack_ptr with new offset.
> 
> Signed-off-by: Yonghong Song <yhs@fb.com>
> Acked-by: Martin KaFai Lau <kafai@fb.com>
> Acked-by: Daniel Borkmann <daniel@iogearbox.net>
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>
> ---
> technically it's 'net' material, but it's too late for 'net',
> hence 'net-next' tag.
> No 'Fixes' tag, since it's only seen with newer llvm.

Applied to net-next, but I'll queue this up to -stable.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox