* [PATCH net-next 0/9] warning cleanups
From: Stephen Hemminger @ 2017-05-19 16:55 UTC (permalink / raw)
To: davem; +Cc: netdev, Stephen Hemminger
This series addresses a number of warnings in common networking
code visible when kernel is built with W=1
The DCB patch needs review from John Fastbend the original author
of the netlink interface since it adds missing checks.
Stephen Hemminger (9):
dcb: enforce minimum length on IEEE_APPS attribute
ila: propagate error code in ila_output
udp: make local function static
inet: fix warning about missing prototype
tcpnv: do not export local function
xfrm: make xfrm_dev_register static
fou: make local function static
ipv6: drop unused variables in seg6_genl_dumphac
ipv6: remove unused variables in esp6
net/dcb/dcbnl.c | 11 ++++--
net/ipv4/fou.c | 82 ++++++++++++++++++++---------------------
net/ipv4/inet_connection_sock.c | 1 +
net/ipv4/tcp_nv.c | 5 +--
net/ipv4/udp.c | 2 +-
net/ipv6/esp6.c | 5 ---
net/ipv6/fou6.c | 14 +++----
net/ipv6/ila/ila_lwt.c | 2 +-
net/ipv6/seg6.c | 4 --
net/ipv6/udp.c | 2 +-
net/xfrm/xfrm_device.c | 2 +-
11 files changed, 61 insertions(+), 69 deletions(-)
--
2.11.0
^ permalink raw reply
* [PATCH net-next 1/9] dcb: enforce minimum length on IEEE_APPS attribute
From: Stephen Hemminger @ 2017-05-19 16:55 UTC (permalink / raw)
To: davem; +Cc: netdev, Stephen Hemminger
In-Reply-To: <20170519165556.483-1-sthemmin@microsoft.com>
Found by reviewing the warning about unused policy table.
The code implies that it meant to check for size, but since
it unrolled the loop for attribute validation that is never used.
Instead do explicit check for attribute.
Compile tested only. Needs review by original author.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
---
net/dcb/dcbnl.c | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/net/dcb/dcbnl.c b/net/dcb/dcbnl.c
index 93106120f987..733f523707ac 100644
--- a/net/dcb/dcbnl.c
+++ b/net/dcb/dcbnl.c
@@ -178,10 +178,6 @@ static const struct nla_policy dcbnl_ieee_policy[DCB_ATTR_IEEE_MAX + 1] = {
[DCB_ATTR_IEEE_QCN_STATS] = {.len = sizeof(struct ieee_qcn_stats)},
};
-static const struct nla_policy dcbnl_ieee_app[DCB_ATTR_IEEE_APP_MAX + 1] = {
- [DCB_ATTR_IEEE_APP] = {.len = sizeof(struct dcb_app)},
-};
-
/* DCB number of traffic classes nested attributes. */
static const struct nla_policy dcbnl_featcfg_nest[DCB_FEATCFG_ATTR_MAX + 1] = {
[DCB_FEATCFG_ATTR_ALL] = {.type = NLA_FLAG},
@@ -1463,8 +1459,15 @@ static int dcbnl_ieee_set(struct net_device *netdev, struct nlmsghdr *nlh,
nla_for_each_nested(attr, ieee[DCB_ATTR_IEEE_APP_TABLE], rem) {
struct dcb_app *app_data;
+
if (nla_type(attr) != DCB_ATTR_IEEE_APP)
continue;
+
+ if (nla_len(attr) < sizeof(struct dcb_app)) {
+ err = -ERANGE;
+ goto err;
+ }
+
app_data = nla_data(attr);
if (ops->ieee_setapp)
err = ops->ieee_setapp(netdev, app_data);
--
2.11.0
^ permalink raw reply related
* [PATCH net-next 2/9] ila: propagate error code in ila_output
From: Stephen Hemminger @ 2017-05-19 16:55 UTC (permalink / raw)
To: davem; +Cc: netdev, Stephen Hemminger
In-Reply-To: <20170519165556.483-1-sthemmin@microsoft.com>
This warning:
net/ipv6/ila/ila_lwt.c: In function ‘ila_output’:
net/ipv6/ila/ila_lwt.c:42:6: warning: variable ‘err’ set but not used [-Wunused-but-set-variable]
It looks like the code attempts to set propagate different error
values, but always returned -EINVAL.
Compile tested only. Needs review by original author.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
---
net/ipv6/ila/ila_lwt.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/ipv6/ila/ila_lwt.c b/net/ipv6/ila/ila_lwt.c
index b3df03e3faa0..f4a413aba423 100644
--- a/net/ipv6/ila/ila_lwt.c
+++ b/net/ipv6/ila/ila_lwt.c
@@ -91,7 +91,7 @@ static int ila_output(struct net *net, struct sock *sk, struct sk_buff *skb)
drop:
kfree_skb(skb);
- return -EINVAL;
+ return err;
}
static int ila_input(struct sk_buff *skb)
--
2.11.0
^ permalink raw reply related
* [PATCH net-next 3/9] udp: make local function static
From: Stephen Hemminger @ 2017-05-19 16:55 UTC (permalink / raw)
To: davem; +Cc: netdev, Stephen Hemminger
In-Reply-To: <20170519165556.483-1-sthemmin@microsoft.com>
udp_queue_rcv_skb was global but only used in one file.
Identified by this warning:
net/ipv4/udp.c:1775:5: warning: no previous prototype for ‘udp_queue_rcv_skb’ [-Wmissing-prototypes]
int udp_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
---
net/ipv4/udp.c | 2 +-
net/ipv6/udp.c | 2 +-
2 files changed, 2 insertions(+), 2 deletions(-)
diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
index e7b6cfcca627..8d7980ca27a0 100644
--- a/net/ipv4/udp.c
+++ b/net/ipv4/udp.c
@@ -1772,7 +1772,7 @@ EXPORT_SYMBOL(udp_encap_enable);
* Note that in the success and error cases, the skb is assumed to
* have either been requeued or freed.
*/
-int udp_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
+static int udp_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
{
struct udp_sock *up = udp_sk(sk);
int is_udplite = IS_UDPLITE(sk);
diff --git a/net/ipv6/udp.c b/net/ipv6/udp.c
index f78fdf8c9f0f..d558c1fef6fd 100644
--- a/net/ipv6/udp.c
+++ b/net/ipv6/udp.c
@@ -570,7 +570,7 @@ void udpv6_encap_enable(void)
}
EXPORT_SYMBOL(udpv6_encap_enable);
-int udpv6_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
+static int udpv6_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
{
struct udp_sock *up = udp_sk(sk);
int is_udplite = IS_UDPLITE(sk);
--
2.11.0
^ permalink raw reply related
* [PATCH net-next 4/9] inet: fix warning about missing prototype
From: Stephen Hemminger @ 2017-05-19 16:55 UTC (permalink / raw)
To: davem; +Cc: netdev, Stephen Hemminger
In-Reply-To: <20170519165556.483-1-sthemmin@microsoft.com>
The prototype for inet_rcv_saddr_equal was not being included.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
---
net/ipv4/inet_connection_sock.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/net/ipv4/inet_connection_sock.c b/net/ipv4/inet_connection_sock.c
index 1054d330bf9d..82dec8825d28 100644
--- a/net/ipv4/inet_connection_sock.c
+++ b/net/ipv4/inet_connection_sock.c
@@ -25,6 +25,7 @@
#include <net/xfrm.h>
#include <net/tcp.h>
#include <net/sock_reuseport.h>
+#include <net/addrconf.h>
#ifdef INET_CSK_DEBUG
const char inet_csk_timer_bug_msg[] = "inet_csk BUG: unknown timer value\n";
--
2.11.0
^ permalink raw reply related
* [PATCH net-next 5/9] tcpnv: do not export local function
From: Stephen Hemminger @ 2017-05-19 16:55 UTC (permalink / raw)
To: davem; +Cc: netdev, Stephen Hemminger
In-Reply-To: <20170519165556.483-1-sthemmin@microsoft.com>
The TCP New Vegas congestion control was exporting an internal
function tcpnv_get_info which is not used by any other in tree
kernel code. Make it static.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
---
net/ipv4/tcp_nv.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/net/ipv4/tcp_nv.c b/net/ipv4/tcp_nv.c
index 5de82a8d4d87..6d650ed3cb59 100644
--- a/net/ipv4/tcp_nv.c
+++ b/net/ipv4/tcp_nv.c
@@ -424,8 +424,8 @@ static void tcpnv_acked(struct sock *sk, const struct ack_sample *sample)
}
/* Extract info for Tcp socket info provided via netlink */
-size_t tcpnv_get_info(struct sock *sk, u32 ext, int *attr,
- union tcp_cc_info *info)
+static size_t tcpnv_get_info(struct sock *sk, u32 ext, int *attr,
+ union tcp_cc_info *info)
{
const struct tcpnv *ca = inet_csk_ca(sk);
@@ -440,7 +440,6 @@ size_t tcpnv_get_info(struct sock *sk, u32 ext, int *attr,
}
return 0;
}
-EXPORT_SYMBOL_GPL(tcpnv_get_info);
static struct tcp_congestion_ops tcpnv __read_mostly = {
.init = tcpnv_init,
--
2.11.0
^ permalink raw reply related
* [PATCH net-next 7/9] fou: make local function static
From: Stephen Hemminger @ 2017-05-19 16:55 UTC (permalink / raw)
To: davem; +Cc: netdev, Stephen Hemminger
In-Reply-To: <20170519165556.483-1-sthemmin@microsoft.com>
The build header functions are not used by any other code.
net/ipv6/fou6.c:36:5: warning: no previous prototype for ‘fou6_build_header’ [-Wmissing-prototypes]
net/ipv6/fou6.c:54:5: warning: no previous prototype for ‘gue6_build_header’ [-Wmissing-prototypes]
Need to do some code rearranging to satisfy different Kconfig possiblities.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
---
net/ipv4/fou.c | 82 ++++++++++++++++++++++++++++-----------------------------
net/ipv6/fou6.c | 14 +++++-----
2 files changed, 47 insertions(+), 49 deletions(-)
diff --git a/net/ipv4/fou.c b/net/ipv4/fou.c
index 805f6607f8d9..8e0257d01200 100644
--- a/net/ipv4/fou.c
+++ b/net/ipv4/fou.c
@@ -8,6 +8,7 @@
#include <linux/kernel.h>
#include <net/genetlink.h>
#include <net/gue.h>
+#include <net/fou.h>
#include <net/ip.h>
#include <net/protocol.h>
#include <net/udp.h>
@@ -859,25 +860,6 @@ size_t gue_encap_hlen(struct ip_tunnel_encap *e)
}
EXPORT_SYMBOL(gue_encap_hlen);
-static void fou_build_udp(struct sk_buff *skb, struct ip_tunnel_encap *e,
- struct flowi4 *fl4, u8 *protocol, __be16 sport)
-{
- struct udphdr *uh;
-
- skb_push(skb, sizeof(struct udphdr));
- skb_reset_transport_header(skb);
-
- uh = udp_hdr(skb);
-
- uh->dest = e->dport;
- uh->source = sport;
- uh->len = htons(skb->len);
- udp_set_csum(!(e->flags & TUNNEL_ENCAP_FLAG_CSUM), skb,
- fl4->saddr, fl4->daddr, skb->len);
-
- *protocol = IPPROTO_UDP;
-}
-
int __fou_build_header(struct sk_buff *skb, struct ip_tunnel_encap *e,
u8 *protocol, __be16 *sport, int type)
{
@@ -894,24 +876,6 @@ int __fou_build_header(struct sk_buff *skb, struct ip_tunnel_encap *e,
}
EXPORT_SYMBOL(__fou_build_header);
-int fou_build_header(struct sk_buff *skb, struct ip_tunnel_encap *e,
- u8 *protocol, struct flowi4 *fl4)
-{
- int type = e->flags & TUNNEL_ENCAP_FLAG_CSUM ? SKB_GSO_UDP_TUNNEL_CSUM :
- SKB_GSO_UDP_TUNNEL;
- __be16 sport;
- int err;
-
- err = __fou_build_header(skb, e, protocol, &sport, type);
- if (err)
- return err;
-
- fou_build_udp(skb, e, fl4, protocol, sport);
-
- return 0;
-}
-EXPORT_SYMBOL(fou_build_header);
-
int __gue_build_header(struct sk_buff *skb, struct ip_tunnel_encap *e,
u8 *protocol, __be16 *sport, int type)
{
@@ -985,8 +949,46 @@ int __gue_build_header(struct sk_buff *skb, struct ip_tunnel_encap *e,
}
EXPORT_SYMBOL(__gue_build_header);
-int gue_build_header(struct sk_buff *skb, struct ip_tunnel_encap *e,
- u8 *protocol, struct flowi4 *fl4)
+#ifdef CONFIG_NET_FOU_IP_TUNNELS
+
+static void fou_build_udp(struct sk_buff *skb, struct ip_tunnel_encap *e,
+ struct flowi4 *fl4, u8 *protocol, __be16 sport)
+{
+ struct udphdr *uh;
+
+ skb_push(skb, sizeof(struct udphdr));
+ skb_reset_transport_header(skb);
+
+ uh = udp_hdr(skb);
+
+ uh->dest = e->dport;
+ uh->source = sport;
+ uh->len = htons(skb->len);
+ udp_set_csum(!(e->flags & TUNNEL_ENCAP_FLAG_CSUM), skb,
+ fl4->saddr, fl4->daddr, skb->len);
+
+ *protocol = IPPROTO_UDP;
+}
+
+static int fou_build_header(struct sk_buff *skb, struct ip_tunnel_encap *e,
+ u8 *protocol, struct flowi4 *fl4)
+{
+ int type = e->flags & TUNNEL_ENCAP_FLAG_CSUM ? SKB_GSO_UDP_TUNNEL_CSUM :
+ SKB_GSO_UDP_TUNNEL;
+ __be16 sport;
+ int err;
+
+ err = __fou_build_header(skb, e, protocol, &sport, type);
+ if (err)
+ return err;
+
+ fou_build_udp(skb, e, fl4, protocol, sport);
+
+ return 0;
+}
+
+static int gue_build_header(struct sk_buff *skb, struct ip_tunnel_encap *e,
+ u8 *protocol, struct flowi4 *fl4)
{
int type = e->flags & TUNNEL_ENCAP_FLAG_CSUM ? SKB_GSO_UDP_TUNNEL_CSUM :
SKB_GSO_UDP_TUNNEL;
@@ -1001,9 +1003,7 @@ int gue_build_header(struct sk_buff *skb, struct ip_tunnel_encap *e,
return 0;
}
-EXPORT_SYMBOL(gue_build_header);
-#ifdef CONFIG_NET_FOU_IP_TUNNELS
static const struct ip_tunnel_encap_ops fou_iptun_ops = {
.encap_hlen = fou_encap_hlen,
diff --git a/net/ipv6/fou6.c b/net/ipv6/fou6.c
index 9ea249b9451e..6de3c04b0f30 100644
--- a/net/ipv6/fou6.c
+++ b/net/ipv6/fou6.c
@@ -14,6 +14,8 @@
#include <net/udp.h>
#include <net/udp_tunnel.h>
+#if IS_ENABLED(CONFIG_IPV6_FOU_TUNNEL)
+
static void fou6_build_udp(struct sk_buff *skb, struct ip_tunnel_encap *e,
struct flowi6 *fl6, u8 *protocol, __be16 sport)
{
@@ -33,8 +35,8 @@ static void fou6_build_udp(struct sk_buff *skb, struct ip_tunnel_encap *e,
*protocol = IPPROTO_UDP;
}
-int fou6_build_header(struct sk_buff *skb, struct ip_tunnel_encap *e,
- u8 *protocol, struct flowi6 *fl6)
+static int fou6_build_header(struct sk_buff *skb, struct ip_tunnel_encap *e,
+ u8 *protocol, struct flowi6 *fl6)
{
__be16 sport;
int err;
@@ -49,10 +51,9 @@ int fou6_build_header(struct sk_buff *skb, struct ip_tunnel_encap *e,
return 0;
}
-EXPORT_SYMBOL(fou6_build_header);
-int gue6_build_header(struct sk_buff *skb, struct ip_tunnel_encap *e,
- u8 *protocol, struct flowi6 *fl6)
+static int gue6_build_header(struct sk_buff *skb, struct ip_tunnel_encap *e,
+ u8 *protocol, struct flowi6 *fl6)
{
__be16 sport;
int err;
@@ -67,9 +68,6 @@ int gue6_build_header(struct sk_buff *skb, struct ip_tunnel_encap *e,
return 0;
}
-EXPORT_SYMBOL(gue6_build_header);
-
-#if IS_ENABLED(CONFIG_IPV6_FOU_TUNNEL)
static const struct ip6_tnl_encap_ops fou_ip6tun_ops = {
.encap_hlen = fou_encap_hlen,
--
2.11.0
^ permalink raw reply related
* [PATCH net-next 8/9] ipv6: drop unused variables in seg6_genl_dumphac
From: Stephen Hemminger @ 2017-05-19 16:55 UTC (permalink / raw)
To: davem; +Cc: netdev, Stephen Hemminger
In-Reply-To: <20170519165556.483-1-sthemmin@microsoft.com>
THe seg6_pernet_data variable was set but never used.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
---
net/ipv6/seg6.c | 4 ----
1 file changed, 4 deletions(-)
diff --git a/net/ipv6/seg6.c b/net/ipv6/seg6.c
index 5f44ffed2576..15fba55e3da8 100644
--- a/net/ipv6/seg6.c
+++ b/net/ipv6/seg6.c
@@ -303,13 +303,9 @@ static int seg6_genl_dumphmac_done(struct netlink_callback *cb)
static int seg6_genl_dumphmac(struct sk_buff *skb, struct netlink_callback *cb)
{
struct rhashtable_iter *iter = (struct rhashtable_iter *)cb->args[0];
- struct net *net = sock_net(skb->sk);
- struct seg6_pernet_data *sdata;
struct seg6_hmac_info *hinfo;
int ret;
- sdata = seg6_pernet(net);
-
ret = rhashtable_walk_start(iter);
if (ret && ret != -EAGAIN)
goto done;
--
2.11.0
^ permalink raw reply related
* [PATCH net-next 6/9] xfrm: make xfrm_dev_register static
From: Stephen Hemminger @ 2017-05-19 16:55 UTC (permalink / raw)
To: davem; +Cc: netdev, Stephen Hemminger
In-Reply-To: <20170519165556.483-1-sthemmin@microsoft.com>
This function is only used in this file and should not be global.
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
---
net/xfrm/xfrm_device.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/xfrm/xfrm_device.c b/net/xfrm/xfrm_device.c
index 8ec8a3fcf8d4..50ec73399b48 100644
--- a/net/xfrm/xfrm_device.c
+++ b/net/xfrm/xfrm_device.c
@@ -138,7 +138,7 @@ bool xfrm_dev_offload_ok(struct sk_buff *skb, struct xfrm_state *x)
}
EXPORT_SYMBOL_GPL(xfrm_dev_offload_ok);
-int xfrm_dev_register(struct net_device *dev)
+static int xfrm_dev_register(struct net_device *dev)
{
if ((dev->features & NETIF_F_HW_ESP) && !dev->xfrmdev_ops)
return NOTIFY_BAD;
--
2.11.0
^ permalink raw reply related
* [PATCH net-next 9/9] ipv6: remove unused variables in esp6
From: Stephen Hemminger @ 2017-05-19 16:55 UTC (permalink / raw)
To: davem; +Cc: netdev, Stephen Hemminger
In-Reply-To: <20170519165556.483-1-sthemmin@microsoft.com>
Resolves warnings:
net/ipv6/esp6.c: In function ‘esp_ssg_unref’:
net/ipv6/esp6.c:121:10: warning: variable ‘seqhi’ set but not used [-Wunused-but-set-variable]
net/ipv6/esp6.c: In function ‘esp6_output_head’:
net/ipv6/esp6.c:227:21: warning: variable ‘esph’ set but not used [-Wunused-but-set-variable]
Signed-off-by: Stephen Hemminger <sthemmin@microsoft.com>
---
net/ipv6/esp6.c | 5 -----
1 file changed, 5 deletions(-)
diff --git a/net/ipv6/esp6.c b/net/ipv6/esp6.c
index 1fe99ba8066c..53b6b870b935 100644
--- a/net/ipv6/esp6.c
+++ b/net/ipv6/esp6.c
@@ -118,7 +118,6 @@ static inline struct scatterlist *esp_req_sg(struct crypto_aead *aead,
static void esp_ssg_unref(struct xfrm_state *x, void *tmp)
{
- __be32 *seqhi;
struct crypto_aead *aead = x->data;
int seqhilen = 0;
u8 *iv;
@@ -128,7 +127,6 @@ static void esp_ssg_unref(struct xfrm_state *x, void *tmp)
if (x->props.flags & XFRM_STATE_ESN)
seqhilen += sizeof(__be32);
- seqhi = esp_tmp_seqhi(tmp);
iv = esp_tmp_iv(aead, tmp, seqhilen);
req = esp_tmp_req(aead, iv);
@@ -224,12 +222,9 @@ int esp6_output_head(struct xfrm_state *x, struct sk_buff *skb, struct esp_info
u8 *vaddr;
int nfrags;
struct page *page;
- struct ip_esp_hdr *esph;
struct sk_buff *trailer;
int tailen = esp->tailen;
- esph = ip_esp_hdr(skb);
-
if (!skb_cloned(skb)) {
if (tailen <= skb_availroom(skb)) {
nfrags = 1;
--
2.11.0
^ permalink raw reply related
* [PATCH iproute2 1/1] tc: fix Makefile to build skbmod
From: Roman Mashak @ 2017-05-19 17:05 UTC (permalink / raw)
To: stephen; +Cc: netdev, Roman Mashak
Signed-off-by: Roman Mashak <mrv@mojatatu.com>
---
tc/Makefile | 1 +
1 file changed, 1 insertion(+)
diff --git a/tc/Makefile b/tc/Makefile
index 9a6bb1d..678b302 100644
--- a/tc/Makefile
+++ b/tc/Makefile
@@ -45,6 +45,7 @@ TCMODULES += m_nat.o
TCMODULES += m_pedit.o
TCMODULES += m_ife.o
TCMODULES += m_skbedit.o
+TCMODULES += m_skbmod.o
TCMODULES += m_csum.o
TCMODULES += m_simple.o
TCMODULES += m_vlan.o
--
1.9.1
^ permalink raw reply related
* Re: [PATCH net] bridge: fix hello and hold timers starting/stopping
From: Ivan Vecera @ 2017-05-19 17:06 UTC (permalink / raw)
To: Nikolay Aleksandrov; +Cc: Xin Long, network dev, bridge, David Miller
In-Reply-To: <4fb3ce7a-b3d8-b62e-2263-eb3fab2ff2fe@cumulusnetworks.com>
2017-05-19 18:55 GMT+02:00 Nikolay Aleksandrov <nikolay@cumulusnetworks.com>:
> On 5/19/17 7:25 PM, Ivan Vecera wrote:
>>
>> Current bridge code incorrectly handles starting/stopping of hello and
>> hold timers during STP enable/disable.
>>
>> 1. Timers are stopped in br_stp_start() during NO_STP->USER_STP
>> transition. This is not correct as the timers are stopped in NO_STP
>> case.
>
>
> This really is a noop, but ok.
Yes, stopping of stopped timers are safe but confusing.
>> 2. Timers are started in br_stp_stop() during USER_STP->NO_STP transition.
>> This is not also correct as the timers should be stopped in NO_STP
>> state.
>
>
> Indeed, but the actual end result is almost as them being stopped because
> in the timers there are specific checks if the STP == KERNEL_STP (see
> br_transmit_config()) and the hold_timers will simply expire and not rearm
> in any other mode. The only real problem is the hello_timer which continues
> to rearm itself, but with Xin's earlier patch that is taken care of too.
Yes, this is clean-up as well. The starting of timers are more
confusing than dangerous
but from a reader's point of view the starting of timers is non-sense
when STP is
going to be disabled.
>>
>> 3. Timers are NOT stopped in br_stp_stop() during KERNEL_STP->NO_STP
>> transition. They should be stopped as they are running in KERNEL_STP
>> state and should not run in NO_STP case.
>
>
> Same comment as for point 2.
This can be removed... and leave hello_timer handler to stop itself.
>> The patch is a follow-up for "bridge: start hello_timer when enabling
>> KERNEL_STP in br_stp_start" patch from Xin Long.
>>
>
> I'd say this is more of a cleanup/improvement after Xin's patch and thus
> would
> suggest targeting net-next. The only real issue is fixed by his patch.
Agree... will send resend against net-next.
Thanks for comments,
Ivan
^ permalink raw reply
* Re: [RFC net-next PATCH 3/5] net: introduce XDP driver features interface
From: Daniel Borkmann @ 2017-05-19 17:13 UTC (permalink / raw)
To: Jesper Dangaard Brouer, Daniel Borkmann, Alexei Starovoitov
Cc: John Fastabend, netdev
In-Reply-To: <149512210317.14733.15039298820296846614.stgit@firesoul>
On 05/18/2017 05:41 PM, Jesper Dangaard Brouer wrote:
> There is a fundamental difference between normal eBPF programs
> and (XDP) eBPF programs getting attached in a driver. For normal
> eBPF programs it is easy to add a new bpf feature, like a bpf
> helper, because is it strongly tied to the feature being
> available in the current core kernel code. When drivers invoke a
> bpf_prog, then it is not sufficient to simply relying on whether
> a bpf_helper exists or not. When a driver haven't implemented a
> given feature yet, then it is possible to expose uninitialized
> parts of xdp_buff. The driver pass in a pointer to xdp_buff,
> usually "allocated" on the stack, which must not be exposed.
When xdp_buff is being extended, then we should at least zero
initialize all in-tree users that don't support or populate this
field, thus that it's not uninitialized memory. Better would be
to have a way to reject the prog in the first place until it's
implemented (but further comments on feature bits below).
> Only two user visible NETIF_F_XDP_* net_device feature flags are
> exposed via ethtool (-k) seen as "xdp" and "xdp-partial".
> The "xdp-partial" is detected when there is not feature equality
> between kernel and driver, and a netdev_warn is given.
I think having something like a NETIF_F_XDP_BIT for ethtool to
indicate support as "xdp" is quite useful. Avoids having to grep
the kernel tree for ndo_xdp callback. ;) A "xdp-partial" would
still be unclear/confusing to the user whether his program loads
or doesn't which is the only thing a user (or some loading infra)
cares about eventually, so one still needs to go trying to load
the XDP code to see whether that fails for the native case.
> The idea is that XDP_DRV_* feature bits define a contract between
> the driver and the kernel, giving a reliable way to know that XDP
> features a driver promised to implement. Thus, knowing what bpf
> side features are safe to allow.
>
> There are 3 levels of features: "required", "devel" and "optional".
>
> The motivation is pushing driver vendors forward to support all
> the new XDP features. Once a given feature bit is moved into
> the "required" features, the kernel will reject loading XDP
> program if feature isn't implemented by driver. Features under
> developement, require help from the bpf infrastrucure to detect
> when a given helper or direct-access is used, using a bpf_prog
> bit to mark a need for the feature, and pulling in this bit in
> the xdp_features_check(). When all drivers have implemented
> a "devel" feature, it can be moved to the "required" feature and
The problem is that once you add bits markers to bpf_prog like we
used to do in the past, then as you do in patch 4/5 with the
xdp_rxhash_needed bit, they will need to be turned /on/ unconditionally
when a prog has tail calls.
Meaning, xdp_features_check() would then bail out when you have
XDP_DRV_F_RXHASH set. This has the effect that while XDP prog X
was running fine on kernel Y, it will suddenly get rejected on
later kernel (Y + 1), without using the feature (but tail calls).
And more complex networking progs are likely to use tail calls,
so that would break all of them unfortunately as they cannot load
anymore.
> the bpf_prog bit can be refurbished. The "optional" features are
> for things that are handled safely runtime, but drivers will
> still get flagged as "xdp-partial" if not implementing those.
^ permalink raw reply
* Re: [PATCH v2 1/3] bpf: Use 1<<16 as ceiling for immediate alignment in verifier.
From: Edward Cree @ 2017-05-19 17:17 UTC (permalink / raw)
To: Alexei Starovoitov, David Miller, Daniel Borkmann
Cc: alexei.starovoitov, netdev
In-Reply-To: <db8cd2e8-9f51-dc8b-5e55-4c75d13d4858@fb.com>
On 19/05/17 15:55, Alexei Starovoitov wrote:
> On 5/19/17 7:21 AM, Edward Cree wrote:
>> I'm currently translating the algos to C. But for the kernel patch,
>> I'll need to read & understand the existing verifier code, so it
>> might take a while :) (I don't suppose there's any design document
>> or hacking-notes you could point me at?)
>
> Dave just gave a good overview of the verifier:
> https://www.spinics.net/lists/xdp-newbies/msg00185.html
>
> Few more details in
> Documentation/networking/filter.txt
> and in top comments in kernel/bpf/verifier.c
>
> General bpf arch overview:
> http://docs.cilium.io/en/latest/bpf/#instruction-set
Thanks for the links! I'm digging into implementation now.
One question: is there a way to build the verifier as userland code
(or at least as a module), or will I have to reboot every time I
want to test a change?
-Ed
^ permalink raw reply
* Re: [PATCH] net: sched: fix a use-after-free error on chain on the error exit path
From: Cong Wang @ 2017-05-19 17:17 UTC (permalink / raw)
To: Colin King
Cc: Jamal Hadi Salim, Jiri Pirko, David S . Miller,
Linux Kernel Network Developers, kernel-janitors, LKML
In-Reply-To: <20170518140702.6072-1-colin.king@canonical.com>
On Thu, May 18, 2017 at 7:07 AM, Colin King <colin.king@canonical.com> wrote:
> diff --git a/net/sched/cls_api.c b/net/sched/cls_api.c
> index 4020b8d932a1..82ebdc3fcb2e 100644
> --- a/net/sched/cls_api.c
> +++ b/net/sched/cls_api.c
> @@ -511,6 +511,7 @@ static int tc_ctl_tfilter(struct sk_buff *skb, struct nlmsghdr *n,
> if (n->nlmsg_type == RTM_DELTFILTER && prio == 0) {
> tfilter_notify_chain(net, skb, n, chain, RTM_DELTFILTER);
> tcf_chain_destroy(chain);
Jiri, how does this work...? An action could hold a refcnt to a filter
chain, but here you destroy a whole chain without respecting
the refcnt???
> + chain = NULL;
> err = 0;
> goto errout;
Colin, not your fault, I think we may miss something more serious
when reviewing Jiri's patchset. ;)
Thanks.
^ permalink raw reply
* Re: [PATCH net] bridge: fix hello and hold timers starting/stopping
From: Xin Long @ 2017-05-19 17:26 UTC (permalink / raw)
To: Ivan Vecera; +Cc: Nikolay Aleksandrov, network dev, bridge, davem
In-Reply-To: <20170519162543.31670-1-cera@cera.cz>
On Sat, May 20, 2017 at 12:25 AM, Ivan Vecera <cera@cera.cz> wrote:
[...]
> @@ -197,13 +191,14 @@ static void br_stp_stop(struct net_bridge *br)
> br_err(br, "failed to stop userspace STP (%d)\n", err);
>
> /* To start timers on any ports left in blocking */
> - mod_timer(&br->hello_timer, jiffies + br->hello_time);
> - list_for_each_entry(p, &br->port_list, list)
> - mod_timer(&p->hold_timer,
> - round_jiffies(jiffies + BR_HOLD_TIME));
> spin_lock_bh(&br->lock);
> br_port_state_selection(br);
> spin_unlock_bh(&br->lock);
> + } else {
> + /* BR_KERNEL_STP - stop hello and hold timers */
> + del_timer(&br->hello_timer);
> + list_for_each_entry(p, &br->port_list, list)
> + del_timer(&p->hold_timer);
> }
>
> br->stp_enabled = BR_NO_STP;
I have a question here, br->stp_enabled is not atomic, and it is being
changed without holding br->lock here, while it may be checked in
br_hello_timer_expired, is it safe ?
(sorry if I misunderstood or overthought about it)
> --
> 2.13.0
>
^ permalink raw reply
* [PATCH net] fix BUG: scheduling while atomic in netlink broadcast
From: Akshay Narayan @ 2017-05-19 17:22 UTC (permalink / raw)
To: netdev; +Cc: davem
netlink_broadcast_filtered() calls yield() when a slow listener causes
the buffer to fill. yield() is the wrong choice here, as pointed out by
Commit 8e3fabfde4 (sched: Update yield() docs); in some cases, its use causes
"BUG: scheduling while atomic" and, when fewer cores are available,
kernel hangs:
(note: "ccp" is a kernel module which multicasts netlink messages upon
certain TCP events)
May 17 17:33:56 ccp kernel: [ 394.493978] BUG: scheduling while
atomic: iperf/4744/0x00000101
May 17 17:33:56 ccp kernel: [ 394.493979] Modules linked in: ccp(OE)
snd_intel8x0 snd_ac97_codec ac97_bus snd_pcm crct10dif_pclmul
crc32_pclmul ghash_clmulni_intel snd_seq_midi snd_seq_midi_event
snd_rawmidi snd_seq joydev pcbc snd_seq
_device snd_timer snd vboxvideo aesni_intel aes_x86_64 i2c_piix4
input_leds crypto_simd glue_helper cryptd soundcore mac_hid vboxguest
intel_rapl_perf serio_raw ttm drm_kms_helper drm fb_sys_fops
syscopyarea sysfillrect sysimgblt parpor
t_pc ppdev lp parport ip_tables x_tables autofs4 hid_generic usbhid
hid psmouse ahci e1000 libahci pata_acpi fjes video [last unloaded:
ccp]
May 17 17:33:56 ccp kernel: [ 394.493997] CPU: 0 PID: 4744 Comm:
iperf Tainted: G W OE 4.10.0-21-generic #23-Ubuntu
May 17 17:33:56 ccp kernel: [ 394.493997] Hardware name: innotek GmbH
VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
May 17 17:33:56 ccp kernel: [ 394.493997] Call Trace:
May 17 17:33:56 ccp kernel: [ 394.493998] <IRQ>
May 17 17:33:56 ccp kernel: [ 394.494000] dump_stack+0x63/0x81
May 17 17:33:56 ccp kernel: [ 394.494001] __schedule_bug+0x54/0x70
May 17 17:33:56 ccp kernel: [ 394.494002] __schedule+0x536/0x6f0
May 17 17:33:56 ccp kernel: [ 394.494004] schedule+0x36/0x80
May 17 17:33:56 ccp kernel: [ 394.494005] sys_sched_yield+0x4f/0x60
May 17 17:33:56 ccp kernel: [ 394.494005] yield+0x33/0x40
May 17 17:33:56 ccp kernel: [ 394.494006]
netlink_broadcast_filtered+0x29b/0x3c0
May 17 17:33:56 ccp kernel: [ 394.494007] netlink_broadcast+0x1d/0x20
...
May 17 17:33:56 ccp kernel: [ 394.494080] softirq: huh, entered
softirq 3 NET_RX ffffffff85fb5c60 with preempt_count 00000101, exited
with 00000000?
Changing this call to cond_resched() prevents this.
---
net/netlink/af_netlink.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/net/netlink/af_netlink.c b/net/netlink/af_netlink.c
index 161b628..f70716c 100644
--- a/net/netlink/af_netlink.c
+++ b/net/netlink/af_netlink.c
@@ -1441,7 +1441,7 @@ int netlink_broadcast_filtered(struct sock *ssk,
struct sk_buff *skb, u32 portid
if (info.delivered) {
if (info.congested && gfpflags_allow_blocking(allocation))
- yield();
+ cond_resched();
return 0;
}
return -ESRCH;
^ permalink raw reply related
* [PATCH net-next v2] bridge: fix hello and hold timers starting/stopping
From: Ivan Vecera @ 2017-05-19 17:30 UTC (permalink / raw)
To: netdev; +Cc: davem, sashok, stephen, bridge, lucien.xin, nikolay
Current bridge code incorrectly handles starting/stopping of hello and
hold timers during STP enable/disable.
1. Timers are stopped in br_stp_start() during NO_STP->USER_STP
transition. The timers are already stopped in NO_STP state so
this is confusing no-op.
2. During USER_STP->NO_STP transition the timers are started. This
does not make sense and is confusion because the timer should not be
active in NO_STP state.
Cc: davem@davemloft.net
Cc: sashok@cumulusnetworks.com
Cc: stephen@networkplumber.org
Cc: bridge@lists.linux-foundation.org
Cc: lucien.xin@gmail.com
Cc: nikolay@cumulusnetworks.com
Signed-off-by: Ivan Vecera <cera@cera.cz>
---
net/bridge/br_stp_if.c | 11 -----------
1 file changed, 11 deletions(-)
diff --git a/net/bridge/br_stp_if.c b/net/bridge/br_stp_if.c
index 08341d2aa9c9..a05027027513 100644
--- a/net/bridge/br_stp_if.c
+++ b/net/bridge/br_stp_if.c
@@ -150,7 +150,6 @@ static int br_stp_call_user(struct net_bridge *br, char *arg)
static void br_stp_start(struct net_bridge *br)
{
- struct net_bridge_port *p;
int err = -ENOENT;
if (net_eq(dev_net(br->dev), &init_net))
@@ -169,11 +168,6 @@ static void br_stp_start(struct net_bridge *br)
if (!err) {
br->stp_enabled = BR_USER_STP;
br_debug(br, "userspace STP started\n");
-
- /* Stop hello and hold timers */
- del_timer(&br->hello_timer);
- list_for_each_entry(p, &br->port_list, list)
- del_timer(&p->hold_timer);
} else {
br->stp_enabled = BR_KERNEL_STP;
br_debug(br, "using kernel STP\n");
@@ -187,7 +181,6 @@ static void br_stp_start(struct net_bridge *br)
static void br_stp_stop(struct net_bridge *br)
{
- struct net_bridge_port *p;
int err;
if (br->stp_enabled == BR_USER_STP) {
@@ -196,10 +189,6 @@ static void br_stp_stop(struct net_bridge *br)
br_err(br, "failed to stop userspace STP (%d)\n", err);
/* To start timers on any ports left in blocking */
- mod_timer(&br->hello_timer, jiffies + br->hello_time);
- list_for_each_entry(p, &br->port_list, list)
- mod_timer(&p->hold_timer,
- round_jiffies(jiffies + BR_HOLD_TIME));
spin_lock_bh(&br->lock);
br_port_state_selection(br);
spin_unlock_bh(&br->lock);
--
2.13.0
^ permalink raw reply related
* Re: [PATCH net-next v2] bridge: fix hello and hold timers starting/stopping
From: Xin Long @ 2017-05-19 17:35 UTC (permalink / raw)
To: Ivan Vecera; +Cc: Nikolay Aleksandrov, network dev, bridge, davem
In-Reply-To: <20170519173043.10201-1-cera@cera.cz>
On Sat, May 20, 2017 at 1:30 AM, Ivan Vecera <cera@cera.cz> wrote:
> Current bridge code incorrectly handles starting/stopping of hello and
> hold timers during STP enable/disable.
>
> 1. Timers are stopped in br_stp_start() during NO_STP->USER_STP
> transition. The timers are already stopped in NO_STP state so
> this is confusing no-op.
>
> 2. During USER_STP->NO_STP transition the timers are started. This
> does not make sense and is confusion because the timer should not be
> active in NO_STP state.
>
> Cc: davem@davemloft.net
> Cc: sashok@cumulusnetworks.com
> Cc: stephen@networkplumber.org
> Cc: bridge@lists.linux-foundation.org
> Cc: lucien.xin@gmail.com
> Cc: nikolay@cumulusnetworks.com
> Signed-off-by: Ivan Vecera <cera@cera.cz>
Reviewed-by: Xin Long <lucien.xin@gmail.com>
> ---
> net/bridge/br_stp_if.c | 11 -----------
> 1 file changed, 11 deletions(-)
>
> diff --git a/net/bridge/br_stp_if.c b/net/bridge/br_stp_if.c
> index 08341d2aa9c9..a05027027513 100644
> --- a/net/bridge/br_stp_if.c
> +++ b/net/bridge/br_stp_if.c
> @@ -150,7 +150,6 @@ static int br_stp_call_user(struct net_bridge *br, char *arg)
>
> static void br_stp_start(struct net_bridge *br)
> {
> - struct net_bridge_port *p;
> int err = -ENOENT;
>
> if (net_eq(dev_net(br->dev), &init_net))
> @@ -169,11 +168,6 @@ static void br_stp_start(struct net_bridge *br)
> if (!err) {
> br->stp_enabled = BR_USER_STP;
> br_debug(br, "userspace STP started\n");
> -
> - /* Stop hello and hold timers */
> - del_timer(&br->hello_timer);
> - list_for_each_entry(p, &br->port_list, list)
> - del_timer(&p->hold_timer);
> } else {
> br->stp_enabled = BR_KERNEL_STP;
> br_debug(br, "using kernel STP\n");
> @@ -187,7 +181,6 @@ static void br_stp_start(struct net_bridge *br)
>
> static void br_stp_stop(struct net_bridge *br)
> {
> - struct net_bridge_port *p;
> int err;
>
> if (br->stp_enabled == BR_USER_STP) {
> @@ -196,10 +189,6 @@ static void br_stp_stop(struct net_bridge *br)
> br_err(br, "failed to stop userspace STP (%d)\n", err);
>
> /* To start timers on any ports left in blocking */
> - mod_timer(&br->hello_timer, jiffies + br->hello_time);
> - list_for_each_entry(p, &br->port_list, list)
> - mod_timer(&p->hold_timer,
> - round_jiffies(jiffies + BR_HOLD_TIME));
> spin_lock_bh(&br->lock);
> br_port_state_selection(br);
> spin_unlock_bh(&br->lock);
> --
> 2.13.0
>
^ permalink raw reply
* Re: [PATCH net] fix BUG: scheduling while atomic in netlink broadcast
From: Cong Wang @ 2017-05-19 17:46 UTC (permalink / raw)
To: Akshay Narayan; +Cc: Linux Kernel Network Developers, David Miller
In-Reply-To: <CAN-7y0oY=LyeLTxhC5L+Vjy-NivmRJSBpqPP9+NYT7a=NFV0_A@mail.gmail.com>
On Fri, May 19, 2017 at 10:22 AM, Akshay Narayan <akshayn@mit.edu> wrote:
> netlink_broadcast_filtered() calls yield() when a slow listener causes
> the buffer to fill. yield() is the wrong choice here, as pointed out by
> Commit 8e3fabfde4 (sched: Update yield() docs); in some cases, its use causes
> "BUG: scheduling while atomic" and, when fewer cores are available,
> kernel hangs:
I don't want to defend the use of yield() but it looks like there is other
problem.
>
> (note: "ccp" is a kernel module which multicasts netlink messages upon
> certain TCP events)
Does this module call netlink_broadcast() with __GFP_DIRECT_RECLAIM
in IRQ context? If so you should adjust the gfp flags.
^ permalink raw reply
* [Patch net] vsock: use new wait API for vsock_stream_sendmsg()
From: Cong Wang @ 2017-05-19 18:21 UTC (permalink / raw)
To: netdev
Cc: Cong Wang, Stefan Hajnoczi, Jorgen Hansen, Michael S. Tsirkin,
Claudio Imbrenda
As reported by Michal, vsock_stream_sendmsg() could still
sleep at vsock_stream_has_space() after prepare_to_wait():
vsock_stream_has_space
vmci_transport_stream_has_space
vmci_qpair_produce_free_space
qp_lock
qp_acquire_queue_mutex
mutex_lock
Just switch to the new wait API like we did for commit
d9dc8b0f8b4e ("net: fix sleeping for sk_wait_event()").
Reported-by: Michal Kubecek <mkubecek@suse.cz>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Jorgen Hansen <jhansen@vmware.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>
---
net/vmw_vsock/af_vsock.c | 21 ++++++++-------------
1 file changed, 8 insertions(+), 13 deletions(-)
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index 6f7f675..dfc8c51e 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -1540,8 +1540,7 @@ static int vsock_stream_sendmsg(struct socket *sock, struct msghdr *msg,
long timeout;
int err;
struct vsock_transport_send_notify_data send_data;
-
- DEFINE_WAIT(wait);
+ DEFINE_WAIT_FUNC(wait, woken_wake_function);
sk = sock->sk;
vsk = vsock_sk(sk);
@@ -1584,11 +1583,10 @@ static int vsock_stream_sendmsg(struct socket *sock, struct msghdr *msg,
if (err < 0)
goto out;
-
while (total_written < len) {
ssize_t written;
- prepare_to_wait(sk_sleep(sk), &wait, TASK_INTERRUPTIBLE);
+ add_wait_queue(sk_sleep(sk), &wait);
while (vsock_stream_has_space(vsk) == 0 &&
sk->sk_err == 0 &&
!(sk->sk_shutdown & SEND_SHUTDOWN) &&
@@ -1597,33 +1595,30 @@ static int vsock_stream_sendmsg(struct socket *sock, struct msghdr *msg,
/* Don't wait for non-blocking sockets. */
if (timeout == 0) {
err = -EAGAIN;
- finish_wait(sk_sleep(sk), &wait);
+ remove_wait_queue(sk_sleep(sk), &wait);
goto out_err;
}
err = transport->notify_send_pre_block(vsk, &send_data);
if (err < 0) {
- finish_wait(sk_sleep(sk), &wait);
+ remove_wait_queue(sk_sleep(sk), &wait);
goto out_err;
}
release_sock(sk);
- timeout = schedule_timeout(timeout);
+ timeout = wait_woken(&wait, TASK_INTERRUPTIBLE, timeout);
lock_sock(sk);
if (signal_pending(current)) {
err = sock_intr_errno(timeout);
- finish_wait(sk_sleep(sk), &wait);
+ remove_wait_queue(sk_sleep(sk), &wait);
goto out_err;
} else if (timeout == 0) {
err = -EAGAIN;
- finish_wait(sk_sleep(sk), &wait);
+ remove_wait_queue(sk_sleep(sk), &wait);
goto out_err;
}
-
- prepare_to_wait(sk_sleep(sk), &wait,
- TASK_INTERRUPTIBLE);
}
- finish_wait(sk_sleep(sk), &wait);
+ remove_wait_queue(sk_sleep(sk), &wait);
/* These checks occur both as part of and after the loop
* conditional since we need to check before and after
--
2.5.5
^ permalink raw reply related
* Re: [PATCH] e1000e: use disable_hardirq() also for MSIX vectors in e1000_netpoll()
From: Cong Wang @ 2017-05-19 18:34 UTC (permalink / raw)
To: Konstantin Khlebnikov
Cc: Linux Kernel Network Developers, intel-wired-lan, Jeff Kirsher,
Dave Jones, David S. Miller, Sabrina Dubroca, Thomas Gleixner
In-Reply-To: <149517832870.37403.16008624383220981105.stgit@buzz>
On Fri, May 19, 2017 at 12:18 AM, Konstantin Khlebnikov
<khlebnikov@yandex-team.ru> wrote:
> Replace disable_irq() which waits for threaded irq handlers with
> disable_hardirq() which waits only for hardirq part.
>
> Signed-off-by: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
> Fixes: 311191297125 ("e1000: use disable_hardirq() for e1000_netpoll()")
Thomas had a similar patch, I don't know why he never sends it
out formally. Anyway,
Acked-by: Cong Wang <xiyou.wangcong@gmail.com>
^ permalink raw reply
* Re: [PATCH net] sctp: fix ICMP processing if skb is non-linear
From: Xin Long @ 2017-05-19 18:40 UTC (permalink / raw)
To: Davide Caratti, Marcelo Ricardo Leitner
Cc: network dev, linux-sctp, David S. Miller
In-Reply-To: <cb2535a6a40a84684f26d7c7d26788cc3f8a2010.1495207495.git.dcaratti@redhat.com>
On Fri, May 19, 2017 at 11:34 PM, Davide Caratti <dcaratti@redhat.com> wrote:
> when the ICMP packet is carried by a paged skb, sctp_err_lookup() may fail
> validation even if the payload contents match an open socket: as a
> consequence, sometimes ICMPs are wrongly ignored. Use skb_header_pointer()
> to retrieve encapsulated SCTP headers, to ensure that ICMP payloads are
> validated correctly, also when skbs are non-linear.
>
> Signed-off-by: Davide Caratti <dcaratti@redhat.com>
> ---
> include/net/sctp/sctp.h | 2 +-
> net/sctp/input.c | 29 +++++++++++++++++++----------
> net/sctp/ipv6.c | 2 +-
> 3 files changed, 21 insertions(+), 12 deletions(-)
>
> diff --git a/include/net/sctp/sctp.h b/include/net/sctp/sctp.h
> index 069582e..1b8c16b 100644
> --- a/include/net/sctp/sctp.h
> +++ b/include/net/sctp/sctp.h
> @@ -152,7 +152,7 @@ void sctp_v4_err(struct sk_buff *skb, u32 info);
> void sctp_hash_endpoint(struct sctp_endpoint *);
> void sctp_unhash_endpoint(struct sctp_endpoint *);
> struct sock *sctp_err_lookup(struct net *net, int family, struct sk_buff *,
> - struct sctphdr *, struct sctp_association **,
> + struct sctp_association **,
> struct sctp_transport **);
> void sctp_err_finish(struct sock *, struct sctp_transport *);
> void sctp_icmp_frag_needed(struct sock *, struct sctp_association *,
> diff --git a/net/sctp/input.c b/net/sctp/input.c
> index 0e06a27..7f3f983 100644
> --- a/net/sctp/input.c
> +++ b/net/sctp/input.c
> @@ -469,19 +469,19 @@ void sctp_icmp_proto_unreachable(struct sock *sk,
>
> /* Common lookup code for icmp/icmpv6 error handler. */
> struct sock *sctp_err_lookup(struct net *net, int family, struct sk_buff *skb,
> - struct sctphdr *sctphdr,
> struct sctp_association **app,
> struct sctp_transport **tpp)
> {
> + struct sctp_init_chunk _chunkhdr, *chunkhdr;
> + struct sctphdr _sctphdr, *sctphdr;
> union sctp_addr saddr;
> union sctp_addr daddr;
> struct sctp_af *af;
> struct sock *sk = NULL;
> struct sctp_association *asoc;
> struct sctp_transport *transport = NULL;
> - struct sctp_init_chunk *chunkhdr;
> - __u32 vtag = ntohl(sctphdr->vtag);
> - int len = skb->len - ((void *)sctphdr - (void *)skb->data);
> + int offset;
> + __u32 vtag;
>
> *app = NULL; *tpp = NULL;
>
> @@ -515,14 +515,23 @@ struct sock *sctp_err_lookup(struct net *net, int family, struct sk_buff *skb,
> * or the chunk type or the Initiate Tag does not match, silently
> * discard the packet.
> */
> + offset = skb_transport_offset(skb);
> + sctphdr = skb_header_pointer(skb, offset, sizeof(_sctphdr), &_sctphdr);
> + if (unlikely(!sctphdr))
> + goto out;
> +
> + vtag = ntohl(sctphdr->vtag);
> if (vtag == 0) {
> - chunkhdr = (void *)sctphdr + sizeof(struct sctphdr);
> - if (len < sizeof(struct sctphdr) + sizeof(sctp_chunkhdr_t)
> - + sizeof(__be32) ||
> + offset += sizeof(_sctphdr);
will be nice to delete this line, and use
> + /* chunk header + first 4 octects of init header */
> + chunkhdr = skb_header_pointer(skb, offset,
chunkhdr = skb_header_pointer(skb, offset + sizeof(_sctphdr), ;)
wdyt?
> + sizeof(struct sctp_chunkhdr) +
> + sizeof(__be32), &_chunkhdr);
> + if (!chunkhdr ||
> chunkhdr->chunk_hdr.type != SCTP_CID_INIT ||
> - ntohl(chunkhdr->init_hdr.init_tag) != asoc->c.my_vtag) {
> + ntohl(chunkhdr->init_hdr.init_tag) != asoc->c.my_vtag)
> goto out;
> - }
> +
> } else if (vtag != asoc->c.peer_vtag) {
> goto out;
> }
> @@ -585,7 +594,7 @@ void sctp_v4_err(struct sk_buff *skb, __u32 info)
> savesctp = skb->transport_header;
> skb_reset_network_header(skb);
> skb_set_transport_header(skb, ihlen);
> - sk = sctp_err_lookup(net, AF_INET, skb, sctp_hdr(skb), &asoc, &transport);
> + sk = sctp_err_lookup(net, AF_INET, skb, &asoc, &transport);
> /* Put back, the original values. */
> skb->network_header = saveip;
> skb->transport_header = savesctp;
> diff --git a/net/sctp/ipv6.c b/net/sctp/ipv6.c
> index 142b70e..d72c8d5 100644
> --- a/net/sctp/ipv6.c
> +++ b/net/sctp/ipv6.c
> @@ -157,7 +157,7 @@ static void sctp_v6_err(struct sk_buff *skb, struct inet6_skb_parm *opt,
> savesctp = skb->transport_header;
> skb_reset_network_header(skb);
> skb_set_transport_header(skb, offset);
> - sk = sctp_err_lookup(net, AF_INET6, skb, sctp_hdr(skb), &asoc, &transport);
> + sk = sctp_err_lookup(net, AF_INET6, skb, &asoc, &transport);
> /* Put back, the original pointers. */
> skb->network_header = saveip;
> skb->transport_header = savesctp;
> --
> 2.7.4
>
^ permalink raw reply
* [PATCH net] bonding: fix randomly populated arp target array
From: Jarod Wilson @ 2017-05-19 18:46 UTC (permalink / raw)
To: linux-kernel
Cc: Jarod Wilson, Mahesh Bandewar, Jay Vosburgh, Veaceslav Falico,
Andy Gospodarek, netdev, stable
In commit dc9c4d0fe023, the arp_target array moved from a static global
to a local variable. By the nature of static globals, the array used to
be initialized to all 0. At present, it's full of random data, which
that gets interpreted as arp_target values, when none have actually been
specified. Systems end up booting with spew along these lines:
[ 32.161783] IPv6: ADDRCONF(NETDEV_UP): lacp0: link is not ready
[ 32.168475] IPv6: ADDRCONF(NETDEV_UP): lacp0: link is not ready
[ 32.175089] 8021q: adding VLAN 0 to HW filter on device lacp0
[ 32.193091] IPv6: ADDRCONF(NETDEV_UP): lacp0: link is not ready
[ 32.204892] lacp0: Setting MII monitoring interval to 100
[ 32.211071] lacp0: Removing ARP target 216.124.228.17
[ 32.216824] lacp0: Removing ARP target 218.160.255.255
[ 32.222646] lacp0: Removing ARP target 185.170.136.184
[ 32.228496] lacp0: invalid ARP target 255.255.255.255 specified for removal
[ 32.236294] lacp0: option arp_ip_target: invalid value (-255.255.255.255)
[ 32.243987] lacp0: Removing ARP target 56.125.228.17
[ 32.249625] lacp0: Removing ARP target 218.160.255.255
[ 32.255432] lacp0: Removing ARP target 15.157.233.184
[ 32.261165] lacp0: invalid ARP target 255.255.255.255 specified for removal
[ 32.268939] lacp0: option arp_ip_target: invalid value (-255.255.255.255)
[ 32.276632] lacp0: Removing ARP target 16.0.0.0
[ 32.281755] lacp0: Removing ARP target 218.160.255.255
[ 32.287567] lacp0: Removing ARP target 72.125.228.17
[ 32.293165] lacp0: Removing ARP target 218.160.255.255
[ 32.298970] lacp0: Removing ARP target 8.125.228.17
[ 32.304458] lacp0: Removing ARP target 218.160.255.255
None of these were actually specified as ARP targets, and the driver does
seem to clean up the mess okay, but it's rather noisy and confusing, leaks
values to userspace, and the 255.255.255.255 spew shows up even when debug
prints are disabled.
The fix: just zero out arp_target at init time.
While we're in here, init arp_all_targets_value in the right place.
Fixes: dc9c4d0fe023 ("bonding: reduce scope of some global variables")
CC: Mahesh Bandewar <maheshb@google.com>
CC: Jay Vosburgh <j.vosburgh@gmail.com>
CC: Veaceslav Falico <vfalico@gmail.com>
CC: Andy Gospodarek <andy@greyhouse.net>
CC: netdev@vger.kernel.org
CC: stable@vger.kernel.org
Signed-off-by: Jarod Wilson <jarod@redhat.com>
---
drivers/net/bonding/bond_main.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c
index 2be78807fd6e..73313318399c 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -4271,10 +4271,10 @@ static int bond_check_params(struct bond_params *params)
int arp_validate_value, fail_over_mac_value, primary_reselect_value, i;
struct bond_opt_value newval;
const struct bond_opt_value *valptr;
- int arp_all_targets_value;
+ int arp_all_targets_value = 0;
u16 ad_actor_sys_prio = 0;
u16 ad_user_port_key = 0;
- __be32 arp_target[BOND_MAX_ARP_TARGETS];
+ __be32 arp_target[BOND_MAX_ARP_TARGETS] = { 0 };
int arp_ip_count;
int bond_mode = BOND_MODE_ROUNDROBIN;
int xmit_hashtype = BOND_XMIT_POLICY_LAYER2;
@@ -4501,7 +4501,6 @@ static int bond_check_params(struct bond_params *params)
arp_validate_value = 0;
}
- arp_all_targets_value = 0;
if (arp_all_targets) {
bond_opt_initstr(&newval, arp_all_targets);
valptr = bond_opt_parse(bond_opt_get(BOND_OPT_ARP_ALL_TARGETS),
--
2.12.1
^ permalink raw reply related
* Re: [PATCH net] fix BUG: scheduling while atomic in netlink broadcast
From: Akshay Narayan @ 2017-05-19 18:47 UTC (permalink / raw)
To: Cong Wang; +Cc: Linux Kernel Network Developers, David Miller
In-Reply-To: <CAM_iQpXraspjEvh9Wy5jOscFz7zKS4Z3XsigxCJFJ5_RuCOnyw@mail.gmail.com>
> I don't want to defend the use of yield() but it looks like there is other
> problem.
I believe this use of yield() should be replaced with cond_resched()
even if it turns out there is an unrelated problem.
> Does this module call netlink_broadcast() with __GFP_DIRECT_RECLAIM
> in IRQ context? If so you should adjust the gfp flags.
The module only calls netlink_broadcast() from a pluggable TCP
function; from my understanding this is not in the IRQ context. Full
trace, perhaps more clear, attached below.
May 19 14:30:44 ccp kernel: [ 178.885546] BUG: scheduling while
atomic: mm-link/3105/0x00000200
May 19 14:30:44 ccp kernel: [ 178.885552] Modules linked in:
ipt_MASQUERADE nf_nat_masquerade_ipv4 iptable_nat nf_conntrack_ipv4
nf_defrag_ipv4 nf_nat_ipv4 nf_nat libcrc32c xt_connmark nf_conntrack
ccp(OE) crct10dif_pclmul crc32_pclmul
ghash_clmulni_intel snd_intel8x0 pcbc snd_ac97_codec joydev ac97_bus
snd_pcm snd_seq_midi snd_seq_midi_event snd_rawmidi aesni_intel
snd_seq aes_x86_64 crypto_simd snd_seq_device snd_timer snd input_leds
i2c_piix4 glue_helper cryptd so
undcore mac_hid serio_raw vboxvideo ttm drm_kms_helper drm fb_sys_fops
syscopyarea sysfillrect sysimgblt vboxguest intel_rapl_perf parport_pc
ppdev lp parport ip_tables x_tables autofs4 hid_generic usbhid hid
e1000 ahci libahci psmouse
fjes pata_acpi video
May 19 14:30:44 ccp kernel: [ 178.885665] CPU: 0 PID: 3105 Comm:
mm-link Tainted: G W OE 4.10.0-21-generic #23-Ubuntu
May 19 14:30:44 ccp kernel: [ 178.885666] Hardware name: innotek GmbH
VirtualBox/VirtualBox, BIOS VirtualBox 12/01/2006
May 19 14:30:44 ccp kernel: [ 178.885667] Call Trace:
May 19 14:30:44 ccp kernel: [ 178.885674] dump_stack+0x63/0x81
May 19 14:30:44 ccp kernel: [ 178.885678] __schedule_bug+0x54/0x70
May 19 14:30:44 ccp kernel: [ 178.885682] __schedule+0x536/0x6f0
May 19 14:30:44 ccp kernel: [ 178.885685] schedule+0x36/0x80
May 19 14:30:44 ccp kernel: [ 178.885687] sys_sched_yield+0x4f/0x60
May 19 14:30:44 ccp kernel: [ 178.885688] yield+0x33/0x40
May 19 14:30:44 ccp kernel: [ 178.885691]
netlink_broadcast_filtered+0x29b/0x3c0
May 19 14:30:44 ccp kernel: [ 178.885692] netlink_broadcast+0x1d/0x20
May 19 14:30:44 ccp kernel: [ 178.885697] nl_sendmsg+0xb8/0x664 [ccp]
May 19 14:30:44 ccp kernel: [ 178.885699] nl_send_ack_notif+0x7d/0x90 [ccp]
May 19 14:30:44 ccp kernel: [ 178.885702] tcp_ccp_cong_avoid+0x69/0x70 [ccp]
May 19 14:30:44 ccp kernel: [ 178.885704] tcp_ack+0x980/0xa60
May 19 14:30:44 ccp kernel: [ 178.885708] tcp_rcv_state_process+0x2be/0xda0
May 19 14:30:44 ccp kernel: [ 178.885712] ? security_sock_rcv_skb+0x3b/0x50
May 19 14:30:44 ccp kernel: [ 178.885715] ? sk_filter_trim_cap+0x3b/0x270
May 19 14:30:44 ccp kernel: [ 178.885717] tcp_v4_do_rcv+0xb2/0x200
May 19 14:30:44 ccp kernel: [ 178.885719] tcp_v4_rcv+0x90a/0xa00
May 19 14:30:44 ccp kernel: [ 178.885722] ip_local_deliver_finish+0x96/0x1c0
May 19 14:30:44 ccp kernel: [ 178.885725] ip_local_deliver+0x6f/0xe0
May 19 14:30:44 ccp kernel: [ 178.885727] ? ip_rcv_finish+0x3f0/0x3f0
May 19 14:30:44 ccp kernel: [ 178.885730] ip_rcv_finish+0x118/0x3f0
May 19 14:30:44 ccp kernel: [ 178.885732] ip_rcv+0x282/0x390
May 19 14:30:44 ccp kernel: [ 178.885735] ? inet_del_offload+0x40/0x40
May 19 14:30:44 ccp kernel: [ 178.885737] __netif_receive_skb_core+0x514/0xa40
May 19 14:30:44 ccp kernel: [ 178.885740] ? __check_object_size+0x10/0x1d7
May 19 14:30:44 ccp kernel: [ 178.885742] __netif_receive_skb+0x18/0x60
May 19 14:30:44 ccp kernel: [ 178.885744] netif_receive_skb_internal+0x32/0xa0
May 19 14:30:44 ccp kernel: [ 178.885746] netif_receive_skb+0x1c/0x70
May 19 14:30:44 ccp kernel: [ 178.885749] tun_get_user+0x425/0x800
May 19 14:30:44 ccp kernel: [ 178.885751] tun_chr_write_iter+0x57/0x70
May 19 14:30:44 ccp kernel: [ 178.885752] new_sync_write+0xd5/0x130
May 19 14:30:44 ccp kernel: [ 178.885754] __vfs_write+0x26/0x40
May 19 14:30:44 ccp kernel: [ 178.885756] vfs_write+0xb5/0x1a0
May 19 14:30:44 ccp kernel: [ 178.885757] SyS_write+0x55/0xc0
May 19 14:30:44 ccp kernel: [ 178.885760] entry_SYSCALL_64_fastpath+0x1e/0xad
May 19 14:30:44 ccp kernel: [ 178.885762] RIP: 0033:0x7f8d9abbf670
May 19 14:30:44 ccp kernel: [ 178.885763] RSP: 002b:00007ffc2f16d8b8
EFLAGS: 00000246 ORIG_RAX: 0000000000000001
May 19 14:30:44 ccp kernel: [ 178.885765] RAX: ffffffffffffffda RBX:
00007ffc2f16dde0 RCX: 00007f8d9abbf670
May 19 14:30:44 ccp kernel: [ 178.885767] RDX: 0000000000000038 RSI:
0000557403fd86e0 RDI: 0000000000000008
May 19 14:30:44 ccp kernel: [ 178.885768] RBP: 00007ffc2f16dcd0 R08:
0000557403fd8d40 R09: 0000557403fd8690
May 19 14:30:44 ccp kernel: [ 178.885769] R10: 000e20d41e89a5c5 R11:
0000000000000246 R12: 0000000000000108
May 19 14:30:44 ccp kernel: [ 178.885770] R13: 000000000000145c R14:
00007ffc2f16dd98 R15: 0000000000000003
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox