Netdev List

Netdev List
 help / color / mirror / Atom feed

* [PATCH net-next] flow_dissector: do not rely on implicit casts
From: Paolo Abeni @ 2018-05-07 10:06 UTC (permalink / raw)
  To: netdev; +Cc: David S. Miller

This change fixes a couple of type mismatch reported by the sparse
tool, explicitly using the requested type for the offending arguments.

Signed-off-by: Paolo Abeni <pabeni@redhat.com>
---
 include/net/tipc.h        | 4 ++--
 net/core/flow_dissector.c | 2 +-
 2 files changed, 3 insertions(+), 3 deletions(-)

diff --git a/include/net/tipc.h b/include/net/tipc.h
index 07670ec022a7..f0e7e6bc1bef 100644
--- a/include/net/tipc.h
+++ b/include/net/tipc.h
@@ -44,11 +44,11 @@ struct tipc_basic_hdr {
 	__be32 w[4];
 };
 
-static inline u32 tipc_hdr_rps_key(struct tipc_basic_hdr *hdr)
+static inline __be32 tipc_hdr_rps_key(struct tipc_basic_hdr *hdr)
 {
 	u32 w0 = ntohl(hdr->w[0]);
 	bool keepalive_msg = (w0 & KEEPALIVE_MSG_MASK) == KEEPALIVE_MSG_MASK;
-	int key;
+	__be32 key;
 
 	/* Return source node identity as key */
 	if (likely(!keepalive_msg))
diff --git a/net/core/flow_dissector.c b/net/core/flow_dissector.c
index 030d4ca177fb..4fc1e84d77ec 100644
--- a/net/core/flow_dissector.c
+++ b/net/core/flow_dissector.c
@@ -1316,7 +1316,7 @@ u32 skb_get_poff(const struct sk_buff *skb)
 {
 	struct flow_keys_basic keys;
 
-	if (!skb_flow_dissect_flow_keys_basic(skb, &keys, 0, 0, 0, 0, 0))
+	if (!skb_flow_dissect_flow_keys_basic(skb, &keys, NULL, 0, 0, 0, 0))
 		return 0;
 
 	return __skb_get_poff(skb, skb->data, &keys, skb_headlen(skb));
-- 
2.14.3

^ permalink raw reply related

* Re: [PATCH v2 net-next] net: stmmac: Add support for U32 TC filter using Flexible RX Parser
From: Jose Abreu @ 2018-05-07 10:02 UTC (permalink / raw)
  To: Jakub Kicinski, Jose Abreu
  Cc: netdev, David S. Miller, Joao Pinto, Vitor Soares,
	Giuseppe Cavallaro, Alexandre Torgue
In-Reply-To: <20180504183338.586f2903@cakuba.netronome.com>

Hi Jakub, David,

On 05-05-2018 02:33, Jakub Kicinski wrote:
> On Fri,  4 May 2018 10:01:38 +0100, Jose Abreu wrote:
>> This adds support for U32 filter by using an HW only feature called
>> Flexible RX Parser. This allow us to match any given packet field with a
>> pattern and accept/reject or even route the packet to a specific DMA
>> channel.
>>
>> Right now we only support acception or rejection of frame and we only
>> support simple rules. Though, the Parser has the flexibility of jumping to
>> specific rules as an if condition so complex rules can be established.
>>
>> This is only supported in GMAC5.10+.
>>
>> The following commands can be used to test this code:
>>
>> 	1) Setup an ingress qdisk:
>> 	# tc qdisc add dev eth0 handle ffff: ingress
>>
>> 	2) Setup a filter (e.g. filter by IP):
>> 	# tc filter add dev eth0 parent ffff: protocol ip u32 match ip \
>> 		src 192.168.0.3 skip_sw action drop
>>
>> In every tests performed we always used the "skip_sw" flag to make sure
>> only the RX Parser was involved.
>>
>> Signed-off-by: Jose Abreu <joabreu@synopsys.com>
>> Cc: David S. Miller <davem@davemloft.net>
>> Cc: Joao Pinto <jpinto@synopsys.com>
>> Cc: Vitor Soares <soares@synopsys.com>
>> Cc: Giuseppe Cavallaro <peppe.cavallaro@st.com>
>> Cc: Alexandre Torgue <alexandre.torgue@st.com>
>> Cc: Jakub Kicinski <kubakici@wp.pl>
>> ---
>> Changes from v1:
>> 	- Follow Linux network coding style (David)
>> 	- Use tc_cls_can_offload_and_chain0() (Jakub)
> Thanks!
>
>> @@ -4223,6 +4277,11 @@ int stmmac_dvr_probe(struct device *device,
>>  	ndev->hw_features = NETIF_F_SG | NETIF_F_IP_CSUM | NETIF_F_IPV6_CSUM |
>>  			    NETIF_F_RXCSUM;
>>  
>> +	ret = stmmac_tc_init(priv, priv);
>> +	if (!ret) {
>> +		ndev->hw_features |= NETIF_F_HW_TC;
>> +	}
>> +
>>  	if ((priv->plat->tso_en) && (priv->dma_cap.tsoen)) {
>>  		ndev->hw_features |= NETIF_F_TSO | NETIF_F_TSO6;
>>  		priv->tso = true;
> One more comment, but perhaps not a showstopper, it's considered good
> practice to disallow clearing/disabling this flag while filters are
> installed.  Driver should return -EBUSY from .ndo_set_features if TC
> rules are offloaded and user wants to disable HW_TC feature flag.

I can do that but I saw in Patchwork that patch was already
marked as accepted, David sent me no confirmation though.

David,
Shall I respin or send a follow up patch?

Thanks and Best Regards,
Jose Miguel Abreu

^ permalink raw reply

* Re: WARNING in kernfs_add_one
From: Johannes Berg @ 2018-05-07  9:53 UTC (permalink / raw)
  To: Dmitry Vyukov
  Cc: Greg KH, linux-wireless, Eric Dumazet, netdev, syzbot, LKML,
	syzkaller-bugs, Tejun Heo
In-Reply-To: <CACT4Y+bWW8w7RJ1roUQ8dDk5Fv0syDsKtMH1wRnkkN4nWOootQ@mail.gmail.com>

On Mon, 2018-05-07 at 11:33 +0200, Dmitry Vyukov wrote:
> On Mon, May 7, 2018 at 10:43 AM, Johannes Berg
> <johannes@sipsolutions.net> wrote:
> > On Sat, 2018-05-05 at 15:07 -0700, Greg KH wrote:
> > 
> > > > > > syzbot found the following crash on:
> > 
> > Maybe it should learn to differentiate warnings, if it's going to set
> > panic_on_warn :-)
> 
> How?
> Note that this is not specific to syzbot. If you see WARNINGs in a
> subsystem that you have no idea about (or you just a normal user),
> what do you do? Right, you report it to maintainers.

Yeah, no problem with that. Just some people seem to get so much more
upset about crashes ... but then again I get bug reports about WARN_ON
all the time anyway that say "my kernel crashed" so I guess it doesn't
really matter :-)

> > I get why, but still, at least differentiating in the emails wouldn't be
> > bad.
> 
> Well, the subject says "WARNING".
> But note there are _very_ bad WARNINGs too. Generally, a WARNING means
> a kernel bug just that kernel can tolerate without bringing the system
> down (as opposed to BUG).

Yeah, fair point. I sort of missed the subject I guess.

johannes

^ permalink raw reply

* [PATCH net-next] net:sched: add gkprio scheduler
From: Nishanth Devarajan @ 2018-05-07  9:36 UTC (permalink / raw)
  To: xiyou.wangcong, jiri, jhs, davem; +Cc: netdev, doucette, michel

net/sched: add gkprio scheduler

Gkprio (Gatekeeper Priority Queue) is a queueing discipline that prioritizes
IPv4 and IPv6 packets accordingly to their DSCP field. Although Gkprio can be
employed in any QoS scenario in which a higher DSCP field means a higher
priority packet, Gkprio was concieved as a solution for denial-of-service
defenses that need to route packets with different priorities.

Signed-off-by: Nishanth Devarajan <ndev2021@gmail.com>
Reviewed-by: Cody Doucette <doucette@bu.edu>
Reviewed-by: Michel Machado <michel@digirati.com.br>
Reviewed-by: Sachin Paryani <sachin.paryani@gmail.com>
---
 include/uapi/linux/pkt_sched.h |  11 ++
 net/sched/Kconfig              |  13 ++
 net/sched/Makefile             |   1 +
 net/sched/sch_gkprio.c         | 316 +++++++++++++++++++++++++++++++++++++++++
 4 files changed, 341 insertions(+)
 create mode 100644 net/sched/sch_gkprio.c

diff --git a/include/uapi/linux/pkt_sched.h b/include/uapi/linux/pkt_sched.h
index 37b5096..de8b5ca 100644
--- a/include/uapi/linux/pkt_sched.h
+++ b/include/uapi/linux/pkt_sched.h
@@ -124,6 +124,17 @@ struct tc_fifo_qopt {
 	__u32	limit;	/* Queue length: bytes for bfifo, packets for pfifo */
 };
 
+/* GKPRIO section */
+
+struct tc_gkprio_qopt {
+	__u32	limit; 	    	/* Queue length in packets. */
+	__u16	noip_dfltp; 	/* Default priority for non-IP packets. */
+
+	/* Stats. */
+	__u16 highest_prio; 	/* Highest priority currently in queue.  */
+	__u16 lowest_prio;  	/* Lowest priority currently in queue. */
+};
+
 /* PRIO section */
 
 #define TCQ_PRIO_BANDS	16
diff --git a/net/sched/Kconfig b/net/sched/Kconfig
index a01169f..9c47857 100644
--- a/net/sched/Kconfig
+++ b/net/sched/Kconfig
@@ -240,6 +240,19 @@ config NET_SCH_MQPRIO
 
 	  If unsure, say N.
 
+config NET_SCH_GKPRIO
+	tristate "Gatekeeper priority queue scheduler (GKPRIO)"
+	help
+	  Say Y here if you want to use the Gatekeeper priority queue
+	  scheduler. This schedules packets according to priorities based on
+	  the DSCP (IPv4) and DS (IPv6) fields, which is useful for request
+	  packets in DoS mitigation systems such as Gatekeeper.
+
+	  To compile this driver as a module, choose M here: the module will
+	  be called sch_gkprio.
+
+	  If unsure, say N.
+
 config NET_SCH_CHOKE
 	tristate "CHOose and Keep responsive flow scheduler (CHOKE)"
 	help
diff --git a/net/sched/Makefile b/net/sched/Makefile
index 8811d38..93a1fdb 100644
--- a/net/sched/Makefile
+++ b/net/sched/Makefile
@@ -46,6 +46,7 @@ obj-$(CONFIG_NET_SCH_NETEM)	+= sch_netem.o
 obj-$(CONFIG_NET_SCH_DRR)	+= sch_drr.o
 obj-$(CONFIG_NET_SCH_PLUG)	+= sch_plug.o
 obj-$(CONFIG_NET_SCH_MQPRIO)	+= sch_mqprio.o
+obj-$(CONFIG_NET_SCH_GKPRIO)	+= sch_gkprio.o
 obj-$(CONFIG_NET_SCH_CHOKE)	+= sch_choke.o
 obj-$(CONFIG_NET_SCH_QFQ)	+= sch_qfq.o
 obj-$(CONFIG_NET_SCH_CODEL)	+= sch_codel.o
diff --git a/net/sched/sch_gkprio.c b/net/sched/sch_gkprio.c
new file mode 100644
index 0000000..ad1227c
--- /dev/null
+++ b/net/sched/sch_gkprio.c
@@ -0,0 +1,316 @@
+/*
+ * net/sched/sch_gkprio.c  Gatekeeper Priority Queue.
+ *
+ *		This program is free software; you can redistribute it and/or
+ *		modify it under the terms of the GNU General Public License
+ *		as published by the Free Software Foundation; either version
+ *		2 of the License, or (at your option) any later version.
+ *
+ * Authors:	Nishanth Devarajan, <ndev_2021@gmail.com>
+ *	        original idea by Michel Machado, Cody Doucette, and Qiaobin Fu
+ */
+
+#include <linux/string.h>
+#include <linux/module.h>
+#include <linux/slab.h>
+#include <linux/types.h>
+#include <linux/kernel.h>
+#include <linux/errno.h>
+#include <linux/skbuff.h>
+#include <net/pkt_sched.h>
+#include <net/sch_generic.h>
+#include <net/inet_ecn.h>
+
+/* Packets are assigned priorities [0, 63] due to the IP DSCP field limits. */
+#define GKPRIO_MAX_PRIORITY 64
+
+/*	  Gatekeeper Priority Queue
+ *	=================================
+ *
+ * This qdisc schedules a packet according to the value (0-63) of its DSCP
+ * (IPv4) or DS (IPv6) field, where a higher value places the packet closer
+ * to the exit of the queue. Non-IP packets are assigned a default priority
+ * specified to GKPRIO; if none is specified, default priority is set
+ * to 0. When the queue is full, the lowest priority packet in the queue is
+ * dropped to make room for the packet to be added if it has higher priority.
+ * If the packet to be added has lower priority than all packets in the queue,
+ * it is dropped.
+ *
+ * Without the Gatekeeper priority queue, queue length limits must be imposed
+ * for individual queues, and there is no easy way to enforce a global queue
+ * length limit across all priorities. With the Gatekeeper queue, a global
+ * queue length limit can be enforced while not restricting the queue lengths
+ * of individual priorities.
+ *
+ * This is especially useful for a denial-of-service defense system; like
+ * Gatekeeper, which prioritizes packets in flows that demonstrate expected
+ * behavior of legitimate users. The queue is flexible to allow any number
+ * of packets of any priority up to the global limit of the scheduler
+ * without risking resource overconsumption by a flood of low priority packets.
+ *
+ * The Gatekeper standalone codebase is found here:
+ *
+ *		https://github.com/AltraMayor/gatekeeper
+ */
+
+struct gkprio_sched_data {
+	/* Parameters. */
+	u32 max_limit;
+	u16 noip_dfltp;
+
+	/* Queue state. */
+	struct sk_buff_head qdiscs[GKPRIO_MAX_PRIORITY];
+	u16 highest_prio;
+	u16 lowest_prio;
+};
+
+static u16 calc_new_high_prio(const struct gkprio_sched_data *q)
+{
+	int prio;
+
+	for (prio = q->highest_prio - 1; prio >= q->lowest_prio; prio--) {
+		if (!skb_queue_empty(&q->qdiscs[prio]))
+			return prio;
+	}
+
+	/* GK queue is empty, return 0 (default highest priority setting). */
+	return 0;
+}
+
+static u16 calc_new_low_prio(const struct gkprio_sched_data *q)
+{
+	int prio;
+
+	for (prio = q->lowest_prio + 1; prio <= q->highest_prio; prio++) {
+		if (!skb_queue_empty(&q->qdiscs[prio]))
+			return prio;
+	}
+
+	/* GK queue is empty, return GKPRIO_MAX_PRIORITY - 1
+	 * (default lowest priority setting).
+	 */
+	return GKPRIO_MAX_PRIORITY - 1;
+}
+
+static int gkprio_enqueue(struct sk_buff *skb, struct Qdisc *sch,
+			  struct sk_buff **to_free)
+{
+	struct gkprio_sched_data *q = qdisc_priv(sch);
+	struct sk_buff_head *qdisc;
+	struct sk_buff_head *lp_qdisc;
+	struct sk_buff *to_drop;
+	int wlen;
+	u16 prio, lp;
+
+	/* Obtain the priority of @skb. */
+	wlen = skb_network_offset(skb);
+	switch (tc_skb_protocol(skb)) {
+	case htons(ETH_P_IP):
+		wlen += sizeof(struct iphdr);
+		if (!pskb_may_pull(skb, wlen))
+			goto drop;
+		prio = ipv4_get_dsfield(ip_hdr(skb)) >> 2;
+		break;
+
+	case htons(ETH_P_IPV6):
+		wlen += sizeof(struct ipv6hdr);
+		if (!pskb_may_pull(skb, wlen))
+			goto drop;
+		prio = ipv6_get_dsfield(ipv6_hdr(skb)) >> 2;
+		break;
+
+	default:
+		prio = q->noip_dfltp;
+		break;
+	}
+
+	qdisc = &q->qdiscs[prio];
+
+	if (sch->q.qlen < q->max_limit) {
+		__skb_queue_tail(qdisc, skb);
+		qdisc_qstats_backlog_inc(sch, skb);
+
+		/* Check to update highest and lowest priorities. */
+		if (prio > q->highest_prio)
+			q->highest_prio = prio;
+
+		if (prio < q->lowest_prio)
+			q->lowest_prio = prio;
+
+		sch->q.qlen++;
+		return NET_XMIT_SUCCESS;
+	}
+
+	/* If this packet has the lowest priority, drop it. */
+	lp = q->lowest_prio;
+	if (prio <= lp)
+		return qdisc_drop(skb, sch, to_free);
+
+	/* Drop the packet at the tail of the lowest priority qdisc. */
+	lp_qdisc = &q->qdiscs[lp];
+	to_drop = __skb_dequeue_tail(lp_qdisc);
+	BUG_ON(!to_drop);
+	qdisc_qstats_backlog_dec(sch, to_drop);
+	qdisc_drop(to_drop, sch, to_free);
+
+	__skb_queue_tail(qdisc, skb);
+	qdisc_qstats_backlog_inc(sch, skb);
+
+	/* Check to update highest and lowest priorities. */
+	if (skb_queue_empty(lp_qdisc)) {
+		if (q->lowest_prio == q->highest_prio) {
+			BUG_ON(sch->q.qlen);
+			q->lowest_prio = prio;
+			q->highest_prio = prio;
+		} else {
+			q->lowest_prio = calc_new_low_prio(q);
+		}
+	}
+
+	if (prio > q->highest_prio)
+		q->highest_prio = prio;
+
+	return NET_XMIT_SUCCESS;
+drop:
+	qdisc_drop(skb, sch, to_free);
+	return NET_XMIT_SUCCESS | __NET_XMIT_BYPASS;
+}
+
+static struct sk_buff *gkprio_dequeue(struct Qdisc *sch)
+{
+	struct gkprio_sched_data *q = qdisc_priv(sch);
+	struct sk_buff_head *hpq = &q->qdiscs[q->highest_prio];
+	struct sk_buff *skb = __skb_dequeue(hpq);
+
+	if (unlikely(!skb))
+		return NULL;
+
+	sch->q.qlen--;
+	qdisc_qstats_backlog_dec(sch, skb);
+	qdisc_bstats_update(sch, skb);
+
+	/* Update highest priority field. */
+	if (skb_queue_empty(hpq)) {
+		if (q->lowest_prio == q->highest_prio) {
+			BUG_ON(sch->q.qlen);
+			q->highest_prio = 0;
+			q->lowest_prio = GKPRIO_MAX_PRIORITY - 1;
+		} else {
+			q->highest_prio = calc_new_high_prio(q);
+		}
+	}
+	return skb;
+}
+
+static int gkprio_change(struct Qdisc *sch, struct nlattr *opt,
+			struct netlink_ext_ack *extack)
+{
+	struct gkprio_sched_data *q = qdisc_priv(sch);
+	struct tc_gkprio_qopt *ctl = nla_data(opt);
+	unsigned int min_limit = 1;
+
+	if (ctl->limit == (typeof(ctl->limit))-1)
+		q->max_limit = max(qdisc_dev(sch)->tx_queue_len, min_limit);
+	else if (ctl->limit < 1 || ctl->limit > qdisc_dev(sch)->tx_queue_len)
+		return -EINVAL;
+	else
+		q->max_limit = ctl->limit;
+
+	if (ctl->noip_dfltp == (typeof(ctl->noip_dfltp))-1)
+		q->noip_dfltp = 0;
+	else if (ctl->noip_dfltp >= GKPRIO_MAX_PRIORITY)
+		return -EINVAL;
+	else
+		q->noip_dfltp = ctl->noip_dfltp;
+
+	return 0;
+}
+
+static int gkprio_init(struct Qdisc *sch, struct nlattr *opt,
+			struct netlink_ext_ack *extack)
+{
+	struct gkprio_sched_data *q = qdisc_priv(sch);
+	int prio;
+	unsigned int min_limit = 1;
+
+	/* Initialise all queues, one for each possible priority. */
+	for (prio = 0; prio < GKPRIO_MAX_PRIORITY; prio++)
+		__skb_queue_head_init(&q->qdiscs[prio]);
+
+	q->highest_prio = 0;
+	q->lowest_prio = GKPRIO_MAX_PRIORITY - 1;
+	if (!opt) {
+		q->max_limit = max(qdisc_dev(sch)->tx_queue_len, min_limit);
+		q->noip_dfltp = 0;
+		return 0;
+	}
+	return gkprio_change(sch, opt, extack);
+}
+
+static int gkprio_dump(struct Qdisc *sch, struct sk_buff *skb)
+{
+	struct gkprio_sched_data *q = qdisc_priv(sch);
+	struct tc_gkprio_qopt opt;
+
+	opt.limit = q->max_limit;
+	opt.noip_dfltp = q->noip_dfltp;
+	opt.highest_prio = q->highest_prio;
+	opt.lowest_prio = q->lowest_prio;
+
+	if (nla_put(skb, TCA_OPTIONS, sizeof(opt), &opt))
+		return -1;
+
+	return skb->len;
+}
+
+static void gkprio_reset(struct Qdisc *sch)
+{
+	struct gkprio_sched_data *q = qdisc_priv(sch);
+	int prio;
+
+	sch->qstats.backlog = 0;
+	sch->q.qlen = 0;
+
+	for (prio = 0; prio < GKPRIO_MAX_PRIORITY; prio++)
+		__skb_queue_purge(&q->qdiscs[prio]);
+	q->highest_prio = 0;
+	q->lowest_prio = GKPRIO_MAX_PRIORITY - 1;
+}
+
+static void gkprio_destroy(struct Qdisc *sch)
+{
+	struct gkprio_sched_data *q = qdisc_priv(sch);
+	int prio;
+
+	for (prio = 0; prio < GKPRIO_MAX_PRIORITY; prio++)
+		__skb_queue_purge(&q->qdiscs[prio]);
+}
+
+struct Qdisc_ops gkprio_qdisc_ops __read_mostly = {
+	.id		=	"gkprio",
+	.priv_size	=	sizeof(struct gkprio_sched_data),
+	.enqueue	=	gkprio_enqueue,
+	.dequeue	=	gkprio_dequeue,
+	.peek		=	qdisc_peek_dequeued,
+	.init		=	gkprio_init,
+	.reset		=	gkprio_reset,
+	.change		=	gkprio_change,
+	.dump		=	gkprio_dump,
+	.destroy	=	gkprio_destroy,
+	.owner		=	THIS_MODULE,
+};
+
+static int __init gkprio_module_init(void)
+{
+	return register_qdisc(&gkprio_qdisc_ops);
+}
+
+static void __exit gkprio_module_exit(void)
+{
+	unregister_qdisc(&gkprio_qdisc_ops);
+}
+
+module_init(gkprio_module_init)
+module_exit(gkprio_module_exit)
+
+MODULE_LICENSE("GPL");
-- 
1.9.1

^ permalink raw reply related

* [PATCH 01/18] docs: can.rst: fix a footnote reference
From: Mauro Carvalho Chehab @ 2018-05-07  9:35 UTC (permalink / raw)
  To: Linux Doc Mailing List
  Cc: Mauro Carvalho Chehab, Mauro Carvalho Chehab, linux-kernel,
	Jonathan Corbet, Oliver Hartkopp, Marc Kleine-Budde,
	David S. Miller, linux-can, netdev
In-Reply-To: <cover.1525684985.git.mchehab+samsung@kernel.org>

As stated at:
	http://www.sphinx-doc.org/en/master/usage/restructuredtext/basics.html#footnotes

A footnote should contain either a number, a reference or
an auto number, e. g.:
	[1], [#f1] or [#].

While using [*] accidentaly works for html, it fails for other
document outputs. In particular, it causes an error with LaTeX
output, causing all books after networking to not be built.

So, replace it by a valid syntax.

Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
---
 Documentation/networking/can.rst | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Documentation/networking/can.rst b/Documentation/networking/can.rst
index d23c51abf8c6..2fd0b51a8c52 100644
--- a/Documentation/networking/can.rst
+++ b/Documentation/networking/can.rst
@@ -164,7 +164,7 @@ The Linux network devices (by default) just can handle the
 transmission and reception of media dependent frames. Due to the
 arbitration on the CAN bus the transmission of a low prio CAN-ID
 may be delayed by the reception of a high prio CAN frame. To
-reflect the correct [*]_ traffic on the node the loopback of the sent
+reflect the correct [#f1]_ traffic on the node the loopback of the sent
 data has to be performed right after a successful transmission. If
 the CAN network interface is not capable of performing the loopback for
 some reason the SocketCAN core can do this task as a fallback solution.
@@ -175,7 +175,7 @@ networking behaviour for CAN applications. Due to some requests from
 the RT-SocketCAN group the loopback optionally may be disabled for each
 separate socket. See sockopts from the CAN RAW sockets in :ref:`socketcan-raw-sockets`.
 
-.. [*] you really like to have this when you're running analyser
+.. [#f1] you really like to have this when you're running analyser
        tools like 'candump' or 'cansniffer' on the (same) node.
 
 
-- 
2.17.0

^ permalink raw reply related

* [PATCH 09/18] net: mac80211.h: fix a bad comment line
From: Mauro Carvalho Chehab @ 2018-05-07  9:35 UTC (permalink / raw)
  To: Linux Doc Mailing List
  Cc: Mauro Carvalho Chehab, Mauro Carvalho Chehab, linux-kernel,
	Jonathan Corbet, Johannes Berg, David S. Miller, linux-wireless,
	netdev
In-Reply-To: <cover.1525684985.git.mchehab+samsung@kernel.org>

Sphinx produces a lot of errors like this:
	./include/net/mac80211.h:2083: warning: bad line:  >

Signed-off-by: Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
---
 include/net/mac80211.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/include/net/mac80211.h b/include/net/mac80211.h
index d2279b2d61aa..b2f3a0c018e7 100644
--- a/include/net/mac80211.h
+++ b/include/net/mac80211.h
@@ -2080,7 +2080,7 @@ struct ieee80211_txq {
  *	virtual interface might not be given air time for the transmission of
  *	the frame, as it is not synced with the AP/P2P GO yet, and thus the
  *	deauthentication frame might not be transmitted.
- >
+ *
  * @IEEE80211_HW_DOESNT_SUPPORT_QOS_NDP: The driver (or firmware) doesn't
  *	support QoS NDP for AP probing - that's most likely a driver bug.
  *
-- 
2.17.0

^ permalink raw reply related

* Re: WARNING in kernfs_add_one
From: Dmitry Vyukov @ 2018-05-07  9:33 UTC (permalink / raw)
  To: Johannes Berg
  Cc: Greg KH, linux-wireless, Eric Dumazet, netdev, syzbot, LKML,
	syzkaller-bugs, Tejun Heo
In-Reply-To: <1525682589.6049.4.camel@sipsolutions.net>

On Mon, May 7, 2018 at 10:43 AM, Johannes Berg
<johannes@sipsolutions.net> wrote:
> On Sat, 2018-05-05 at 15:07 -0700, Greg KH wrote:
>
>> > > > syzbot found the following crash on:
>
> Maybe it should learn to differentiate warnings, if it's going to set
> panic_on_warn :-)

How?
Note that this is not specific to syzbot. If you see WARNINGs in a
subsystem that you have no idea about (or you just a normal user),
what do you do? Right, you report it to maintainers.


> I get why, but still, at least differentiating in the emails wouldn't be
> bad.

Well, the subject says "WARNING".
But note there are _very_ bad WARNINGs too. Generally, a WARNING means
a kernel bug just that kernel can tolerate without bringing the system
down (as opposed to BUG).


>> > > > kernfs: ns required in 'ieee80211' for 'phy3'
>
> Huh. What does that even mean?
>
>> > > > RIP: 0010:kernfs_add_one+0x406/0x4d0 fs/kernfs/dir.c:758
>> > > > RSP: 0018:ffff8801ca9eece0 EFLAGS: 00010286
>> > > > RAX: 000000000000002d RBX: ffffffff87d5cee0 RCX: ffffffff8160ba7d
>> > > > RDX: 0000000000000000 RSI: ffffffff81610731 RDI: ffff8801ca9ee840
>> > > > RBP: ffff8801ca9eed20 R08: ffff8801d9538500 R09: 0000000000000006
>> > > > R10: ffff8801d9538500 R11: 0000000000000000 R12: ffff8801ad1cb6c0
>> > > > R13: ffffffff885da640 R14: 0000000000000020 R15: 0000000000000000
>> > > >  kernfs_create_link+0x112/0x180 fs/kernfs/symlink.c:41
>> > > >  sysfs_do_create_link_sd.isra.2+0x90/0x130 fs/sysfs/symlink.c:43
>> > > >  sysfs_do_create_link fs/sysfs/symlink.c:79 [inline]
>> > > >  sysfs_create_link+0x65/0xc0 fs/sysfs/symlink.c:91
>> > > >  device_add_class_symlinks drivers/base/core.c:1612 [inline]
>> > > >  device_add+0x7a0/0x16d0 drivers/base/core.c:1810
>> > > >  wiphy_register+0x178a/0x2430 net/wireless/core.c:806
>> > > >  ieee80211_register_hw+0x13cd/0x35d0 net/mac80211/main.c:1047
>> > > >  mac80211_hwsim_new_radio+0x1d9b/0x3410
>> > > > drivers/net/wireless/mac80211_hwsim.c:2772
>> > > >  hwsim_new_radio_nl+0x7a7/0xa60 drivers/net/wireless/mac80211_hwsim.c:3246
>> > > >  genl_family_rcv_msg+0x889/0x1120 net/netlink/genetlink.c:599
>
> Basically we're creating a new virtual radio, which in turn creates a
> new device, which we have to register.
>
> Something is going on with the context here that makes sysfs unhappy,
> but TBH I have no idea what.
>
> johannes
>
> --
> You received this message because you are subscribed to the Google Groups "syzkaller-bugs" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to syzkaller-bugs+unsubscribe@googlegroups.com.
> To view this discussion on the web visit https://groups.google.com/d/msgid/syzkaller-bugs/1525682589.6049.4.camel%40sipsolutions.net.
> For more options, visit https://groups.google.com/d/optout.

^ permalink raw reply

* Re: [PATCH 6/8] rhashtable: further improve stability of rhashtable_walk
From: Herbert Xu @ 2018-05-07  9:30 UTC (permalink / raw)
  To: NeilBrown; +Cc: Thomas Graf, netdev, linux-kernel
In-Reply-To: <87vac1db5t.fsf@notabene.neil.brown.name>

On Sun, May 06, 2018 at 07:50:54AM +1000, NeilBrown wrote:
>
> Do we?  How could we fix it for both rhashtable and rhltable?

Well I suggested earlier to insert the walker object into the
hash table, which would be applicable regardless of whether it
is a normal rhashtable of a rhltable.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH 8/8] rhashtable: don't hold lock on first table throughout insertion.
From: Herbert Xu @ 2018-05-07  9:29 UTC (permalink / raw)
  To: NeilBrown; +Cc: Thomas Graf, netdev, linux-kernel
In-Reply-To: <878t8wcthy.fsf@notabene.neil.brown.name>

On Mon, May 07, 2018 at 08:24:41AM +1000, NeilBrown wrote:
>
> This is true, but I don't see how it is relevant.
> At some point, each thread will find that the table they have just
> locked for their search key, has a NULL 'future_tbl' pointer.
> At the point, the thread can know that the key is not in any table,
> and that no other thread can add the key until the lock is released.

The updating of future_tbl is not synchronised with insert threads.
Therefore it is entirely possible that two inserters end up on
different tables as their "latest" table.  This must not be allowed
to occur.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH 2/8] rhashtable: remove nulls_base and related code.
From: Herbert Xu @ 2018-05-07  9:27 UTC (permalink / raw)
  To: NeilBrown; +Cc: Thomas Graf, netdev, linux-kernel
In-Reply-To: <877eoheqc2.fsf@notabene.neil.brown.name>

On Sun, May 06, 2018 at 07:37:49AM +1000, NeilBrown wrote:
> 
> I can see no evidence that this is required for anything, as it isn't
> use and I'm fairly sure that in it's current form - it cannot be used.

Search for nulls in net/ipv4.  This is definitely used throughout
the network stack.  As the aim is to convert as many existing hash
tables to rhashtable as possible, we want to keep this.

Cheers,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH bpf-next v3 00/15] Introducing AF_XDP support
From: Magnus Karlsson @ 2018-05-07  9:13 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: Daniel Borkmann, Björn Töpel, Karlsson, Magnus,
	Alexander Duyck, Alexander Duyck, John Fastabend,
	Alexei Starovoitov, Jesper Dangaard Brouer, Willem de Bruijn,
	Michael S. Tsirkin, Network Development, Björn Töpel,
	michael.lundkvist, Brandeburg, Jesse, Singhai, Anjali,
	Zhang, Qi Z
In-Reply-To: <20180505003445.zxaghbly54s2cre3@ast-mbp>

On Sat, May 5, 2018 at 2:34 AM, Alexei Starovoitov
<alexei.starovoitov@gmail.com> wrote:
> On Fri, May 04, 2018 at 01:22:17PM +0200, Magnus Karlsson wrote:
>> On Fri, May 4, 2018 at 1:38 AM, Alexei Starovoitov
>> <alexei.starovoitov@gmail.com> wrote:
>> > On Fri, May 04, 2018 at 12:49:09AM +0200, Daniel Borkmann wrote:
>> >> On 05/02/2018 01:01 PM, Björn Töpel wrote:
>> >> > From: Björn Töpel <bjorn.topel@intel.com>
>> >> >
>> >> > This patch set introduces a new address family called AF_XDP that is
>> >> > optimized for high performance packet processing and, in upcoming
>> >> > patch sets, zero-copy semantics. In this patch set, we have removed
>> >> > all zero-copy related code in order to make it smaller, simpler and
>> >> > hopefully more review friendly. This patch set only supports copy-mode
>> >> > for the generic XDP path (XDP_SKB) for both RX and TX and copy-mode
>> >> > for RX using the XDP_DRV path. Zero-copy support requires XDP and
>> >> > driver changes that Jesper Dangaard Brouer is working on. Some of his
>> >> > work has already been accepted. We will publish our zero-copy support
>> >> > for RX and TX on top of his patch sets at a later point in time.
>> >>
>> >> +1, would be great to see it land this cycle. Saw few minor nits here
>> >> and there but nothing to hold it up, for the series:
>> >>
>> >> Acked-by: Daniel Borkmann <daniel@iogearbox.net>
>> >>
>> >> Thanks everyone!
>> >
>> > Great stuff!
>> >
>> > Applied to bpf-next, with one condition.
>> > Upcoming zero-copy patches for both RX and TX need to be posted
>> > and reviewed within this release window.
>> > If netdev community as a whole won't be able to agree on the zero-copy
>> > bits we'd need to revert this feature before the next merge window.
>>
>> Thanks everyone for reviewing this. Highly appreciated.
>>
>> Just so we understand the purpose correctly:
>>
>> 1: Do you want to see the ZC patches in order to verify that the user
>> space API holds? If so, we can produce an additional RFC  patch set
>> using a big chunk of code that we had in RFC V1. We are not proud of
>> this code since it is clunky, but it hopefully proves the point with
>> the uapi being the same.
>>
>> 2: And/Or are you worried about us all (the netdev community) not
>> agreeing on a way to implement ZC internally in the drivers and the
>> XDP infrastructure? This is not going to be possible to finish during
>> this cycle since we do not like the implementation we had in RFC V1.
>> Too intrusive and now we also have nicer abstractions from Jesper that
>> we can use and extend to provide a (hopefully) much cleaner and less
>> intrusive solution.
>
> short answer: both.
>
> Cleanliness and performance of the ZC code is not as important as
> getting API right. The main concern that during ZC review process
> we will find out that existing API has issues, so we have to
> do this exercise before the merge window.
> And RFC won't fly. Send the patches for real. They have to go
> through the proper code review. The hackers of netdev community
> can accept a partial, or a bit unclean, or slightly inefficient
> implementation, since it can be and will be improved later,
> but API we cannot change once it goes into official release.
>
> Here is the example of API concern:
> this patch set added shared umem concept. It sounds good in theory,
> but will it perform well with ZC ? Earlier RFCs didn't have that
> feature. If it won't perform well than it shouldn't be in the tree.
> The key reason to let AF_XDP into the tree is its performance promise.
> If it doesn't perform we should rip it out and redesign.

That is a fair point. We will try to produce patch sets for zero-copy
RX and TX using the latest interfaces within this merge window. Just
note that we will focus on this for the next week(s) instead of the
review items that you and Daniel Borkmann submitted. If we get those
patch sets out in time and we agree that they are a possible way
forward, then we produce patches with your fixes. It was mainly small
items, so should be quick.

/Magnus

^ permalink raw reply

* [PATCH 4/5] change the comment of vti6_ioctl
From: Steffen Klassert @ 2018-05-07  9:01 UTC (permalink / raw)
  To: David Miller; +Cc: Herbert Xu, Steffen Klassert, netdev
In-Reply-To: <20180507090116.31222-1-steffen.klassert@secunet.com>

From: Sun Lianwen <sunlw.fnst@cn.fujitsu.com>

The comment of vti6_ioctl() is wrong. which use vti6_tnl_ioctl
instead of vti6_ioctl.

Signed-off-by: Sun Lianwen <sunlw.fnst@cn.fujitsu.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
---
 net/ipv6/ip6_vti.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv6/ip6_vti.c b/net/ipv6/ip6_vti.c
index c214ffec02f0..deadc4c3703b 100644
--- a/net/ipv6/ip6_vti.c
+++ b/net/ipv6/ip6_vti.c
@@ -743,7 +743,7 @@ vti6_parm_to_user(struct ip6_tnl_parm2 *u, const struct __ip6_tnl_parm *p)
 }
 
 /**
- * vti6_tnl_ioctl - configure vti6 tunnels from userspace
+ * vti6_ioctl - configure vti6 tunnels from userspace
  *   @dev: virtual device associated with tunnel
  *   @ifr: parameters passed from userspace
  *   @cmd: command to be performed
-- 
2.14.1

^ permalink raw reply related

* [PATCH 5/5] xfrm: use a dedicated slab cache for struct xfrm_state
From: Steffen Klassert @ 2018-05-07  9:01 UTC (permalink / raw)
  To: David Miller; +Cc: Herbert Xu, Steffen Klassert, netdev
In-Reply-To: <20180507090116.31222-1-steffen.klassert@secunet.com>

From: Mathias Krause <minipli@googlemail.com>

struct xfrm_state is rather large (768 bytes here) and therefore wastes
quite a lot of memory as it falls into the kmalloc-1024 slab cache,
leaving 256 bytes of unused memory per XFRM state object -- a net waste
of 25%.

Using a dedicated slab cache for struct xfrm_state reduces the level of
internal fragmentation to a minimum.

On my configuration SLUB chooses to create a slab cache covering 4
pages holding 21 objects, resulting in an average memory waste of ~13
bytes per object -- a net waste of only 1.6%.

In my tests this led to memory savings of roughly 2.3MB for 10k XFRM
states.

Signed-off-by: Mathias Krause <minipli@googlemail.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
---
 net/xfrm/xfrm_state.c | 9 +++++++--
 1 file changed, 7 insertions(+), 2 deletions(-)

diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
index f9d2f2233f09..f595797a20ce 100644
--- a/net/xfrm/xfrm_state.c
+++ b/net/xfrm/xfrm_state.c
@@ -42,6 +42,7 @@ static void xfrm_state_gc_task(struct work_struct *work);
 
 static unsigned int xfrm_state_hashmax __read_mostly = 1 * 1024 * 1024;
 static __read_mostly seqcount_t xfrm_state_hash_generation = SEQCNT_ZERO(xfrm_state_hash_generation);
+static struct kmem_cache *xfrm_state_cache __ro_after_init;
 
 static DECLARE_WORK(xfrm_state_gc_work, xfrm_state_gc_task);
 static HLIST_HEAD(xfrm_state_gc_list);
@@ -451,7 +452,7 @@ static void xfrm_state_gc_destroy(struct xfrm_state *x)
 	}
 	xfrm_dev_state_free(x);
 	security_xfrm_state_free(x);
-	kfree(x);
+	kmem_cache_free(xfrm_state_cache, x);
 }
 
 static void xfrm_state_gc_task(struct work_struct *work)
@@ -563,7 +564,7 @@ struct xfrm_state *xfrm_state_alloc(struct net *net)
 {
 	struct xfrm_state *x;
 
-	x = kzalloc(sizeof(struct xfrm_state), GFP_ATOMIC);
+	x = kmem_cache_alloc(xfrm_state_cache, GFP_ATOMIC | __GFP_ZERO);
 
 	if (x) {
 		write_pnet(&x->xs_net, net);
@@ -2307,6 +2308,10 @@ int __net_init xfrm_state_init(struct net *net)
 {
 	unsigned int sz;
 
+	if (net_eq(net, &init_net))
+		xfrm_state_cache = KMEM_CACHE(xfrm_state,
+					      SLAB_HWCACHE_ALIGN | SLAB_PANIC);
+
 	INIT_LIST_HEAD(&net->xfrm.state_all);
 
 	sz = sizeof(struct hlist_head) * 8;
-- 
2.14.1

^ permalink raw reply related

* [PATCH 2/5] udp: enable UDP checksum offload for ESP
From: Steffen Klassert @ 2018-05-07  9:01 UTC (permalink / raw)
  To: David Miller; +Cc: Herbert Xu, Steffen Klassert, netdev
In-Reply-To: <20180507090116.31222-1-steffen.klassert@secunet.com>

From: Jacek Kalwas <jacek.kalwas@intel.com>

In case NIC has support for ESP TX CSUM offload skb->ip_summed is not
set to CHECKSUM_PARTIAL which results in checksum calculated by SW.

Fix enables ESP TX CSUM for UDP by extending condition with check for
NETIF_F_HW_ESP_TX_CSUM.

Signed-off-by: Jacek Kalwas <jacek.kalwas@intel.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
---
 net/ipv4/ip_output.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
index 4c11b810a447..a2dfb5a9ba76 100644
--- a/net/ipv4/ip_output.c
+++ b/net/ipv4/ip_output.c
@@ -907,7 +907,7 @@ static int __ip_append_data(struct sock *sk,
 	    length + fragheaderlen <= mtu &&
 	    rt->dst.dev->features & (NETIF_F_HW_CSUM | NETIF_F_IP_CSUM) &&
 	    !(flags & MSG_MORE) &&
-	    !exthdrlen)
+	    (!exthdrlen || (rt->dst.dev->features & NETIF_F_HW_ESP_TX_CSUM)))
 		csummode = CHECKSUM_PARTIAL;
 
 	cork->length += length;
-- 
2.14.1

^ permalink raw reply related

* [PATCH 1/5] selftests: add xfrm state-policy-monitor to rtnetlink.sh
From: Steffen Klassert @ 2018-05-07  9:01 UTC (permalink / raw)
  To: David Miller; +Cc: Herbert Xu, Steffen Klassert, netdev
In-Reply-To: <20180507090116.31222-1-steffen.klassert@secunet.com>

From: Shannon Nelson <shannon.nelson@oracle.com>

Add a simple set of tests for the IPsec xfrm commands.

Signed-off-by: Shannon Nelson <shannon.nelson@oracle.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
---
 tools/testing/selftests/net/rtnetlink.sh | 103 +++++++++++++++++++++++++++++++
 1 file changed, 103 insertions(+)

diff --git a/tools/testing/selftests/net/rtnetlink.sh b/tools/testing/selftests/net/rtnetlink.sh
index e6f485235435..760faef2e12e 100755
--- a/tools/testing/selftests/net/rtnetlink.sh
+++ b/tools/testing/selftests/net/rtnetlink.sh
@@ -502,6 +502,108 @@ kci_test_macsec()
 	echo "PASS: macsec"
 }
 
+#-------------------------------------------------------------------
+# Example commands
+#   ip x s add proto esp src 14.0.0.52 dst 14.0.0.70 \
+#            spi 0x07 mode transport reqid 0x07 replay-window 32 \
+#            aead 'rfc4106(gcm(aes))' 1234567890123456dcba 128 \
+#            sel src 14.0.0.52/24 dst 14.0.0.70/24
+#   ip x p add dir out src 14.0.0.52/24 dst 14.0.0.70/24 \
+#            tmpl proto esp src 14.0.0.52 dst 14.0.0.70 \
+#            spi 0x07 mode transport reqid 0x07
+#
+# Subcommands not tested
+#    ip x s update
+#    ip x s allocspi
+#    ip x s deleteall
+#    ip x p update
+#    ip x p deleteall
+#    ip x p set
+#-------------------------------------------------------------------
+kci_test_ipsec()
+{
+	srcip="14.0.0.52"
+	dstip="14.0.0.70"
+	algo="aead rfc4106(gcm(aes)) 0x3132333435363738393031323334353664636261 128"
+
+	# flush to be sure there's nothing configured
+	ip x s flush ; ip x p flush
+	check_err $?
+
+	# start the monitor in the background
+	tmpfile=`mktemp ipsectestXXX`
+	ip x m > $tmpfile &
+	mpid=$!
+	sleep 0.2
+
+	ipsecid="proto esp src $srcip dst $dstip spi 0x07"
+	ip x s add $ipsecid \
+            mode transport reqid 0x07 replay-window 32 \
+            $algo sel src $srcip/24 dst $dstip/24
+	check_err $?
+
+	lines=`ip x s list | grep $srcip | grep $dstip | wc -l`
+	test $lines -eq 2
+	check_err $?
+
+	ip x s count | grep -q "SAD count 1"
+	check_err $?
+
+	lines=`ip x s get $ipsecid | grep $srcip | grep $dstip | wc -l`
+	test $lines -eq 2
+	check_err $?
+
+	ip x s delete $ipsecid
+	check_err $?
+
+	lines=`ip x s list | wc -l`
+	test $lines -eq 0
+	check_err $?
+
+	ipsecsel="dir out src $srcip/24 dst $dstip/24"
+	ip x p add $ipsecsel \
+		    tmpl proto esp src $srcip dst $dstip \
+		    spi 0x07 mode transport reqid 0x07
+	check_err $?
+
+	lines=`ip x p list | grep $srcip | grep $dstip | wc -l`
+	test $lines -eq 2
+	check_err $?
+
+	ip x p count | grep -q "SPD IN  0 OUT 1 FWD 0"
+	check_err $?
+
+	lines=`ip x p get $ipsecsel | grep $srcip | grep $dstip | wc -l`
+	test $lines -eq 2
+	check_err $?
+
+	ip x p delete $ipsecsel
+	check_err $?
+
+	lines=`ip x p list | wc -l`
+	test $lines -eq 0
+	check_err $?
+
+	# check the monitor results
+	kill $mpid
+	lines=`wc -l $tmpfile | cut "-d " -f1`
+	test $lines -eq 20
+	check_err $?
+	rm -rf $tmpfile
+
+	# clean up any leftovers
+	ip x s flush
+	check_err $?
+	ip x p flush
+	check_err $?
+
+	if [ $ret -ne 0 ]; then
+		echo "FAIL: ipsec"
+		return 1
+	fi
+	echo "PASS: ipsec"
+}
+
 kci_test_gretap()
 {
 	testns="testns"
@@ -755,6 +857,7 @@ kci_test_rtnl()
 	kci_test_vrf
 	kci_test_encap
 	kci_test_macsec
+	kci_test_ipsec
 
 	kci_del_dummy
 }
-- 
2.14.1

^ permalink raw reply related

* [PATCH 3/5] xfrm: remove VLA usage in __xfrm6_sort()
From: Steffen Klassert @ 2018-05-07  9:01 UTC (permalink / raw)
  To: David Miller; +Cc: Herbert Xu, Steffen Klassert, netdev
In-Reply-To: <20180507090116.31222-1-steffen.klassert@secunet.com>

From: Kees Cook <keescook@chromium.org>

In the quest to remove all stack VLA usage removed from the kernel[1],
just use XFRM_MAX_DEPTH as already done for the "class" array. In one
case, it'll do this loop up to 5, the other caller up to 6.

[1] https://lkml.org/lkml/2018/3/7/621

Co-developed-by: Andreas Christoforou <andreaschristofo@gmail.com>
Signed-off-by: Kees Cook <keescook@chromium.org>
Acked-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
---
 net/ipv6/xfrm6_state.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/net/ipv6/xfrm6_state.c b/net/ipv6/xfrm6_state.c
index 16f434791763..5bdca3d5d6b7 100644
--- a/net/ipv6/xfrm6_state.c
+++ b/net/ipv6/xfrm6_state.c
@@ -60,11 +60,9 @@ xfrm6_init_temprop(struct xfrm_state *x, const struct xfrm_tmpl *tmpl,
 static int
 __xfrm6_sort(void **dst, void **src, int n, int (*cmp)(void *p), int maxclass)
 {
-	int i;
+	int count[XFRM_MAX_DEPTH] = { };
 	int class[XFRM_MAX_DEPTH];
-	int count[maxclass];
-
-	memset(count, 0, sizeof(count));
+	int i;
 
 	for (i = 0; i < n; i++) {
 		int c;
-- 
2.14.1

^ permalink raw reply related

* pull request (net-next): ipsec-next 2018-05-07
From: Steffen Klassert @ 2018-05-07  9:01 UTC (permalink / raw)
  To: David Miller; +Cc: Herbert Xu, Steffen Klassert, netdev

1) Add selftests for the xfrm commands.
   From Shannon Nelson.

2) Enable hardware checksum offload for ESP encapsulated
   UDP packets if the hardware supports this.
   From Jacek Kalwas.

3) Remove VLA usage in __xfrm6_sort. From Kees Cook.

4) Fix a typo in the comment of vti6_ioctl.
   From Sun Lianwen.

5) Use a dedicated slab cache for struct xfrm_state,
   this reduces the memory usage of this struct
   by 25 percent. From Mathias Krause.

Please note that this pull request has a merge conflict
between commit:

  bec1f6f69736 ("udp: generate gso with UDP_SEGMENT")

from the net-next tree and commit:

  cd027a5433d6 ("udp: enable UDP checksum offload for ESP")

from the ipsec-next tree.

The conflict can be solved as done in linux-next.

Please pull or let me know if there are problems.

Thanks!

The following changes since commit ef53e9e14714de2ce26eaae0244c07c426064d69:

  net: Remove unused tcp_set_state tracepoint (2018-04-16 19:02:15 -0400)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec-next.git master

for you to fetch changes up to 565f0fa902b64020d5d147ff1708567e9e0b6e49:

  xfrm: use a dedicated slab cache for struct xfrm_state (2018-05-04 10:14:00 +0200)

----------------------------------------------------------------
Jacek Kalwas (1):
      udp: enable UDP checksum offload for ESP

Kees Cook (1):
      xfrm: remove VLA usage in __xfrm6_sort()

Mathias Krause (1):
      xfrm: use a dedicated slab cache for struct xfrm_state

Shannon Nelson (1):
      selftests: add xfrm state-policy-monitor to rtnetlink.sh

Sun Lianwen (1):
      change the comment of vti6_ioctl

 net/ipv4/ip_output.c                     |   2 +-
 net/ipv6/ip6_vti.c                       |   2 +-
 net/ipv6/xfrm6_state.c                   |   6 +-
 net/xfrm/xfrm_state.c                    |   9 ++-
 tools/testing/selftests/net/rtnetlink.sh | 103 +++++++++++++++++++++++++++++++
 5 files changed, 114 insertions(+), 8 deletions(-)

^ permalink raw reply

* [PATCH 1/3] af_key: Always verify length of provided sadb_key
From: Steffen Klassert @ 2018-05-07  8:43 UTC (permalink / raw)
  To: David Miller; +Cc: Herbert Xu, Steffen Klassert, netdev
In-Reply-To: <20180507084323.22165-1-steffen.klassert@secunet.com>

From: Kevin Easton <kevin@guarana.org>

Key extensions (struct sadb_key) include a user-specified number of key
bits.  The kernel uses that number to determine how much key data to copy
out of the message in pfkey_msg2xfrm_state().

The length of the sadb_key message must be verified to be long enough,
even in the case of SADB_X_AALG_NULL.  Furthermore, the sadb_key_len value
must be long enough to include both the key data and the struct sadb_key
itself.

Introduce a helper function verify_key_len(), and call it from
parse_exthdrs() where other exthdr types are similarly checked for
correctness.

Signed-off-by: Kevin Easton <kevin@guarana.org>
Reported-by: syzbot+5022a34ca5a3d49b84223653fab632dfb7b4cf37@syzkaller.appspotmail.com
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
---
 net/key/af_key.c | 45 +++++++++++++++++++++++++++++++++++----------
 1 file changed, 35 insertions(+), 10 deletions(-)

diff --git a/net/key/af_key.c b/net/key/af_key.c
index 7e2e7188e7f4..e62e52e8f141 100644
--- a/net/key/af_key.c
+++ b/net/key/af_key.c
@@ -437,6 +437,24 @@ static int verify_address_len(const void *p)
 	return 0;
 }
 
+static inline int sadb_key_len(const struct sadb_key *key)
+{
+	int key_bytes = DIV_ROUND_UP(key->sadb_key_bits, 8);
+
+	return DIV_ROUND_UP(sizeof(struct sadb_key) + key_bytes,
+			    sizeof(uint64_t));
+}
+
+static int verify_key_len(const void *p)
+{
+	const struct sadb_key *key = p;
+
+	if (sadb_key_len(key) > key->sadb_key_len)
+		return -EINVAL;
+
+	return 0;
+}
+
 static inline int pfkey_sec_ctx_len(const struct sadb_x_sec_ctx *sec_ctx)
 {
 	return DIV_ROUND_UP(sizeof(struct sadb_x_sec_ctx) +
@@ -533,16 +551,25 @@ static int parse_exthdrs(struct sk_buff *skb, const struct sadb_msg *hdr, void *
 				return -EINVAL;
 			if (ext_hdrs[ext_type-1] != NULL)
 				return -EINVAL;
-			if (ext_type == SADB_EXT_ADDRESS_SRC ||
-			    ext_type == SADB_EXT_ADDRESS_DST ||
-			    ext_type == SADB_EXT_ADDRESS_PROXY ||
-			    ext_type == SADB_X_EXT_NAT_T_OA) {
+			switch (ext_type) {
+			case SADB_EXT_ADDRESS_SRC:
+			case SADB_EXT_ADDRESS_DST:
+			case SADB_EXT_ADDRESS_PROXY:
+			case SADB_X_EXT_NAT_T_OA:
 				if (verify_address_len(p))
 					return -EINVAL;
-			}
-			if (ext_type == SADB_X_EXT_SEC_CTX) {
+				break;
+			case SADB_X_EXT_SEC_CTX:
 				if (verify_sec_ctx_len(p))
 					return -EINVAL;
+				break;
+			case SADB_EXT_KEY_AUTH:
+			case SADB_EXT_KEY_ENCRYPT:
+				if (verify_key_len(p))
+					return -EINVAL;
+				break;
+			default:
+				break;
 			}
 			ext_hdrs[ext_type-1] = (void *) p;
 		}
@@ -1104,14 +1131,12 @@ static struct xfrm_state * pfkey_msg2xfrm_state(struct net *net,
 	key = ext_hdrs[SADB_EXT_KEY_AUTH - 1];
 	if (key != NULL &&
 	    sa->sadb_sa_auth != SADB_X_AALG_NULL &&
-	    ((key->sadb_key_bits+7) / 8 == 0 ||
-	     (key->sadb_key_bits+7) / 8 > key->sadb_key_len * sizeof(uint64_t)))
+	    key->sadb_key_bits == 0)
 		return ERR_PTR(-EINVAL);
 	key = ext_hdrs[SADB_EXT_KEY_ENCRYPT-1];
 	if (key != NULL &&
 	    sa->sadb_sa_encrypt != SADB_EALG_NULL &&
-	    ((key->sadb_key_bits+7) / 8 == 0 ||
-	     (key->sadb_key_bits+7) / 8 > key->sadb_key_len * sizeof(uint64_t)))
+	    key->sadb_key_bits == 0)
 		return ERR_PTR(-EINVAL);
 
 	x = xfrm_state_alloc(net);
-- 
2.14.1

^ permalink raw reply related

* [PATCH 2/3] xfrm: Fix warning in xfrm6_tunnel_net_exit.
From: Steffen Klassert @ 2018-05-07  8:43 UTC (permalink / raw)
  To: David Miller; +Cc: Herbert Xu, Steffen Klassert, netdev
In-Reply-To: <20180507084323.22165-1-steffen.klassert@secunet.com>

We need to make sure that all states are really deleted
before we check that the state lists are empty. Otherwise
we trigger a warning.

Fixes: baeb0dbbb5659 ("xfrm6_tunnel: exit_net cleanup check added")
Reported-and-tested-by:syzbot+777bf170a89e7b326405@syzkaller.appspotmail.com
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
---
 include/net/xfrm.h      | 1 +
 net/ipv6/xfrm6_tunnel.c | 3 +++
 net/xfrm/xfrm_state.c   | 6 ++++++
 3 files changed, 10 insertions(+)

diff --git a/include/net/xfrm.h b/include/net/xfrm.h
index a872379b69da..45e75c36b738 100644
--- a/include/net/xfrm.h
+++ b/include/net/xfrm.h
@@ -375,6 +375,7 @@ struct xfrm_input_afinfo {
 int xfrm_input_register_afinfo(const struct xfrm_input_afinfo *afinfo);
 int xfrm_input_unregister_afinfo(const struct xfrm_input_afinfo *afinfo);
 
+void xfrm_flush_gc(void);
 void xfrm_state_delete_tunnel(struct xfrm_state *x);
 
 struct xfrm_type {
diff --git a/net/ipv6/xfrm6_tunnel.c b/net/ipv6/xfrm6_tunnel.c
index f85f0d7480ac..4a46df8441c9 100644
--- a/net/ipv6/xfrm6_tunnel.c
+++ b/net/ipv6/xfrm6_tunnel.c
@@ -341,6 +341,9 @@ static void __net_exit xfrm6_tunnel_net_exit(struct net *net)
 	struct xfrm6_tunnel_net *xfrm6_tn = xfrm6_tunnel_pernet(net);
 	unsigned int i;
 
+	xfrm_state_flush(net, IPSEC_PROTO_ANY, false);
+	xfrm_flush_gc();
+
 	for (i = 0; i < XFRM6_TUNNEL_SPI_BYADDR_HSIZE; i++)
 		WARN_ON_ONCE(!hlist_empty(&xfrm6_tn->spi_byaddr[i]));
 
diff --git a/net/xfrm/xfrm_state.c b/net/xfrm/xfrm_state.c
index f9d2f2233f09..6c177ae7a6d9 100644
--- a/net/xfrm/xfrm_state.c
+++ b/net/xfrm/xfrm_state.c
@@ -2175,6 +2175,12 @@ struct xfrm_state_afinfo *xfrm_state_get_afinfo(unsigned int family)
 	return afinfo;
 }
 
+void xfrm_flush_gc(void)
+{
+	flush_work(&xfrm_state_gc_work);
+}
+EXPORT_SYMBOL(xfrm_flush_gc);
+
 /* Temporarily located here until net/xfrm/xfrm_tunnel.c is created */
 void xfrm_state_delete_tunnel(struct xfrm_state *x)
 {
-- 
2.14.1

^ permalink raw reply related

* [PATCH 3/3] vti6: Change minimum MTU to IPV4_MIN_MTU, vti6 can carry IPv4 too
From: Steffen Klassert @ 2018-05-07  8:43 UTC (permalink / raw)
  To: David Miller; +Cc: Herbert Xu, Steffen Klassert, netdev
In-Reply-To: <20180507084323.22165-1-steffen.klassert@secunet.com>

From: Stefano Brivio <sbrivio@redhat.com>

A vti6 interface can carry IPv4 as well, so it makes no sense to
enforce a minimum MTU of IPV6_MIN_MTU.

If the user sets an MTU below IPV6_MIN_MTU, IPv6 will be
disabled on the interface, courtesy of addrconf_notify().

Reported-by: Xin Long <lucien.xin@gmail.com>
Fixes: b96f9afee4eb ("ipv4/6: use core net MTU range checking")
Fixes: c6741fbed6dc ("vti6: Properly adjust vti6 MTU from MTU of lower device")
Fixes: 53c81e95df17 ("ip6_vti: adjust vti mtu according to mtu of lower device")
Signed-off-by: Stefano Brivio <sbrivio@redhat.com>
Signed-off-by: Steffen Klassert <steffen.klassert@secunet.com>
---
 net/ipv6/ip6_vti.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/ipv6/ip6_vti.c b/net/ipv6/ip6_vti.c
index c214ffec02f0..ca957dd93a29 100644
--- a/net/ipv6/ip6_vti.c
+++ b/net/ipv6/ip6_vti.c
@@ -669,7 +669,7 @@ static void vti6_link_config(struct ip6_tnl *t, bool keep_mtu)
 	else
 		mtu = ETH_DATA_LEN - LL_MAX_HEADER - sizeof(struct ipv6hdr);
 
-	dev->mtu = max_t(int, mtu, IPV6_MIN_MTU);
+	dev->mtu = max_t(int, mtu, IPV4_MIN_MTU);
 }
 
 /**
@@ -881,7 +881,7 @@ static void vti6_dev_setup(struct net_device *dev)
 	dev->priv_destructor = vti6_dev_free;
 
 	dev->type = ARPHRD_TUNNEL6;
-	dev->min_mtu = IPV6_MIN_MTU;
+	dev->min_mtu = IPV4_MIN_MTU;
 	dev->max_mtu = IP_MAX_MTU - sizeof(struct ipv6hdr);
 	dev->flags |= IFF_NOARP;
 	dev->addr_len = sizeof(struct in6_addr);
-- 
2.14.1

^ permalink raw reply related

* pull request (net): ipsec 2018-05-07
From: Steffen Klassert @ 2018-05-07  8:43 UTC (permalink / raw)
  To: David Miller; +Cc: Herbert Xu, Steffen Klassert, netdev

1) Always verify length of provided sadb_key to fix a
   slab-out-of-bounds read in pfkey_add. From Kevin Easton.

2) Make sure that all states are really deleted
   before we check that the state lists are empty.
   Otherwise we trigger a warning.

3) Fix MTU handling of the VTI6 interfaces on
   interfamily tunnels. From Stefano Brivio.

Please pull or let me know if there are problems.

Thanks!

The following changes since commit 76327a35caabd1a932e83d6a42b967aa08584e5d:

  dp83640: Ensure against premature access to PHY registers after reset (2018-04-08 19:58:52 -0400)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/klassert/ipsec.git master

for you to fetch changes up to b4331a681822b420511b3258f1c3db35001fde48:

  vti6: Change minimum MTU to IPV4_MIN_MTU, vti6 can carry IPv4 too (2018-04-27 07:29:23 +0200)

----------------------------------------------------------------
Kevin Easton (1):
      af_key: Always verify length of provided sadb_key

Stefano Brivio (1):
      vti6: Change minimum MTU to IPV4_MIN_MTU, vti6 can carry IPv4 too

Steffen Klassert (1):
      xfrm: Fix warning in xfrm6_tunnel_net_exit.

 include/net/xfrm.h      |  1 +
 net/ipv6/ip6_vti.c      |  4 ++--
 net/ipv6/xfrm6_tunnel.c |  3 +++
 net/key/af_key.c        | 45 +++++++++++++++++++++++++++++++++++----------
 net/xfrm/xfrm_state.c   |  6 ++++++
 5 files changed, 47 insertions(+), 12 deletions(-)

^ permalink raw reply

* Re: WARNING in kernfs_add_one
From: Johannes Berg @ 2018-05-07  8:43 UTC (permalink / raw)
  To: Greg KH, linux-wireless, Eric Dumazet
  Cc: netdev, syzbot, linux-kernel, syzkaller-bugs, tj
In-Reply-To: <20180505220721.GA10829@kroah.com>

On Sat, 2018-05-05 at 15:07 -0700, Greg KH wrote:

> > > > syzbot found the following crash on:

Maybe it should learn to differentiate warnings, if it's going to set
panic_on_warn :-)

I get why, but still, at least differentiating in the emails wouldn't be
bad.

> > > > kernfs: ns required in 'ieee80211' for 'phy3'

Huh. What does that even mean?

> > > > RIP: 0010:kernfs_add_one+0x406/0x4d0 fs/kernfs/dir.c:758
> > > > RSP: 0018:ffff8801ca9eece0 EFLAGS: 00010286
> > > > RAX: 000000000000002d RBX: ffffffff87d5cee0 RCX: ffffffff8160ba7d
> > > > RDX: 0000000000000000 RSI: ffffffff81610731 RDI: ffff8801ca9ee840
> > > > RBP: ffff8801ca9eed20 R08: ffff8801d9538500 R09: 0000000000000006
> > > > R10: ffff8801d9538500 R11: 0000000000000000 R12: ffff8801ad1cb6c0
> > > > R13: ffffffff885da640 R14: 0000000000000020 R15: 0000000000000000
> > > >  kernfs_create_link+0x112/0x180 fs/kernfs/symlink.c:41
> > > >  sysfs_do_create_link_sd.isra.2+0x90/0x130 fs/sysfs/symlink.c:43
> > > >  sysfs_do_create_link fs/sysfs/symlink.c:79 [inline]
> > > >  sysfs_create_link+0x65/0xc0 fs/sysfs/symlink.c:91
> > > >  device_add_class_symlinks drivers/base/core.c:1612 [inline]
> > > >  device_add+0x7a0/0x16d0 drivers/base/core.c:1810
> > > >  wiphy_register+0x178a/0x2430 net/wireless/core.c:806
> > > >  ieee80211_register_hw+0x13cd/0x35d0 net/mac80211/main.c:1047
> > > >  mac80211_hwsim_new_radio+0x1d9b/0x3410
> > > > drivers/net/wireless/mac80211_hwsim.c:2772
> > > >  hwsim_new_radio_nl+0x7a7/0xa60 drivers/net/wireless/mac80211_hwsim.c:3246
> > > >  genl_family_rcv_msg+0x889/0x1120 net/netlink/genetlink.c:599

Basically we're creating a new virtual radio, which in turn creates a
new device, which we have to register.

Something is going on with the context here that makes sysfs unhappy,
but TBH I have no idea what.

johannes

^ permalink raw reply

* Re: [PATCH] bpf: fix misaligned access for BPF_PROG_TYPE_PERF_EVENT program type on x86_32 platform
From: Daniel Borkmann @ 2018-05-07  8:25 UTC (permalink / raw)
  To: Wang YanQing, Alexei Starovoitov, ast, netdev, linux-kernel
In-Reply-To: <20180507072305.GA11275@udknight>

On 05/07/2018 09:23 AM, Wang YanQing wrote:
> On Sat, Apr 28, 2018 at 01:29:17PM +0800, Wang YanQing wrote:
>> On Sat, Apr 28, 2018 at 01:33:15AM +0200, Daniel Borkmann wrote:
>>> On 04/28/2018 12:48 AM, Alexei Starovoitov wrote:
>>>> On Thu, Apr 26, 2018 at 05:57:49PM +0800, Wang YanQing wrote:
>>>>> All the testcases for BPF_PROG_TYPE_PERF_EVENT program type in
>>>>> test_verifier(kselftest) report below errors on x86_32:
>>>>> "
>>>>> 172/p unpriv: spill/fill of different pointers ldx FAIL
>>>>> Unexpected error message!
>>>>> 0: (bf) r6 = r10
>>>>> 1: (07) r6 += -8
>>>>> 2: (15) if r1 == 0x0 goto pc+3
>>>>> R1=ctx(id=0,off=0,imm=0) R6=fp-8,call_-1 R10=fp0,call_-1
>>>>> 3: (bf) r2 = r10
>>>>> 4: (07) r2 += -76
>>>>> 5: (7b) *(u64 *)(r6 +0) = r2
>>>>> 6: (55) if r1 != 0x0 goto pc+1
>>>>> R1=ctx(id=0,off=0,imm=0) R2=fp-76,call_-1 R6=fp-8,call_-1 R10=fp0,call_-1 fp-8=fp
>>>>> 7: (7b) *(u64 *)(r6 +0) = r1
>>>>> 8: (79) r1 = *(u64 *)(r6 +0)
>>>>> 9: (79) r1 = *(u64 *)(r1 +68)
>>>>> invalid bpf_context access off=68 size=8
>>>>>
>>>>> 378/p check bpf_perf_event_data->sample_period byte load permitted FAIL
>>>>> Failed to load prog 'Permission denied'!
>>>>> 0: (b7) r0 = 0
>>>>> 1: (71) r0 = *(u8 *)(r1 +68)
>>>>> invalid bpf_context access off=68 size=1
>>>>>
>>>>> 379/p check bpf_perf_event_data->sample_period half load permitted FAIL
>>>>> Failed to load prog 'Permission denied'!
>>>>> 0: (b7) r0 = 0
>>>>> 1: (69) r0 = *(u16 *)(r1 +68)
>>>>> invalid bpf_context access off=68 size=2
>>>>>
>>>>> 380/p check bpf_perf_event_data->sample_period word load permitted FAIL
>>>>> Failed to load prog 'Permission denied'!
>>>>> 0: (b7) r0 = 0
>>>>> 1: (61) r0 = *(u32 *)(r1 +68)
>>>>> invalid bpf_context access off=68 size=4
>>>>>
>>>>> 381/p check bpf_perf_event_data->sample_period dword load permitted FAIL
>>>>> Failed to load prog 'Permission denied'!
>>>>> 0: (b7) r0 = 0
>>>>> 1: (79) r0 = *(u64 *)(r1 +68)
>>>>> invalid bpf_context access off=68 size=8
>>>>> "
>>>>>
>>>>> This patch fix it, the fix isn't only necessary for x86_32, it will fix the
>>>>> same problem for other platforms too, if their size of bpf_user_pt_regs_t
>>>>> can't divide exactly into 8.
>>>>>
>>>>> Signed-off-by: Wang YanQing <udknight@gmail.com>
>>>>> ---
>>>>>  Hi all!
>>>>>  After mainline accept this patch, then we need to submit a sync patch
>>>>>  to update the tools/include/uapi/linux/bpf_perf_event.h.
>>>>>
>>>>>  Thanks.
>>>>>
>>>>>  include/uapi/linux/bpf_perf_event.h | 2 +-
>>>>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>>>>
>>>>> diff --git a/include/uapi/linux/bpf_perf_event.h b/include/uapi/linux/bpf_perf_event.h
>>>>> index eb1b9d2..ff4c092 100644
>>>>> --- a/include/uapi/linux/bpf_perf_event.h
>>>>> +++ b/include/uapi/linux/bpf_perf_event.h
>>>>> @@ -12,7 +12,7 @@
>>>>>  
>>>>>  struct bpf_perf_event_data {
>>>>>  	bpf_user_pt_regs_t regs;
>>>>> -	__u64 sample_period;
>>>>> +	__u64 sample_period __attribute__((aligned(8)));
>>>>
>>>> I don't think this necessary.
>>>> imo it's a bug in pe_prog_is_valid_access
>>>> that should have allowed 8-byte access to 4-byte aligned sample_period.
>>>> The access rewritten by pe_prog_convert_ctx_access anyway,
>>>> no alignment issues as far as I can see.
>>>
>>> Right, good point. Wang, could you give the below a test run:
>>>
>>> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
>>> index 56ba0f2..95b9142 100644
>>> --- a/kernel/trace/bpf_trace.c
>>> +++ b/kernel/trace/bpf_trace.c
>>> @@ -833,8 +833,14 @@ static bool pe_prog_is_valid_access(int off, int size, enum bpf_access_type type
>>>  		return false;
>>>  	if (type != BPF_READ)
>>>  		return false;
>>> -	if (off % size != 0)
>>> -		return false;
>>> +	if (off % size != 0) {
>>> +		if (sizeof(long) != 4)
>>> +			return false;
>>> +		if (size != 8)
>>> +			return false;
>>> +		if (off % size != 4)
>>> +			return false;
>>> +	}
>>>
>>>  	switch (off) {
>>>  	case bpf_ctx_range(struct bpf_perf_event_data, sample_period):
>> Hi all!
>>
>> I have tested this patch, but test_verifier reports the same errors
>> for the five testcases.
>>
>> The reason is they all failed to pass the test of bpf_ctx_narrow_access_ok.
>>
>> Thanks.
> Hi! Daniel Borkmann.
> 
> Do you have any plan to fix bpf_ctx_narrow_access_ok for these problems?

Yep, sorry for the delay, will get to it during this week.

Thanks,
Daniel

^ permalink raw reply

* Re: linux-next: manual merge of the tip tree with the bpf-next tree
From: Daniel Borkmann @ 2018-05-07  8:15 UTC (permalink / raw)
  To: Stephen Rothwell, Thomas Gleixner, Ingo Molnar, H. Peter Anvin,
	Peter Zijlstra, Alexei Starovoitov, Networking
  Cc: Linux-Next Mailing List, Linux Kernel Mailing List
In-Reply-To: <20180507141003.79783848@canb.auug.org.au>

On 05/07/2018 06:10 AM, Stephen Rothwell wrote:
> On Mon, 7 May 2018 12:09:09 +1000 Stephen Rothwell <sfr@canb.auug.org.au> wrote:
>>
>> Today's linux-next merge of the tip tree got a conflict in:
>>
>>   arch/x86/net/bpf_jit_comp.c
>>
>> between commit:
>>
>>   e782bdcf58c5 ("bpf, x64: remove ld_abs/ld_ind")
>>
>> from the bpf-next tree and commit:
>>
>>   5f26c50143f5 ("x86/bpf: Clean up non-standard comments, to make the code more readable")
>>
>> from the tip tree.
>>
>> I fixed it up (the former commit removed some code modified by the latter,
>> so I just removed it) and can carry the fix as necessary. This is now
>> fixed as far as linux-next is concerned, but any non trivial conflicts
>> should be mentioned to your upstream maintainer when your tree is
>> submitted for merging.  You may also want to consider cooperating with
>> the maintainer of the conflicting tree to minimise any particularly
>> complex conflicts.
> 
> Actually the tip tree commit has been added to the bpf-next tree as a
> different commit, so dropping it from the tip tree will clean this up.

Yep, it's been cherry-picked into bpf-next to avoid merge conflicts with
ongoing work.

^ permalink raw reply

* Re: [PATCH] netfilter: nf_queue: Replace conntrack entry
From: Dan Carpenter @ 2018-05-07  8:07 UTC (permalink / raw)
  To: kbuild, Kristian Evensen
  Cc: Kristian Evensen, netdev, Florian Westphal, linux-kernel,
	coreteam, netfilter-devel, kbuild-all, Jozsef Kadlecsik,
	David S. Miller, Pablo Neira Ayuso
In-Reply-To: <20180503140745.26588-1-kristian.evensen@gmail.com>

Hi Kristian,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on nf-next/master]
[also build test WARNING on v4.17-rc3 next-20180504]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Kristian-Evensen/netfilter-nf_queue-Replace-conntrack-entry/20180504-051218
base:   https://git.kernel.org/pub/scm/linux/kernel/git/pablo/nf-next.git master

smatch warnings:
net/netfilter/nfnetlink_queue.c:1141 nfqnl_recv_verdict_batch() warn: curly braces intended?

# https://github.com/0day-ci/linux/commit/8776e32a6c6e2ba0c6c8ce85e227672b81a1649d
git remote add linux-review https://github.com/0day-ci/linux
git remote update linux-review
git checkout 8776e32a6c6e2ba0c6c8ce85e227672b81a1649d
vim +1141 net/netfilter/nfnetlink_queue.c

8776e32a net/netfilter/nfnetlink_queue.c      Kristian Evensen  2018-05-03  1093  
7b8002a1 net/netfilter/nfnetlink_queue.c      Pablo Neira Ayuso 2015-12-15  1094  static int nfqnl_recv_verdict_batch(struct net *net, struct sock *ctnl,
7b8002a1 net/netfilter/nfnetlink_queue.c      Pablo Neira Ayuso 2015-12-15  1095  				    struct sk_buff *skb,
97d32cf9 net/netfilter/nfnetlink_queue.c      Florian Westphal  2011-07-19  1096  				    const struct nlmsghdr *nlh,
04ba724b net/netfilter/nfnetlink_queue.c      Pablo Neira Ayuso 2017-06-19  1097  			            const struct nlattr * const nfqa[],
04ba724b net/netfilter/nfnetlink_queue.c      Pablo Neira Ayuso 2017-06-19  1098  				    struct netlink_ext_ack *extack)
97d32cf9 net/netfilter/nfnetlink_queue.c      Florian Westphal  2011-07-19  1099  {
3da07c0c net/netfilter/nfnetlink_queue_core.c David S. Miller   2012-06-26  1100  	struct nfgenmsg *nfmsg = nlmsg_data(nlh);
97d32cf9 net/netfilter/nfnetlink_queue.c      Florian Westphal  2011-07-19  1101  	struct nf_queue_entry *entry, *tmp;
97d32cf9 net/netfilter/nfnetlink_queue.c      Florian Westphal  2011-07-19  1102  	unsigned int verdict, maxid;
97d32cf9 net/netfilter/nfnetlink_queue.c      Florian Westphal  2011-07-19  1103  	struct nfqnl_msg_verdict_hdr *vhdr;
97d32cf9 net/netfilter/nfnetlink_queue.c      Florian Westphal  2011-07-19  1104  	struct nfqnl_instance *queue;
97d32cf9 net/netfilter/nfnetlink_queue.c      Florian Westphal  2011-07-19  1105  	LIST_HEAD(batch_list);
97d32cf9 net/netfilter/nfnetlink_queue.c      Florian Westphal  2011-07-19  1106  	u16 queue_num = ntohs(nfmsg->res_id);
e8179610 net/netfilter/nfnetlink_queue_core.c Gao feng          2013-03-24  1107  	struct nfnl_queue_net *q = nfnl_queue_pernet(net);
8776e32a net/netfilter/nfnetlink_queue.c      Kristian Evensen  2018-05-03  1108  	enum ip_conntrack_info ctinfo;
e8179610 net/netfilter/nfnetlink_queue_core.c Gao feng          2013-03-24  1109  
e8179610 net/netfilter/nfnetlink_queue_core.c Gao feng          2013-03-24  1110  	queue = verdict_instance_lookup(q, queue_num,
e8179610 net/netfilter/nfnetlink_queue_core.c Gao feng          2013-03-24  1111  					NETLINK_CB(skb).portid);
97d32cf9 net/netfilter/nfnetlink_queue.c      Florian Westphal  2011-07-19  1112  	if (IS_ERR(queue))
97d32cf9 net/netfilter/nfnetlink_queue.c      Florian Westphal  2011-07-19  1113  		return PTR_ERR(queue);
97d32cf9 net/netfilter/nfnetlink_queue.c      Florian Westphal  2011-07-19  1114  
97d32cf9 net/netfilter/nfnetlink_queue.c      Florian Westphal  2011-07-19  1115  	vhdr = verdicthdr_get(nfqa);
97d32cf9 net/netfilter/nfnetlink_queue.c      Florian Westphal  2011-07-19  1116  	if (!vhdr)
97d32cf9 net/netfilter/nfnetlink_queue.c      Florian Westphal  2011-07-19  1117  		return -EINVAL;
97d32cf9 net/netfilter/nfnetlink_queue.c      Florian Westphal  2011-07-19  1118  
97d32cf9 net/netfilter/nfnetlink_queue.c      Florian Westphal  2011-07-19  1119  	verdict = ntohl(vhdr->verdict);
97d32cf9 net/netfilter/nfnetlink_queue.c      Florian Westphal  2011-07-19  1120  	maxid = ntohl(vhdr->id);
97d32cf9 net/netfilter/nfnetlink_queue.c      Florian Westphal  2011-07-19  1121  
97d32cf9 net/netfilter/nfnetlink_queue.c      Florian Westphal  2011-07-19  1122  	spin_lock_bh(&queue->lock);
97d32cf9 net/netfilter/nfnetlink_queue.c      Florian Westphal  2011-07-19  1123  
97d32cf9 net/netfilter/nfnetlink_queue.c      Florian Westphal  2011-07-19  1124  	list_for_each_entry_safe(entry, tmp, &queue->queue_list, list) {
97d32cf9 net/netfilter/nfnetlink_queue.c      Florian Westphal  2011-07-19  1125  		if (nfq_id_after(entry->id, maxid))
97d32cf9 net/netfilter/nfnetlink_queue.c      Florian Westphal  2011-07-19  1126  			break;
97d32cf9 net/netfilter/nfnetlink_queue.c      Florian Westphal  2011-07-19  1127  		__dequeue_entry(queue, entry);
97d32cf9 net/netfilter/nfnetlink_queue.c      Florian Westphal  2011-07-19  1128  		list_add_tail(&entry->list, &batch_list);
97d32cf9 net/netfilter/nfnetlink_queue.c      Florian Westphal  2011-07-19  1129  	}
97d32cf9 net/netfilter/nfnetlink_queue.c      Florian Westphal  2011-07-19  1130  
97d32cf9 net/netfilter/nfnetlink_queue.c      Florian Westphal  2011-07-19  1131  	spin_unlock_bh(&queue->lock);
97d32cf9 net/netfilter/nfnetlink_queue.c      Florian Westphal  2011-07-19  1132  
97d32cf9 net/netfilter/nfnetlink_queue.c      Florian Westphal  2011-07-19  1133  	if (list_empty(&batch_list))
97d32cf9 net/netfilter/nfnetlink_queue.c      Florian Westphal  2011-07-19  1134  		return -ENOENT;
97d32cf9 net/netfilter/nfnetlink_queue.c      Florian Westphal  2011-07-19  1135  
97d32cf9 net/netfilter/nfnetlink_queue.c      Florian Westphal  2011-07-19  1136  	list_for_each_entry_safe(entry, tmp, &batch_list, list) {
97d32cf9 net/netfilter/nfnetlink_queue.c      Florian Westphal  2011-07-19  1137  		if (nfqa[NFQA_MARK])
97d32cf9 net/netfilter/nfnetlink_queue.c      Florian Westphal  2011-07-19  1138  			entry->skb->mark = ntohl(nla_get_be32(nfqa[NFQA_MARK]));
8776e32a net/netfilter/nfnetlink_queue.c      Kristian Evensen  2018-05-03  1139  
8776e32a net/netfilter/nfnetlink_queue.c      Kristian Evensen  2018-05-03  1140  #if IS_ENABLED(CONFIG_NF_CONNTRACK)
8776e32a net/netfilter/nfnetlink_queue.c      Kristian Evensen  2018-05-03 @1141  			nf_ct_get(entry->skb, &ctinfo);
8776e32a net/netfilter/nfnetlink_queue.c      Kristian Evensen  2018-05-03  1142  
8776e32a net/netfilter/nfnetlink_queue.c      Kristian Evensen  2018-05-03  1143  			if (ctinfo == IP_CT_NEW && verdict != NF_STOLEN &&
8776e32a net/netfilter/nfnetlink_queue.c      Kristian Evensen  2018-05-03  1144  			    verdict != NF_DROP) {
8776e32a net/netfilter/nfnetlink_queue.c      Kristian Evensen  2018-05-03  1145  				nfqnl_update_ct(net, entry->skb);
8776e32a net/netfilter/nfnetlink_queue.c      Kristian Evensen  2018-05-03  1146  			}
8776e32a net/netfilter/nfnetlink_queue.c      Kristian Evensen  2018-05-03  1147  #endif
8776e32a net/netfilter/nfnetlink_queue.c      Kristian Evensen  2018-05-03  1148  
97d32cf9 net/netfilter/nfnetlink_queue.c      Florian Westphal  2011-07-19  1149  		nf_reinject(entry, verdict);
97d32cf9 net/netfilter/nfnetlink_queue.c      Florian Westphal  2011-07-19  1150  	}
97d32cf9 net/netfilter/nfnetlink_queue.c      Florian Westphal  2011-07-19  1151  	return 0;
97d32cf9 net/netfilter/nfnetlink_queue.c      Florian Westphal  2011-07-19  1152  }
97d32cf9 net/netfilter/nfnetlink_queue.c      Florian Westphal  2011-07-19  1153  

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox