Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH net-next 2/3] gso: limit udp gso to egress-only virtual devices
From: Willem de Bruijn @ 2018-05-14 23:12 UTC (permalink / raw)
  To: Network Development; +Cc: David Miller, Willem de Bruijn, Alexander Duyck
In-Reply-To: <20180514230747.118875-3-willemdebruijn.kernel@gmail.com>

On Mon, May 14, 2018 at 7:07 PM, Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
> From: Willem de Bruijn <willemb@google.com>
>
> Until the udp receive stack supports large packets (UDP GRO), GSO
> packets must not loop from the egress to the ingress path.
>
> Revert the change that added NETIF_F_GSO_UDP_L4 to various virtual
> devices through NETIF_F_GSO_ENCAP_ALL as this included devices that
> may loop packets, such as veth and macvlan.
>
> Instead add it to specific devices that forward to another device's
> egress path: bonding and team.
>
> Fixes: 83aa025f535f ("udp: add gso support to virtual devices")
> CC: Alexander Duyck <alexander.duyck@gmail.com>
> Signed-off-by: Willem de Bruijn <willemb@google.com>
> ---

> diff --git a/drivers/net/team/team.c b/drivers/net/team/team.c
> index 9dbd390ace34..c6a9f0cafea2 100644
> --- a/drivers/net/team/team.c
> +++ b/drivers/net/team/team.c
> @@ -1026,7 +1026,8 @@ static void __team_compute_features(struct team *team)
>         }
>
>         team->dev->vlan_features = vlan_features;
> -       team->dev->hw_enc_features = enc_features | NETIF_F_GSO_ENCAP_ALL;
> +       team->dev->hw_enc_features = enc_features | NETIF_F_GSO_ENCAP_ALL |
> +                                    NETIF_GSO_UDP_L4;

This has a typo. team.ko did not build automatically for me and caught it
with a full compile just too late.

Need to send a v2, sorry.

^ permalink raw reply

* Re: [net 1/1] net/mlx5: Fix build break when CONFIG_SMP=n
From: Randy Dunlap @ 2018-05-14 23:19 UTC (permalink / raw)
  To: Saeed Mahameed, David S. Miller; +Cc: netdev, Guenter Roeck, Thomas Gleixner
In-Reply-To: <20180514223810.21197-1-saeedm@mellanox.com>

On 05/14/2018 03:38 PM, Saeed Mahameed wrote:
> Avoid using the kernel's irq_descriptor and return IRQ vector affinity
> directly from the driver.
> 
> This fixes the following build break when CONFIG_SMP=n
> 
> include/linux/mlx5/driver.h: In function ‘mlx5_get_vector_affinity_hint’:
> include/linux/mlx5/driver.h:1299:13: error:
>         ‘struct irq_desc’ has no member named ‘affinity_hint’
> 
> Fixes: 6082d9c9c94a ("net/mlx5: Fix mlx5_get_vector_affinity function")
> Signed-off-by: Saeed Mahameed <saeedm@mellanox.com>
> CC: Randy Dunlap <rdunlap@infradead.org>
> CC: Guenter Roeck <linux@roeck-us.net>
> CC: Thomas Gleixner <tglx@linutronix.de>
> Tested-by: Israel Rukshin <israelr@mellanox.com>

Reported-by: kbuild test robot <lkp@intel.com>
Reported-by: Randy Dunlap <rdunlap@infradead.org>
Tested-by: Randy Dunlap <rdunlap@infradead.org>

Thanks.

> ---
> 
> For -stable v4.14
> 
>  include/linux/mlx5/driver.h | 12 +-----------
>  1 file changed, 1 insertion(+), 11 deletions(-)
> 
> diff --git a/include/linux/mlx5/driver.h b/include/linux/mlx5/driver.h
> index 2a156c5dfadd..d703774982ca 100644
> --- a/include/linux/mlx5/driver.h
> +++ b/include/linux/mlx5/driver.h
> @@ -1286,17 +1286,7 @@ enum {
>  static inline const struct cpumask *
>  mlx5_get_vector_affinity_hint(struct mlx5_core_dev *dev, int vector)
>  {
> -	struct irq_desc *desc;
> -	unsigned int irq;
> -	int eqn;
> -	int err;
> -
> -	err = mlx5_vector2eqn(dev, vector, &eqn, &irq);
> -	if (err)
> -		return NULL;
> -
> -	desc = irq_to_desc(irq);
> -	return desc->affinity_hint;
> +	return dev->priv.irq_info[vector].mask;
>  }
>  
>  #endif /* MLX5_DRIVER_H */
> 


-- 
~Randy

^ permalink raw reply

* Re: [PATCH v1 1/4] media: rc: introduce BPF_PROG_IR_DECODER
From: Randy Dunlap @ 2018-05-14 23:27 UTC (permalink / raw)
  To: Sean Young, linux-media, linux-kernel, Alexei Starovoitov,
	Mauro Carvalho Chehab, Daniel Borkmann, netdev, Matthias Reichl,
	Devin Heitmueller
In-Reply-To: <32a944171d5c48abf126259595b0088ce3122c91.1526331777.git.sean@mess.org>

On 05/14/2018 02:10 PM, Sean Young wrote:
> Add support for BPF_PROG_IR_DECODER. This type of BPF program can call

Kconfig file below uses IR_BPF_DECODER instead of the symbol name above.

and then patch 3 says a third choice:
The context provided to a BPF_PROG_RAWIR_DECODER is a struct ir_raw_event;

> rc_keydown() to reported decoded IR scancodes, or rc_repeat() to report
> that the last key should be repeated.
> 
> Signed-off-by: Sean Young <sean@mess.org>
> ---
>  drivers/media/rc/Kconfig          |  8 +++
>  drivers/media/rc/Makefile         |  1 +
>  drivers/media/rc/ir-bpf-decoder.c | 93 +++++++++++++++++++++++++++++++
>  include/linux/bpf_types.h         |  3 +
>  include/uapi/linux/bpf.h          | 16 +++++-
>  5 files changed, 120 insertions(+), 1 deletion(-)
>  create mode 100644 drivers/media/rc/ir-bpf-decoder.c
> 
> diff --git a/drivers/media/rc/Kconfig b/drivers/media/rc/Kconfig
> index eb2c3b6eca7f..10ad6167d87c 100644
> --- a/drivers/media/rc/Kconfig
> +++ b/drivers/media/rc/Kconfig
> @@ -120,6 +120,14 @@ config IR_IMON_DECODER
>  	   remote control and you would like to use it with a raw IR
>  	   receiver, or if you wish to use an encoder to transmit this IR.
>  
> +config IR_BPF_DECODER
> +	bool "Enable IR raw decoder using BPF"
> +	depends on BPF_SYSCALL
> +	depends on RC_CORE=y
> +	help
> +	   Enable this option to make it possible to load custom IR
> +	   decoders written in BPF.
> +
>  endif #RC_DECODERS
>  
>  menuconfig RC_DEVICES
> diff --git a/drivers/media/rc/Makefile b/drivers/media/rc/Makefile
> index 2e1c87066f6c..12e1118430d0 100644
> --- a/drivers/media/rc/Makefile
> +++ b/drivers/media/rc/Makefile
> @@ -5,6 +5,7 @@ obj-y += keymaps/
>  obj-$(CONFIG_RC_CORE) += rc-core.o
>  rc-core-y := rc-main.o rc-ir-raw.o
>  rc-core-$(CONFIG_LIRC) += lirc_dev.o
> +rc-core-$(CONFIG_IR_BPF_DECODER) += ir-bpf-decoder.o


-- 
~Randy

^ permalink raw reply

* Re: [PATCH net-next 3/3] udp: only use paged allocation with scatter-gather
From: Willem de Bruijn @ 2018-05-14 23:30 UTC (permalink / raw)
  To: Eric Dumazet; +Cc: Network Development, David Miller, Willem de Bruijn
In-Reply-To: <a629c4fa-3666-48c2-900f-9d04d9ecfcbc@gmail.com>

On Mon, May 14, 2018 at 7:12 PM, Eric Dumazet <eric.dumazet@gmail.com> wrote:
>
>
> On 05/14/2018 04:07 PM, Willem de Bruijn wrote:
>> From: Willem de Bruijn <willemb@google.com>
>>
>> Paged allocation stores most payload in skb frags. This helps udp gso
>> by avoiding copying from the gso skb to segment skb in skb_segment.
>>
>> But without scatter-gather, data must be linear, so do not use paged
>> mode unless NETIF_F_SG.
>>
>> Fixes: 15e36f5b8e98 ("udp: paged allocation with gso")
>> Reported-by: Sean Tranchetti <stranche@codeaurora.org>
>> Signed-off-by: Willem de Bruijn <willemb@google.com>
>> ---
>>  net/ipv4/ip_output.c  | 2 +-
>>  net/ipv6/ip6_output.c | 2 +-
>>  2 files changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/net/ipv4/ip_output.c b/net/ipv4/ip_output.c
>> index b5e21eb198d8..b38731d8a44f 100644
>> --- a/net/ipv4/ip_output.c
>> +++ b/net/ipv4/ip_output.c
>> @@ -884,7 +884,7 @@ static int __ip_append_data(struct sock *sk,
>>
>>       exthdrlen = !skb ? rt->dst.header_len : 0;
>>       mtu = cork->gso_size ? IP_MAX_MTU : cork->fragsize;
>> -     paged = !!cork->gso_size;
>> +     paged = cork->gso_size && (rt->dst.dev->features & NETIF_F_SG);
>>
>>       if (cork->tx_flags & SKBTX_ANY_SW_TSTAMP &&
>>           sk->sk_tsflags & SOF_TIMESTAMPING_OPT_ID)
>> diff --git a/net/ipv6/ip6_output.c b/net/ipv6/ip6_output.c
>> index 7f4493080df6..35a940b9f208 100644
>> --- a/net/ipv6/ip6_output.c
>> +++ b/net/ipv6/ip6_output.c
>> @@ -1262,7 +1262,7 @@ static int __ip6_append_data(struct sock *sk,
>>               dst_exthdrlen = rt->dst.header_len - rt->rt6i_nfheader_len;
>>       }
>>
>> -     paged = !!cork->gso_size;
>> +     paged = cork->gso_size && (rt->dst.dev->features & NETIF_F_SG);
>>       mtu = cork->gso_size ? IP6_MAX_MTU : cork->fragsize;
>>       orig_mtu = mtu;
>>
>>
>
> As I said, this wont help for stacked device
>
> bonding might advertise NETIF_F_SG, but one slave might not.

I don't quite follow. The reported crash happens in the protocol layer,
because of this check. With pagedlen we have not allocated
sufficient space for the skb_put.

                if (!(rt->dst.dev->features&NETIF_F_SG)) {
                        unsigned int off;

                        off = skb->len;
                        if (getfrag(from, skb_put(skb, copy),
                                        offset, copy, off, skb) < 0) {
                                __skb_trim(skb, off);
                                err = -EFAULT;
                                goto error;
                        }
                } else {
                        int i = skb_shinfo(skb)->nr_frags;

Are you referring to a separate potential issue in the gso layer?
If a bonding device advertises SG, but a slave does not, then
skb_segment on the slave should build linear segs? I have not
tested that.

^ permalink raw reply

* Re: [PATCH 01/14] net: sched: use rcu for action cookie update
From: kbuild test robot @ 2018-05-14 23:39 UTC (permalink / raw)
  To: Vlad Buslov
  Cc: kbuild-all, netdev, davem, jhs, xiyou.wangcong, jiri, pablo,
	kadlec, fw, ast, daniel, edumazet, vladbu, keescook, linux-kernel,
	netfilter-devel, coreteam, kliteyn
In-Reply-To: <1526308035-12484-2-git-send-email-vladbu@mellanox.com>

Hi Vlad,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on net/master]
[also build test WARNING on v4.17-rc5 next-20180514]
[cannot apply to net-next/master]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Vlad-Buslov/Modify-action-API-for-implementing-lockless-actions/20180515-025420
reproduce:
        # apt-get install sparse
        make ARCH=x86_64 allmodconfig
        make C=1 CF=-D__CHECK_ENDIAN__


sparse warnings: (new ones prefixed by >>)

>> net/sched/act_api.c:71:15: sparse: incorrect type in initializer (different address spaces) @@    expected struct tc_cookie [noderef] <asn:4>*__ret @@    got [noderef] <asn:4>*__ret @@
   net/sched/act_api.c:71:15:    expected struct tc_cookie [noderef] <asn:4>*__ret
   net/sched/act_api.c:71:15:    got struct tc_cookie *new_cookie
>> net/sched/act_api.c:71:13: sparse: incorrect type in assignment (different address spaces) @@    expected struct tc_cookie *old @@    got struct tc_cookie [noderef] <struct tc_cookie *old @@
   net/sched/act_api.c:71:13:    expected struct tc_cookie *old
   net/sched/act_api.c:71:13:    got struct tc_cookie [noderef] <asn:4>*[assigned] __ret
>> net/sched/act_api.c:132:48: sparse: dereference of noderef expression

vim +71 net/sched/act_api.c

    65	
    66	static void tcf_set_action_cookie(struct tc_cookie __rcu **old_cookie,
    67					  struct tc_cookie *new_cookie)
    68	{
    69		struct tc_cookie *old;
    70	
  > 71		old = xchg(old_cookie, new_cookie);
    72		if (old)
    73			call_rcu(&old->rcu, tcf_free_cookie_rcu);
    74	}
    75	
    76	/* XXX: For standalone actions, we don't need a RCU grace period either, because
    77	 * actions are always connected to filters and filters are already destroyed in
    78	 * RCU callbacks, so after a RCU grace period actions are already disconnected
    79	 * from filters. Readers later can not find us.
    80	 */
    81	static void free_tcf(struct tc_action *p)
    82	{
    83		free_percpu(p->cpu_bstats);
    84		free_percpu(p->cpu_qstats);
    85	
    86		tcf_set_action_cookie(&p->act_cookie, NULL);
    87		if (p->goto_chain)
    88			tcf_action_goto_chain_fini(p);
    89	
    90		kfree(p);
    91	}
    92	
    93	static void tcf_idr_remove(struct tcf_idrinfo *idrinfo, struct tc_action *p)
    94	{
    95		spin_lock_bh(&idrinfo->lock);
    96		idr_remove(&idrinfo->action_idr, p->tcfa_index);
    97		spin_unlock_bh(&idrinfo->lock);
    98		gen_kill_estimator(&p->tcfa_rate_est);
    99		free_tcf(p);
   100	}
   101	
   102	int __tcf_idr_release(struct tc_action *p, bool bind, bool strict)
   103	{
   104		int ret = 0;
   105	
   106		ASSERT_RTNL();
   107	
   108		if (p) {
   109			if (bind)
   110				p->tcfa_bindcnt--;
   111			else if (strict && p->tcfa_bindcnt > 0)
   112				return -EPERM;
   113	
   114			p->tcfa_refcnt--;
   115			if (p->tcfa_bindcnt <= 0 && p->tcfa_refcnt <= 0) {
   116				if (p->ops->cleanup)
   117					p->ops->cleanup(p);
   118				tcf_idr_remove(p->idrinfo, p);
   119				ret = ACT_P_DELETED;
   120			}
   121		}
   122	
   123		return ret;
   124	}
   125	EXPORT_SYMBOL(__tcf_idr_release);
   126	
   127	static size_t tcf_action_shared_attrs_size(const struct tc_action *act)
   128	{
   129		u32 cookie_len = 0;
   130	
   131		if (act->act_cookie)
 > 132			cookie_len = nla_total_size(act->act_cookie->len);
   133	
   134		return  nla_total_size(0) /* action number nested */
   135			+ nla_total_size(IFNAMSIZ) /* TCA_ACT_KIND */
   136			+ cookie_len /* TCA_ACT_COOKIE */
   137			+ nla_total_size(0) /* TCA_ACT_STATS nested */
   138			/* TCA_STATS_BASIC */
   139			+ nla_total_size_64bit(sizeof(struct gnet_stats_basic))
   140			/* TCA_STATS_QUEUE */
   141			+ nla_total_size_64bit(sizeof(struct gnet_stats_queue))
   142			+ nla_total_size(0) /* TCA_OPTIONS nested */
   143			+ nla_total_size(sizeof(struct tcf_t)); /* TCA_GACT_TM */
   144	}
   145	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

^ permalink raw reply

* Re: [PATCH net-next 3/3] udp: only use paged allocation with scatter-gather
From: Eric Dumazet @ 2018-05-14 23:45 UTC (permalink / raw)
  To: Willem de Bruijn, Eric Dumazet
  Cc: Network Development, David Miller, Willem de Bruijn
In-Reply-To: <CAF=yD-KYer3RV6hB+-5LYt6VgL3LA6OpgbCBzdmnGrCvGF=ySQ@mail.gmail.com>



On 05/14/2018 04:30 PM, Willem de Bruijn wrote:

> I don't quite follow. The reported crash happens in the protocol layer,
> because of this check. With pagedlen we have not allocated
> sufficient space for the skb_put.
> 
>                 if (!(rt->dst.dev->features&NETIF_F_SG)) {
>                         unsigned int off;
> 
>                         off = skb->len;
>                         if (getfrag(from, skb_put(skb, copy),
>                                         offset, copy, off, skb) < 0) {
>                                 __skb_trim(skb, off);
>                                 err = -EFAULT;
>                                 goto error;
>                         }
>                 } else {
>                         int i = skb_shinfo(skb)->nr_frags;
> 
> Are you referring to a separate potential issue in the gso layer?
> If a bonding device advertises SG, but a slave does not, then
> skb_segment on the slave should build linear segs? I have not
> tested that.

Given that the device attribute could change under us, we need to not
crash, even if initially we thought NETIF_F_SG was available.

Unless you want to hold RTNL in UDP xmit :)

Ideally, GSO should be always on, as we did for TCP.

Otherwise, I can guarantee syzkaller will hit again.

^ permalink raw reply

* [PATCH net-next] erspan: set bso bit based on mirrored packet's len
From: William Tu @ 2018-05-14 23:54 UTC (permalink / raw)
  To: netdev

Before the patch, the erspan BSO bit (Bad/Short/Oversized) is not
handled.  BSO has 4 possible values:
  00 --> Good frame with no error, or unknown integrity
  11 --> Payload is a Bad Frame with CRC or Alignment Error
  01 --> Payload is a Short Frame
  10 --> Payload is an Oversized Frame

Based the short/oversized definitions in RFC1757, the patch sets
the bso bit based on the mirrored packet's size.

Reported-by: Xiaoyan Jin <xiaoyanj@vmware.com>
Signed-off-by: William Tu <u9012063@gmail.com>
---
 include/net/erspan.h | 25 +++++++++++++++++++++++++
 1 file changed, 25 insertions(+)

diff --git a/include/net/erspan.h b/include/net/erspan.h
index d044aa60cc76..5eb95f78ad45 100644
--- a/include/net/erspan.h
+++ b/include/net/erspan.h
@@ -219,6 +219,30 @@ static inline __be32 erspan_get_timestamp(void)
 	return htonl((u32)h_usecs);
 }
 
+/* ERSPAN BSO (Bad/Short/Oversized)
+ *   00b --> Good frame with no error, or unknown integrity
+ *   01b --> Payload is a Short Frame
+ *   10b --> Payload is an Oversized Frame
+ *   11b --> Payload is a Bad Frame with CRC or Alignment Error
+ */
+enum erspan_bso {
+	BSO_NOERROR,
+	BSO_SHORT,
+	BSO_OVERSIZED,
+	BSO_BAD,
+};
+
+static inline u8 erspan_detect_bso(struct sk_buff *skb)
+{
+	if (skb->len < ETH_ZLEN)
+		return BSO_SHORT;
+
+	if (skb->len > ETH_FRAME_LEN)
+		return BSO_OVERSIZED;
+
+	return BSO_NOERROR;
+}
+
 static inline void erspan_build_header_v2(struct sk_buff *skb,
 					  u32 id, u8 direction, u16 hwid,
 					  bool truncate, bool is_ipv4)
@@ -248,6 +272,7 @@ static inline void erspan_build_header_v2(struct sk_buff *skb,
 		vlan_tci = ntohs(qp->tci);
 	}
 
+	bso = erspan_detect_bso(skb);
 	skb_push(skb, sizeof(*ershdr) + ERSPAN_V2_MDSIZE);
 	ershdr = (struct erspan_base_hdr *)skb->data;
 	memset(ershdr, 0, sizeof(*ershdr) + ERSPAN_V2_MDSIZE);
-- 
2.7.4

^ permalink raw reply related

* [PATCH bpf-next] selftests/bpf: make sure build-id is on
From: Alexei Starovoitov @ 2018-05-15  0:11 UTC (permalink / raw)
  To: David S . Miller; +Cc: daniel, songliubraving, netdev

--build-id may not be a default linker config.
Make sure it's used when linking urandom_read test program.
Otherwise test_stacktrace_build_id[_nmi] tests will be failling.

Signed-off-by: Alexei Starovoitov <ast@kernel.org>
---
 tools/testing/selftests/bpf/Makefile | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
index 438d4f93875b..133ebc68cbe4 100644
--- a/tools/testing/selftests/bpf/Makefile
+++ b/tools/testing/selftests/bpf/Makefile
@@ -19,7 +19,7 @@ all: $(TEST_CUSTOM_PROGS)
 $(TEST_CUSTOM_PROGS): urandom_read
 
 urandom_read: urandom_read.c
-	$(CC) -o $(TEST_CUSTOM_PROGS) -static $<
+	$(CC) -o $(TEST_CUSTOM_PROGS) -static $< -Wl,--build-id
 
 # Order correspond to 'make run_tests' order
 TEST_GEN_PROGS = test_verifier test_tag test_maps test_lru_map test_lpm_map test_progs \
-- 
2.9.5

^ permalink raw reply related

* Re: [PATCH bpf-next] selftests/bpf: make sure build-id is on
From: Y Song @ 2018-05-15  0:37 UTC (permalink / raw)
  To: Alexei Starovoitov
  Cc: David S . Miller, Daniel Borkmann, songliubraving, netdev
In-Reply-To: <20180515001129.3557608-1-ast@kernel.org>

On Mon, May 14, 2018 at 5:11 PM, Alexei Starovoitov <ast@kernel.org> wrote:
> --build-id may not be a default linker config.
> Make sure it's used when linking urandom_read test program.
> Otherwise test_stacktrace_build_id[_nmi] tests will be failling.
>
> Signed-off-by: Alexei Starovoitov <ast@kernel.org>

Tested and the change looks good.
Acked-by: Yonghong Song <yhs@fb.com>

> ---
>  tools/testing/selftests/bpf/Makefile | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
> index 438d4f93875b..133ebc68cbe4 100644
> --- a/tools/testing/selftests/bpf/Makefile
> +++ b/tools/testing/selftests/bpf/Makefile
> @@ -19,7 +19,7 @@ all: $(TEST_CUSTOM_PROGS)
>  $(TEST_CUSTOM_PROGS): urandom_read
>
>  urandom_read: urandom_read.c
> -       $(CC) -o $(TEST_CUSTOM_PROGS) -static $<
> +       $(CC) -o $(TEST_CUSTOM_PROGS) -static $< -Wl,--build-id
>
>  # Order correspond to 'make run_tests' order
>  TEST_GEN_PROGS = test_verifier test_tag test_maps test_lru_map test_lpm_map test_progs \
> --
> 2.9.5
>

^ permalink raw reply

* Re: [PATCH v3] kvmalloc: always use vmalloc if CONFIG_DEBUG_SG
From: Joonsoo Kim @ 2018-05-15  1:13 UTC (permalink / raw)
  To: Mikulas Patocka
  Cc: Matthew Wilcox, Michal Hocko, David Miller, Andrew Morton,
	linux-mm, eric.dumazet, edumazet, netdev, linux-kernel, mst,
	jasowang, virtualization, dm-devel, Vlastimil Babka,
	Christoph Lameter, Pekka Enberg, David Rientjes
In-Reply-To: <alpine.LRH.2.02.1804241428120.8296@file01.intranet.prod.int.rdu2.redhat.com>

Hello, Mikulas.

On Tue, Apr 24, 2018 at 02:41:47PM -0400, Mikulas Patocka wrote:
> 
> 
> On Tue, 24 Apr 2018, Matthew Wilcox wrote:
> 
> > On Tue, Apr 24, 2018 at 08:29:14AM -0400, Mikulas Patocka wrote:
> > > 
> > > 
> > > On Mon, 23 Apr 2018, Matthew Wilcox wrote:
> > > 
> > > > On Mon, Apr 23, 2018 at 08:06:16PM -0400, Mikulas Patocka wrote:
> > > > > Some bugs (such as buffer overflows) are better detected
> > > > > with kmalloc code, so we must test the kmalloc path too.
> > > > 
> > > > Well now, this brings up another item for the collective TODO list --
> > > > implement redzone checks for vmalloc.  Unless this is something already
> > > > taken care of by kasan or similar.
> > > 
> > > The kmalloc overflow testing is also not ideal - it rounds the size up to 
> > > the next slab size and detects buffer overflows only at this boundary.
> > > 
> > > Some times ago, I made a "kmalloc guard" patch that places a magic number 
> > > immediatelly after the requested size - so that it can detect overflows at 
> > > byte boundary 
> > > ( https://www.redhat.com/archives/dm-devel/2014-September/msg00018.html )
> > > 
> > > That patch found a bug in crypto code:
> > > ( http://lkml.iu.edu/hypermail/linux/kernel/1409.1/02325.html )
> > 
> > Is it still worth doing this, now we have kasan?
> 
> The kmalloc guard has much lower overhead than kasan.

I skimm at your code and it requires rebuilding the kernel.
I think that if rebuilding is required as the same with the KASAN,
using the KASAN is better since it has far better coverage for
detection the bug.

However, I think that if the redzone can be setup tightly
without rebuild, it would be worth implementing. I have an idea to
implement it only for the SLUB. Could I try it? (I'm asking this
because I'm inspired from the above patch.) :)
Or do you wanna try it?

Thanks.

^ permalink raw reply

* Re: [kbuild-all] [PATCH net-next 2/2] sctp: add sctp_make_op_error_limited and reuse inner functions
From: Ye Xiaolong @ 2018-05-15  1:23 UTC (permalink / raw)
  To: Marcelo Ricardo Leitner
  Cc: kbuild test robot, Xin Long, Neil Horman, netdev, Vlad Yasevich,
	linux-sctp, kbuild-all
In-Reply-To: <20180514121605.GB5105@localhost.localdomain>

On 05/14, Marcelo Ricardo Leitner wrote:
>On Mon, May 14, 2018 at 07:47:20PM +0800, Ye Xiaolong wrote:
>> On 05/14, Marcelo Ricardo Leitner wrote:
>> >On Mon, May 14, 2018 at 03:40:53PM +0800, Ye Xiaolong wrote:
>> >> >> config: x86_64-randconfig-x006-201817 (attached as .config)
>> >> >> compiler: gcc-7 (Debian 7.3.0-16) 7.3.0
>> >> >> reproduce:
>> >> >>         # save the attached .config to linux build tree
>> >> >>         make ARCH=x86_64
>> >> >>
>> >> >> All errors (new ones prefixed by >>):
>> >> >>
>> >> >>    net//sctp/sm_make_chunk.c: In function 'sctp_make_op_error_limited':
>> >> >> >> net//sctp/sm_make_chunk.c:1260:9: error: implicit declaration of function 'sctp_mtu_payload'; did you mean 'sctp_do_peeloff'? [-Werror=implicit-function-declaration]
>> >> >>      size = sctp_mtu_payload(sp, size, sizeof(struct sctp_errhdr));
>> >> >>             ^~~~~~~~~~~~~~~~
>> >> >>             sctp_do_peeloff
>> >> >>    cc1: some warnings being treated as errors
>> >> >
>> >> >Seems the test didn't pick up the MTU refactor patchset yet.
>> >>
>> >> Do you mean your patchset require MTU refactor patchset as prerequisites?
>> >
>> >Yes.
>>
>> Then it is recommended to use '--base' option of git format-patch, it would record
>> the base tree info in the first patch or cover letter, 0day bot would apply your
>> patchset to right base according to it.
>
>Nice. I wasn't aware of it. Thanks.
>
>Considering that the MTU refactor patchset was already applied on
>net-next when the bot did the test, why should I have to specify the
>base?

Could you share me the subjects or commits of MTU refactor patcheset, I'll double
check what was wrong.

Thanks,
Xiaolong
>
>  Marcelo

^ permalink raw reply

* Re: [PATCH 05/14] net: sched: always take reference to action
From: kbuild test robot @ 2018-05-15  1:38 UTC (permalink / raw)
  To: Vlad Buslov
  Cc: kbuild-all, netdev, davem, jhs, xiyou.wangcong, jiri, pablo,
	kadlec, fw, ast, daniel, edumazet, vladbu, keescook, linux-kernel,
	netfilter-devel, coreteam, kliteyn
In-Reply-To: <1526308035-12484-6-git-send-email-vladbu@mellanox.com>

Hi Vlad,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on net/master]
[also build test WARNING on v4.17-rc5 next-20180514]
[cannot apply to net-next/master]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Vlad-Buslov/Modify-action-API-for-implementing-lockless-actions/20180515-025420
reproduce:
        # apt-get install sparse
        make ARCH=x86_64 allmodconfig
        make C=1 CF=-D__CHECK_ENDIAN__


sparse warnings: (new ones prefixed by >>)

   net/sched/act_api.c:71:15: sparse: incorrect type in initializer (different address spaces) @@    expected struct tc_cookie [noderef] <asn:4>*__ret @@    got [noderef] <asn:4>*__ret @@
   net/sched/act_api.c:71:15:    expected struct tc_cookie [noderef] <asn:4>*__ret
   net/sched/act_api.c:71:15:    got struct tc_cookie *new_cookie
   net/sched/act_api.c:71:13: sparse: incorrect type in assignment (different address spaces) @@    expected struct tc_cookie *old @@    got struct tc_cookie [noderef] <struct tc_cookie *old @@
   net/sched/act_api.c:71:13:    expected struct tc_cookie *old
   net/sched/act_api.c:71:13:    got struct tc_cookie [noderef] <asn:4>*[assigned] __ret
>> net/sched/act_api.c:287:6: sparse: symbol '__tcf_idr_check' was not declared. Should it be static?
   net/sched/act_api.c:144:48: sparse: dereference of noderef expression

Please review and possibly fold the followup patch.

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

^ permalink raw reply

* [RFC PATCH] net: sched: __tcf_idr_check() can be static
From: kbuild test robot @ 2018-05-15  1:38 UTC (permalink / raw)
  To: Vlad Buslov
  Cc: kbuild-all, netdev, davem, jhs, xiyou.wangcong, jiri, pablo,
	kadlec, fw, ast, daniel, edumazet, vladbu, keescook, linux-kernel,
	netfilter-devel, coreteam, kliteyn
In-Reply-To: <1526308035-12484-6-git-send-email-vladbu@mellanox.com>


Fixes: 446adedb5339 ("net: sched: always take reference to action")
Signed-off-by: Fengguang Wu <fengguang.wu@intel.com>
---
 act_api.c |    4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/sched/act_api.c b/net/sched/act_api.c
index 9459cce..27e80cf 100644
--- a/net/sched/act_api.c
+++ b/net/sched/act_api.c
@@ -284,8 +284,8 @@ int tcf_generic_walker(struct tc_action_net *tn, struct sk_buff *skb,
 }
 EXPORT_SYMBOL(tcf_generic_walker);
 
-bool __tcf_idr_check(struct tc_action_net *tn, u32 index, struct tc_action **a,
-		     int bind)
+static bool __tcf_idr_check(struct tc_action_net *tn, u32 index, struct tc_action **a,
+			    int bind)
 {
 	struct tcf_idrinfo *idrinfo = tn->idrinfo;
 	struct tc_action *p;

^ permalink raw reply related

* Re: [PATCH v4 1/1] drivers core: multi-threading device shutdown
From: Pavel Tatashin @ 2018-05-15  1:53 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Steven Sistare, Daniel Jordan, Linux Kernel Mailing List,
	Kirsher, Jeffrey T, intel-wired-lan, netdev, Greg Kroah-Hartman,
	alexander.duyck, tobin
In-Reply-To: <CAHp75VeopFMFHq=rJCUcAw02kznJ5+0vLWr5500DsGtERiX6cw@mail.gmail.com>

Hi Andy,

Thank you for your comments. I will send an updated patch soon. My replies are below:

On 05/14/2018 04:04 PM, Andy Shevchenko wrote:

> Can we still preserve an order here? (Yes, even if the entire list is
> not fully ordered)
> In the context I see it would go before netdevice.h.

Sure, I will move kthread.h.

>> +static struct device *
>> +device_get_child_by_index(struct device *parent, int index)
>> +{
>> +       struct klist_iter i;
>> +       struct device *dev = NULL, *d;
>> +       int child_index = 0;
>> +
>> +       if (!parent->p || index < 0)
>> +               return NULL;
>> +
>> +       klist_iter_init(&parent->p->klist_children, &i);
>> +       while ((d = next_device(&i))) {
>> +               if (child_index == index) {
>> +                       dev = d;
>> +                       break;
>> +               }
>> +               child_index++;
>> +       }
>> +       klist_iter_exit(&i);
>> +
>> +       return dev;
>> +}
> 
> This can be implemented as a subfunction to device_find_child(), can't it be?

Yes, but that would make it very inefficient to search for an index in a list via function pointer call.

> 
>> +/**
> 
> Hmm... Why it's marked as kernel doc while it's just a plain comment?
> Same applies to the rest of similar comments.

Fixed this, thanks!

> 
>> +               for (i = 0; i < children_count; i++) {
>> +                       if (device_shutdown_serial) {
>> +                               device_shutdown_child_task(&tdata);
>> +                       } else {
>> +                               kthread_run(device_shutdown_child_task,
>> +                                           &tdata, "device_shutdown.%s",
>> +                                           dev_name(dev));
>> +                       }
>> +               }
> 
> Can't we just use device_for_each_child() instead?

No, at least without doing some memory allocation. Notice in this loop we are not traversing through children, instead we are starting number of children threads, and each thread finds a child to work on. Otherwise we would have to pass child pointer via argument, and we would need to keep that argument in some memory.

Pavel

^ permalink raw reply

* Re: [PATCH v1 3/4] media: rc bpf: move ir_raw_event to uapi
From: kbuild test robot @ 2018-05-15  1:59 UTC (permalink / raw)
  To: Sean Young
  Cc: kbuild-all, linux-media, linux-kernel, Alexei Starovoitov,
	Mauro Carvalho Chehab, Daniel Borkmann, netdev, Matthias Reichl,
	Devin Heitmueller
In-Reply-To: <6ecdbd01b8c42c8784f2235c1e5109dac3dd86a5.1526331777.git.sean@mess.org>

[-- Attachment #1: Type: text/plain, Size: 870 bytes --]

Hi Sean,

I love your patch! Perhaps something to improve:

[auto build test WARNING on linus/master]
[also build test WARNING on v4.17-rc5]
[cannot apply to next-20180514]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Sean-Young/media-rc-introduce-BPF_PROG_IR_DECODER/20180515-093234
config: i386-tinyconfig (attached as .config)
compiler: gcc-7 (Debian 7.3.0-16) 7.3.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

All warnings (new ones prefixed by >>):

>> ./usr/include/linux/bpf_rcdev.h:13: found __[us]{8,16,32,64} type without #include <linux/types.h>

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 6302 bytes --]

^ permalink raw reply

* Re: [PATCH bpf-next v2 8/8] bpf: add ld64 imm test cases
From: Y Song @ 2018-05-15  2:18 UTC (permalink / raw)
  To: Daniel Borkmann; +Cc: Alexei Starovoitov, netdev
In-Reply-To: <20180514212234.2661-9-daniel@iogearbox.net>

On Mon, May 14, 2018 at 2:22 PM, Daniel Borkmann <daniel@iogearbox.net> wrote:
> Add test cases where we combine semi-random imm values, mainly for testing
> JITs when they have different encoding options for 64 bit immediates in
> order to reduce resulting image size.
>
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>

Acked-by: Yonghong Song <yhs@fb.com>

^ permalink raw reply

* Re: [PATCH bpf-next v2 0/8] Minor follow-up cleanups in BPF JITs and optimized imm emission
From: Alexei Starovoitov @ 2018-05-15  2:19 UTC (permalink / raw)
  To: Daniel Borkmann; +Cc: netdev
In-Reply-To: <20180514212234.2661-1-daniel@iogearbox.net>

On Mon, May 14, 2018 at 11:22:26PM +0200, Daniel Borkmann wrote:
> This series follows up mostly with with some minor cleanups on top
> of 'Move ld_abs/ld_ind to native BPF' as well as implements better
> 32/64 bit immediate load into register and saves tail call init on
> cBPF for the arm64 JIT. Last but not least we add a couple of test
> cases. For details please see individual patches. Thanks!
> 
> v1 -> v2:
>   - Minor fix in i64_i16_blocks() to remove 24 shift.
>   - Added last two patches.
>   - Added Acks from prior round.

Applied, thanks.

^ permalink raw reply

* Re: [PATCH v1 2/4] media: bpf: allow raw IR decoder bpf programs to be used
From: kbuild test robot @ 2018-05-15  2:34 UTC (permalink / raw)
  To: Sean Young
  Cc: kbuild-all, linux-media, linux-kernel, Alexei Starovoitov,
	Mauro Carvalho Chehab, Daniel Borkmann, netdev, Matthias Reichl,
	Devin Heitmueller
In-Reply-To: <cd3a5e27ef4122fab90daae2af6031982df77282.1526331777.git.sean@mess.org>

[-- Attachment #1: Type: text/plain, Size: 1224 bytes --]

Hi Sean,

I love your patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v4.17-rc5]
[cannot apply to next-20180514]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url:    https://github.com/0day-ci/linux/commits/Sean-Young/media-rc-introduce-BPF_PROG_IR_DECODER/20180515-093234
config: i386-allmodconfig (attached as .config)
compiler: gcc-7 (Debian 7.3.0-16) 7.3.0
reproduce:
        # save the attached .config to linux build tree
        make ARCH=i386 

All errors (new ones prefixed by >>):

>> drivers/media/rc/lirc_dev.c:32:10: fatal error: linux/bpf-rcdev.h: No such file or directory
    #include <linux/bpf-rcdev.h>
             ^~~~~~~~~~~~~~~~~~~
   compilation terminated.
--
>> kernel/bpf/syscall.c:30:10: fatal error: linux/bpf-rcdev.h: No such file or directory
    #include <linux/bpf-rcdev.h>
             ^~~~~~~~~~~~~~~~~~~
   compilation terminated.

vim +32 drivers/media/rc/lirc_dev.c

    31	
  > 32	#include <linux/bpf-rcdev.h>
    33	

---
0-DAY kernel test infrastructure                Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all                   Intel Corporation

[-- Attachment #2: .config.gz --]
[-- Type: application/gzip, Size: 62940 bytes --]

^ permalink raw reply

* Re: [RFC bpf-next 04/11] bpf: Add PTR_TO_SOCKET verifier type
From: Alexei Starovoitov @ 2018-05-15  2:37 UTC (permalink / raw)
  To: Joe Stringer; +Cc: daniel, netdev, ast, john.fastabend, kafai
In-Reply-To: <20180509210709.7201-5-joe@wand.net.nz>

On Wed, May 09, 2018 at 02:07:02PM -0700, Joe Stringer wrote:
> Teach the verifier a little bit about a new type of pointer, a
> PTR_TO_SOCKET. This pointer type is accessed from BPF through the
> 'struct bpf_sock' structure.
> 
> Signed-off-by: Joe Stringer <joe@wand.net.nz>
> ---
>  include/linux/bpf.h          | 19 +++++++++-
>  include/linux/bpf_verifier.h |  2 ++
>  kernel/bpf/verifier.c        | 86 ++++++++++++++++++++++++++++++++++++++------
>  net/core/filter.c            | 30 +++++++++-------
>  4 files changed, 114 insertions(+), 23 deletions(-)

Ack for patches 1-3. In this one few nits:

> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index a38e474bf7ee..a03b4b0edcb6 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -136,7 +136,7 @@ enum bpf_arg_type {
>  	/* the following constraints used to prototype bpf_memcmp() and other
>  	 * functions that access data on eBPF program stack
>  	 */
> -	ARG_PTR_TO_MEM,		/* pointer to valid memory (stack, packet, map value) */
> +	ARG_PTR_TO_MEM,		/* pointer to valid memory (stack, packet, map value, socket) */

I don't see where in this patch this change happens...

>  	ARG_PTR_TO_MEM_OR_NULL, /* pointer to valid memory or NULL */
>  	ARG_PTR_TO_UNINIT_MEM,	/* pointer to memory does not need to be initialized,
>  				 * helper function must fill all bytes or clear
> @@ -148,6 +148,7 @@ enum bpf_arg_type {
>  
>  	ARG_PTR_TO_CTX,		/* pointer to context */
>  	ARG_ANYTHING,		/* any (initialized) argument is ok */
> +	ARG_PTR_TO_SOCKET,	/* pointer to bpf_sock */
>  };
>  
>  /* type of values returned from helper functions */
> @@ -155,6 +156,7 @@ enum bpf_return_type {
>  	RET_INTEGER,			/* function returns integer */
>  	RET_VOID,			/* function doesn't return anything */
>  	RET_PTR_TO_MAP_VALUE_OR_NULL,	/* returns a pointer to map elem value or NULL */
> +	RET_PTR_TO_SOCKET_OR_NULL,	/* returns a pointer to a socket or NULL */
>  };
>  
>  /* eBPF function prototype used by verifier to allow BPF_CALLs from eBPF programs
> @@ -205,6 +207,8 @@ enum bpf_reg_type {
>  	PTR_TO_PACKET_META,	 /* skb->data - meta_len */
>  	PTR_TO_PACKET,		 /* reg points to skb->data */
>  	PTR_TO_PACKET_END,	 /* skb->data + headlen */
> +	PTR_TO_SOCKET,		 /* reg points to struct bpf_sock */
> +	PTR_TO_SOCKET_OR_NULL,	 /* reg points to struct bpf_sock or NULL */
>  };
>  
>  /* The information passed from prog-specific *_is_valid_access
> @@ -326,6 +330,11 @@ const struct bpf_func_proto *bpf_get_trace_printk_proto(void);
>  
>  typedef unsigned long (*bpf_ctx_copy_t)(void *dst, const void *src,
>  					unsigned long off, unsigned long len);
> +typedef u32 (*bpf_convert_ctx_access_t)(enum bpf_access_type type,
> +					const struct bpf_insn *src,
> +					struct bpf_insn *dst,
> +					struct bpf_prog *prog,
> +					u32 *target_size);
>  
>  u64 bpf_event_output(struct bpf_map *map, u64 flags, void *meta, u64 meta_size,
>  		     void *ctx, u64 ctx_size, bpf_ctx_copy_t ctx_copy);
> @@ -729,4 +738,12 @@ extern const struct bpf_func_proto bpf_sock_map_update_proto;
>  void bpf_user_rnd_init_once(void);
>  u64 bpf_user_rnd_u32(u64 r1, u64 r2, u64 r3, u64 r4, u64 r5);
>  
> +bool bpf_sock_is_valid_access(int off, int size, enum bpf_access_type type,
> +			      struct bpf_insn_access_aux *info);
> +u32 bpf_sock_convert_ctx_access(enum bpf_access_type type,
> +			        const struct bpf_insn *si,
> +			        struct bpf_insn *insn_buf,
> +			        struct bpf_prog *prog,
> +			        u32 *target_size);
> +
>  #endif /* _LINUX_BPF_H */
> diff --git a/include/linux/bpf_verifier.h b/include/linux/bpf_verifier.h
> index a613b52ce939..9dcd87f1d322 100644
> --- a/include/linux/bpf_verifier.h
> +++ b/include/linux/bpf_verifier.h
> @@ -57,6 +57,8 @@ struct bpf_reg_state {
>  	 * offset, so they can share range knowledge.
>  	 * For PTR_TO_MAP_VALUE_OR_NULL this is used to share which map value we
>  	 * came from, when one is tested for != NULL.
> +	 * For PTR_TO_SOCKET this is used to share which pointers retain the
> +	 * same reference to the socket, to determine proper reference freeing.
>  	 */
>  	u32 id;
>  	/* Ordering of fields matters.  See states_equal() */
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 1b31b805dea4..d38c7c1e9da6 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -80,8 +80,8 @@ static const struct bpf_verifier_ops * const bpf_verifier_ops[] = {
>   * (like pointer plus pointer becomes SCALAR_VALUE type)
>   *
>   * When verifier sees load or store instructions the type of base register
> - * can be: PTR_TO_MAP_VALUE, PTR_TO_CTX, PTR_TO_STACK. These are three pointer
> - * types recognized by check_mem_access() function.
> + * can be: PTR_TO_MAP_VALUE, PTR_TO_CTX, PTR_TO_STACK, PTR_TO_SOCKET. These are
> + * four pointer types recognized by check_mem_access() function.
>   *
>   * PTR_TO_MAP_VALUE means that this register is pointing to 'map element value'
>   * and the range of [ptr, ptr + map's value_size) is accessible.
> @@ -244,6 +244,8 @@ static const char * const reg_type_str[] = {
>  	[PTR_TO_PACKET]		= "pkt",
>  	[PTR_TO_PACKET_META]	= "pkt_meta",
>  	[PTR_TO_PACKET_END]	= "pkt_end",
> +	[PTR_TO_SOCKET]		= "sock",
> +	[PTR_TO_SOCKET_OR_NULL] = "sock_or_null",
>  };
>  
>  static void print_liveness(struct bpf_verifier_env *env,
> @@ -977,6 +979,8 @@ static bool is_spillable_regtype(enum bpf_reg_type type)
>  	case PTR_TO_PACKET_META:
>  	case PTR_TO_PACKET_END:
>  	case CONST_PTR_TO_MAP:
> +	case PTR_TO_SOCKET:
> +	case PTR_TO_SOCKET_OR_NULL:
>  		return true;
>  	default:
>  		return false;
> @@ -1360,6 +1364,28 @@ static int check_ctx_access(struct bpf_verifier_env *env, int insn_idx, int off,
>  	return -EACCES;
>  }
>  
> +static int check_sock_access(struct bpf_verifier_env *env, u32 regno, int off,
> +			     int size, enum bpf_access_type t)
> +{
> +	struct bpf_reg_state *regs = cur_regs(env);
> +	struct bpf_reg_state *reg = &regs[regno];
> +	struct bpf_insn_access_aux info;
> +
> +	if (reg->smin_value < 0) {
> +		verbose(env, "R%d min value is negative, either use unsigned index or do a if (index >=0) check.\n",
> +			regno);
> +		return -EACCES;
> +	}
> +
> +	if (!bpf_sock_is_valid_access(off, size, t, &info)) {
> +		verbose(env, "invalid bpf_sock_ops access off=%d size=%d\n",
> +			off, size);
> +		return -EACCES;
> +	}
> +
> +	return 0;
> +}
> +
>  static bool __is_pointer_value(bool allow_ptr_leaks,
>  			       const struct bpf_reg_state *reg)
>  {
> @@ -1475,6 +1501,9 @@ static int check_ptr_alignment(struct bpf_verifier_env *env,
>  		 */
>  		strict = true;
>  		break;
> +	case PTR_TO_SOCKET:
> +		pointer_desc = "sock ";
> +		break;
>  	default:
>  		break;
>  	}
> @@ -1723,6 +1752,16 @@ static int check_mem_access(struct bpf_verifier_env *env, int insn_idx, u32 regn
>  		err = check_packet_access(env, regno, off, size, false);
>  		if (!err && t == BPF_READ && value_regno >= 0)
>  			mark_reg_unknown(env, regs, value_regno);
> +
> +	} else if (reg->type == PTR_TO_SOCKET) {
> +		if (t == BPF_WRITE) {
> +			verbose(env, "cannot write into socket\n");
> +			return -EACCES;
> +		}
> +		err = check_sock_access(env, regno, off, size, t);
> +		if (!err && t == BPF_READ && value_regno >= 0)

t == BPF_READ check is unnecessary.

> +			mark_reg_unknown(env, regs, value_regno);
> +
>  	} else {
>  		verbose(env, "R%d invalid mem access '%s'\n", regno,
>  			reg_type_str[reg->type]);
> @@ -1941,6 +1980,10 @@ static int check_func_arg(struct bpf_verifier_env *env, u32 regno,
>  		expected_type = PTR_TO_CTX;
>  		if (type != expected_type)
>  			goto err_type;
> +	} else if (arg_type == ARG_PTR_TO_SOCKET) {
> +		expected_type = PTR_TO_SOCKET;
> +		if (type != expected_type)
> +			goto err_type;
>  	} else if (arg_type_is_mem_ptr(arg_type)) {
>  		expected_type = PTR_TO_STACK;
>  		/* One exception here. In case function allows for NULL to be
> @@ -2477,6 +2520,10 @@ static int check_helper_call(struct bpf_verifier_env *env, int func_id, int insn
>  			insn_aux->map_ptr = meta.map_ptr;
>  		else if (insn_aux->map_ptr != meta.map_ptr)
>  			insn_aux->map_ptr = BPF_MAP_PTR_POISON;
> +	} else if (fn->ret_type == RET_PTR_TO_SOCKET_OR_NULL) {
> +		mark_reg_known_zero(env, regs, BPF_REG_0);
> +		regs[BPF_REG_0].type = PTR_TO_SOCKET_OR_NULL;
> +		regs[BPF_REG_0].id = ++env->id_gen;
>  	} else {
>  		verbose(env, "unknown return type %d of func %s#%d\n",
>  			fn->ret_type, func_id_name(func_id), func_id);
> @@ -2614,6 +2661,8 @@ static int adjust_ptr_min_max_vals(struct bpf_verifier_env *env,
>  		return -EACCES;
>  	case CONST_PTR_TO_MAP:
>  	case PTR_TO_PACKET_END:
> +	case PTR_TO_SOCKET:
> +	case PTR_TO_SOCKET_OR_NULL:
>  		verbose(env, "R%d pointer arithmetic on %s prohibited\n",
>  			dst, reg_type_str[ptr_reg->type]);
>  		return -EACCES;
> @@ -3559,6 +3608,8 @@ static void mark_ptr_or_null_reg(struct bpf_reg_state *reg, u32 id,
>  			} else {
>  				reg->type = PTR_TO_MAP_VALUE;
>  			}
> +		} else if (reg->type == PTR_TO_SOCKET_OR_NULL) {
> +			reg->type = PTR_TO_SOCKET;
>  		}
>  		/* We don't need id from this point onwards anymore, thus we
>  		 * should better reset it, so that state pruning has chances
> @@ -4333,6 +4384,8 @@ static bool regsafe(struct bpf_reg_state *rold, struct bpf_reg_state *rcur,
>  	case PTR_TO_CTX:
>  	case CONST_PTR_TO_MAP:
>  	case PTR_TO_PACKET_END:
> +	case PTR_TO_SOCKET:
> +	case PTR_TO_SOCKET_OR_NULL:
>  		/* Only valid matches are exact, which memcmp() above
>  		 * would have accepted
>  		 */
> @@ -5188,10 +5241,14 @@ static void sanitize_dead_code(struct bpf_verifier_env *env)
>  	}
>  }
>  
> -/* convert load instructions that access fields of 'struct __sk_buff'
> - * into sequence of instructions that access fields of 'struct sk_buff'
> +/* convert load instructions that access fields of a context type into a
> + * sequence of instructions that access fields of the underlying structure:
> + *     struct __sk_buff    -> struct sk_buff
> + *     struct bpf_sock_ops -> struct sock
>   */
> -static int convert_ctx_accesses(struct bpf_verifier_env *env)
> +static int convert_ctx_accesses(struct bpf_verifier_env *env,
> +				bpf_convert_ctx_access_t convert_ctx_access,
> +				enum bpf_reg_type ctx_type)
>  {
>  	const struct bpf_verifier_ops *ops = env->ops;
>  	int i, cnt, size, ctx_field_size, delta = 0;
> @@ -5218,12 +5275,14 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env)
>  		}
>  	}
>  
> -	if (!ops->convert_ctx_access || bpf_prog_is_dev_bound(env->prog->aux))
> +	if (!convert_ctx_access || bpf_prog_is_dev_bound(env->prog->aux))
>  		return 0;
>  
>  	insn = env->prog->insnsi + delta;
>  
>  	for (i = 0; i < insn_cnt; i++, insn++) {
> +		enum bpf_reg_type ptr_type;
> +
>  		if (insn->code == (BPF_LDX | BPF_MEM | BPF_B) ||
>  		    insn->code == (BPF_LDX | BPF_MEM | BPF_H) ||
>  		    insn->code == (BPF_LDX | BPF_MEM | BPF_W) ||
> @@ -5237,7 +5296,8 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env)
>  		else
>  			continue;
>  
> -		if (env->insn_aux_data[i + delta].ptr_type != PTR_TO_CTX)
> +		ptr_type = env->insn_aux_data[i + delta].ptr_type;
> +		if (ptr_type != ctx_type)
>  			continue;
>  
>  		ctx_field_size = env->insn_aux_data[i + delta].ctx_field_size;
> @@ -5269,8 +5329,8 @@ static int convert_ctx_accesses(struct bpf_verifier_env *env)
>  		}
>  
>  		target_size = 0;
> -		cnt = ops->convert_ctx_access(type, insn, insn_buf, env->prog,
> -					      &target_size);
> +		cnt = convert_ctx_access(type, insn, insn_buf, env->prog,
> +					 &target_size);
>  		if (cnt == 0 || cnt >= ARRAY_SIZE(insn_buf) ||
>  		    (ctx_field_size && !target_size)) {
>  			verbose(env, "bpf verifier is misconfigured\n");
> @@ -5785,7 +5845,13 @@ int bpf_check(struct bpf_prog **prog, union bpf_attr *attr)
>  
>  	if (ret == 0)
>  		/* program is valid, convert *(u32*)(ctx + off) accesses */
> -		ret = convert_ctx_accesses(env);
> +		ret = convert_ctx_accesses(env, env->ops->convert_ctx_access,
> +					   PTR_TO_CTX);
> +
> +	if (ret == 0)
> +		/* Convert *(u32*)(sock_ops + off) accesses */
> +		ret = convert_ctx_accesses(env, bpf_sock_convert_ctx_access,
> +					   PTR_TO_SOCKET);

Overall looks great.
Only this part is missing for PTR_TO_SOCKET:
     } else if (dst_reg_type != *prev_dst_type &&
                (dst_reg_type == PTR_TO_CTX ||
                 *prev_dst_type == PTR_TO_CTX)) {
             verbose(env, "same insn cannot be used with different pointers\n");
             return -EINVAL;
similar logic has to be added.
Otherwise the following will be accepted:

R1 = sock_ptr
goto X;
...
R1 = some_other_valid_ptr;
goto X;
...

R2 = *(u32 *)(R1 + 0);
this will be rewritten for first branch,
but it's wrong for second.

^ permalink raw reply

* Re: [PATCH net-next] net: stmmac: Add Jose Abreu as co-maintainer
From: David Miller @ 2018-05-15  2:40 UTC (permalink / raw)
  To: Jose.Abreu; +Cc: netdev, Joao.Pinto, alexandre.torgue, peppe.cavallaro
In-Reply-To: <06350075c76ee1c13f84c94346444507e9770ed2.1526290012.git.joabreu@synopsys.com>

From: Jose Abreu <Jose.Abreu@synopsys.com>
Date: Mon, 14 May 2018 10:29:56 +0100

> I'm offering to be a co-maintainer for stmmac driver.
> 
> As per discussion with Alexandre, I will arranje to get STM32 boards to
> test patches in GMAC version 3.x and 4.1. I also have HW to test GMAC
> version 5.
> 
> Looking forward to contribute to net-dev!
> 
> Signed-off-by: Jose Abreu <joabreu@synopsys.com>

Applied with commit message typo fixed.

^ permalink raw reply

* Re: [PATCH net] cxgb4: Correct ntuple mask validation for hash filters
From: David Miller @ 2018-05-15  2:43 UTC (permalink / raw)
  To: ganeshgr
  Cc: netdev, nirranjan, indranil, venkatesh, rahul.lakkireddy, kumaras
In-Reply-To: <1526295454-12650-1-git-send-email-ganeshgr@chelsio.com>

From: Ganesh Goudar <ganeshgr@chelsio.com>
Date: Mon, 14 May 2018 16:27:34 +0530

> From: Kumar Sanghvi <kumaras@chelsio.com>
> 
> Earlier code of doing bitwise AND with field width bits was wrong.
> Instead, simplify code to calculate ntuple_mask based on supplied
> fields and then compare with mask configured in hw - which is the
> correct and simpler way to validate ntuple mask.
> 
> Fixes: 3eb8b62d5a26 ("cxgb4: add support to create hash-filters via tc-flower offload")
> Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>

Applied and queued up for -stable.

^ permalink raw reply

* Re: [PATCH net 1/2] vmxnet3: set the DMA mask before the first DMA map operation
From: David Miller @ 2018-05-15  2:44 UTC (permalink / raw)
  To: hpreg; +Cc: netdev, doshir, pv-drivers, linux-kernel
In-Reply-To: <1526300907-79831-1-git-send-email-hpreg@vmware.com>

From: <hpreg@vmware.com>
Date: Mon, 14 May 2018 08:28:26 -0400

> The DMA mask must be set before, not after, the first DMA map operation, or
> the first DMA map operation could in theory fail on some systems.
> 
> Fixes: b0eb57cb97e78 ("VMXNET3: Add support for virtual IOMMU")
> Signed-off-by: Regis Duchesne <hpreg@vmware.com>
> Acked-by: Ronak Doshi <doshir@vmware.com>

Applied and queued up for -stable.

^ permalink raw reply

* Re: [PATCH net 2/2] vmxnet3: use DMA memory barriers where required
From: David Miller @ 2018-05-15  2:45 UTC (permalink / raw)
  To: hpreg; +Cc: netdev, doshir, pv-drivers, linux-kernel
In-Reply-To: <1526300090-78546-1-git-send-email-hpreg@vmware.com>

From: <hpreg@vmware.com>
Date: Mon, 14 May 2018 08:14:49 -0400

> The gen bits must be read first from (resp. written last to) DMA memory.
> The proper way to enforce this on Linux is to call dma_rmb() (resp.
> dma_wmb()).
> 
> Signed-off-by: Regis Duchesne <hpreg@vmware.com>
> Acked-by: Ronak Doshi <doshir@vmware.com>

Applied and queued up for -stable.

^ permalink raw reply

* Re: [PATCH net-next] cxgb4: add tc flower match support for tunnel VNI
From: David Miller @ 2018-05-15  2:50 UTC (permalink / raw)
  To: ganeshgr; +Cc: netdev, nirranjan, indranil, venkatesh, kumaras
In-Reply-To: <1526300481-19115-1-git-send-email-ganeshgr@chelsio.com>

From: Ganesh Goudar <ganeshgr@chelsio.com>
Date: Mon, 14 May 2018 17:51:21 +0530

> From: Kumar Sanghvi <kumaras@chelsio.com>
> 
> Adds support for matching flows based on tunnel VNI value.
> Introduces fw APIs for allocating/removing MPS entries related
> to encapsulation. And uses the same while adding/deleting filters
> for offloading flows based on tunnel VNI match.
> 
> Signed-off-by: Kumar Sanghvi <kumaras@chelsio.com>
> Signed-off-by: Ganesh Goudar <ganeshgr@chelsio.com>

Applied, thank you.

^ permalink raw reply

* [PATCH RFC net-next 0/7] net/ipv6: Fix route append and replace use cases
From: David Ahern @ 2018-05-15  2:51 UTC (permalink / raw)
  To: netdev; +Cc: Thomas.Winter, idosch, sharpd, roopa, David Ahern

This patch set fixes a few append and replace uses cases for IPv6 and
adds test cases that codifies the expectations of how append and replace
are expected to work. In paricular it allows a multipath route to have
a dev-only nexthop, something Thomas tried to accomplish with commit
edd7ceb78296 ("ipv6: Allow non-gateway ECMP for IPv6") which had to be
reverted because of breakage, and to replace an existing FIB entry
with a reject route.

There are a number of inconsistent and surprising aspects to the Linux
API for adding, deleting, replacing and changing FIB entries. For example,
with IPv4 NLM_F_APPEND means insert the route after any existing entries
with the same key (prefix + priority + TOS for IPv4) and NLM_F_CREATE
without the append flag inserts the new route before any existing entries.

IPv6 on the other hand attempts to guess whether a new route should be
appended to an existing one, possibly creating a multipath route, or to
add a new entry after any existing ones. This applies to both the 'append'
(NLM_F_CREATE + NLM_F_APPEND) and 'prepend' (NLM_F_CREATE only) cases
meaning for IPv6 the NLM_F_APPEND is basically ignored. This guessing
whether the route should be added to a multipath route (gateway routes)
or inserted after existing entries (non-gateway based routes) means a
multipath route can not have a dev only nexthop (potentially required in
some cases - tunnels or VRF route leaking for example) and route 'replace'
is a bit adhoc treating gateway based routes and dev-only / reject routes
differently.

This has led to frustration with developers working on routing suites
such as FRR where workarounds such as delete and add.

After this patch set there are 2 differences between IPv4 and IPv6:
1. 'ip ro prepend' = NLM_F_CREATE only
    IPv4 adds the new route before any existing ones
    IPv6 adds new route after any existing ones

2. 'ip ro append' = NLM_F_CREATE|NLM_F_APPEND
   IPv4 adds the new route after any existing ones
   IPv6 adds the nexthop to existing routes converting to multipath

For the former, there are cases where we want same prefix routes added
after existing ones (e.g., multicast, prefix routes for macvlan when used
for virtual router redundancy). Requiring the APPEND flag to add a new
route to an existing one helps here but is a slight change in behavior
since prepend with gateway routes now create a separate entry.

For the latter IPv6 behavior is preferred - appending a route for the same
prefix and metric to make a multipath route, so really IPv4 not allowing an
existing route to be updated is the limiter. This will be fixed when
nexthops become separate objects - a future patch set.

Thank you to Thomas and Ido for testing earlier versions of this set, and
to Ido for providing an update to the mlxsw driver.

David Ahern (7):
  mlxsw: spectrum_router: Add support for route append
  net/ipv6: Simplify appending route into multipath route
  selftests: fib_tests: Add success-fail counts
  selftests: fib_tests: Add command line options
  selftests: fib_tests: Add option to pause after each test
  selftests: fib_tests: Add ipv6 route add append replace tests
  selftests: fib_tests: Add ipv4 route add append replace tests

 .../net/ethernet/mellanox/mlxsw/spectrum_router.c  |   2 +
 include/net/ip6_route.h                            |   6 -
 net/ipv6/ip6_fib.c                                 | 157 +++--
 net/ipv6/route.c                                   |   3 +-
 tools/testing/selftests/net/fib_tests.sh           | 673 ++++++++++++++++++++-
 5 files changed, 737 insertions(+), 104 deletions(-)

-- 
2.11.0

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox