Netdev List
 help / color / mirror / Atom feed
* pull-request: bpf-next 2018-10-21
From: Daniel Borkmann @ 2018-10-21 19:24 UTC (permalink / raw)
  To: davem; +Cc: daniel, ast, netdev

Hi David,

The following pull-request contains BPF updates for your *net-next* tree.

The main changes are:

1) Implement two new kind of BPF maps, that is, queue and stack
   map along with new peek, push and pop operations, from Mauricio.

2) Add support for MSG_PEEK flag when redirecting into an ingress
   psock sk_msg queue, and add a new helper bpf_msg_push_data() for
   insert data into the message, from John.

3) Allow for BPF programs of type BPF_PROG_TYPE_CGROUP_SKB to use
   direct packet access for __skb_buff, from Song.

4) Use more lightweight barriers for walking perf ring buffer for
   libbpf and perf tool as well. Also, various fixes and improvements
   from verifier side, from Daniel.

5) Add per-symbol visibility for DSO in libbpf and hide by default
   global symbols such as netlink related functions, from Andrey.

6) Two improvements to nfp's BPF offload to check vNIC capabilities
   in case prog is shared with multiple vNICs and to protect against
   mis-initializing atomic counters, from Jakub.

7) Fix for bpftool to use 4 context mode for the nfp disassembler,
   also from Jakub.

8) Fix a return value comparison in test_libbpf.sh and add several
   bpftool improvements in bash completion, documentation of bpf fs
   restrictions and batch mode summary print, from Quentin.

9) Fix a file resource leak in BPF selftest's load_kallsyms()
   helper, from Peng.

10) Fix an unused variable warning in map_lookup_and_delete_elem(),
    from Alexei.

11) Fix bpf_skb_adjust_room() signature in BPF UAPI helper doc,
    from Nicolas.

12) Add missing executables to .gitignore in BPF selftests, from Anders.

Please consider pulling these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git

Thanks a lot!

----------------------------------------------------------------

The following changes since commit 2c59f06cc0442862d589c36bd2f29667f96c35e7:

  Merge branch 'net-Kernel-side-filtering-for-route-dumps' (2018-10-16 00:14:18 -0700)

are available in the git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git 

for you to fetch changes up to fe8ecccc10b3adc071de05ca7af728ca1a4ac9aa:

  tools: bpftool: fix completion for "bpftool map update" (2018-10-21 20:45:14 +0200)

----------------------------------------------------------------
Alexei Starovoitov (6):
      Merge branch 'nfp-improve-bpf-offload'
      Merge branch 'queue_stack_maps'
      Merge branch 'improve_perf_barriers'
      Merge branch 'cg_skb_direct_pkt_access'
      bpf: remove unused variable
      Merge branch 'misc-improvements'

Anders Roxell (1):
      selftests/bpf: add missing executables to .gitignore

Andrey Ignatov (1):
      libbpf: Per-symbol visibility for DSO

Daniel Borkmann (11):
      bpf, tls: add tls header to tools infrastructure
      Merge branch 'bpf-sk-msg-peek'
      tools, perf: add and use optimized ring_buffer_{read_head, write_tail} helpers
      bpf, libbpf: use correct barriers in perf ring buffer walk
      Merge branch 'bpf-msg-push-data'
      ulp: remove uid and user_visible members
      bpf, verifier: fix register type dump in xadd and st
      bpf, verifier: reject xadd on flow key memory
      bpf, verifier: remove unneeded flow key in check_helper_mem_access
      bpf, verifier: avoid retpoline for map push/pop/peek operation
      bpf, libbpf: simplify and cleanup perf ring buffer walk

Jakub Kicinski (3):
      nfp: bpf: protect against mis-initializing atomic counters
      nfp: bpf: double check vNIC capabilities after object sharing
      tools: bpftool: use 4 context mode for the NFP disasm

John Fastabend (8):
      bpf: sockmap, fix skmsg recvmsg handler to track size correctly
      bpf: skmsg, improve sk_msg_used_element to work in cork context
      bpf: sockmap, support for msg_peek in sk_msg with redirect ingress
      bpf: sockmap, add msg_peek tests to test_sockmap
      bpf: skmsg, fix psock create on existing kcm/tls port
      bpf: sk_msg program helper bpf_msg_push_data
      bpf: libbpf support for msg_push_data
      bpf: test_sockmap add options to use msg_push_data

Mauricio Vasquez B (7):
      bpf: rename stack trace map operations
      bpf/syscall: allow key to be null in map functions
      bpf/verifier: add ARG_PTR_TO_UNINIT_MAP_VALUE
      bpf: add queue and stack maps
      bpf: add MAP_LOOKUP_AND_DELETE_ELEM syscall
      Sync uapi/bpf.h to tools/include
      selftests/bpf: add test cases for queue and stack maps

Nicolas Dichtel (1):
      bpf: fix doc of bpf_skb_adjust_room() in uapi

Peng Hao (1):
      selftests/bpf: fix file resource leak in load_kallsyms

Quentin Monnet (4):
      selftests/bpf: fix return value comparison for tests in test_libbpf.sh
      tools: bpftool: document restriction on '.' in names to pin in bpffs
      tools: bpftool: print nb of cmds to stdout (not stderr) for batch mode
      tools: bpftool: fix completion for "bpftool map update"

Song Liu (2):
      bpf: add cg_skb_is_valid_access for BPF_PROG_TYPE_CGROUP_SKB
      bpf: add tests for direct packet access from CGROUP_SKB

 drivers/net/ethernet/netronome/nfp/bpf/main.h      |  10 +-
 drivers/net/ethernet/netronome/nfp/bpf/offload.c   |  32 ++-
 drivers/net/ethernet/netronome/nfp/bpf/verifier.c  |  69 ++++-
 include/linux/bpf.h                                |   7 +
 include/linux/bpf_types.h                          |   4 +-
 include/linux/filter.h                             |  21 ++
 include/linux/skmsg.h                              |  42 ++-
 include/net/tcp.h                                  |   9 +-
 include/uapi/linux/bpf.h                           |  50 +++-
 kernel/bpf/Makefile                                |   2 +-
 kernel/bpf/cgroup.c                                |   6 +
 kernel/bpf/core.c                                  |   3 +
 kernel/bpf/helpers.c                               |  43 +++
 kernel/bpf/queue_stack_maps.c                      | 288 +++++++++++++++++++++
 kernel/bpf/stackmap.c                              |   2 +-
 kernel/bpf/syscall.c                               |  91 ++++++-
 kernel/bpf/verifier.c                              |  85 +++++-
 net/bpf/test_run.c                                 |  15 ++
 net/core/filter.c                                  | 176 ++++++++++++-
 net/core/sock_map.c                                |  11 +-
 net/ipv4/tcp_bpf.c                                 |  41 ++-
 net/tls/tls_main.c                                 |   2 -
 net/tls/tls_sw.c                                   |   3 +-
 tools/arch/arm64/include/asm/barrier.h             |  70 +++++
 tools/arch/ia64/include/asm/barrier.h              |  13 +
 tools/arch/powerpc/include/asm/barrier.h           |  16 ++
 tools/arch/s390/include/asm/barrier.h              |  13 +
 tools/arch/sparc/include/asm/barrier_64.h          |  13 +
 tools/arch/x86/include/asm/barrier.h               |  14 +
 tools/bpf/bpftool/Documentation/bpftool-map.rst    |   4 +-
 tools/bpf/bpftool/Documentation/bpftool-prog.rst   |   8 +-
 tools/bpf/bpftool/bash-completion/bpftool          |   2 +-
 tools/bpf/bpftool/common.c                         |   5 +-
 tools/bpf/bpftool/jit_disasm.c                     |   4 +-
 tools/bpf/bpftool/main.c                           |   3 +-
 tools/bpf/bpftool/main.h                           |   6 +-
 tools/bpf/bpftool/map_perf_ring.c                  |  10 +-
 tools/bpf/bpftool/prog.c                           |  14 +-
 tools/include/asm/barrier.h                        |  35 +++
 tools/include/linux/ring_buffer.h                  |  73 ++++++
 tools/include/uapi/linux/bpf.h                     |  50 +++-
 tools/include/uapi/linux/tls.h                     |  78 ++++++
 tools/lib/bpf/Makefile                             |   1 +
 tools/lib/bpf/bpf.c                                |  12 +
 tools/lib/bpf/bpf.h                                | 120 +++++----
 tools/lib/bpf/btf.h                                |  22 +-
 tools/lib/bpf/libbpf.c                             |  75 +++---
 tools/lib/bpf/libbpf.h                             | 191 +++++++-------
 tools/perf/util/mmap.h                             |  15 +-
 tools/testing/selftests/bpf/.gitignore             |   2 +
 tools/testing/selftests/bpf/Makefile               |   5 +-
 tools/testing/selftests/bpf/bpf_helpers.h          |   9 +
 tools/testing/selftests/bpf/test_libbpf.sh         |   2 +-
 tools/testing/selftests/bpf/test_maps.c            | 122 +++++++++
 tools/testing/selftests/bpf/test_progs.c           |  99 +++++++
 tools/testing/selftests/bpf/test_queue_map.c       |   4 +
 tools/testing/selftests/bpf/test_queue_stack_map.h |  59 +++++
 tools/testing/selftests/bpf/test_sockmap.c         | 238 ++++++++++++-----
 tools/testing/selftests/bpf/test_sockmap_kern.h    |  97 +++++--
 tools/testing/selftests/bpf/test_stack_map.c       |   4 +
 tools/testing/selftests/bpf/test_verifier.c        | 181 ++++++++++++-
 tools/testing/selftests/bpf/trace_helpers.c        |   8 +-
 62 files changed, 2315 insertions(+), 394 deletions(-)
 create mode 100644 kernel/bpf/queue_stack_maps.c
 create mode 100644 tools/include/linux/ring_buffer.h
 create mode 100644 tools/include/uapi/linux/tls.h
 create mode 100644 tools/testing/selftests/bpf/test_queue_map.c
 create mode 100644 tools/testing/selftests/bpf/test_queue_stack_map.h
 create mode 100644 tools/testing/selftests/bpf/test_stack_map.c

^ permalink raw reply

* bpf-next is CLOSED
From: Daniel Borkmann @ 2018-10-21 19:27 UTC (permalink / raw)
  To: netdev; +Cc: ast

Given the merge window is opening shortly, please only submit bug
fixes to bpf tree at this time, thank you!

^ permalink raw reply

* Re: [PATCH 0/9] net: simplify getting .driver_data
From: David Miller @ 2018-10-22  4:11 UTC (permalink / raw)
  To: wsa+renesas
  Cc: linux-kernel, linux-renesas-soc, linux-arm-kernel, linux-omap,
	netdev
In-Reply-To: <20181021200021.1693-1-wsa+renesas@sang-engineering.com>

From: Wolfram Sang <wsa+renesas@sang-engineering.com>
Date: Sun, 21 Oct 2018 22:00:11 +0200

> I got tired of fixing this in Renesas drivers manually, so I took the big
> hammer. Remove this cumbersome code pattern which got copy-pasted too much
> already:
> 
> -	struct platform_device *pdev = to_platform_device(dev);
> -	struct ep93xx_keypad *keypad = platform_get_drvdata(pdev);
> +	struct ep93xx_keypad *keypad = dev_get_drvdata(dev);
> 
> A branch, tested by buildbot, can be found here:
> 
> git://git.kernel.org/pub/scm/linux/kernel/git/wsa/linux.git coccinelle/get_drvdata
> 
> I have been asked if it couldn't be done for dev_set_drvdata as well. I checked
> it and did not find one occasion where it could be simplified like this. Not
> much of a surprise because driver_data is usually set in probe() functions
> which access struct platform_device in many other ways.
> 
> I am open for other comments, suggestions, too, of course.
> 
> Here is the cocci-script I created:
 ...

Series applied to net-next, thanks.

^ permalink raw reply

* [PATCH 2/9] net: dsa: qca8k: simplify getting .driver_data
From: Wolfram Sang @ 2018-10-21 20:00 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-renesas-soc, Wolfram Sang, Andrew Lunn, Vivien Didelot,
	Florian Fainelli, David S. Miller, netdev
In-Reply-To: <20181021200021.1693-1-wsa+renesas@sang-engineering.com>

We should get 'driver_data' from 'struct device' directly. Going via
platform_device is an unneeded step back and forth.

Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>
---

Build tested only. buildbot is happy.

 drivers/net/dsa/qca8k.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/net/dsa/qca8k.c b/drivers/net/dsa/qca8k.c
index cdcde7f8e0b2..7e97e620bd44 100644
--- a/drivers/net/dsa/qca8k.c
+++ b/drivers/net/dsa/qca8k.c
@@ -955,8 +955,7 @@ qca8k_set_pm(struct qca8k_priv *priv, int enable)
 
 static int qca8k_suspend(struct device *dev)
 {
-	struct platform_device *pdev = to_platform_device(dev);
-	struct qca8k_priv *priv = platform_get_drvdata(pdev);
+	struct qca8k_priv *priv = dev_get_drvdata(dev);
 
 	qca8k_set_pm(priv, 0);
 
@@ -965,8 +964,7 @@ static int qca8k_suspend(struct device *dev)
 
 static int qca8k_resume(struct device *dev)
 {
-	struct platform_device *pdev = to_platform_device(dev);
-	struct qca8k_priv *priv = platform_get_drvdata(pdev);
+	struct qca8k_priv *priv = dev_get_drvdata(dev);
 
 	qca8k_set_pm(priv, 1);
 
-- 
2.19.0

^ permalink raw reply related

* Re: [RFC PATCH v2 00/10] udp: implement GRO support
From: Willem de Bruijn @ 2018-10-21 20:05 UTC (permalink / raw)
  To: Paolo Abeni; +Cc: Network Development, Willem de Bruijn, steffen.klassert
In-Reply-To: <cover.1539957909.git.pabeni@redhat.com>

On Fri, Oct 19, 2018 at 10:30 AM Paolo Abeni <pabeni@redhat.com> wrote:
>
> This series implements GRO support for UDP sockets, as the RX counterpart
> of commit bec1f6f69736 ("udp: generate gso with UDP_SEGMENT").
> The core functionality is implemented by the second patch, introducing a new
> sockopt to enable UDP_GRO, while patch 3 implements support for passing the
> segment size to the user space via a new cmsg.
> UDP GRO performs a socket lookup for each ingress packets and aggregate datagram
> directed to UDP GRO enabled sockets with constant l4 tuple.
>
> UDP GRO packets can land on non GRO-enabled sockets, e.g. due to iptables NAT
> rules, and that could potentially confuse existing applications.

Good catch.

> The solution adopted here is to de-segment the GRO packet before enqueuing
> as needed. Since we must cope with packet reinsertion after de-segmentation,
> the relevant code is factored-out in ipv4 and ipv6 specific helpers and exposed
> to UDP usage.
>
> While the current code can probably be improved, this safeguard ,implemented in
> the patches 4-7, allows future enachements to enable UDP GSO offload on more
> virtual devices eventually even on forwarded packets.
>
> The last 4 for patches implement some performance and functional self-tests,
> re-using the existing udpgso infrastructure. The problematic scenario described
> above is explicitly tested.

This looks awesome! Impressive testing, too.

A few comments in the individual patches, mostly minor.

^ permalink raw reply

* Re: [RFC PATCH v2 02/10] udp: implement GRO for plain UDP sockets.
From: Willem de Bruijn @ 2018-10-21 20:06 UTC (permalink / raw)
  To: Paolo Abeni; +Cc: Network Development, Willem de Bruijn, steffen.klassert
In-Reply-To: <3fa3822651e29b8484d598b10ae61b0efde6b14f.1539957909.git.pabeni@redhat.com>

On Fri, Oct 19, 2018 at 10:30 AM Paolo Abeni <pabeni@redhat.com> wrote:
>
> This is the RX counterpart of commit bec1f6f69736 ("udp: generate gso
> with UDP_SEGMENT"). When UDP_GRO is enabled, such socket is also
> eligible for GRO in the rx path: UDP segments directed to such socket
> are assembled into a larger GSO_UDP_L4 packet.
>
> The core UDP GRO support is enabled with setsockopt(UDP_GRO).
>
> Initial benchmark numbers:
>
> Before:
> udp rx:   1079 MB/s   769065 calls/s
>
> After:
> udp rx:   1466 MB/s    24877 calls/s
>
>
> This change introduces a side effect in respect to UDP tunnels:
> after a UDP tunnel creation, now the kernel performs a lookup per ingress
> UDP packet, while before such lookup happened only if the ingress packet
> carried a valid internal header csum.
>
> v1 -> v2:
>  - use a new option to enable UDP GRO
>  - use static keys to protect the UDP GRO socket lookup
>
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> ---
>  include/linux/udp.h      |   3 +-
>  include/uapi/linux/udp.h |   1 +
>  net/ipv4/udp.c           |   7 +++
>  net/ipv4/udp_offload.c   | 109 +++++++++++++++++++++++++++++++--------
>  net/ipv6/udp_offload.c   |   6 +--
>  5 files changed, 98 insertions(+), 28 deletions(-)
>
> diff --git a/include/linux/udp.h b/include/linux/udp.h
> index a4dafff407fb..f613b329852e 100644
> --- a/include/linux/udp.h
> +++ b/include/linux/udp.h
> @@ -50,11 +50,12 @@ struct udp_sock {
>         __u8             encap_type;    /* Is this an Encapsulation socket? */
>         unsigned char    no_check6_tx:1,/* Send zero UDP6 checksums on TX? */
>                          no_check6_rx:1,/* Allow zero UDP6 checksums on RX? */
> -                        encap_enabled:1; /* This socket enabled encap
> +                        encap_enabled:1, /* This socket enabled encap
>                                            * processing; UDP tunnels and
>                                            * different encapsulation layer set
>                                            * this
>                                            */
> +                        gro_enabled:1; /* Can accept GRO packets */
>
>         /*
>          * Following member retains the information to create a UDP header
>          * when the socket is uncorked.
> diff --git a/include/uapi/linux/udp.h b/include/uapi/linux/udp.h
> index 09502de447f5..30baccb6c9c4 100644
> --- a/include/uapi/linux/udp.h
> +++ b/include/uapi/linux/udp.h
> @@ -33,6 +33,7 @@ struct udphdr {
>  #define UDP_NO_CHECK6_TX 101   /* Disable sending checksum for UDP6X */
>  #define UDP_NO_CHECK6_RX 102   /* Disable accpeting checksum for UDP6 */
>  #define UDP_SEGMENT    103     /* Set GSO segmentation size */
> +#define UDP_GRO                104     /* This socket can receive UDP GRO packets */
>
>  /* UDP encapsulation types */
>  #define UDP_ENCAP_ESPINUDP_NON_IKE     1 /* draft-ietf-ipsec-nat-t-ike-00/01 */
> diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> index 9fcb5374e166..3c277378814f 100644
> --- a/net/ipv4/udp.c
> +++ b/net/ipv4/udp.c
> @@ -115,6 +115,7 @@
>  #include "udp_impl.h"
>  #include <net/sock_reuseport.h>
>  #include <net/addrconf.h>
> +#include <net/udp_tunnel.h>
>
>  struct udp_table udp_table __read_mostly;
>  EXPORT_SYMBOL(udp_table);
> @@ -2459,6 +2460,12 @@ int udp_lib_setsockopt(struct sock *sk, int level, int optname,
>                 up->gso_size = val;
>                 break;
>
> +       case UDP_GRO:
> +               if (valbool)
> +                       udp_tunnel_encap_enable(sk->sk_socket);
> +               up->gro_enabled = valbool;

The socket lock is not held here, so multiple updates to
up->gro_enabled and the up->encap_enabled and the static branch can
race. Syzkaller is adept at generating those.

> +#define UDO_GRO_CNT_MAX 64
> +static struct sk_buff *udp_gro_receive_segment(struct list_head *head,
> +                                              struct sk_buff *skb)
> +{
> +       struct udphdr *uh = udp_hdr(skb);
> +       struct sk_buff *pp = NULL;
> +       struct udphdr *uh2;
> +       struct sk_buff *p;
> +
> +       /* requires non zero csum, for simmetry with GSO */

symmetry

^ permalink raw reply

* Re: [RFC PATCH v2 03/10] udp: add support for UDP_GRO cmsg
From: Willem de Bruijn @ 2018-10-21 20:07 UTC (permalink / raw)
  To: Paolo Abeni; +Cc: Network Development, Willem de Bruijn, steffen.klassert
In-Reply-To: <a8112a7fbbfc39d5b59e5ace0d3c1409824f9261.1539957909.git.pabeni@redhat.com>

On Fri, Oct 19, 2018 at 10:30 AM Paolo Abeni <pabeni@redhat.com> wrote:
>
> When UDP GRO is enabled, the UDP_GRO cmsg will carry the ingress
> datagram size. User-space can use such info to compute the original
> packets layout.
>
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> ---
> CHECK: should we use a separate setsockopt to explicitly enable
> gso_size cmsg reception? So that user space can enable UDP_GRO and
> fetch cmsg without forcefully receiving GRO related info.

A user can avoid the message by not passing control data. Though in
most practical cases it seems unsafe to do so, anyway, as the path MTU
can be lower than the expected device MTU.

> diff --git a/net/ipv4/udp.c b/net/ipv4/udp.c
> index 3c277378814f..2331ac9de954 100644
> --- a/net/ipv4/udp.c
> +++ b/net/ipv4/udp.c
> @@ -1714,6 +1714,10 @@ int udp_recvmsg(struct sock *sk, struct msghdr *msg, size_t len, int noblock,
>                 memset(sin->sin_zero, 0, sizeof(sin->sin_zero));
>                 *addr_len = sizeof(*sin);
>         }
> +
> +       if (udp_sk(sk)->gro_enabled)
> +               udp_cmsg_recv(msg, sk, skb);
> +

Perhaps we can avoid adding a branch by setting a bit in
inet->cmsg_flags for gso_size to let the below branch handle the cmsg
processing.

I'd still set that as part of the UDP_GRO setsockopt. Though if you
insist it could be a value 2 instead of 1, effectively allowing the
above mentioned opt-out.

>         if (inet->cmsg_flags)
>                 ip_cmsg_recv_offset(msg, sk, skb, sizeof(struct udphdr), off);
>

^ permalink raw reply

* Re: [RFC PATCH v2 06/10] udp: cope with UDP GRO packet misdirection
From: Willem de Bruijn @ 2018-10-21 20:08 UTC (permalink / raw)
  To: Paolo Abeni; +Cc: Network Development, Willem de Bruijn, steffen.klassert
In-Reply-To: <63e4ceb238db122d3d831f0809285243701b2284.1539957909.git.pabeni@redhat.com>

On Fri, Oct 19, 2018 at 10:31 AM Paolo Abeni <pabeni@redhat.com> wrote:
>
> In some scenarios, the GRO engine can assemble an UDP GRO packet
> that ultimately lands on a non GRO-enabled socket.
> This patch tries to address the issue explicitly checking for the UDP
> socket features before enqueuing the packet, and eventually segmenting
> the unexpected GRO packet, as needed.
>
> We must also cope with re-insertion requests: after segmentation the
> UDP code calls the helper introduced by the previous patches, as needed.
>
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> ---

> +static inline struct sk_buff *udp_rcv_segment(struct sock *sk,
> +                                             struct sk_buff *skb)
> +{
> +       struct sk_buff *segs;
> +
> +       /* the GSO CB lays after the UDP one, no need to save and restore any
> +        * CB fragment, just initialize it
> +        */
> +       segs = __skb_gso_segment(skb, NETIF_F_SG, false);
> +       if (unlikely(IS_ERR(segs)))
> +               kfree_skb(skb);
> +       else if (segs)
> +               consume_skb(skb);
> +       return segs;
> +}
> +
> +
> +void ip_protocol_deliver_rcu(struct net *net, struct sk_buff *skb, int proto);
> +
> +static int udp_queue_rcv_skb(struct sock *sk, struct sk_buff *skb)
> +{
> +       struct sk_buff *next, *segs;
> +       int ret;
> +
> +       if (likely(!udp_unexpected_gso(sk, skb)))
> +               return udp_queue_rcv_one_skb(sk, skb);
> +
> +       BUILD_BUG_ON(sizeof(struct udp_skb_cb) > SKB_SGO_CB_OFFSET);
> +       __skb_push(skb, -skb_mac_offset(skb));
> +       segs = udp_rcv_segment(sk, skb);
> +       for (skb = segs; skb; skb = next) {

need to check IS_ERR(segs) again?

^ permalink raw reply

* Re: [RFC PATCH v2 07/10] selftests: add GRO support to udp bench rx program
From: Willem de Bruijn @ 2018-10-21 20:08 UTC (permalink / raw)
  To: Paolo Abeni; +Cc: Network Development, Willem de Bruijn, steffen.klassert
In-Reply-To: <883aae234bed25efdef98907507aa73b26054a38.1539957909.git.pabeni@redhat.com>

On Fri, Oct 19, 2018 at 10:31 AM Paolo Abeni <pabeni@redhat.com> wrote:
>
> And fix a couple of buglets (port option processing,
> clean termination on SIGINT). This is preparatory work
> for GRO tests.
>
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> ---
>  tools/testing/selftests/net/udpgso_bench_rx.c | 37 +++++++++++++++----
>  1 file changed, 30 insertions(+), 7 deletions(-)
>
> diff --git a/tools/testing/selftests/net/udpgso_bench_rx.c b/tools/testing/selftests/net/udpgso_bench_rx.c

> @@ -167,10 +177,10 @@ static void do_verify_udp(const char *data, int len)
>  /* Flush all outstanding datagrams. Verify first few bytes of each. */
>  static void do_flush_udp(int fd)
>  {
> -       static char rbuf[ETH_DATA_LEN];
> +       static char rbuf[65535];

we can use ETH_MAX_MTU.

^ permalink raw reply

* Re: [RFC PATCH v2 08/10] selftests: conditionally enable XDP support in udpgso_bench_rx
From: Willem de Bruijn @ 2018-10-21 20:09 UTC (permalink / raw)
  To: Paolo Abeni; +Cc: Network Development, Willem de Bruijn, steffen.klassert
In-Reply-To: <76b1b20399a7d5c9d4249f97383bc711e9416f1e.1539957909.git.pabeni@redhat.com>

On Fri, Oct 19, 2018 at 10:31 AM Paolo Abeni <pabeni@redhat.com> wrote:
>
> XDP support will be used by a later patch to test the GRO path
> in a net namespace, leveraging the veth XDP implementation.
> To avoid breaking existing setup, XDP support is conditionally
> enabled and build only if llc is locally available.
>
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> ---
> diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile
> index 256d82d5fa87..176459b7c4d6 100644
> --- a/tools/testing/selftests/net/Makefile
> +++ b/tools/testing/selftests/net/Makefile
> @@ -16,8 +16,77 @@ TEST_GEN_PROGS = reuseport_bpf reuseport_bpf_cpu reuseport_bpf_numa
>  TEST_GEN_PROGS += reuseport_dualstack reuseaddr_conflict tls
>
>  KSFT_KHDR_INSTALL := 1
> +
> +# Allows pointing LLC/CLANG to a LLVM backend with bpf support, redefine on cmdline:
> +#  make samples/bpf/ LLC=~/git/llvm/build/bin/llc CLANG=~/git/llvm/build/bin/clang
> +LLC ?= llc
> +CLANG ?= clang
> +LLVM_OBJCOPY ?= llvm-objcopy
> +BTF_PAHOLE ?= pahole
> +HAS_LLC := $(shell which $(LLC) 2>/dev/null)
> +
> +# conditional enable testes requiring llc
> +ifneq (, $(HAS_LLC))
> +TEST_GEN_FILES += xdp_dummy.o
> +endif
> +
>  include ../lib.mk
>
> +ifneq (, $(HAS_LLC))
> +
> +# Detect that we're cross compiling and use the cross compiler
> +ifdef CROSS_COMPILE
> +CLANG_ARCH_ARGS = -target $(ARCH)
> +endif
> +
> +PROBE := $(shell $(LLC) -march=bpf -mcpu=probe -filetype=null /dev/null 2>&1)
> +
> +# Let newer LLVM versions transparently probe the kernel for availability
> +# of full BPF instruction set.
> +ifeq ($(PROBE),)
> +  CPU ?= probe
> +else
> +  CPU ?= generic
> +endif
> +
> +SRC_PATH := $(abspath ../../../..)
> +LIB_PATH := $(SRC_PATH)/tools/lib
> +XDP_CFLAGS := -D SUPPORT_XDP=1 -I$(LIB_PATH)
> +LIBBPF = $(LIB_PATH)/bpf/libbpf.a
> +BTF_LLC_PROBE := $(shell $(LLC) -march=bpf -mattr=help 2>&1 | grep dwarfris)
> +BTF_PAHOLE_PROBE := $(shell $(BTF_PAHOLE) --help 2>&1 | grep BTF)
> +BTF_OBJCOPY_PROBE := $(shell $(LLVM_OBJCOPY) --help 2>&1 | grep -i 'usage.*llvm')
> +CLANG_SYS_INCLUDES := $(shell $(CLANG) -v -E - </dev/null 2>&1 \
> +        | sed -n '/<...> search starts here:/,/End of search list./{ s| \(/.*\)|-idirafter \1|p }')
> +CLANG_FLAGS = -I. -I$(SRC_PATH)/include -I../bpf/ \
> +             $(CLANG_SYS_INCLUDES) -Wno-compare-distinct-pointer-types
> +
> +ifneq ($(and $(BTF_LLC_PROBE),$(BTF_PAHOLE_PROBE),$(BTF_OBJCOPY_PROBE)),)
> +       CLANG_CFLAGS += -g
> +       LLC_FLAGS += -mattr=dwarfris
> +       DWARF2BTF = y
> +endif
> +
> +$(LIBBPF): FORCE
> +# Fix up variables inherited from Kbuild that tools/ build system won't like
> +       $(MAKE) -C $(dir $@) RM='rm -rf' LDFLAGS= srctree=$(SRC_PATH) O= $(nodir $@)
> +

This is a lot of XDP specific code. Not for this patchset, per se, but
would be nice if we can reuse the logic in selftests/bpf for all this.

> --- a/tools/testing/selftests/net/udpgso_bench_rx.c
> @@ -227,6 +234,13 @@ static void parse_opts(int argc, char **argv)
>                         cfg_verify = true;
>                         cfg_read_all = true;
>                         break;
> +#ifdef SUPPORT_XDP
> +               case 'x':
> +                       cfg_xdp_iface = if_nametoindex(optarg);
> +                       if (!cfg_xdp_iface)
> +                               error(1, errno, "unknown interface %s", optarg);
> +                       break;
> +#endif

nit: needs to be added to getopt string in this patch.

^ permalink raw reply

* Re: [RFC PATCH v2 10/10] selftests: add functionals test for UDP GRO
From: Willem de Bruijn @ 2018-10-21 20:09 UTC (permalink / raw)
  To: Paolo Abeni; +Cc: Network Development, Willem de Bruijn, steffen.klassert
In-Reply-To: <846157138dd61db611c5df7942c0e9f76bbc5a0e.1539957909.git.pabeni@redhat.com>

On Fri, Oct 19, 2018 at 10:31 AM Paolo Abeni <pabeni@redhat.com> wrote:
>
> Extends the existing udp programs to allow checking for proper
> GRO aggregation/GSO size, and run the tests via a shell script, using
> a veth pair with XDP program attached to trigger the GRO code path.
>
> Signed-off-by: Paolo Abeni <pabeni@redhat.com>
> ---
>  tools/testing/selftests/net/Makefile          |   2 +-
>  tools/testing/selftests/net/udpgro.sh         | 144 ++++++++++++++++++
>  tools/testing/selftests/net/udpgro_bench.sh   |   8 +-
>  tools/testing/selftests/net/udpgso_bench.sh   |   2 +-
>  tools/testing/selftests/net/udpgso_bench_rx.c | 125 +++++++++++++--
>  tools/testing/selftests/net/udpgso_bench_tx.c |  22 ++-
>  6 files changed, 281 insertions(+), 22 deletions(-)
>  create mode 100755 tools/testing/selftests/net/udpgro.sh
>

> diff --git a/tools/testing/selftests/net/udpgro.sh b/tools/testing/selftests/net/udpgro.sh

> +       run_test "no GRO chk cmsg" "${ipv4_args} -M 10 -s 1400" "-4 -n 10 -l 1400 -S -1"
> +       run_test "no GRO chk cmsg" "${ipv6_args} -M 10 -s 1400" "-n 10 -l 1400 -S -1"

why expected segment size -1 in these two?

> diff --git a/tools/testing/selftests/net/udpgso_bench_tx.c b/tools/testing/selftests/net/udpgso_bench_tx.c
>  static void usage(const char *filepath)
>  {
> -       error(1, 0, "Usage: %s [-46cmStuz] [-C cpu] [-D dst ip] [-l secs] [-p port] [-s sendsize]",
> +       error(1, 0, "Usage: %s [-46cmtuz] [-C cpu] [-D dst ip] [-l secs] [-m messagenr] [-p port] [-s sendsize] [-S gsosize]",
>                     filepath);

missing -M

^ permalink raw reply

* Re: [PATCH bpf-next] selftests/bpf: enable (uncomment) all tests in test_libbpf.sh
From: Jesper Dangaard Brouer @ 2018-10-21 21:04 UTC (permalink / raw)
  To: Quentin Monnet
  Cc: Alexei Starovoitov, Daniel Borkmann, netdev, oss-drivers, brouer
In-Reply-To: <1540136228-11360-1-git-send-email-quentin.monnet@netronome.com>

On Sun, 21 Oct 2018 16:37:08 +0100
Quentin Monnet <quentin.monnet@netronome.com> wrote:

> 2018-10-21 11:57 UTC+0200 ~ Jesper Dangaard Brouer <brouer@redhat.com>
> > On Sat, 20 Oct 2018 23:00:24 +0100
> > Quentin Monnet <quentin.monnet@netronome.com> wrote:
> >   
> 
> [...]
> 
> >> --- a/tools/testing/selftests/bpf/test_libbpf.sh
> >> +++ b/tools/testing/selftests/bpf/test_libbpf.sh
> >> @@ -33,17 +33,11 @@ trap exit_handler 0 2 3 6 9
> >>   
> >>   libbpf_open_file test_l4lb.o
> >>   
> >> -# TODO: fix libbpf to load noinline functions
> >> -# [warning] libbpf: incorrect bpf_call opcode
> >> -#libbpf_open_file test_l4lb_noinline.o
> >> +libbpf_open_file test_l4lb_noinline.o
> >>   
> >> -# TODO: fix test_xdp_meta.c to load with libbpf
> >> -# [warning] libbpf: test_xdp_meta.o doesn't provide kernel version
> >> -#libbpf_open_file test_xdp_meta.o
> >> +libbpf_open_file test_xdp_meta.o
> >>   
> >> -# TODO: fix libbpf to handle .eh_frame
> >> -# [warning] libbpf: relocation failed: no section(10)
> >> -#libbpf_open_file ../../../../samples/bpf/tracex3_kern.o
> >> +libbpf_open_file ../../../../samples/bpf/tracex3_kern.o  
> > 
> > I don't like the ../../../../samples/bpf/ reference (even-through I
> > added this TODO), as the kselftests AFAIK support installing the
> > selftests and then this tests will fail.
> > Maybe we can find another example kern.o file?
> > (which isn't compiled with -target bpf)  
> 
> Hi Jesper, yeah maybe making the test rely on something from samples/bpf
> instead of just the selftests/bpf directory is not a good idea. But
> there is no program compiled without the "-target-bpf" in that
> directory. What we could do is explicitly compile one without the flag
> in the Makefile, as in the patch below, but I am not sure that doing so
> is acceptable?

I think it makes sense to have a test program compiled without the
"-target-bpf", as that will happen for users.  And I guess we can add
some more specific test that are related to "-target-bpf".

> Or should tests for libbpf have a directory of their own,
> with another Makefile?

Hmm, I'm not sure about that idea.

I did plan by naming the test "libbpf_open_file", what we add more
libbpf_ prefixed tests to the test_libbpf.sh script, which should
cover more aspects of the _base_ libbpf functionality.

> Another question regarding the test with test_xdp_meta.o: does the fix I
> suggested (setting a version in the .C file) makes sense, or did you
> leave this test for testing someday that libbpf would be able to open
> even programs that do not set a version (in which case this is still not
> the case if program type is not provided, and in fact my fix ruins
> everything? :s).

Well, yes.  I was hinting if we should relax the version requirement
for e.g. XDP BPF progs.

> ---
>  tools/testing/selftests/bpf/Makefile        | 10 ++++++++++
>  tools/testing/selftests/bpf/test_libbpf.sh  | 14 +++++---------
>  tools/testing/selftests/bpf/test_xdp_meta.c |  2 ++
>  3 files changed, 17 insertions(+), 9 deletions(-)
> 
> diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile
> index e39dfb4e7970..ecd79b7fb107 100644
> --- a/tools/testing/selftests/bpf/Makefile
> +++ b/tools/testing/selftests/bpf/Makefile
> @@ -135,6 +135,16 @@ endif
>  endif
>  endif
>  
> +# Have one program compiled without "-target bpf" to test whether libbpf loads
> +# it successfully
> +$(OUTPUT)/test_xdp.o: test_xdp.c
> +	$(CLANG) $(CLANG_FLAGS) \
> +		-O2 -emit-llvm -c $< -o - | \
> +	$(LLC) -march=bpf -mcpu=$(CPU) $(LLC_FLAGS) -filetype=obj -o $@
> +ifeq ($(DWARF2BTF),y)
> +	$(BTF_PAHOLE) -J $@
> +endif
> +
>  $(OUTPUT)/%.o: %.c
>  	$(CLANG) $(CLANG_FLAGS) \
>  		 -O2 -target bpf -emit-llvm -c $< -o - |      \
> diff --git a/tools/testing/selftests/bpf/test_libbpf.sh b/tools/testing/selftests/bpf/test_libbpf.sh
> index 156d89f1edcc..b45962a44243 100755
> --- a/tools/testing/selftests/bpf/test_libbpf.sh
> +++ b/tools/testing/selftests/bpf/test_libbpf.sh
> @@ -33,17 +33,13 @@ trap exit_handler 0 2 3 6 9
>  
>  libbpf_open_file test_l4lb.o
>  
> -# TODO: fix libbpf to load noinline functions
> -# [warning] libbpf: incorrect bpf_call opcode
> -#libbpf_open_file test_l4lb_noinline.o
> +# Load a program with BPF-to-BPF calls
> +libbpf_open_file test_l4lb_noinline.o
>  
> -# TODO: fix test_xdp_meta.c to load with libbpf
> -# [warning] libbpf: test_xdp_meta.o doesn't provide kernel version
> -#libbpf_open_file test_xdp_meta.o
> +libbpf_open_file test_xdp_meta.o
>  
> -# TODO: fix libbpf to handle .eh_frame
> -# [warning] libbpf: relocation failed: no section(10)
> -#libbpf_open_file ../../../../samples/bpf/tracex3_kern.o
> +# Load a program compiled without the "-target bpf" flag
> +libbpf_open_file test_xdp.o
>  
>  # Success
>  exit 0
> diff --git a/tools/testing/selftests/bpf/test_xdp_meta.c b/tools/testing/selftests/bpf/test_xdp_meta.c
> index 8d0182650653..2f42de66e2bb 100644
> --- a/tools/testing/selftests/bpf/test_xdp_meta.c
> +++ b/tools/testing/selftests/bpf/test_xdp_meta.c
> @@ -8,6 +8,8 @@
>  #define round_up(x, y) ((((x) - 1) | __round_mask(x, y)) + 1)
>  #define ctx_ptr(ctx, mem) (void *)(unsigned long)ctx->mem
>  
> +int _version SEC("version") = 1;
> +
>  SEC("t")
>  int ing_cls(struct __sk_buff *ctx)
>  {



-- 
Best regards,
  Jesper Dangaard Brouer
  MSc.CS, Principal Kernel Engineer at Red Hat
  LinkedIn: http://www.linkedin.com/in/brouer

^ permalink raw reply

* Re: CRC errors between mvneta and macb
From: Richard Genoud @ 2018-10-22  6:51 UTC (permalink / raw)
  To: linux-kernel, Richard Genoud
  Cc: Thomas Petazzoni, Antoine Tenart, Gregory CLEMENT,
	Yelena Krivosheev, Maxime Chevallier, Nicolas Ferre, netdev
In-Reply-To: <20181019154439.GC24045@1wt.eu>

Le 19/10/2018 à 17:44, Willy Tarreau a écrit :
> On Fri, Oct 19, 2018 at 05:15:03PM +0200, Richard Genoud wrote:
>> When there's a CRC error, the TXCLK has its polarity inverted...
>> That's a clue !
>>
>> But this TXCLK (25MHz) is not used on the g35-ek.
>> Only the REFCLK/XT2 (50MHz) is used to synchronise the PHY and the macb.
>> So I guess that the TXCLK has a role to play to generate TX+/TX-
> 
> Well, just a stupid idea, maybe when this signal is inverted, the TX+/TX-
> are desynchronized by half a clock and are not always properly interpreted
> on the other side ?
> 
> Willy
> 

I must admit that I'm not familiar with the PHY internals, I'll have to
dig into that.

Richard.

^ permalink raw reply

* RE: SMSC9730 Autosuspend/Resume Questions
From: Nisar.Sayed @ 2018-10-22  7:03 UTC (permalink / raw)
  To: frieder.schrempf, steve.glendinning, UNGLinuxDriver, davem,
	netdev
  Cc: oneukum, linux-usb, linux-kernel
In-Reply-To: <69e14429-a4d9-dd5c-5e70-c22086490cc0@exceet.de>

Hi Frieder,

> Hi,
> 
> I recently tested a board with SMSC9730 connected via USB HSIC to an
> i.MX6S SOC. I used these patches on top of v4.14-rc8 for the USB HSIC
> support: [1].
> 
> When I turned on autosuspend, the smsc95xx stopped in the middle of the
> suspending routine and /sys/bus/usb/devices/usb3/3-
> 1/power/runtime_status reported "suspending"
> forever.
> 
> With some debug logs I found out, that the last line of code that was
> executed is [2], after that I didn't get any further messages.
> 
> Then I applied the following diff and the suspend/resume started to work
> reliably:
> 
> @@ -1382,10 +1385,11 @@ static int smsc95xx_link_ok_nopm(struct usbnet
> *dev)
>          ret = smsc95xx_mdio_read_nopm(dev->net, mii->phy_id, MII_BMSR);
>          if (ret < 0)
>                  return ret;
> -
> +       /*
>          ret = smsc95xx_mdio_read_nopm(dev->net, mii->phy_id, MII_BMSR);
>          if (ret < 0)
>                  return ret;
> +       */
> 
>          return !!(ret & BMSR_LSTATUS);
>   }
> 
> So it seems like the dummy read, that is commented with "first, a dummy
> read, needed to latch some MII phys" causes problems in my setup. Do you
> have an idea what could be the reason? Is this a proper fix?
>
 
Link status bit is latch low register and needs to be read twice. You may get wrong Link status with this fix.

- Nisar

^ permalink raw reply

* Re: SMSC9730 Autosuspend/Resume Questions
From: Frieder Schrempf @ 2018-10-22  7:14 UTC (permalink / raw)
  To: Nisar.Sayed, steve.glendinning, UNGLinuxDriver, davem, netdev
  Cc: oneukum, linux-usb, linux-kernel
In-Reply-To: <CE371C1263339941885964188A0225FA44571A@CHN-SV-EXMX03.mchp-main.com>

Hi Nisar,

On 22.10.18 09:03, Nisar.Sayed@microchip.com wrote:
> Hi Frieder,
> 
>> Hi,
>>
>> I recently tested a board with SMSC9730 connected via USB HSIC to an
>> i.MX6S SOC. I used these patches on top of v4.14-rc8 for the USB HSIC
>> support: [1].
>>
>> When I turned on autosuspend, the smsc95xx stopped in the middle of the
>> suspending routine and /sys/bus/usb/devices/usb3/3-
>> 1/power/runtime_status reported "suspending"
>> forever.
>>
>> With some debug logs I found out, that the last line of code that was
>> executed is [2], after that I didn't get any further messages.
>>
>> Then I applied the following diff and the suspend/resume started to work
>> reliably:
>>
>> @@ -1382,10 +1385,11 @@ static int smsc95xx_link_ok_nopm(struct usbnet
>> *dev)
>>           ret = smsc95xx_mdio_read_nopm(dev->net, mii->phy_id, MII_BMSR);
>>           if (ret < 0)
>>                   return ret;
>> -
>> +       /*
>>           ret = smsc95xx_mdio_read_nopm(dev->net, mii->phy_id, MII_BMSR);
>>           if (ret < 0)
>>                   return ret;
>> +       */
>>
>>           return !!(ret & BMSR_LSTATUS);
>>    }
>>
>> So it seems like the dummy read, that is commented with "first, a dummy
>> read, needed to latch some MII phys" causes problems in my setup. Do you
>> have an idea what could be the reason? Is this a proper fix?
>>
>   
> Link status bit is latch low register and needs to be read twice. You may get wrong Link status with this fix.

Ok, thanks for the information. I guess I'll have to do some more 
debugging to find out why the second read fails in my case.

Frieder

^ permalink raw reply

* Re: [PATCH net] r8169: fix NAPI handling under high load
From: Jonathan Woithe @ 2018-10-21 23:07 UTC (permalink / raw)
  To: Heiner Kallweit
  Cc: Francois Romieu, Holger Hoffstätte, David Miller,
	Realtek linux nic maintainers, netdev@vger.kernel.org
In-Reply-To: <20181018061554.GF2487@marvin.atrad.com.au>

On Fri, 19 Oct 2018 17:59:21 +1030, Jonathan Woithe wrote:
> On 10/18/18 08:15, Jonathan Woithe wrote:
> > On Thu, Oct 18, 2018 at 08:03:32AM +0200, Heiner Kallweit wrote:
> > > Proposed fix is here:
> > > https://patchwork.ozlabs.org/patch/985014/
> > > Would be good if you could test it. Thanks!
> > 
> > I should be able to do so tomorrow.  Which kernel would you like me to apply
> > the patch to?
>
> It turns out I couldn't compile 4.18.15 conveniently ...
> I could compile 4.14 though - I had previously confirmed that the problem
> was still seen with that kernel.  I therefore applied the fix to 4.14 (which
> was trivial) and started a test with that.  The system has been running
> without exhibiting the effects of the bug for over five hours.  With an
> unpatched 4.14, symptoms were typically seen within 30 minutes.
> 
> I will leave the system to operate over the weekend to be sure, but at this
> stage it seems likely that the patch will resolve this long-standing
> difficulty that we've experienced.  I will report back on Monday.

Our test system has now been running for just under 3 days (69 hours) and
there has not been any incidence of the problem with this patch applied to
the 4.14 r8169 driver.  Without the patch, multiple problems are logged
within 30 minutes.

Given this, I would conclude that this patch fixes the problem for us. 
Although belated (since the patch has already been accepted):

  Tested-by: Jonathan Woithe <jwoithe@atrad.com.au>

Thank you very much for your work which lead to the patch!  It means we can
finally provide an upgrade path for systems in the field which are equipped
with the r8169 hardware.

For reference, the problem being tested on our systems is the one discussed
in the "r8169 regression: UDP packets dropped intermittantly" thread on this
mailing list.

Regards
  jonathan

^ permalink raw reply

* RE: [PATCH] bonding: avoid repeated display of same link status change
From: Manish Kumar Singh @ 2018-10-22  7:29 UTC (permalink / raw)
  To: Eric Dumazet, netdev
  Cc: Jay Vosburgh, Veaceslav Falico, Andy Gospodarek, David S. Miller,
	linux-kernel
In-Reply-To: <dd819430-a67e-4507-b03f-d9d40528e74d@default>



> -----Original Message-----
> From: Manish Kumar Singh
> Sent: 24 सितम्बर 2018 12:36
> To: Eric Dumazet; netdev@vger.kernel.org
> Cc: Jay Vosburgh; Veaceslav Falico; Andy Gospodarek; David S. Miller; linux-
> kernel@vger.kernel.org
> Subject: RE: [PATCH] bonding: avoid repeated display of same link status
> change
> 
> 
> 
> > -----Original Message-----
> > From: Eric Dumazet [mailto:eric.dumazet@gmail.com]
> > Sent: 18 सितम्बर 2018 19:30
> > To: Manish Kumar Singh; Eric Dumazet; netdev@vger.kernel.org
> > Cc: Jay Vosburgh; Veaceslav Falico; Andy Gospodarek; David S. Miller; linux-
> > kernel@vger.kernel.org
> > Subject: Re: [PATCH] bonding: avoid repeated display of same link status
> > change
> >
> >
> >
> > On 09/17/2018 10:05 PM, Manish Kumar Singh wrote:
> > >
> > >
> > >> -----Original Message-----
> > >> From: Eric Dumazet [mailto:eric.dumazet@gmail.com]
> > >> Sent: 17 सितम्बर 2018 20:08
> > >> To: Manish Kumar Singh; netdev@vger.kernel.org
> > >> Cc: Jay Vosburgh; Veaceslav Falico; Andy Gospodarek; David S. Miller;
> > linux-
> > >> kernel@vger.kernel.org
> > >> Subject: Re: [PATCH] bonding: avoid repeated display of same link status
> > >> change
> > >>
> > >>
> > >>
> > >> On 09/17/2018 12:20 AM, mk.singh@oracle.com wrote:
> > >>> From: Manish Kumar Singh <mk.singh@oracle.com>
> > >>>
> > >>> When link status change needs to be committed and rtnl lock couldn't
> be
> > >>> taken, avoid redisplay of same link status change message.
> > >>>
> > >>> Signed-off-by: Manish Kumar Singh <mk.singh@oracle.com>
> > >>> ---
> > >>>  drivers/net/bonding/bond_main.c | 6 ++++--
> > >>>  include/net/bonding.h           | 1 +
> > >>>  2 files changed, 5 insertions(+), 2 deletions(-)
> > >>>
> > >>> diff --git a/drivers/net/bonding/bond_main.c
> > >> b/drivers/net/bonding/bond_main.c
> > >>> index 217b790d22ed..fb4e3aff1677 100644
> > >>> --- a/drivers/net/bonding/bond_main.c
> > >>> +++ b/drivers/net/bonding/bond_main.c
> > >>> @@ -2087,7 +2087,7 @@ static int bond_miimon_inspect(struct
> bonding
> > >> *bond)
> > >>>  			bond_propose_link_state(slave, BOND_LINK_FAIL);
> > >>>  			commit++;
> > >>>  			slave->delay = bond->params.downdelay;
> > >>> -			if (slave->delay) {
> > >>> +			if (slave->delay && !bond->rtnl_needed) {
> > >>>  				netdev_info(bond->dev, "link status down
> > for
> > >> %sinterface %s, disabling it in %d ms\n",
> > >>>  					    (BOND_MODE(bond) ==
> > >>>  					     BOND_MODE_ACTIVEBACKUP) ?
> > >>> @@ -2127,7 +2127,7 @@ static int bond_miimon_inspect(struct
> bonding
> > >> *bond)
> > >>>  			commit++;
> > >>>  			slave->delay = bond->params.updelay;
> > >>>
> > >>> -			if (slave->delay) {
> > >>> +			if (slave->delay && !bond->rtnl_needed) {
> > >>>  				netdev_info(bond->dev, "link status up for
> > >> interface %s, enabling it in %d ms\n",
> > >>>  					    slave->dev->name,
> > >>>  					    ignore_updelay ? 0 :
> > >>> @@ -2301,9 +2301,11 @@ static void bond_mii_monitor(struct
> > >> work_struct *work)
> > >>>  		if (!rtnl_trylock()) {
> > >>>  			delay = 1;
> > >>>  			should_notify_peers = false;
> > >>> +			bond->rtnl_needed = true;
> > >>
> > >> How can you set a shared variable with no synchronization ?
> > > Thanks Eric for reviewing the patch. rtnl_needed is not a shared variable,
> it
> > is part of bonding structure, that is one per bonding driver instance. There
> > can't be two parallel instances of bond_miimon_inspect for a
> single  bonding
> > driver instance at any given point of time. and only bond_miimon_inspect
> > updates it. That’s why I think there is no need of any synchronization here.
> > >
> > >
> >
> > If rtnl_trylock() can not grab RTNL,
> > there is no way the current thread can set the  variable without a race, if
> the
> > word including rtnl_needed is shared by other fields in the structure.
> >
> > Your patch adds a subtle possibility of future bugs, even if it runs fine
> today.
> >
> > Do not pave the way for future bugs, make your code robust, please.
> 
> Thankyou Eric, we are making the changes and will repost the patch after
> testing it.
> -Manish

Please review the updated patch, I have changed the rtnl_needed to an atomic variable, to avoid any race condition:

From: Manish Kumar Singh <mk.singh@oracle.com>

When link status change needs to be committed and rtnl lock couldn't be taken, avoid redisplay of same link status change message.

Signed-off-by: Manish Kumar Singh <mk.singh@oracle.com>
---
 drivers/net/bonding/bond_main.c | 6 ++++--
 include/net/bonding.h           | 1 +
 2 files changed, 5 insertions(+), 2 deletions(-)

diff --git a/drivers/net/bonding/bond_main.c b/drivers/net/bonding/bond_main.c index 217b790d22ed..fac5350bf19c 100644
--- a/drivers/net/bonding/bond_main.c
+++ b/drivers/net/bonding/bond_main.c
@@ -2087,7 +2087,7 @@ static int bond_miimon_inspect(struct bonding *bond)
 			bond_propose_link_state(slave, BOND_LINK_FAIL);
 			commit++;
 			slave->delay = bond->params.downdelay;
-			if (slave->delay) {
+			if (slave->delay && !atomic_read(&bond->rtnl_needed)) {
 				netdev_info(bond->dev, "link status down for %sinterface %s, disabling it in %d ms\n",
 					    (BOND_MODE(bond) ==
 					     BOND_MODE_ACTIVEBACKUP) ?
@@ -2127,7 +2127,7 @@ static int bond_miimon_inspect(struct bonding *bond)
 			commit++;
 			slave->delay = bond->params.updelay;
 
-			if (slave->delay) {
+			if (slave->delay && !atomic_read(&bond->rtnl_needed)) {
 				netdev_info(bond->dev, "link status up for interface %s, enabling it in %d ms\n",
 					    slave->dev->name,
 					    ignore_updelay ? 0 :
@@ -2301,9 +2301,11 @@ static void bond_mii_monitor(struct work_struct *work)
 		if (!rtnl_trylock()) {
 			delay = 1;
 			should_notify_peers = false;
+			atomic_set(&bond->rtnl_needed, 1);
 			goto re_arm;
 		}
 
+		atomic_set(&bond->rtnl_needed, 0);
 		bond_for_each_slave(bond, slave, iter) {
 			bond_commit_link_state(slave, BOND_SLAVE_NOTIFY_LATER);
 		}
diff --git a/include/net/bonding.h b/include/net/bonding.h index 808f1d167349..ffc1219f7a07 100644
--- a/include/net/bonding.h
+++ b/include/net/bonding.h
@@ -234,6 +234,7 @@ struct bonding {
 	struct	 dentry *debug_dir;
 #endif /* CONFIG_DEBUG_FS */
 	struct rtnl_link_stats64 bond_stats;
+	atomic_t rtnl_needed;
 };
 
 #define bond_slave_get_rcu(dev) \
--
2.14.1



Thanks,
Manish

> >
> >
> >
> >
> >
> >
> >

^ permalink raw reply

* Re: [PATCH 1/1] net: dsa: legacy: simplify getting .driver_data
From: Andrew Lunn @ 2018-10-21 23:13 UTC (permalink / raw)
  To: Wolfram Sang
  Cc: linux-kernel, linux-renesas-soc, Vivien Didelot, Florian Fainelli,
	David S. Miller, netdev
In-Reply-To: <20181021200105.2414-2-wsa+renesas@sang-engineering.com>

On Sun, Oct 21, 2018 at 10:01:05PM +0200, Wolfram Sang wrote:
> We should get 'driver_data' from 'struct device' directly. Going via
> platform_device is an unneeded step back and forth.
> 
> Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew

^ permalink raw reply

* Re: [PATCH 9/9] net: phy: mdio-mux-bcm-iproc: simplify getting .driver_data
From: Andrew Lunn @ 2018-10-21 23:14 UTC (permalink / raw)
  To: Wolfram Sang
  Cc: linux-kernel, linux-renesas-soc, Florian Fainelli,
	David S. Miller, Ray Jui, Scott Branden, Jon Mason,
	bcm-kernel-feedback-list, netdev, linux-arm-kernel
In-Reply-To: <20181021200021.1693-10-wsa+renesas@sang-engineering.com>

On Sun, Oct 21, 2018 at 10:00:20PM +0200, Wolfram Sang wrote:
> We should get 'driver_data' from 'struct device' directly. Going via
> platform_device is an unneeded step back and forth.
> 
> Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew

^ permalink raw reply

* Re: [PATCH 2/9] net: dsa: qca8k: simplify getting .driver_data
From: Andrew Lunn @ 2018-10-21 23:15 UTC (permalink / raw)
  To: Wolfram Sang
  Cc: linux-kernel, linux-renesas-soc, Vivien Didelot, Florian Fainelli,
	David S. Miller, netdev
In-Reply-To: <20181021200021.1693-3-wsa+renesas@sang-engineering.com>

On Sun, Oct 21, 2018 at 10:00:13PM +0200, Wolfram Sang wrote:
> We should get 'driver_data' from 'struct device' directly. Going via
> platform_device is an unneeded step back and forth.
> 
> Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>

Reviewed-by: Andrew Lunn <andrew@lunn.ch>

    Andrew

^ permalink raw reply

* Re: r8169 regression: UDP packets dropped intermittantly
From: Jonathan Woithe @ 2018-10-22  0:01 UTC (permalink / raw)
  To: netdev; +Cc: linux-kernel
In-Reply-To: <20180115065658.GI16419@marvin.atrad.com.au>

On Mon, Jan 15, 2018 at 05:26:59PM +1030, Jonathan Woithe wrote:
> Is there any more information that can be provided (or tests done) to assist
> in tracking this problem down?  Based on the tests done in December it seems
> that the problem only affects specific RTL-8169 variants, with most being
> ok.  Is it a case that we simply need to accept that for the greater good
> commit da78dbff2e05630921c551dbbc70a4b7981a8fff has permanently broken
> Netgear GA311 [1] network cards with respect to these UDP packets and that
> nothing can be done?

For future reference, commit 6b839b6cf9eada30b086effb51e5d6076bafc761
("r8169: fix NAPI handling under high load") appears to have fixed the
regression documented by this thread.  Thanks to Heiner Kallweit for the
work which lead to this solution.

Regards
  jonathan

> [1] Or perhaps any using the RTL8110s variant.

^ permalink raw reply

* RE: [PATCH] net: ethernet:fec: Consistently use SPEED_ prefix
From: Andy Duan @ 2018-10-22  2:41 UTC (permalink / raw)
  To: Andrew Lunn, David Miller
  Cc: clabbe@baylibre.com, netdev, Florian Fainelli,
	hkallweit1@gmail.com
In-Reply-To: <1540068491-14760-1-git-send-email-andrew@lunn.ch>

From: netdev-owner@vger.kernel.org <netdev-owner@vger.kernel.org>
> All other calls to phy_set_max_speed() use the SPEED_ prefix. Make the
> FEC driver follow this common pattern. This makes no different to
> generated code since SPEED_1000 is 1000, and SPEED_100 is 100.
> 

Please also add more information that was introduced by commit: 58056c1e1b0e ("net: ethernet: Use phy_set_max_speed() to limit advertised speed ")

Andy
> Reported-by: Corentin Labbe <clabbe@baylibre.com>
> Signed-off-by: Andrew Lunn <andrew@lunn.ch>
> ---
>  drivers/net/ethernet/freescale/fec_main.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/freescale/fec_main.c
> b/drivers/net/ethernet/freescale/fec_main.c
> index 6db69ba30dcd..b067eaf8b792 100644
> --- a/drivers/net/ethernet/freescale/fec_main.c
> +++ b/drivers/net/ethernet/freescale/fec_main.c
> @@ -1948,7 +1948,7 @@ static int fec_enet_mii_probe(struct
> net_device *ndev)
> 
>  	/* mask with MAC supported features */
>  	if (fep->quirks & FEC_QUIRK_HAS_GBIT) {
> -		phy_set_max_speed(phy_dev, 1000);
> +		phy_set_max_speed(phy_dev, SPEED_1000);
>  		phy_remove_link_mode(phy_dev,
>  				     ETHTOOL_LINK_MODE_1000baseT_Half_BIT);
>  #if !defined(CONFIG_M5272)
> @@ -1956,7 +1956,7 @@ static int fec_enet_mii_probe(struct
> net_device *ndev)  #endif
>  	}
>  	else
> -		phy_set_max_speed(phy_dev, 100);
> +		phy_set_max_speed(phy_dev, SPEED_100);
> 
>  	fep->link = 0;
>  	fep->full_duplex = 0;
> --
> 2.19.0

^ permalink raw reply

* [PATCH 0/2] Mark expected switch fall-throughs and fix missing break
From: Gustavo A. R. Silva @ 2018-10-22 11:49 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jes Sorensen, Kalle Valo, linux-wireless, David S. Miller, netdev,
	Gustavo A. R. Silva

In preparation to enabling -Wimplicit-fallthrough, this patchset aims
to mark multiple switch cases where we are expecting to fall through.

Also, the second patch in this series fixes a missing break in switch.

Thanks

Gustavo A. R. Silva (2):
  rtl8xxxu: Mark expected switch fall-throughs
  rtl8xxxu: Fix missing break in switch

 drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_core.c | 5 +++++
 1 file changed, 5 insertions(+)

-- 
2.7.4

^ permalink raw reply

* [PATCH 1/2] rtl8xxxu: Mark expected switch fall-throughs
From: Gustavo A. R. Silva @ 2018-10-22 11:50 UTC (permalink / raw)
  To: linux-kernel
  Cc: Jes Sorensen, Kalle Valo, linux-wireless, David S. Miller, netdev,
	Gustavo A. R. Silva
In-Reply-To: <cover.1540208577.git.gustavo@embeddedor.com>

In preparation to enabling -Wimplicit-fallthrough, mark switch cases
where we are expecting to fall through.

Addresses-Coverity-ID: 1357355 ("Missing break in switch")
Addresses-Coverity-ID: 1357378 ("Missing break in switch")
Signed-off-by: Gustavo A. R. Silva <gustavo@embeddedor.com>
---
 drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_core.c | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_core.c b/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_core.c
index 56040b1..c6b0686 100644
--- a/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_core.c
+++ b/drivers/net/wireless/realtek/rtl8xxxu/rtl8xxxu_core.c
@@ -1153,6 +1153,7 @@ void rtl8xxxu_gen1_config_channel(struct ieee80211_hw *hw)
 	switch (hw->conf.chandef.width) {
 	case NL80211_CHAN_WIDTH_20_NOHT:
 		ht = false;
+		/* fall through */
 	case NL80211_CHAN_WIDTH_20:
 		opmode |= BW_OPMODE_20MHZ;
 		rtl8xxxu_write8(priv, REG_BW_OPMODE, opmode);
@@ -1280,6 +1281,7 @@ void rtl8xxxu_gen2_config_channel(struct ieee80211_hw *hw)
 	switch (hw->conf.chandef.width) {
 	case NL80211_CHAN_WIDTH_20_NOHT:
 		ht = false;
+		/* fall through */
 	case NL80211_CHAN_WIDTH_20:
 		rf_mode_bw |= WMAC_TRXPTCL_CTL_BW_20;
 		subchannel = 0;
@@ -1748,9 +1750,11 @@ static int rtl8xxxu_identify_chip(struct rtl8xxxu_priv *priv)
 		case 3:
 			priv->ep_tx_low_queue = 1;
 			priv->ep_tx_count++;
+			/* fall through */
 		case 2:
 			priv->ep_tx_normal_queue = 1;
 			priv->ep_tx_count++;
+			/* fall through */
 		case 1:
 			priv->ep_tx_high_queue = 1;
 			priv->ep_tx_count++;
-- 
2.7.4

^ permalink raw reply related

* Re: pull-request: bpf-next 2018-10-21
From: David Miller @ 2018-10-22  4:12 UTC (permalink / raw)
  To: daniel; +Cc: ast, netdev
In-Reply-To: <20181021192426.22776-1-daniel@iogearbox.net>

From: Daniel Borkmann <daniel@iogearbox.net>
Date: Sun, 21 Oct 2018 21:24:26 +0200

> The following pull-request contains BPF updates for your *net-next* tree.
> 
> The main changes are:
 ...
> Please consider pulling these changes from:
> 
>   git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git

Pulled, thanks Daniel.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox