Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH bpf-next v2 0/7] bpf/flow_dissector: support input flags
From: Petar Penkov @ 2019-07-25 17:05 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Networking, bpf, David S . Miller, Alexei Starovoitov,
	Daniel Borkmann, Song Liu, Willem de Bruijn
In-Reply-To: <20190725153342.3571-1-sdf@google.com>

Thanks! For the series:

Acked-by: Petar Penkov <ppenkov@google.com>

On Thu, Jul 25, 2019 at 8:33 AM Stanislav Fomichev <sdf@google.com> wrote:
>
> C flow dissector supports input flags that tell it to customize parsing
> by either stopping early or trying to parse as deep as possible.
> BPF flow dissector always parses as deep as possible which is sub-optimal.
> Pass input flags to the BPF flow dissector as well so it can make the same
> decisions.
>
> Series outline:
> * remove unused FLOW_DISSECTOR_F_STOP_AT_L3 flag
> * export FLOW_DISSECTOR_F_XXX flags as uapi and pass them to BPF
>   flow dissector
> * add documentation for the export flags
> * support input flags in BPF_PROG_TEST_RUN via ctx_{in,out}
> * sync uapi to tools
> * support FLOW_DISSECTOR_F_PARSE_1ST_FRAG in selftest
> * support FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL in kernel and selftest
> * support FLOW_DISSECTOR_F_STOP_AT_ENCAP in selftest
>
> Pros:
> * makes BPF flow dissector faster by avoiding burning extra cycles
> * existing BPF progs continue to work by ignoring the flags and always
>   parsing as deep as possible
>
> Cons:
> * new UAPI which we need to support (OTOH, if we need to deprecate some
>   flags, we can just stop setting them upon calling BPF programs)
>
> Some numbers (with .repeat = 4000000 in test_flow_dissector):
>         test_flow_dissector:PASS:ipv4-frag 35 nsec
>         test_flow_dissector:PASS:ipv4-frag 35 nsec
>         test_flow_dissector:PASS:ipv4-no-frag 32 nsec
>         test_flow_dissector:PASS:ipv4-no-frag 32 nsec
>
>         test_flow_dissector:PASS:ipv6-frag 39 nsec
>         test_flow_dissector:PASS:ipv6-frag 39 nsec
>         test_flow_dissector:PASS:ipv6-no-frag 36 nsec
>         test_flow_dissector:PASS:ipv6-no-frag 36 nsec
>
>         test_flow_dissector:PASS:ipv6-flow-label 36 nsec
>         test_flow_dissector:PASS:ipv6-flow-label 36 nsec
>         test_flow_dissector:PASS:ipv6-no-flow-label 33 nsec
>         test_flow_dissector:PASS:ipv6-no-flow-label 33 nsec
>
>         test_flow_dissector:PASS:ipip-encap 38 nsec
>         test_flow_dissector:PASS:ipip-encap 38 nsec
>         test_flow_dissector:PASS:ipip-no-encap 32 nsec
>         test_flow_dissector:PASS:ipip-no-encap 32 nsec
>
> The improvement is around 10%, but it's in a tight cache-hot
> BPF_PROG_TEST_RUN loop.
>
> Cc: Song Liu <songliubraving@fb.com>
> Cc: Willem de Bruijn <willemb@google.com>
> Cc: Petar Penkov <ppenkov@google.com>
>
> Stanislav Fomichev (7):
>   bpf/flow_dissector: pass input flags to BPF flow dissector program
>   bpf/flow_dissector: document flags
>   bpf/flow_dissector: support flags in BPF_PROG_TEST_RUN
>   tools/bpf: sync bpf_flow_keys flags
>   selftests/bpf: support FLOW_DISSECTOR_F_PARSE_1ST_FRAG
>   bpf/flow_dissector: support ipv6 flow_label and
>     FLOW_DISSECTOR_F_STOP_AT_FLOW_LABEL
>   selftests/bpf: support FLOW_DISSECTOR_F_STOP_AT_ENCAP
>
>  Documentation/bpf/prog_flow_dissector.rst     |  18 ++
>  include/linux/skbuff.h                        |   2 +-
>  include/net/flow_dissector.h                  |   4 -
>  include/uapi/linux/bpf.h                      |   6 +
>  net/bpf/test_run.c                            |  39 ++-
>  net/core/flow_dissector.c                     |  14 +-
>  tools/include/uapi/linux/bpf.h                |   6 +
>  .../selftests/bpf/prog_tests/flow_dissector.c | 242 +++++++++++++++++-
>  tools/testing/selftests/bpf/progs/bpf_flow.c  |  46 +++-
>  9 files changed, 359 insertions(+), 18 deletions(-)
>
> --
> 2.22.0.657.g960e92d24f-goog

^ permalink raw reply

* Re: [PATCH v2 bpf] libbpf: silence GCC8 warning about string truncation
From: Alexei Starovoitov @ 2019-07-25 17:14 UTC (permalink / raw)
  To: Andrii Nakryiko
  Cc: bpf, Network Development, Alexei Starovoitov, Daniel Borkmann,
	Andrii Nakryiko, Kernel Team, Magnus Karlsson
In-Reply-To: <20190724214753.1816451-1-andriin@fb.com>

On Wed, Jul 24, 2019 at 10:46 PM Andrii Nakryiko <andriin@fb.com> wrote:
>
> Despite a proper NULL-termination after strncpy(..., ..., IFNAMSIZ - 1),
> GCC8 still complains about *expected* string truncation:
>
>   xsk.c:330:2: error: 'strncpy' output may be truncated copying 15 bytes
>   from a string of length 15 [-Werror=stringop-truncation]
>     strncpy(ifr.ifr_name, xsk->ifname, IFNAMSIZ - 1);
>
> This patch gets rid of the issue altogether by using memcpy instead.
> There is no performance regression, as strncpy will still copy and fill
> all of the bytes anyway.
>
> v1->v2:
> - rebase against bpf tree.
>
> Cc: Magnus Karlsson <magnus.karlsson@intel.com>
> Signed-off-by: Andrii Nakryiko <andriin@fb.com>
> Acked-by: Magnus Karlsson <magnus.karlsson@intel.com>
> Acked-by: Song Liu <songliubraving@fb.com>

Applied. Thanks

^ permalink raw reply

* Re: [PATCH bpf-next v3 00/11] XDP unaligned chunk placement support
From: Jonathan Lemon @ 2019-07-25 17:30 UTC (permalink / raw)
  To: Richardson, Bruce
  Cc: Laatz, Kevin, netdev, ast, daniel, Topel, Bjorn, Karlsson, Magnus,
	jakub.kicinski, saeedm, maximmi, stephen, Loftus, Ciara, bpf,
	intel-wired-lan
In-Reply-To: <59AF69C657FD0841A61C55336867B5B07EDB5C3F@IRSMSX103.ger.corp.intel.com>



On 25 Jul 2019, at 8:56, Richardson, Bruce wrote:

>> -----Original Message-----
>> From: Jonathan Lemon [mailto:jonathan.lemon@gmail.com]
>> Sent: Thursday, July 25, 2019 4:39 PM
>> To: Laatz, Kevin <kevin.laatz@intel.com>
>> Cc: netdev@vger.kernel.org; ast@kernel.org; daniel@iogearbox.net; Topel,
>> Bjorn <bjorn.topel@intel.com>; Karlsson, Magnus
>> <magnus.karlsson@intel.com>; jakub.kicinski@netronome.com;
>> saeedm@mellanox.com; maximmi@mellanox.com; stephen@networkplumber.org;
>> Richardson, Bruce <bruce.richardson@intel.com>; Loftus, Ciara
>> <ciara.loftus@intel.com>; bpf@vger.kernel.org; intel-wired-
>> lan@lists.osuosl.org
>> Subject: Re: [PATCH bpf-next v3 00/11] XDP unaligned chunk placement
>> support
>>
>>
>>
>> On 23 Jul 2019, at 22:10, Kevin Laatz wrote:
>>
>>> This patch set adds the ability to use unaligned chunks in the XDP umem.
>>>
>>> Currently, all chunk addresses passed to the umem are masked to be
>>> chunk size aligned (max is PAGE_SIZE). This limits where we can place
>>> chunks within the umem as well as limiting the packet sizes that are
>> supported.
>>>
>>> The changes in this patch set removes these restrictions, allowing XDP
>>> to be more flexible in where it can place a chunk within a umem. By
>>> relaxing where the chunks can be placed, it allows us to use an
>>> arbitrary buffer size and place that wherever we have a free address
>>> in the umem. These changes add the ability to support arbitrary frame
>>> sizes up to 4k
>>> (PAGE_SIZE) and make it easy to integrate with other existing
>>> frameworks that have their own memory management systems, such as DPDK.
>>> In DPDK, for example, there is already support for AF_XDP with zero-
>> copy.
>>> However, with this patch set the integration will be much more seamless.
>>> You can find the DPDK AF_XDP driver at:
>>> https://git.dpdk.org/dpdk/tree/drivers/net/af_xdp
>>>
>>> Since we are now dealing with arbitrary frame sizes, we need also need
>>> to update how we pass around addresses. Currently, the addresses can
>>> simply be masked to 2k to get back to the original address. This
>>> becomes less trivial when using frame sizes that are not a 'power of
>>> 2' size. This patch set modifies the Rx/Tx descriptor format to use
>>> the upper 16-bits of the addr field for an offset value, leaving the
>>> lower 48-bits for the address (this leaves us with 256 Terabytes,
>>> which should be enough!). We only need to use the upper 16-bits to store
>> the offset when running in unaligned mode.
>>> Rather than adding the offset (headroom etc) to the address, we will
>>> store it in the upper 16-bits of the address field. This way, we can
>>> easily add the offset to the address where we need it, using some bit
>>> manipulation and addition, and we can also easily get the original
>>> address wherever we need it (for example in i40e_zca_fr-- ee) by
>>> simply masking to get the lower 48-bits of the address field.
>>
>> I wonder if it would be better to break backwards compatibility here and
>> say that a handle is going to change from [addr] to [base | offset], or
>> even [index | offset], where address = (index * chunk size) + offset, and
>> then use accessor macros to manipulate the queue entries.
>>
>> This way, the XDP hotpath can adjust the handle with simple arithmetic,
>> bypassing the "if (unaligned)", check, as it changes the offset directly.
>>
>> Using a chunk index instead of a base address is safer, otherwise it is
>> too easy to corrupt things.
>> --
>
> The trouble with using a chunk index is that it assumes that all chunks are
> contiguous, which is not always going to be the case. For example, for
> userspace apps the easiest way to get memory that is IOVA/physically
> contiguous is to use hugepages, but even then we still need to skip space
> when crossing a 2MB barrier.
>
> Specifically in this example case, with a 3k buffer size and 2MB hugepages,
> we'd get 666 buffers on a single page, but then waste a few KB before
> starting the 667th buffer at byte 0 on the second 2MB page.
> This is the setup used in DPDK, for instance, when we allocate memory for
> use in buffer pools.
>
> Therefore, I think it's better to just have the kernel sanity checking the
> request for safety and leave userspace the freedom to decide where in its
> memory area it wants to place the buffers.

That makes sense, thanks.  My concern is that this makes it difficult for
the kernel to validate the buffer start address, but perhaps that isn't a
concern.

I still believe it would simplify things if the format was [base | offset],
any opinion on that?
-- 
Jonathan

^ permalink raw reply

* Re: TSN - tc usage for a tbs setup
From: Vinicius Costa Gomes @ 2019-07-25 17:30 UTC (permalink / raw)
  To: Stéphane Ancelot, netdev
In-Reply-To: <43c8c7bd-f281-a4dc-badc-9672aaccbd6a@numalliance.com>

Hi,

Stéphane Ancelot <sancelot@numalliance.com> writes:

> Hi,
>
> I am trying to setup my network queue for offline time based configuration.
>
> initial setup is :
>
> tc qdisc show dev eth1:
>
> qdisc mq 0: root
>
> qdisc pfifo_fast 0: parent :1 bands 3 priomap  1 2 2 2 1 2 0 0 1 1 1 1 1 
> 1 1 1
>
>
> I won't need pfifo , I have to send one frame at a precise xmit time 
> (high prio), and then maybe some other frames (with low priority)
>
>
> I want to setup offload time based  xmit.
>
> /sbin/tc qdisc add dev eth1 root handle 100:1 etf delta 100000 clockid 
> CLOCK_REALTIME offload
>

Because the common (expected?) use case for ETF is using it on a system
that is running ptp4l (for example), and so, has the NIC PHC clock using
the TAI clock reference, we only accept the clockid to be CLOCK_TAI.
(Perhaps you are using an old version of iproute2, because a clearer
message should have been printed together with the error as well, anyway
there should be something in dmesg too)

That said, when I need to run some experiments with ETF, and do not care
about having the PHC clock is sync with anything else, I use phc2sys to
force the TAI offset to be zero. Something like this:

$ phc2sys -c $IFACE -s CLOCK_REALTIME -O 0 -m

And install ETF as "usual", something like this:

$ tc qdisc add dev $IFACE root handle 100:1 etf delta 100000 clockid CLOCK_TAI offload

Hope this helps.

Cheers,
--
Vinicius

^ permalink raw reply

* pull-request: bpf 2019-07-25
From: Alexei Starovoitov @ 2019-07-25 17:35 UTC (permalink / raw)
  To: davem; +Cc: daniel, netdev, bpf, kernel-team

Hi David,

The following pull-request contains BPF updates for your *net* tree.

The main changes are:

1) fix segfault in libbpf, from Andrii.

2) fix gso_segs access, from Eric.

3) tls/sockmap fixes, from Jakub and John.

Please consider pulling these changes from:

  git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git

Thanks a lot!

----------------------------------------------------------------

The following changes since commit 8d650cdedaabb33e85e9b7c517c0c71fcecc1de9:

  tcp: fix tcp_set_congestion_control() use from bpf hook (2019-07-18 20:33:48 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf.git 

for you to fetch changes up to cb8ffde5694ae5fffb456eae932aac442aa3a207:

  libbpf: silence GCC8 warning about string truncation (2019-07-25 10:13:31 -0700)

----------------------------------------------------------------
Alexei Starovoitov (1):
      Merge branch 'fix-gso_segs'

Andrii Nakryiko (3):
      libbpf: fix SIGSEGV when BTF loading fails, but .BTF.ext exists
      libbpf: sanitize VAR to conservative 1-byte INT
      libbpf: silence GCC8 warning about string truncation

Arnaldo Carvalho de Melo (2):
      libbpf: Fix endianness macro usage for some compilers
      libbpf: Avoid designated initializers for unnamed union members

Daniel Borkmann (1):
      Merge branch 'bpf-sockmap-tls-fixes'

Eric Dumazet (2):
      bpf: fix access to skb_shared_info->gso_segs
      selftests/bpf: add another gso_segs access

Ilya Leoshkevich (2):
      selftests/bpf: fix sendmsg6_prog on s390
      bpf: fix narrower loads on s390

Ilya Maximets (1):
      libbpf: fix using uninitialized ioctl results

Jakub Kicinski (7):
      net/tls: don't arm strparser immediately in tls_set_sw_offload()
      net/tls: don't call tls_sk_proto_close for hw record offload
      selftests/tls: add a test for ULP but no keys
      selftests/tls: test error codes around TLS ULP installation
      selftests/tls: add a bidirectional test
      selftests/tls: close the socket with open record
      selftests/tls: add shutdown tests

John Fastabend (7):
      net/tls: remove close callback sock unlock/lock around TX work flush
      net/tls: remove sock unlock/lock around strp_done()
      net/tls: fix transition through disconnect with close
      bpf: sockmap, sock_map_delete needs to use xchg
      bpf: sockmap, synchronize_rcu before free'ing map
      bpf: sockmap, only create entry if ulp is not already enabled
      bpf: sockmap/tls, close can race with map free

 Documentation/networking/tls-offload.rst          |   6 +
 include/linux/filter.h                            |  13 ++
 include/linux/skmsg.h                             |   8 +-
 include/net/tcp.h                                 |   3 +
 include/net/tls.h                                 |  15 +-
 kernel/bpf/verifier.c                             |   4 +-
 net/core/filter.c                                 |   6 +-
 net/core/skmsg.c                                  |   4 +-
 net/core/sock_map.c                               |  19 ++-
 net/ipv4/tcp_ulp.c                                |  13 ++
 net/tls/tls_main.c                                | 142 ++++++++++++----
 net/tls/tls_sw.c                                  |  83 ++++++---
 tools/lib/bpf/btf.c                               |   5 +-
 tools/lib/bpf/libbpf.c                            |  34 ++--
 tools/lib/bpf/xsk.c                               |  11 +-
 tools/testing/selftests/bpf/progs/sendmsg6_prog.c |   3 +-
 tools/testing/selftests/bpf/verifier/ctx_skb.c    |  11 ++
 tools/testing/selftests/net/tls.c                 | 194 ++++++++++++++++++++++
 18 files changed, 480 insertions(+), 94 deletions(-)

^ permalink raw reply

* Re: [PATCH bpf-next v2 5/7] selftests/bpf: support FLOW_DISSECTOR_F_PARSE_1ST_FRAG
From: Song Liu @ 2019-07-25 17:15 UTC (permalink / raw)
  To: Stanislav Fomichev
  Cc: Networking, bpf, David S. Miller, ast@kernel.org,
	daniel@iogearbox.net, Willem de Bruijn, Petar Penkov
In-Reply-To: <20190725153342.3571-6-sdf@google.com>



> On Jul 25, 2019, at 8:33 AM, Stanislav Fomichev <sdf@google.com> wrote:
> 
> bpf_flow.c: exit early unless FLOW_DISSECTOR_F_PARSE_1ST_FRAG is passed
> in flags. Also, set ip_proto earlier, this makes sure we have correct
> value with fragmented packets.
> 
> Add selftest cases to test ipv4/ipv6 fragments and skip eth_get_headlen
> tests that don't have FLOW_DISSECTOR_F_PARSE_1ST_FRAG flag.
> 
> eth_get_headlen calls flow dissector with
> FLOW_DISSECTOR_F_PARSE_1ST_FRAG flag so we can't run tests that
> have different set of input flags against it.
> 
> v2:
> * sefltests -> selftests (Willem de Bruijn)
> * Reword a comment about eth_get_headlen flags (Song Liu)
> 
> Acked-by: Willem de Bruijn <willemb@google.com>
> Cc: Song Liu <songliubraving@fb.com>
> Cc: Willem de Bruijn <willemb@google.com>
> Cc: Petar Penkov <ppenkov@google.com>
> Signed-off-by: Stanislav Fomichev <sdf@google.com>

Acked-by: Song Liu <songliubraving@fb.com>


> ---
> .../selftests/bpf/prog_tests/flow_dissector.c | 132 +++++++++++++++++-
> tools/testing/selftests/bpf/progs/bpf_flow.c  |  28 +++-
> 2 files changed, 153 insertions(+), 7 deletions(-)
> 
> diff --git a/tools/testing/selftests/bpf/prog_tests/flow_dissector.c b/tools/testing/selftests/bpf/prog_tests/flow_dissector.c
> index c938283ac232..f93a115db650 100644
> --- a/tools/testing/selftests/bpf/prog_tests/flow_dissector.c
> +++ b/tools/testing/selftests/bpf/prog_tests/flow_dissector.c
> @@ -5,6 +5,10 @@
> #include <linux/if_tun.h>
> #include <sys/uio.h>
> 
> +#ifndef IP_MF
> +#define IP_MF 0x2000
> +#endif
> +
> #define CHECK_FLOW_KEYS(desc, got, expected)				\
> 	CHECK_ATTR(memcmp(&got, &expected, sizeof(got)) != 0,		\
> 	      desc,							\
> @@ -49,6 +53,18 @@ struct ipv6_pkt {
> 	struct tcphdr tcp;
> } __packed;
> 
> +struct ipv6_frag_pkt {
> +	struct ethhdr eth;
> +	struct ipv6hdr iph;
> +	struct frag_hdr {
> +		__u8 nexthdr;
> +		__u8 reserved;
> +		__be16 frag_off;
> +		__be32 identification;
> +	} ipf;
> +	struct tcphdr tcp;
> +} __packed;
> +
> struct dvlan_ipv6_pkt {
> 	struct ethhdr eth;
> 	__u16 vlan_tci;
> @@ -65,9 +81,11 @@ struct test {
> 		struct ipv4_pkt ipv4;
> 		struct svlan_ipv4_pkt svlan_ipv4;
> 		struct ipv6_pkt ipv6;
> +		struct ipv6_frag_pkt ipv6_frag;
> 		struct dvlan_ipv6_pkt dvlan_ipv6;
> 	} pkt;
> 	struct bpf_flow_keys keys;
> +	__u32 flags;
> };
> 
> #define VLAN_HLEN	4
> @@ -143,6 +161,102 @@ struct test tests[] = {
> 			.n_proto = __bpf_constant_htons(ETH_P_IPV6),
> 		},
> 	},
> +	{
> +		.name = "ipv4-frag",
> +		.pkt.ipv4 = {
> +			.eth.h_proto = __bpf_constant_htons(ETH_P_IP),
> +			.iph.ihl = 5,
> +			.iph.protocol = IPPROTO_TCP,
> +			.iph.tot_len = __bpf_constant_htons(MAGIC_BYTES),
> +			.iph.frag_off = __bpf_constant_htons(IP_MF),
> +			.tcp.doff = 5,
> +			.tcp.source = 80,
> +			.tcp.dest = 8080,
> +		},
> +		.keys = {
> +			.flags = FLOW_DISSECTOR_F_PARSE_1ST_FRAG,
> +			.nhoff = ETH_HLEN,
> +			.thoff = ETH_HLEN + sizeof(struct iphdr),
> +			.addr_proto = ETH_P_IP,
> +			.ip_proto = IPPROTO_TCP,
> +			.n_proto = __bpf_constant_htons(ETH_P_IP),
> +			.is_frag = true,
> +			.is_first_frag = true,
> +			.sport = 80,
> +			.dport = 8080,
> +		},
> +		.flags = FLOW_DISSECTOR_F_PARSE_1ST_FRAG,
> +	},
> +	{
> +		.name = "ipv4-no-frag",
> +		.pkt.ipv4 = {
> +			.eth.h_proto = __bpf_constant_htons(ETH_P_IP),
> +			.iph.ihl = 5,
> +			.iph.protocol = IPPROTO_TCP,
> +			.iph.tot_len = __bpf_constant_htons(MAGIC_BYTES),
> +			.iph.frag_off = __bpf_constant_htons(IP_MF),
> +			.tcp.doff = 5,
> +			.tcp.source = 80,
> +			.tcp.dest = 8080,
> +		},
> +		.keys = {
> +			.nhoff = ETH_HLEN,
> +			.thoff = ETH_HLEN + sizeof(struct iphdr),
> +			.addr_proto = ETH_P_IP,
> +			.ip_proto = IPPROTO_TCP,
> +			.n_proto = __bpf_constant_htons(ETH_P_IP),
> +			.is_frag = true,
> +			.is_first_frag = true,
> +		},
> +	},
> +	{
> +		.name = "ipv6-frag",
> +		.pkt.ipv6_frag = {
> +			.eth.h_proto = __bpf_constant_htons(ETH_P_IPV6),
> +			.iph.nexthdr = IPPROTO_FRAGMENT,
> +			.iph.payload_len = __bpf_constant_htons(MAGIC_BYTES),
> +			.ipf.nexthdr = IPPROTO_TCP,
> +			.tcp.doff = 5,
> +			.tcp.source = 80,
> +			.tcp.dest = 8080,
> +		},
> +		.keys = {
> +			.flags = FLOW_DISSECTOR_F_PARSE_1ST_FRAG,
> +			.nhoff = ETH_HLEN,
> +			.thoff = ETH_HLEN + sizeof(struct ipv6hdr) +
> +				sizeof(struct frag_hdr),
> +			.addr_proto = ETH_P_IPV6,
> +			.ip_proto = IPPROTO_TCP,
> +			.n_proto = __bpf_constant_htons(ETH_P_IPV6),
> +			.is_frag = true,
> +			.is_first_frag = true,
> +			.sport = 80,
> +			.dport = 8080,
> +		},
> +		.flags = FLOW_DISSECTOR_F_PARSE_1ST_FRAG,
> +	},
> +	{
> +		.name = "ipv6-no-frag",
> +		.pkt.ipv6_frag = {
> +			.eth.h_proto = __bpf_constant_htons(ETH_P_IPV6),
> +			.iph.nexthdr = IPPROTO_FRAGMENT,
> +			.iph.payload_len = __bpf_constant_htons(MAGIC_BYTES),
> +			.ipf.nexthdr = IPPROTO_TCP,
> +			.tcp.doff = 5,
> +			.tcp.source = 80,
> +			.tcp.dest = 8080,
> +		},
> +		.keys = {
> +			.nhoff = ETH_HLEN,
> +			.thoff = ETH_HLEN + sizeof(struct ipv6hdr) +
> +				sizeof(struct frag_hdr),
> +			.addr_proto = ETH_P_IPV6,
> +			.ip_proto = IPPROTO_TCP,
> +			.n_proto = __bpf_constant_htons(ETH_P_IPV6),
> +			.is_frag = true,
> +			.is_first_frag = true,
> +		},
> +	},
> };
> 
> static int create_tap(const char *ifname)
> @@ -225,6 +339,13 @@ void test_flow_dissector(void)
> 			.data_size_in = sizeof(tests[i].pkt),
> 			.data_out = &flow_keys,
> 		};
> +		static struct bpf_flow_keys ctx = {};
> +
> +		if (tests[i].flags) {
> +			tattr.ctx_in = &ctx;
> +			tattr.ctx_size_in = sizeof(ctx);
> +			ctx.flags = tests[i].flags;
> +		}
> 
> 		err = bpf_prog_test_run_xattr(&tattr);
> 		CHECK_ATTR(tattr.data_size_out != sizeof(flow_keys) ||
> @@ -251,10 +372,19 @@ void test_flow_dissector(void)
> 	CHECK(err, "ifup", "err %d errno %d\n", err, errno);
> 
> 	for (i = 0; i < ARRAY_SIZE(tests); i++) {
> -		struct bpf_flow_keys flow_keys = {};
> +		/* Keep in sync with 'flags' from eth_get_headlen. */
> +		__u32 eth_get_headlen_flags = FLOW_DISSECTOR_F_PARSE_1ST_FRAG;
> 		struct bpf_prog_test_run_attr tattr = {};
> +		struct bpf_flow_keys flow_keys = {};
> 		__u32 key = 0;
> 
> +		/* For skb-less case we can't pass input flags; run
> +		 * only the tests that have a matching set of flags.
> +		 */
> +
> +		if (tests[i].flags != eth_get_headlen_flags)
> +			continue;
> +
> 		err = tx_tap(tap_fd, &tests[i].pkt, sizeof(tests[i].pkt));
> 		CHECK(err < 0, "tx_tap", "err %d errno %d\n", err, errno);
> 
> diff --git a/tools/testing/selftests/bpf/progs/bpf_flow.c b/tools/testing/selftests/bpf/progs/bpf_flow.c
> index 5ae485a6af3f..0eabe5e57944 100644
> --- a/tools/testing/selftests/bpf/progs/bpf_flow.c
> +++ b/tools/testing/selftests/bpf/progs/bpf_flow.c
> @@ -153,7 +153,6 @@ static __always_inline int parse_ip_proto(struct __sk_buff *skb, __u8 proto)
> 	struct tcphdr *tcp, _tcp;
> 	struct udphdr *udp, _udp;
> 
> -	keys->ip_proto = proto;
> 	switch (proto) {
> 	case IPPROTO_ICMP:
> 		icmp = bpf_flow_dissect_get_header(skb, sizeof(*icmp), &_icmp);
> @@ -231,7 +230,6 @@ static __always_inline int parse_ipv6_proto(struct __sk_buff *skb, __u8 nexthdr)
> {
> 	struct bpf_flow_keys *keys = skb->flow_keys;
> 
> -	keys->ip_proto = nexthdr;
> 	switch (nexthdr) {
> 	case IPPROTO_HOPOPTS:
> 	case IPPROTO_DSTOPTS:
> @@ -266,6 +264,7 @@ PROG(IP)(struct __sk_buff *skb)
> 	keys->addr_proto = ETH_P_IP;
> 	keys->ipv4_src = iph->saddr;
> 	keys->ipv4_dst = iph->daddr;
> +	keys->ip_proto = iph->protocol;
> 
> 	keys->thoff += iph->ihl << 2;
> 	if (data + keys->thoff > data_end)
> @@ -273,13 +272,19 @@ PROG(IP)(struct __sk_buff *skb)
> 
> 	if (iph->frag_off & bpf_htons(IP_MF | IP_OFFSET)) {
> 		keys->is_frag = true;
> -		if (iph->frag_off & bpf_htons(IP_OFFSET))
> +		if (iph->frag_off & bpf_htons(IP_OFFSET)) {
> 			/* From second fragment on, packets do not have headers
> 			 * we can parse.
> 			 */
> 			done = true;
> -		else
> +		} else {
> 			keys->is_first_frag = true;
> +			/* No need to parse fragmented packet unless
> +			 * explicitly asked for.
> +			 */
> +			if (!(keys->flags & FLOW_DISSECTOR_F_PARSE_1ST_FRAG))
> +				done = true;
> +		}
> 	}
> 
> 	if (done)
> @@ -301,6 +306,7 @@ PROG(IPV6)(struct __sk_buff *skb)
> 	memcpy(&keys->ipv6_src, &ip6h->saddr, 2*sizeof(ip6h->saddr));
> 
> 	keys->thoff += sizeof(struct ipv6hdr);
> +	keys->ip_proto = ip6h->nexthdr;
> 
> 	return parse_ipv6_proto(skb, ip6h->nexthdr);
> }
> @@ -317,7 +323,8 @@ PROG(IPV6OP)(struct __sk_buff *skb)
> 	/* hlen is in 8-octets and does not include the first 8 bytes
> 	 * of the header
> 	 */
> -	skb->flow_keys->thoff += (1 + ip6h->hdrlen) << 3;
> +	keys->thoff += (1 + ip6h->hdrlen) << 3;
> +	keys->ip_proto = ip6h->nexthdr;
> 
> 	return parse_ipv6_proto(skb, ip6h->nexthdr);
> }
> @@ -333,9 +340,18 @@ PROG(IPV6FR)(struct __sk_buff *skb)
> 
> 	keys->thoff += sizeof(*fragh);
> 	keys->is_frag = true;
> -	if (!(fragh->frag_off & bpf_htons(IP6_OFFSET)))
> +	keys->ip_proto = fragh->nexthdr;
> +
> +	if (!(fragh->frag_off & bpf_htons(IP6_OFFSET))) {
> 		keys->is_first_frag = true;
> 
> +		/* No need to parse fragmented packet unless
> +		 * explicitly asked for.
> +		 */
> +		if (!(keys->flags & FLOW_DISSECTOR_F_PARSE_1ST_FRAG))
> +			return export_flow_keys(keys, BPF_OK);
> +	}
> +
> 	return parse_ipv6_proto(skb, fragh->nexthdr);
> }
> 
> -- 
> 2.22.0.657.g960e92d24f-goog
> 


^ permalink raw reply

* [PATCH 0/3] ath10k: Clean up regulator and clock handling
From: Bjorn Andersson @ 2019-07-25 17:47 UTC (permalink / raw)
  To: Kalle Valo, Govind Singh
  Cc: David S. Miller, ath10k, linux-wireless, netdev, linux-kernel,
	linux-arm-msm

The first patch in this series removes the regulator_set_voltage() of a fixed
voltate, as fixed regulator constraints should be specified on a board level
and on certain boards - such as the Lenovo Yoga C630 - the voltage specified
for the 3.3V regulator is outside the given range.

The following two patches cleans up regulator and clock usage by using the bulk
API provided by the two frameworks.

Bjorn Andersson (3):
  ath10k: snoc: skip regulator operations
  ath10k: Use standard regulator bulk API in snoc
  ath10k: Use standard bulk clock API in snoc

 drivers/net/wireless/ath/ath10k/snoc.c | 324 ++++---------------------
 drivers/net/wireless/ath/ath10k/snoc.h |  26 +-
 2 files changed, 48 insertions(+), 302 deletions(-)

-- 
2.18.0

^ permalink raw reply

* [PATCH 3/3] ath10k: Use standard bulk clock API in snoc
From: Bjorn Andersson @ 2019-07-25 17:47 UTC (permalink / raw)
  To: Kalle Valo, Govind Singh
  Cc: David S. Miller, ath10k, linux-wireless, netdev, linux-kernel,
	linux-arm-msm
In-Reply-To: <20190725174755.23432-1-bjorn.andersson@linaro.org>

No frequency is currently specified for the single clock defined in the
snoc driver, so the clock wrappers reimplements the standard bulk API
provided by the clock framework. Change to this.

The single clock defined is marked as optional so this version of the
get API is used, but might need to be reconsidered in the future.

Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
---
 drivers/net/wireless/ath/ath10k/snoc.c | 125 ++++---------------------
 drivers/net/wireless/ath/ath10k/snoc.h |  11 +--
 2 files changed, 21 insertions(+), 115 deletions(-)

diff --git a/drivers/net/wireless/ath/ath10k/snoc.c b/drivers/net/wireless/ath/ath10k/snoc.c
index 1c9ff7e53e2f..80ce68c0f75e 100644
--- a/drivers/net/wireless/ath/ath10k/snoc.c
+++ b/drivers/net/wireless/ath/ath10k/snoc.c
@@ -43,8 +43,8 @@ static const char * const ath10k_regulators[] = {
 	"vdd-3.3-ch0",
 };
 
-static struct ath10k_clk_info clk_cfg[] = {
-	{NULL, "cxo_ref_clk_pin", 0, false},
+static const char * const ath10k_clocks[] = {
+	"cxo_ref_clk_pin",
 };
 
 static void ath10k_snoc_htc_tx_cb(struct ath10k_ce_pipe *ce_state);
@@ -1346,104 +1346,6 @@ static void ath10k_snoc_release_resource(struct ath10k *ar)
 		ath10k_ce_free_pipe(ar, i);
 }
 
-static int ath10k_get_clk_info(struct ath10k *ar, struct device *dev,
-			       struct ath10k_clk_info *clk_info)
-{
-	struct clk *handle;
-	int ret = 0;
-
-	handle = devm_clk_get(dev, clk_info->name);
-	if (IS_ERR(handle)) {
-		ret = PTR_ERR(handle);
-		if (clk_info->required) {
-			ath10k_err(ar, "snoc clock %s isn't available: %d\n",
-				   clk_info->name, ret);
-			return ret;
-		}
-		ath10k_dbg(ar, ATH10K_DBG_SNOC, "snoc ignoring clock %s: %d\n",
-			   clk_info->name,
-			   ret);
-		return 0;
-	}
-
-	ath10k_dbg(ar, ATH10K_DBG_SNOC, "snoc clock %s freq %u\n",
-		   clk_info->name, clk_info->freq);
-
-	clk_info->handle = handle;
-
-	return ret;
-}
-
-static int ath10k_snoc_clk_init(struct ath10k *ar)
-{
-	struct ath10k_snoc *ar_snoc = ath10k_snoc_priv(ar);
-	struct ath10k_clk_info *clk_info;
-	int ret = 0;
-	int i;
-
-	for (i = 0; i < ARRAY_SIZE(clk_cfg); i++) {
-		clk_info = &ar_snoc->clk[i];
-
-		if (!clk_info->handle)
-			continue;
-
-		ath10k_dbg(ar, ATH10K_DBG_SNOC, "snoc clock %s being enabled\n",
-			   clk_info->name);
-
-		if (clk_info->freq) {
-			ret = clk_set_rate(clk_info->handle, clk_info->freq);
-
-			if (ret) {
-				ath10k_err(ar, "failed to set clock %s freq %u\n",
-					   clk_info->name, clk_info->freq);
-				goto err_clock_config;
-			}
-		}
-
-		ret = clk_prepare_enable(clk_info->handle);
-		if (ret) {
-			ath10k_err(ar, "failed to enable clock %s\n",
-				   clk_info->name);
-			goto err_clock_config;
-		}
-	}
-
-	return 0;
-
-err_clock_config:
-	for (i = i - 1; i >= 0; i--) {
-		clk_info = &ar_snoc->clk[i];
-
-		if (!clk_info->handle)
-			continue;
-
-		clk_disable_unprepare(clk_info->handle);
-	}
-
-	return ret;
-}
-
-static int ath10k_snoc_clk_deinit(struct ath10k *ar)
-{
-	struct ath10k_snoc *ar_snoc = ath10k_snoc_priv(ar);
-	struct ath10k_clk_info *clk_info;
-	int i;
-
-	for (i = 0; i < ARRAY_SIZE(clk_cfg); i++) {
-		clk_info = &ar_snoc->clk[i];
-
-		if (!clk_info->handle)
-			continue;
-
-		ath10k_dbg(ar, ATH10K_DBG_SNOC, "snoc clock %s being disabled\n",
-			   clk_info->name);
-
-		clk_disable_unprepare(clk_info->handle);
-	}
-
-	return 0;
-}
-
 static int ath10k_hw_power_on(struct ath10k *ar)
 {
 	struct ath10k_snoc *ar_snoc = ath10k_snoc_priv(ar);
@@ -1455,7 +1357,7 @@ static int ath10k_hw_power_on(struct ath10k *ar)
 	if (ret)
 		return ret;
 
-	ret = ath10k_snoc_clk_init(ar);
+	ret = clk_bulk_prepare_enable(ar_snoc->num_clks, ar_snoc->clks);
 	if (ret)
 		goto vreg_off;
 
@@ -1472,7 +1374,7 @@ static int ath10k_hw_power_off(struct ath10k *ar)
 
 	ath10k_dbg(ar, ATH10K_DBG_SNOC, "soc power off\n");
 
-	ath10k_snoc_clk_deinit(ar);
+	clk_bulk_disable_unprepare(ar_snoc->num_clks, ar_snoc->clks);
 
 	return regulator_bulk_disable(ar_snoc->num_vregs, ar_snoc->vregs);
 }
@@ -1560,13 +1462,22 @@ static int ath10k_snoc_probe(struct platform_device *pdev)
 	if (ret < 0)
 		goto err_free_irq;
 
-	ar_snoc->clk = clk_cfg;
-	for (i = 0; i < ARRAY_SIZE(clk_cfg); i++) {
-		ret = ath10k_get_clk_info(ar, dev, &ar_snoc->clk[i]);
-		if (ret)
-			goto err_free_irq;
+	ar_snoc->num_clks = ARRAY_SIZE(ath10k_clocks);
+	ar_snoc->clks = devm_kcalloc(&pdev->dev, ar_snoc->num_clks,
+				     sizeof(*ar_snoc->clks), GFP_KERNEL);
+	if (!ar_snoc->clks) {
+		ret = -ENOMEM;
+		goto err_free_irq;
 	}
 
+	for (i = 0; i < ar_snoc->num_clks; i++)
+		ar_snoc->clks[i].id = ath10k_clocks[i];
+
+	ret = devm_clk_bulk_get_optional(&pdev->dev, ar_snoc->num_clks,
+					 ar_snoc->clks);
+	if (ret)
+		goto err_free_irq;
+
 	ret = ath10k_hw_power_on(ar);
 	if (ret) {
 		ath10k_err(ar, "failed to power on device: %d\n", ret);
diff --git a/drivers/net/wireless/ath/ath10k/snoc.h b/drivers/net/wireless/ath/ath10k/snoc.h
index 3965ddf66d74..d2449a3b4a8f 100644
--- a/drivers/net/wireless/ath/ath10k/snoc.h
+++ b/drivers/net/wireless/ath/ath10k/snoc.h
@@ -42,13 +42,6 @@ struct ath10k_snoc_ce_irq {
 	u32 irq_line;
 };
 
-struct ath10k_clk_info {
-	struct clk *handle;
-	const char *name;
-	u32 freq;
-	bool required;
-};
-
 enum ath10k_snoc_flags {
 	ATH10K_SNOC_FLAG_REGISTERED,
 	ATH10K_SNOC_FLAG_UNREGISTERING,
@@ -56,6 +49,7 @@ enum ath10k_snoc_flags {
 	ATH10K_SNOC_FLAG_8BIT_HOST_CAP_QUIRK,
 };
 
+struct clk_bulk_data;
 struct regulator_bulk_data;
 
 struct ath10k_snoc {
@@ -71,7 +65,8 @@ struct ath10k_snoc {
 	struct timer_list rx_post_retry;
 	struct regulator_bulk_data *vregs;
 	size_t num_vregs;
-	struct ath10k_clk_info *clk;
+	struct clk_bulk_data *clks;
+	size_t num_clks;
 	struct ath10k_qmi *qmi;
 	unsigned long flags;
 };
-- 
2.18.0


^ permalink raw reply related

* [PATCH 1/3] ath10k: snoc: skip regulator operations
From: Bjorn Andersson @ 2019-07-25 17:47 UTC (permalink / raw)
  To: Kalle Valo, Govind Singh
  Cc: David S. Miller, ath10k, linux-wireless, netdev, linux-kernel,
	linux-arm-msm
In-Reply-To: <20190725174755.23432-1-bjorn.andersson@linaro.org>

The regulator operations is trying to set a voltage to a fixed value, by
giving some wiggle room. But some board designs specifies regulator
voltages outside this limited range. One such example is the Lenovo Yoga
C630, with vdd-3.3-ch0 in particular specified at 3.1V.

But consumers with fixed voltage requirements should just rely on the
board configuration to provide the power at the required level, so this
code should be removed.

Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
---

This patch is required for Lenovo Yoga C630 to succeed in power on ath10k, it
can be merged independently of the two following cleanup patches.

 drivers/net/wireless/ath/ath10k/snoc.c | 27 ++++++--------------------
 drivers/net/wireless/ath/ath10k/snoc.h |  2 --
 2 files changed, 6 insertions(+), 23 deletions(-)

diff --git a/drivers/net/wireless/ath/ath10k/snoc.c b/drivers/net/wireless/ath/ath10k/snoc.c
index fc15a0037f0e..3d93ab09a298 100644
--- a/drivers/net/wireless/ath/ath10k/snoc.c
+++ b/drivers/net/wireless/ath/ath10k/snoc.c
@@ -37,10 +37,10 @@ static char *const ce_name[] = {
 };
 
 static struct ath10k_vreg_info vreg_cfg[] = {
-	{NULL, "vdd-0.8-cx-mx", 800000, 850000, 0, 0, false},
-	{NULL, "vdd-1.8-xo", 1800000, 1850000, 0, 0, false},
-	{NULL, "vdd-1.3-rfa", 1300000, 1350000, 0, 0, false},
-	{NULL, "vdd-3.3-ch0", 3300000, 3350000, 0, 0, false},
+	{NULL, "vdd-0.8-cx-mx", 0, 0, false},
+	{NULL, "vdd-1.8-xo", 0, 0, false},
+	{NULL, "vdd-1.3-rfa", 0, 0, false},
+	{NULL, "vdd-3.3-ch0", 0, 0, false},
 };
 
 static struct ath10k_clk_info clk_cfg[] = {
@@ -1377,9 +1377,8 @@ static int ath10k_get_vreg_info(struct ath10k *ar, struct device *dev,
 
 done:
 	ath10k_dbg(ar, ATH10K_DBG_SNOC,
-		   "snog vreg %s min_v %u max_v %u load_ua %u settle_delay %lu\n",
-		   vreg_info->name, vreg_info->min_v, vreg_info->max_v,
-		   vreg_info->load_ua, vreg_info->settle_delay);
+		   "snog vreg %s load_ua %u settle_delay %lu\n",
+		   vreg_info->name, vreg_info->load_ua, vreg_info->settle_delay);
 
 	return 0;
 }
@@ -1420,15 +1419,6 @@ static int __ath10k_snoc_vreg_on(struct ath10k *ar,
 	ath10k_dbg(ar, ATH10K_DBG_SNOC, "snoc regulator %s being enabled\n",
 		   vreg_info->name);
 
-	ret = regulator_set_voltage(vreg_info->reg, vreg_info->min_v,
-				    vreg_info->max_v);
-	if (ret) {
-		ath10k_err(ar,
-			   "failed to set regulator %s voltage-min: %d voltage-max: %d\n",
-			   vreg_info->name, vreg_info->min_v, vreg_info->max_v);
-		return ret;
-	}
-
 	if (vreg_info->load_ua) {
 		ret = regulator_set_load(vreg_info->reg, vreg_info->load_ua);
 		if (ret < 0) {
@@ -1453,7 +1443,6 @@ static int __ath10k_snoc_vreg_on(struct ath10k *ar,
 err_enable:
 	regulator_set_load(vreg_info->reg, 0);
 err_set_load:
-	regulator_set_voltage(vreg_info->reg, 0, vreg_info->max_v);
 
 	return ret;
 }
@@ -1475,10 +1464,6 @@ static int __ath10k_snoc_vreg_off(struct ath10k *ar,
 	if (ret < 0)
 		ath10k_err(ar, "failed to set load %s\n", vreg_info->name);
 
-	ret = regulator_set_voltage(vreg_info->reg, 0, vreg_info->max_v);
-	if (ret)
-		ath10k_err(ar, "failed to set voltage %s\n", vreg_info->name);
-
 	return ret;
 }
 
diff --git a/drivers/net/wireless/ath/ath10k/snoc.h b/drivers/net/wireless/ath/ath10k/snoc.h
index 9db823e46314..1bf7bd4ef2a3 100644
--- a/drivers/net/wireless/ath/ath10k/snoc.h
+++ b/drivers/net/wireless/ath/ath10k/snoc.h
@@ -45,8 +45,6 @@ struct ath10k_snoc_ce_irq {
 struct ath10k_vreg_info {
 	struct regulator *reg;
 	const char *name;
-	u32 min_v;
-	u32 max_v;
 	u32 load_ua;
 	unsigned long settle_delay;
 	bool required;
-- 
2.18.0


^ permalink raw reply related

* [PATCH 2/3] ath10k: Use standard regulator bulk API in snoc
From: Bjorn Andersson @ 2019-07-25 17:47 UTC (permalink / raw)
  To: Kalle Valo, Govind Singh
  Cc: David S. Miller, ath10k, linux-wireless, netdev, linux-kernel,
	linux-arm-msm
In-Reply-To: <20190725174755.23432-1-bjorn.andersson@linaro.org>

The regulator_get_optional() exists for cases where the driver needs do
behave differently depending on some regulator supply being present or
not, as we don't use this we can use the standard regulator_get() and
rely on its handling of unspecified regulators.

While the driver currently doesn't specify any loads the regulator
framework was updated last year to only account for load of enabled
regulators, so should the need appear it's better to apply load numbers
during initialization that dynamically.

With this the regulator wrappers have been reduced the become identical
to the standard bulk API provided by the regulator framework, so use
these instead of rolling our own.

Signed-off-by: Bjorn Andersson <bjorn.andersson@linaro.org>
---
 drivers/net/wireless/ath/ath10k/snoc.c | 184 ++++---------------------
 drivers/net/wireless/ath/ath10k/snoc.h |  13 +-
 2 files changed, 27 insertions(+), 170 deletions(-)

diff --git a/drivers/net/wireless/ath/ath10k/snoc.c b/drivers/net/wireless/ath/ath10k/snoc.c
index 3d93ab09a298..1c9ff7e53e2f 100644
--- a/drivers/net/wireless/ath/ath10k/snoc.c
+++ b/drivers/net/wireless/ath/ath10k/snoc.c
@@ -36,11 +36,11 @@ static char *const ce_name[] = {
 	"WLAN_CE_11",
 };
 
-static struct ath10k_vreg_info vreg_cfg[] = {
-	{NULL, "vdd-0.8-cx-mx", 0, 0, false},
-	{NULL, "vdd-1.8-xo", 0, 0, false},
-	{NULL, "vdd-1.3-rfa", 0, 0, false},
-	{NULL, "vdd-3.3-ch0", 0, 0, false},
+static const char * const ath10k_regulators[] = {
+	"vdd-0.8-cx-mx",
+	"vdd-1.8-xo",
+	"vdd-1.3-rfa",
+	"vdd-3.3-ch0",
 };
 
 static struct ath10k_clk_info clk_cfg[] = {
@@ -1346,43 +1346,6 @@ static void ath10k_snoc_release_resource(struct ath10k *ar)
 		ath10k_ce_free_pipe(ar, i);
 }
 
-static int ath10k_get_vreg_info(struct ath10k *ar, struct device *dev,
-				struct ath10k_vreg_info *vreg_info)
-{
-	struct regulator *reg;
-	int ret = 0;
-
-	reg = devm_regulator_get_optional(dev, vreg_info->name);
-
-	if (IS_ERR(reg)) {
-		ret = PTR_ERR(reg);
-
-		if (ret  == -EPROBE_DEFER) {
-			ath10k_err(ar, "EPROBE_DEFER for regulator: %s\n",
-				   vreg_info->name);
-			return ret;
-		}
-		if (vreg_info->required) {
-			ath10k_err(ar, "Regulator %s doesn't exist: %d\n",
-				   vreg_info->name, ret);
-			return ret;
-		}
-		ath10k_dbg(ar, ATH10K_DBG_SNOC,
-			   "Optional regulator %s doesn't exist: %d\n",
-			   vreg_info->name, ret);
-		goto done;
-	}
-
-	vreg_info->reg = reg;
-
-done:
-	ath10k_dbg(ar, ATH10K_DBG_SNOC,
-		   "snog vreg %s load_ua %u settle_delay %lu\n",
-		   vreg_info->name, vreg_info->load_ua, vreg_info->settle_delay);
-
-	return 0;
-}
-
 static int ath10k_get_clk_info(struct ath10k *ar, struct device *dev,
 			       struct ath10k_clk_info *clk_info)
 {
@@ -1411,114 +1374,6 @@ static int ath10k_get_clk_info(struct ath10k *ar, struct device *dev,
 	return ret;
 }
 
-static int __ath10k_snoc_vreg_on(struct ath10k *ar,
-				 struct ath10k_vreg_info *vreg_info)
-{
-	int ret;
-
-	ath10k_dbg(ar, ATH10K_DBG_SNOC, "snoc regulator %s being enabled\n",
-		   vreg_info->name);
-
-	if (vreg_info->load_ua) {
-		ret = regulator_set_load(vreg_info->reg, vreg_info->load_ua);
-		if (ret < 0) {
-			ath10k_err(ar, "failed to set regulator %s load: %d\n",
-				   vreg_info->name, vreg_info->load_ua);
-			goto err_set_load;
-		}
-	}
-
-	ret = regulator_enable(vreg_info->reg);
-	if (ret) {
-		ath10k_err(ar, "failed to enable regulator %s\n",
-			   vreg_info->name);
-		goto err_enable;
-	}
-
-	if (vreg_info->settle_delay)
-		udelay(vreg_info->settle_delay);
-
-	return 0;
-
-err_enable:
-	regulator_set_load(vreg_info->reg, 0);
-err_set_load:
-
-	return ret;
-}
-
-static int __ath10k_snoc_vreg_off(struct ath10k *ar,
-				  struct ath10k_vreg_info *vreg_info)
-{
-	int ret;
-
-	ath10k_dbg(ar, ATH10K_DBG_SNOC, "snoc regulator %s being disabled\n",
-		   vreg_info->name);
-
-	ret = regulator_disable(vreg_info->reg);
-	if (ret)
-		ath10k_err(ar, "failed to disable regulator %s\n",
-			   vreg_info->name);
-
-	ret = regulator_set_load(vreg_info->reg, 0);
-	if (ret < 0)
-		ath10k_err(ar, "failed to set load %s\n", vreg_info->name);
-
-	return ret;
-}
-
-static int ath10k_snoc_vreg_on(struct ath10k *ar)
-{
-	struct ath10k_snoc *ar_snoc = ath10k_snoc_priv(ar);
-	struct ath10k_vreg_info *vreg_info;
-	int ret = 0;
-	int i;
-
-	for (i = 0; i < ARRAY_SIZE(vreg_cfg); i++) {
-		vreg_info = &ar_snoc->vreg[i];
-
-		if (!vreg_info->reg)
-			continue;
-
-		ret = __ath10k_snoc_vreg_on(ar, vreg_info);
-		if (ret)
-			goto err_reg_config;
-	}
-
-	return 0;
-
-err_reg_config:
-	for (i = i - 1; i >= 0; i--) {
-		vreg_info = &ar_snoc->vreg[i];
-
-		if (!vreg_info->reg)
-			continue;
-
-		__ath10k_snoc_vreg_off(ar, vreg_info);
-	}
-
-	return ret;
-}
-
-static int ath10k_snoc_vreg_off(struct ath10k *ar)
-{
-	struct ath10k_snoc *ar_snoc = ath10k_snoc_priv(ar);
-	struct ath10k_vreg_info *vreg_info;
-	int ret = 0;
-	int i;
-
-	for (i = ARRAY_SIZE(vreg_cfg) - 1; i >= 0; i--) {
-		vreg_info = &ar_snoc->vreg[i];
-
-		if (!vreg_info->reg)
-			continue;
-
-		ret = __ath10k_snoc_vreg_off(ar, vreg_info);
-	}
-
-	return ret;
-}
-
 static int ath10k_snoc_clk_init(struct ath10k *ar)
 {
 	struct ath10k_snoc *ar_snoc = ath10k_snoc_priv(ar);
@@ -1591,11 +1446,12 @@ static int ath10k_snoc_clk_deinit(struct ath10k *ar)
 
 static int ath10k_hw_power_on(struct ath10k *ar)
 {
+	struct ath10k_snoc *ar_snoc = ath10k_snoc_priv(ar);
 	int ret;
 
 	ath10k_dbg(ar, ATH10K_DBG_SNOC, "soc power on\n");
 
-	ret = ath10k_snoc_vreg_on(ar);
+	ret = regulator_bulk_enable(ar_snoc->num_vregs, ar_snoc->vregs);
 	if (ret)
 		return ret;
 
@@ -1606,21 +1462,19 @@ static int ath10k_hw_power_on(struct ath10k *ar)
 	return ret;
 
 vreg_off:
-	ath10k_snoc_vreg_off(ar);
+	regulator_bulk_disable(ar_snoc->num_vregs, ar_snoc->vregs);
 	return ret;
 }
 
 static int ath10k_hw_power_off(struct ath10k *ar)
 {
-	int ret;
+	struct ath10k_snoc *ar_snoc = ath10k_snoc_priv(ar);
 
 	ath10k_dbg(ar, ATH10K_DBG_SNOC, "soc power off\n");
 
 	ath10k_snoc_clk_deinit(ar);
 
-	ret = ath10k_snoc_vreg_off(ar);
-
-	return ret;
+	return regulator_bulk_disable(ar_snoc->num_vregs, ar_snoc->vregs);
 }
 
 static const struct of_device_id ath10k_snoc_dt_match[] = {
@@ -1691,12 +1545,20 @@ static int ath10k_snoc_probe(struct platform_device *pdev)
 		goto err_release_resource;
 	}
 
-	ar_snoc->vreg = vreg_cfg;
-	for (i = 0; i < ARRAY_SIZE(vreg_cfg); i++) {
-		ret = ath10k_get_vreg_info(ar, dev, &ar_snoc->vreg[i]);
-		if (ret)
-			goto err_free_irq;
+	ar_snoc->num_vregs = ARRAY_SIZE(ath10k_regulators);
+	ar_snoc->vregs = devm_kcalloc(&pdev->dev, ar_snoc->num_vregs,
+				      sizeof(*ar_snoc->vregs), GFP_KERNEL);
+	if (!ar_snoc->vregs) {
+		ret = -ENOMEM;
+		goto err_free_irq;
 	}
+	for (i = 0; i < ar_snoc->num_vregs; i++)
+		ar_snoc->vregs[i].supply = ath10k_regulators[i];
+
+	ret = devm_regulator_bulk_get(&pdev->dev, ar_snoc->num_vregs,
+				      ar_snoc->vregs);
+	if (ret < 0)
+		goto err_free_irq;
 
 	ar_snoc->clk = clk_cfg;
 	for (i = 0; i < ARRAY_SIZE(clk_cfg); i++) {
diff --git a/drivers/net/wireless/ath/ath10k/snoc.h b/drivers/net/wireless/ath/ath10k/snoc.h
index 1bf7bd4ef2a3..3965ddf66d74 100644
--- a/drivers/net/wireless/ath/ath10k/snoc.h
+++ b/drivers/net/wireless/ath/ath10k/snoc.h
@@ -42,14 +42,6 @@ struct ath10k_snoc_ce_irq {
 	u32 irq_line;
 };
 
-struct ath10k_vreg_info {
-	struct regulator *reg;
-	const char *name;
-	u32 load_ua;
-	unsigned long settle_delay;
-	bool required;
-};
-
 struct ath10k_clk_info {
 	struct clk *handle;
 	const char *name;
@@ -64,6 +56,8 @@ enum ath10k_snoc_flags {
 	ATH10K_SNOC_FLAG_8BIT_HOST_CAP_QUIRK,
 };
 
+struct regulator_bulk_data;
+
 struct ath10k_snoc {
 	struct platform_device *dev;
 	struct ath10k *ar;
@@ -75,7 +69,8 @@ struct ath10k_snoc {
 	struct ath10k_snoc_ce_irq ce_irqs[CE_COUNT_MAX];
 	struct ath10k_ce ce;
 	struct timer_list rx_post_retry;
-	struct ath10k_vreg_info *vreg;
+	struct regulator_bulk_data *vregs;
+	size_t num_vregs;
 	struct ath10k_clk_info *clk;
 	struct ath10k_qmi *qmi;
 	unsigned long flags;
-- 
2.18.0


^ permalink raw reply related

* Re: [RFC] dt-bindings: net: phy: Add subnode for LED configuration
From: Matthias Kaehlcke @ 2019-07-25 17:52 UTC (permalink / raw)
  To: Andrew Lunn
  Cc: David S . Miller, Rob Herring, Mark Rutland, Florian Fainelli,
	Heiner Kallweit, netdev, devicetree, linux-kernel,
	Douglas Anderson
In-Reply-To: <20190724180430.GB28488@lunn.ch>

Hi Andrew,

On Wed, Jul 24, 2019 at 08:04:30PM +0200, Andrew Lunn wrote:
> On Mon, Jul 22, 2019 at 03:37:41PM -0700, Matthias Kaehlcke wrote:
> > The LED behavior of some Ethernet PHYs is configurable. Add an
> > optional 'leds' subnode with a child node for each LED to be
> > configured. The binding aims to be compatible with the common
> > LED binding (see devicetree/bindings/leds/common.txt).
> > 
> > A LED can be configured to be 'on' when a link with a certain speed
> > is active, or to blink on RX/TX activity. For the configuration to
> > be effective it needs to be supported by the hardware and the
> > corresponding PHY driver.
> > 
> > Suggested-by: Andrew Lunn <andrew@lunn.ch>
> > Signed-off-by: Matthias Kaehlcke <mka@chromium.org>
> > ---
> > This RFC is a follow up of the discussion on "[PATCH v2 6/7]
> > dt-bindings: net: realtek: Add property to configure LED mode"
> > (https://lore.kernel.org/patchwork/patch/1097185/).
> > 
> > For now posting as RFC to get a basic agreement on the bindings
> > before proceding with the implementation in phylib and a specific
> > driver.
> > ---
> >  Documentation/devicetree/bindings/net/phy.txt | 33 +++++++++++++++++++
> >  1 file changed, 33 insertions(+)
> > 
> > diff --git a/Documentation/devicetree/bindings/net/phy.txt b/Documentation/devicetree/bindings/net/phy.txt
> > index 9b9e5b1765dd..ad495d3abbbb 100644
> > --- a/Documentation/devicetree/bindings/net/phy.txt
> > +++ b/Documentation/devicetree/bindings/net/phy.txt
> > @@ -46,6 +46,25 @@ Optional Properties:
> >    Mark the corresponding energy efficient ethernet mode as broken and
> >    request the ethernet to stop advertising it.
> >  
> > +- leds: A sub-node which is a container of only LED nodes. Each child
> > +    node represents a PHY LED.
> > +
> > +  Required properties for LED child nodes:
> > +  - reg: The ID number of the LED, typically corresponds to a hardware ID.
> > +
> > +  Optional properties for child nodes:
> > +  - label: The label for this LED. If omitted, the label is taken from the node
> > +    name (excluding the unit address). It has to uniquely identify a device,
> > +    i.e. no other LED class device can be assigned the same label.
> 
> Hi Matthias
> 
> I've thought about label a bit more. 
> 
> > +			label = "ethphy0:left:green";
> 
> We need to be careful with names here. systemd etc renames
> interfaces. ethphy0 could in fact be connected to enp3s0, or eth0
> might get renamed to eth1, etc. So i think we should avoid things like
> ethphy0.

Agreed, this could be problematic.

> Also, i'm not sure we actually need a label, at least not to
> start with.Do we have any way to expose it to the user?

As of now I don't plan to expose the label to userspace by the PHY
driver/framework itself.

From my side we can omit the label for now and add it later if needed.

> If we do ever make it part of the generic LED framework, we can then
> use the label. At that point, i would probably combine the label with
> the interface name in a dynamic way to avoid issues like this.

Sounds good.

Thanks

Matthias

^ permalink raw reply

* Re: [PATCH v6 rdma-next 1/6] RDMA/core: Create mmap database and cookie helper functions
From: Jason Gunthorpe @ 2019-07-25 17:55 UTC (permalink / raw)
  To: Michal Kalderon, Kamal Heib
  Cc: ariel.elior, dledford, galpress, linux-rdma, davem, netdev
In-Reply-To: <20190709141735.19193-2-michal.kalderon@marvell.com>

On Tue, Jul 09, 2019 at 05:17:30PM +0300, Michal Kalderon wrote:
> Create some common API's for adding entries to a xa_mmap.
> Searching for an entry and freeing one.
> 
> The code was copied from the efa driver almost as is, just renamed
> function to be generic and not efa specific.
> 
> Signed-off-by: Ariel Elior <ariel.elior@marvell.com>
> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
>  drivers/infiniband/core/device.c      |   1 +
>  drivers/infiniband/core/rdma_core.c   |   1 +
>  drivers/infiniband/core/uverbs_cmd.c  |   1 +
>  drivers/infiniband/core/uverbs_main.c | 135 ++++++++++++++++++++++++++++++++++
>  include/rdma/ib_verbs.h               |  46 ++++++++++++
>  5 files changed, 184 insertions(+)
> 
> diff --git a/drivers/infiniband/core/device.c b/drivers/infiniband/core/device.c
> index 8a6ccb936dfe..a830c2c5d691 100644
> +++ b/drivers/infiniband/core/device.c
> @@ -2521,6 +2521,7 @@ void ib_set_device_ops(struct ib_device *dev, const struct ib_device_ops *ops)
>  	SET_DEVICE_OP(dev_ops, map_mr_sg_pi);
>  	SET_DEVICE_OP(dev_ops, map_phys_fmr);
>  	SET_DEVICE_OP(dev_ops, mmap);
> +	SET_DEVICE_OP(dev_ops, mmap_free);
>  	SET_DEVICE_OP(dev_ops, modify_ah);
>  	SET_DEVICE_OP(dev_ops, modify_cq);
>  	SET_DEVICE_OP(dev_ops, modify_device);
> diff --git a/drivers/infiniband/core/rdma_core.c b/drivers/infiniband/core/rdma_core.c
> index ccf4d069c25c..1ed01b02401f 100644
> +++ b/drivers/infiniband/core/rdma_core.c
> @@ -816,6 +816,7 @@ static void ufile_destroy_ucontext(struct ib_uverbs_file *ufile,
>  
>  	rdma_restrack_del(&ucontext->res);
>  
> +	rdma_user_mmap_entries_remove_free(ucontext);
>  	ib_dev->ops.dealloc_ucontext(ucontext);
>  	kfree(ucontext);
>  
> diff --git a/drivers/infiniband/core/uverbs_cmd.c b/drivers/infiniband/core/uverbs_cmd.c
> index 7ddd0e5bc6b3..44c0600245e4 100644
> +++ b/drivers/infiniband/core/uverbs_cmd.c
> @@ -254,6 +254,7 @@ static int ib_uverbs_get_context(struct uverbs_attr_bundle *attrs)
>  
>  	mutex_init(&ucontext->per_mm_list_lock);
>  	INIT_LIST_HEAD(&ucontext->per_mm_list);
> +	xa_init(&ucontext->mmap_xa);
>  
>  	ret = get_unused_fd_flags(O_CLOEXEC);
>  	if (ret < 0)
> diff --git a/drivers/infiniband/core/uverbs_main.c b/drivers/infiniband/core/uverbs_main.c
> index 11c13c1381cf..4b909d7b97de 100644
> +++ b/drivers/infiniband/core/uverbs_main.c
> @@ -965,6 +965,141 @@ int rdma_user_mmap_io(struct ib_ucontext *ucontext, struct vm_area_struct *vma,
>  }
>  EXPORT_SYMBOL(rdma_user_mmap_io);
>  
> +static inline u64
> +rdma_user_mmap_get_key(const struct rdma_user_mmap_entry *entry)
> +{
> +	return (u64)entry->mmap_page << PAGE_SHIFT;
> +}
> +
> +/**
> + * rdma_user_mmap_entry_get() - Get an entry from the mmap_xa.
> + *
> + * @ucontext: associated user context.
> + * @key: The key received from rdma_user_mmap_entry_insert which
> + *     is provided by user as the address to map.
> + * @len: The length the user wants to map
> + *
> + * This function is called when a user tries to mmap a key it
> + * initially received from the driver. They key was created by
> + * the function rdma_user_mmap_entry_insert.
> + *
> + * Return an entry if exists or NULL if there is no match.
> + */
> +struct rdma_user_mmap_entry *
> +rdma_user_mmap_entry_get(struct ib_ucontext *ucontext, u64 key, u64 len)
> +{
> +	struct rdma_user_mmap_entry *entry;
> +	u64 mmap_page;
> +
> +	mmap_page = key >> PAGE_SHIFT;
> +	if (mmap_page > U32_MAX)
> +		return NULL;
> +
> +	entry = xa_load(&ucontext->mmap_xa, mmap_page);
> +	if (!entry || entry->length != len)
> +		return NULL;
> +
> +	ibdev_dbg(ucontext->device,
> +		  "mmap: obj[0x%p] key[%#llx] addr[%#llx] len[%#llx] removed\n",
> +		  entry->obj, key, entry->address, entry->length);
> +
> +	return entry;
> +}
> +EXPORT_SYMBOL(rdma_user_mmap_entry_get);

It is a mistake we keep making, and maybe the war is hopelessly lost
now, but functions called from a driver should not be part of the
ib_uverbs module - ideally uverbs is an optional module. They should
be in ib_core.

Maybe put this in ib_core_uverbs.c ?

Kamal, you've been tackling various cleanups, maybe making ib_uverbs
unloadable again is something you'd be keen on?

> +/**
> + * rdma_user_mmap_entry_insert() - Allocate and insert an entry to the mmap_xa.
> + *
> + * @ucontext: associated user context.
> + * @obj: opaque driver object that will be stored in the entry.
> + * @address: The address that will be mmapped to the user
> + * @length: Length of the address that will be mmapped
> + * @mmap_flag: opaque driver flags related to the address (For
> + *           example could be used for cachability)
> + *
> + * This function should be called by drivers that use the rdma_user_mmap
> + * interface for handling user mmapped addresses. The database is handled in
> + * the core and helper functions are provided to insert entries into the
> + * database and extract entries when the user call mmap with the given key.
> + * The function returns a unique key that should be provided to user, the user
> + * will use the key to map the given address.
> + *
> + * Note this locking scheme cannot support removal of entries,
> + * except during ucontext destruction when the core code
> + * guarentees no concurrency.
> + *
> + * Return: unique key or RDMA_USER_MMAP_INVALID if entry was not added.
> + */
> +u64 rdma_user_mmap_entry_insert(struct ib_ucontext *ucontext, void *obj,
> +				u64 address, u64 length, u8 mmap_flag)
> +{
> +	struct rdma_user_mmap_entry *entry;
> +	u32 next_mmap_page;
> +	int err;
> +
> +	entry = kzalloc(sizeof(*entry), GFP_KERNEL);
> +	if (!entry)
> +		return RDMA_USER_MMAP_INVALID;
> +
> +	entry->obj = obj;
> +	entry->address = address;
> +	entry->length = length;
> +	entry->mmap_flag = mmap_flag;
> +
> +	xa_lock(&ucontext->mmap_xa);
> +	if (check_add_overflow(ucontext->mmap_xa_page,
> +			       (u32)(length >> PAGE_SHIFT),

Should this be divide round up ?

> +			       &next_mmap_page))
> +		goto err_unlock;

I still don't like that this algorithm latches into a permanent
failure when the xa_page wraps.

It seems worth spending a bit more time here to tidy this.. Keep using
the mmap_xa_page scheme, but instead do something like

alloc_cyclic_range():

while () {
   // Find first empty element in a cyclic way
   xa_page_first = mmap_xa_page;
   xa_find(xa, &xa_page_first, U32_MAX, XA_FREE_MARK)

   // Is there a enough room to have the range?
   if (check_add_overflow(xa_page_first, npages, &xa_page_end)) {
      mmap_xa_page = 0;
      continue;
   }

   // See if the element before intersects 
   elm = xa_find(xa, &zero, xa_page_end, 0);
   if (elm && intersects(xa_page_first, xa_page_last, elm->first, elm->last)) {
      mmap_xa_page = elm->last + 1;
      continue
   }
  
   // xa_page_first -> xa_page_end should now be free
   xa_insert(xa, xa_page_start, entry);
   mmap_xa_page = xa_page_end + 1;
   return xa_page_start;
}

Approximately, please check it.

> @@ -2199,6 +2201,17 @@ struct iw_cm_conn_param;
>  
>  #define DECLARE_RDMA_OBJ_SIZE(ib_struct) size_t size_##ib_struct
>  
> +#define RDMA_USER_MMAP_FLAG_SHIFT 56
> +#define RDMA_USER_MMAP_PAGE_MASK GENMASK(EFA_MMAP_FLAG_SHIFT - 1, 0)
> +#define RDMA_USER_MMAP_INVALID U64_MAX
> +struct rdma_user_mmap_entry {
> +	void *obj;
> +	u64 address;
> +	u64 length;
> +	u32 mmap_page;
> +	u8 mmap_flag;
> +};
> +
>  /**
>   * struct ib_device_ops - InfiniBand device operations
>   * This structure defines all the InfiniBand device operations, providers will
> @@ -2311,6 +2324,19 @@ struct ib_device_ops {
>  			      struct ib_udata *udata);
>  	void (*dealloc_ucontext)(struct ib_ucontext *context);
>  	int (*mmap)(struct ib_ucontext *context, struct vm_area_struct *vma);
> +	/**
> +	 * Memory that is mapped to the user can only be freed once the
> +	 * ucontext of the application is destroyed. This is for
> +	 * security reasons where we don't want an application to have a
> +	 * mapping to phyiscal memory that is freed and allocated to
> +	 * another application. For this reason, all the entries are
> +	 * stored in ucontext and once ucontext is freed mmap_free is
> +	 * called on each of the entries. They type of the memory that

They -> the

> +	 * was mapped may differ between entries and is opaque to the
> +	 * rdma_user_mmap interface. Therefore needs to be implemented
> +	 * by the driver in mmap_free.
> +	 */
> +	void (*mmap_free)(struct rdma_user_mmap_entry *entry);
>  	void (*disassociate_ucontext)(struct ib_ucontext *ibcontext);
>  	int (*alloc_pd)(struct ib_pd *pd, struct ib_udata *udata);
>  	void (*dealloc_pd)(struct ib_pd *pd, struct ib_udata *udata);
> @@ -2709,6 +2735,11 @@ void ib_set_device_ops(struct ib_device *device,
>  #if IS_ENABLED(CONFIG_INFINIBAND_USER_ACCESS)
>  int rdma_user_mmap_io(struct ib_ucontext *ucontext, struct vm_area_struct *vma,
>  		      unsigned long pfn, unsigned long size, pgprot_t prot);
> +u64 rdma_user_mmap_entry_insert(struct ib_ucontext *ucontext, void *obj,
> +				u64 address, u64 length, u8 mmap_flag);
> +struct rdma_user_mmap_entry *
> +rdma_user_mmap_entry_get(struct ib_ucontext *ucontext, u64 key, u64 len);
> +void rdma_user_mmap_entries_remove_free(struct ib_ucontext
> *ucontext);

Should remove_free should be in the core-priv header?

Jason

^ permalink raw reply

* Re: [PATCH v6 rdma-next 5/6] RDMA/qedr: Add doorbell overflow recovery support
From: Jason Gunthorpe @ 2019-07-25 18:01 UTC (permalink / raw)
  To: Michal Kalderon
  Cc: ariel.elior, dledford, galpress, linux-rdma, davem, netdev
In-Reply-To: <20190709141735.19193-6-michal.kalderon@marvell.com>

On Tue, Jul 09, 2019 at 05:17:34PM +0300, Michal Kalderon wrote:

> +static int qedr_init_user_db_rec(struct ib_udata *udata,
> +				 struct qedr_dev *dev, struct qedr_userq *q,
> +				 bool requires_db_rec)
> +{
> +	struct qedr_ucontext *uctx =
> +		rdma_udata_to_drv_context(udata, struct qedr_ucontext,
> +					  ibucontext);
> +
> +	/* Aborting for non doorbell userqueue (SRQ) or non-supporting lib */
> +	if (requires_db_rec == 0 || !uctx->db_rec)
> +		return 0;
> +
> +	/* Allocate a page for doorbell recovery, add to mmap ) */
> +	q->db_rec_data = (void *)get_zeroed_page(GFP_KERNEL);

I now think this needs to be GFP_USER and our other drivers have a bug
here as well..

Jason

^ permalink raw reply

* Re: [PATCH v6 rdma-next 0/6] RDMA/qedr: Use the doorbell overflow recovery mechanism for RDMA
From: Jason Gunthorpe @ 2019-07-25 18:01 UTC (permalink / raw)
  To: Michal Kalderon
  Cc: ariel.elior, dledford, galpress, linux-rdma, davem, netdev,
	Bernard Metzler
In-Reply-To: <20190709141735.19193-1-michal.kalderon@marvell.com>

On Tue, Jul 09, 2019 at 05:17:29PM +0300, Michal Kalderon wrote:
> This patch series uses the doorbell overflow recovery mechanism
> introduced in
> commit 36907cd5cd72 ("qed: Add doorbell overflow recovery mechanism")
> for rdma ( RoCE and iWARP )
> 
> The first three patches modify the core code to contain helper
> functions for managing mmap_xa inserting, getting and freeing
> entries. The code was taken almost as is from the efa driver.
> There is still an open discussion on whether we should take
> this even further and make the entire mmap generic. Until a
> decision is made, I only created the database API and modified
> the efa and qedr driver to use it. The doorbell recovery code will be based
> on the common code.
> 
> Efa driver was compile tested only.
> 
> rdma-core pull request #493
> 
> Changes from V5:
> - Switch between driver dealloc_ucontext and mmap_entries_remove.
> - No need to verify the key after using the key to load an entry from
>   the mmap_xa.
> - Change mmap_free api to pass an 'entry' object.
> - Add documentation for mmap_free and for newly exported functions.
> - Fix some extra/missing line breaks.

Lets do SIW now as well, it has the same xa scheme copied from EFA

Thanks,
Jason

^ permalink raw reply

* Re: [PATCH v6 rdma-next 4/6] qed*: Change dpi_addr to be denoted with __iomem
From: Jason Gunthorpe @ 2019-07-25 18:06 UTC (permalink / raw)
  To: Michal Kalderon
  Cc: ariel.elior, dledford, galpress, linux-rdma, davem, netdev
In-Reply-To: <20190709141735.19193-5-michal.kalderon@marvell.com>

On Tue, Jul 09, 2019 at 05:17:33PM +0300, Michal Kalderon wrote:
> Several casts were required around dpi_addr parameter in qed_rdma_if.h
> This is an address on the doorbell bar and should therefore be marked
> with __iomem.
> 
> Reported-by: Jason Gunthorpe <jgg@mellanox.com>
> Signed-off-by: Ariel Elior <ariel.elior@marvell.com>
> Signed-off-by: Michal Kalderon <michal.kalderon@marvell.com>
>  drivers/infiniband/hw/qedr/main.c          | 2 +-
>  drivers/infiniband/hw/qedr/qedr.h          | 2 +-
>  drivers/net/ethernet/qlogic/qed/qed_rdma.c | 5 ++---
>  include/linux/qed/qed_rdma_if.h            | 2 +-
>  4 files changed, 5 insertions(+), 6 deletions(-)

More lines are RDMA than net, so this patch applied to for-next

Thanks,
Jason

^ permalink raw reply

* [PATCH net] net: qualcomm: rmnet: Fix incorrect UL checksum offload logic
From: Subash Abhinov Kasiviswanathan @ 2019-07-25 18:07 UTC (permalink / raw)
  To: davem, netdev; +Cc: Subash Abhinov Kasiviswanathan, Sean Tranchetti

The udp_ip4_ind bit is set only for IPv4 UDP non-fragmented packets
so that the hardware can flip the checksum to 0xFFFF if the computed
checksum is 0 per RFC768.

However, this bit had to be set for IPv6 UDP non fragmented packets
as well per hardware requirements. Otherwise, IPv6 UDP packets
with computed checksum as 0 were transmitted by hardware and were
dropped in the network.

In addition to setting this bit for IPv6 UDP, the field is also
appropriately renamed to udp_ind as part of this change.

Fixes: 5eb5f8608ef1 ("net: qualcomm: rmnet: Add support for TX checksum offload")
Cc: Sean Tranchetti <stranche@codeaurora.org>
Signed-off-by: Subash Abhinov Kasiviswanathan <subashab@codeaurora.org>
---
 drivers/net/ethernet/qualcomm/rmnet/rmnet_map_data.c | 13 +++++++++----
 include/linux/if_rmnet.h                             |  4 ++--
 2 files changed, 11 insertions(+), 6 deletions(-)

diff --git a/drivers/net/ethernet/qualcomm/rmnet/rmnet_map_data.c b/drivers/net/ethernet/qualcomm/rmnet/rmnet_map_data.c
index 6018992..21d3816 100644
--- a/drivers/net/ethernet/qualcomm/rmnet/rmnet_map_data.c
+++ b/drivers/net/ethernet/qualcomm/rmnet/rmnet_map_data.c
@@ -206,9 +206,9 @@ static void rmnet_map_complement_ipv4_txporthdr_csum_field(void *iphdr)
 	ul_header->csum_insert_offset = skb->csum_offset;
 	ul_header->csum_enabled = 1;
 	if (ip4h->protocol == IPPROTO_UDP)
-		ul_header->udp_ip4_ind = 1;
+		ul_header->udp_ind = 1;
 	else
-		ul_header->udp_ip4_ind = 0;
+		ul_header->udp_ind = 0;
 
 	/* Changing remaining fields to network order */
 	hdr++;
@@ -239,6 +239,7 @@ static void rmnet_map_complement_ipv6_txporthdr_csum_field(void *ip6hdr)
 			      struct rmnet_map_ul_csum_header *ul_header,
 			      struct sk_buff *skb)
 {
+	struct ipv6hdr *ip6h = (struct ipv6hdr *)ip6hdr;
 	__be16 *hdr = (__be16 *)ul_header, offset;
 
 	offset = htons((__force u16)(skb_transport_header(skb) -
@@ -246,7 +247,11 @@ static void rmnet_map_complement_ipv6_txporthdr_csum_field(void *ip6hdr)
 	ul_header->csum_start_offset = offset;
 	ul_header->csum_insert_offset = skb->csum_offset;
 	ul_header->csum_enabled = 1;
-	ul_header->udp_ip4_ind = 0;
+
+	if (ip6h->nexthdr == IPPROTO_UDP)
+		ul_header->udp_ind = 1;
+	else
+		ul_header->udp_ind = 0;
 
 	/* Changing remaining fields to network order */
 	hdr++;
@@ -419,7 +424,7 @@ void rmnet_map_checksum_uplink_packet(struct sk_buff *skb,
 	ul_header->csum_start_offset = 0;
 	ul_header->csum_insert_offset = 0;
 	ul_header->csum_enabled = 0;
-	ul_header->udp_ip4_ind = 0;
+	ul_header->udp_ind = 0;
 
 	priv->stats.csum_sw++;
 }
diff --git a/include/linux/if_rmnet.h b/include/linux/if_rmnet.h
index b4f5403..9661416 100644
--- a/include/linux/if_rmnet.h
+++ b/include/linux/if_rmnet.h
@@ -41,11 +41,11 @@ struct rmnet_map_ul_csum_header {
 	__be16 csum_start_offset;
 #if defined(__LITTLE_ENDIAN_BITFIELD)
 	u16 csum_insert_offset:14;
-	u16 udp_ip4_ind:1;
+	u16 udp_ind:1;
 	u16 csum_enabled:1;
 #elif defined (__BIG_ENDIAN_BITFIELD)
 	u16 csum_enabled:1;
-	u16 udp_ip4_ind:1;
+	u16 udp_ind:1;
 	u16 csum_insert_offset:14;
 #else
 #error	"Please fix <asm/byteorder.h>"
-- 
1.9.1


^ permalink raw reply related

* Re: [PATCH] net: sfc: falcon: convert to i2c_new_dummy_device
From: David Miller @ 2019-07-25 18:19 UTC (permalink / raw)
  To: wsa+renesas
  Cc: linux-i2c, linux-net-drivers, ecree, mhabets, netdev,
	linux-kernel
In-Reply-To: <20190722172635.4535-1-wsa+renesas@sang-engineering.com>

From: Wolfram Sang <wsa+renesas@sang-engineering.com>
Date: Mon, 22 Jul 2019 19:26:35 +0200

> Move from i2c_new_dummy() to i2c_new_dummy_device(). So, we now get an
> ERRPTR which we use in error handling.
> 
> Signed-off-by: Wolfram Sang <wsa+renesas@sang-engineering.com>

Applied.

^ permalink raw reply

* Re: [PATCH] sis900: add support for ethtool --eeprom-dump
From: Andrew Lunn @ 2019-07-25 18:20 UTC (permalink / raw)
  To: Sergej Benilov; +Cc: venza, netdev
In-Reply-To: <CAC9-QvATLW0uCzGpeY1kLXs5BBsfNBF_BKCnCz+38_f+STJhog@mail.gmail.com>

On Thu, Jul 25, 2019 at 06:41:41PM +0200, Sergej Benilov wrote:
> On Thu, 25 Jul 2019 at 18:25, Andrew Lunn <andrew@lunn.ch> wrote:
> >
> > > +static int sis900_read_eeprom(struct net_device *net_dev, u8 *buf)
> > > +{
> > > +     struct sis900_private *sis_priv = netdev_priv(net_dev);
> > > +     void __iomem *ioaddr = sis_priv->ioaddr;
> > > +     int wait, ret = -EAGAIN;
> > > +     u16 signature;
> > > +     u16 *ebuf = (u16 *)buf;
> > > +     int i;
> > > +
> > > +     if (sis_priv->chipset_rev == SIS96x_900_REV) {
> > > +             sw32(mear, EEREQ);
> > > +             for (wait = 0; wait < 2000; wait++) {
> > > +                     if (sr32(mear) & EEGNT) {
> > > +                             /* read 16 bits, and index by 16 bits */
> > > +                             for (i = 0; i < sis_priv->eeprom_size / 2; i++)
> > > +                                     ebuf[i] = (u16)read_eeprom(ioaddr, i);
> > > +                     ret = 0;
> > > +                     break;
> > > +                     }
> > > +             udelay(1);
> > > +             }
> > > +     sw32(mear, EEDONE);
> >
> > The indentation looks all messed up here.
> 
> This has passed ./scripts/checkpatch.pl, as you had suggested for the
> previous patch.

checkpatch just checks for things like tabs vs space. 

I would expect the indentation to be more like:


     	if (sis_priv->chipset_rev == SIS96x_900_REV) {
             	sw32(mear, EEREQ);
		for (wait = 0; wait < 2000; wait++) {
			if (sr32(mear) & EEGNT) {
				/* read 16 bits, and index by 16 bits */
				for (i = 0; i < sis_priv->eeprom_size / 2; i++)
					ebuf[i] = (u16)read_eeprom(ioaddr, i);
				ret = 0;
				break;
			}
			udelay(1);
		}
		sw32(mear, EEDONE);
	} else {
		signature = (u16)read_eeprom(ioaddr, EEPROMSignature);
		if (signature != 0xffff && signature != 0x0000) {
			/* read 16 bits, and index by 16 bits */
			for (i = 0; i < sis_priv->eeprom_size / 2; i++)
				ebuf[i] = (u16)read_eeprom(ioaddr, i);
			ret = 0;
		}
	}
	return ret;

> > Why do you not put the data directly into data and avoid this memory
> > allocation, and memcpy?
> 
> Because EEPROM data from 'eeprom->offset' offset and of 'eeprom->len'
> length only is expected to be returned in 'data'.

O.K.

	Andrew

^ permalink raw reply

* Re: [PATCH] net-next: ag71xx: Rearrange ag711xx struct to remove holes
From: David Miller @ 2019-07-25 18:21 UTC (permalink / raw)
  To: rosenp; +Cc: netdev
In-Reply-To: <20190723034309.16492-1-rosenp@gmail.com>

From: Rosen Penev <rosenp@gmail.com>
Date: Mon, 22 Jul 2019 20:43:09 -0700

> Removed ____cacheline_aligned attribute to ring structs. This actually
> causes holes in the ag71xx struc as well as lower performance.

If you are legitimizing a change because of performance you must provide
detailed performance results that support this reason.

I'm not applying this patch until you respin it with the required
information.

Thank you.

^ permalink raw reply

* Re: [Patch net] ife: error out when nla attributes are empty
From: David Miller @ 2019-07-25 18:22 UTC (permalink / raw)
  To: xiyou.wangcong; +Cc: netdev, syzbot+fbb5b288c9cb6a2eeac4, jhs, jiri
In-Reply-To: <20190723044300.16143-1-xiyou.wangcong@gmail.com>

From: Cong Wang <xiyou.wangcong@gmail.com>
Date: Mon, 22 Jul 2019 21:43:00 -0700

> act_ife at least requires TCA_IFE_PARMS, so we have to bail out
> when there is no attribute passed in.
> 
> Reported-by: syzbot+fbb5b288c9cb6a2eeac4@syzkaller.appspotmail.com
> Fixes: ef6980b6becb ("introduce IFE action")
> Cc: Jamal Hadi Salim <jhs@mojatatu.com>
> Cc: Jiri Pirko <jiri@resnulli.us>
> Signed-off-by: Cong Wang <xiyou.wangcong@gmail.com>

Applied and queued up for -stable.

^ permalink raw reply

* Re: [RFC] dt-bindings: net: phy: Add subnode for LED configuration
From: Andrew Lunn @ 2019-07-25 18:34 UTC (permalink / raw)
  To: Matthias Kaehlcke
  Cc: David S . Miller, Rob Herring, Mark Rutland, Florian Fainelli,
	Heiner Kallweit, netdev, devicetree, linux-kernel,
	Douglas Anderson
In-Reply-To: <20190725175258.GE250418@google.com>

> As of now I don't plan to expose the label to userspace by the PHY
> driver/framework itself.

Great.

With that change, i think this proposed binding is O.K.

     Andrew

^ permalink raw reply

* Re: [PATCH net 0/2] DIM fixes for 5.3
From: David Miller @ 2019-07-25 18:34 UTC (permalink / raw)
  To: leon; +Cc: leonro, dledford, jgg, linux-rdma, talgi, yaminf, saeedm, netdev
In-Reply-To: <20190723072248.6844-1-leon@kernel.org>

From: Leon Romanovsky <leon@kernel.org>
Date: Tue, 23 Jul 2019 10:22:46 +0300

> Those two fixes for recently merged DIM patches, both exposed through
> RDMa DIM usage.

Series applied.

^ permalink raw reply

* Re: [PATCH net-next 0/2] mlxsw: Two small updates
From: David Miller @ 2019-07-25 18:36 UTC (permalink / raw)
  To: idosch; +Cc: netdev, jiri, amitc, mlxsw, idosch
In-Reply-To: <20190723075742.29029-1-idosch@idosch.org>

From: Ido Schimmel <idosch@idosch.org>
Date: Tue, 23 Jul 2019 10:57:40 +0300

> From: Ido Schimmel <idosch@mellanox.com>
> 
> Patch #1, from Amit, exposes the size of the key-value database (KVD)
> where different entries (e.g., routes, neighbours) are stored in the
> device. This allows users to understand how many entries can be
> offloaded and is also useful for writing scale tests.
> 
> Patch #2 increases the number of IPv6 nexthop groups mlxsw can offload.
> The problem and solution are explained in detail in the commit message.

Series applied.

^ permalink raw reply

* Re: [PATCH] ptp: ptp_dte: remove redundant dev_err message
From: David Miller @ 2019-07-25 18:38 UTC (permalink / raw)
  To: dingxiang; +Cc: richardcochran, netdev, linux-kernel
In-Reply-To: <1563872045-3692-1-git-send-email-dingxiang@cmss.chinamobile.com>

From: Ding Xiang <dingxiang@cmss.chinamobile.com>
Date: Tue, 23 Jul 2019 16:54:05 +0800

> devm_ioremap_resource already contains error message, so remove
> the redundant dev_err message
> 
> Signed-off-by: Ding Xiang <dingxiang@cmss.chinamobile.com>

Applied to net-next.

^ permalink raw reply

* Re: [PATCH v2] tun: mark small packets as owned by the tap sock
From: David Miller @ 2019-07-25 18:39 UTC (permalink / raw)
  To: abauvin; +Cc: stephen, jasowang, netdev
In-Reply-To: <20190723142301.39568-1-abauvin@scaleway.com>

From: Alexis Bauvin <abauvin@scaleway.com>
Date: Tue, 23 Jul 2019 16:23:01 +0200

> - v1 -> v2: Move skb_set_owner_w to __tun_build_skb to reduce patch size
> 
> Small packets going out of a tap device go through an optimized code
> path that uses build_skb() rather than sock_alloc_send_pskb(). The
> latter calls skb_set_owner_w(), but the small packet code path does not.
> 
> The net effect is that small packets are not owned by the userland
> application's socket (e.g. QEMU), while large packets are.
> This can be seen with a TCP session, where packets are not owned when
> the window size is small enough (around PAGE_SIZE), while they are once
> the window grows (note that this requires the host to support virtio
> tso for the guest to offload segmentation).
> All this leads to inconsistent behaviour in the kernel, especially on
> netfilter modules that uses sk->socket (e.g. xt_owner).
> 
> Signed-off-by: Alexis Bauvin <abauvin@scaleway.com>
> Fixes: 66ccbc9c87c2 ("tap: use build_skb() for small packet")

Applied and queued up for -stable, thanks.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox