Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH bpf-next 07/11] bpf: make sure to clear unused fields in tunnel/xfrm state fetch
From: Song Liu @ 2018-05-30 17:15 UTC (permalink / raw)
  To: Daniel Borkmann; +Cc: Alexei Starovoitov, Networking
In-Reply-To: <20180528004344.3606-8-daniel@iogearbox.net>

On Sun, May 27, 2018 at 5:43 PM, Daniel Borkmann <daniel@iogearbox.net> wrote:
> Since the remaining bits are not filled in struct bpf_tunnel_key
> resp. struct bpf_xfrm_state and originate from uninitialized stack
> space, we should make sure to clear them before handing control
> back to the program.
>
> Also add a padding element to struct bpf_xfrm_state for future use
> similar as we have in struct bpf_tunnel_key and clear it as well.
>
>   struct bpf_xfrm_state {
>       __u32                      reqid;            /*     0     4 */
>       __u32                      spi;              /*     4     4 */
>       __u16                      family;           /*     8     2 */
>
>       /* XXX 2 bytes hole, try to pack */
>
>       union {
>           __u32              remote_ipv4;          /*           4 */
>           __u32              remote_ipv6[4];       /*          16 */
>       };                                           /*    12    16 */
>
>       /* size: 28, cachelines: 1, members: 4 */
>       /* sum members: 26, holes: 1, sum holes: 2 */
>       /* last cacheline: 28 bytes */
>   };
>
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
> Acked-by: Alexei Starovoitov <ast@kernel.org>

Acked-by: Song Liu <songliubraving@fb.com>

> ---
>  include/uapi/linux/bpf.h | 3 ++-
>  net/core/filter.c        | 6 ++++++
>  2 files changed, 8 insertions(+), 1 deletion(-)
>
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index e2853aa..7108711 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -2214,7 +2214,7 @@ struct bpf_tunnel_key {
>         };
>         __u8 tunnel_tos;
>         __u8 tunnel_ttl;
> -       __u16 tunnel_ext;
> +       __u16 tunnel_ext;       /* Padding, future use. */
>         __u32 tunnel_label;
>  };
>
> @@ -2225,6 +2225,7 @@ struct bpf_xfrm_state {
>         __u32 reqid;
>         __u32 spi;      /* Stored in network byte order */
>         __u16 family;
> +       __u16 ext;      /* Padding, future use. */
>         union {
>                 __u32 remote_ipv4;      /* Stored in network byte order */
>                 __u32 remote_ipv6[4];   /* Stored in network byte order */
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 717c740..5ceb5e6 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -3445,6 +3445,7 @@ BPF_CALL_4(bpf_skb_get_tunnel_key, struct sk_buff *, skb, struct bpf_tunnel_key
>         to->tunnel_id = be64_to_cpu(info->key.tun_id);
>         to->tunnel_tos = info->key.tos;
>         to->tunnel_ttl = info->key.ttl;
> +       to->tunnel_ext = 0;
>
>         if (flags & BPF_F_TUNINFO_IPV6) {
>                 memcpy(to->remote_ipv6, &info->key.u.ipv6.src,
> @@ -3452,6 +3453,8 @@ BPF_CALL_4(bpf_skb_get_tunnel_key, struct sk_buff *, skb, struct bpf_tunnel_key
>                 to->tunnel_label = be32_to_cpu(info->key.label);
>         } else {
>                 to->remote_ipv4 = be32_to_cpu(info->key.u.ipv4.src);
> +               memset(&to->remote_ipv6[1], 0, sizeof(__u32) * 3);
> +               to->tunnel_label = 0;
>         }
>
>         if (unlikely(size != sizeof(struct bpf_tunnel_key)))
> @@ -4047,11 +4050,14 @@ BPF_CALL_5(bpf_skb_get_xfrm_state, struct sk_buff *, skb, u32, index,
>         to->reqid = x->props.reqid;
>         to->spi = x->id.spi;
>         to->family = x->props.family;
> +       to->ext = 0;
> +
>         if (to->family == AF_INET6) {
>                 memcpy(to->remote_ipv6, x->props.saddr.a6,
>                        sizeof(to->remote_ipv6));
>         } else {
>                 to->remote_ipv4 = x->props.saddr.a4;
> +               memset(&to->remote_ipv6[1], 0, sizeof(__u32) * 3);
>         }
>
>         return 0;
> --
> 2.9.5
>

^ permalink raw reply

* Re: [PATCH bpf-next 05/11] bpf: avoid retpoline for lookup/update/delete calls on maps
From: Song Liu @ 2018-05-30 17:06 UTC (permalink / raw)
  To: Daniel Borkmann; +Cc: Alexei Starovoitov, Networking
In-Reply-To: <20180528004344.3606-6-daniel@iogearbox.net>

On Sun, May 27, 2018 at 5:43 PM, Daniel Borkmann <daniel@iogearbox.net> wrote:
> While some of the BPF map lookup helpers provide a ->map_gen_lookup()
> callback for inlining the map lookup altogether it is not available
> for every map, so the remaining ones have to call bpf_map_lookup_elem()
> helper which does a dispatch to map->ops->map_lookup_elem(). In
> times of retpolines, this will control and trap speculative execution
> rather than letting it do its work for the indirect call and will
> therefore cause a slowdown. Likewise, bpf_map_update_elem() and
> bpf_map_delete_elem() do not have an inlined version and need to call
> into their map->ops->map_update_elem() resp. map->ops->map_delete_elem()
> handlers.
>
> Before:
>
>   # bpftool p d x i 1
>     0: (bf) r2 = r10
>     1: (07) r2 += -8
>     2: (7a) *(u64 *)(r2 +0) = 0
>     3: (18) r1 = map[id:1]
>     5: (85) call __htab_map_lookup_elem#232656
>     6: (15) if r0 == 0x0 goto pc+4
>     7: (71) r1 = *(u8 *)(r0 +35)
>     8: (55) if r1 != 0x0 goto pc+1
>     9: (72) *(u8 *)(r0 +35) = 1
>    10: (07) r0 += 56
>    11: (15) if r0 == 0x0 goto pc+4
>    12: (bf) r2 = r0
>    13: (18) r1 = map[id:1]
>    15: (85) call bpf_map_delete_elem#215008  <-- indirect call via
>    16: (95) exit                                 helper
>
> After:
>
>   # bpftool p d x i 1
>     0: (bf) r2 = r10
>     1: (07) r2 += -8
>     2: (7a) *(u64 *)(r2 +0) = 0
>     3: (18) r1 = map[id:1]
>     5: (85) call __htab_map_lookup_elem#233328
>     6: (15) if r0 == 0x0 goto pc+4
>     7: (71) r1 = *(u8 *)(r0 +35)
>     8: (55) if r1 != 0x0 goto pc+1
>     9: (72) *(u8 *)(r0 +35) = 1
>    10: (07) r0 += 56
>    11: (15) if r0 == 0x0 goto pc+4
>    12: (bf) r2 = r0
>    13: (18) r1 = map[id:1]
>    15: (85) call htab_lru_map_delete_elem#238240  <-- direct call
>    16: (95) exit
>
> In all three lookup/update/delete cases however we can use the actual
> address of the map callback directly if we find that there's only a
> single path with a map pointer leading to the helper call, meaning
> when the map pointer has not been poisoned from verifier side.
> Example code can be seen above for the delete case.
>
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
> Acked-by: Alexei Starovoitov <ast@kernel.org>

Acked-by: Song Liu <songliubraving@fb.com>

> ---
>  include/linux/filter.h |  3 +++
>  kernel/bpf/hashtab.c   | 12 ++++++---
>  kernel/bpf/verifier.c  | 67 +++++++++++++++++++++++++++++++++++++-------------
>  3 files changed, 62 insertions(+), 20 deletions(-)
>
> diff --git a/include/linux/filter.h b/include/linux/filter.h
> index b443f70..d407ede 100644
> --- a/include/linux/filter.h
> +++ b/include/linux/filter.h
> @@ -301,6 +301,9 @@ struct xdp_buff;
>
>  /* Function call */
>
> +#define BPF_CAST_CALL(x)                                       \
> +               ((u64 (*)(u64, u64, u64, u64, u64))(x))
> +
>  #define BPF_EMIT_CALL(FUNC)                                    \
>         ((struct bpf_insn) {                                    \
>                 .code  = BPF_JMP | BPF_CALL,                    \
> diff --git a/kernel/bpf/hashtab.c b/kernel/bpf/hashtab.c
> index b76828f..3ca2198 100644
> --- a/kernel/bpf/hashtab.c
> +++ b/kernel/bpf/hashtab.c
> @@ -503,7 +503,9 @@ static u32 htab_map_gen_lookup(struct bpf_map *map, struct bpf_insn *insn_buf)
>         struct bpf_insn *insn = insn_buf;
>         const int ret = BPF_REG_0;
>
> -       *insn++ = BPF_EMIT_CALL((u64 (*)(u64, u64, u64, u64, u64))__htab_map_lookup_elem);
> +       BUILD_BUG_ON(!__same_type(&__htab_map_lookup_elem,
> +                    (void *(*)(struct bpf_map *map, void *key))NULL));
> +       *insn++ = BPF_EMIT_CALL(BPF_CAST_CALL(__htab_map_lookup_elem));
>         *insn++ = BPF_JMP_IMM(BPF_JEQ, ret, 0, 1);
>         *insn++ = BPF_ALU64_IMM(BPF_ADD, ret,
>                                 offsetof(struct htab_elem, key) +
> @@ -530,7 +532,9 @@ static u32 htab_lru_map_gen_lookup(struct bpf_map *map,
>         const int ret = BPF_REG_0;
>         const int ref_reg = BPF_REG_1;
>
> -       *insn++ = BPF_EMIT_CALL((u64 (*)(u64, u64, u64, u64, u64))__htab_map_lookup_elem);
> +       BUILD_BUG_ON(!__same_type(&__htab_map_lookup_elem,
> +                    (void *(*)(struct bpf_map *map, void *key))NULL));
> +       *insn++ = BPF_EMIT_CALL(BPF_CAST_CALL(__htab_map_lookup_elem));
>         *insn++ = BPF_JMP_IMM(BPF_JEQ, ret, 0, 4);
>         *insn++ = BPF_LDX_MEM(BPF_B, ref_reg, ret,
>                               offsetof(struct htab_elem, lru_node) +
> @@ -1369,7 +1373,9 @@ static u32 htab_of_map_gen_lookup(struct bpf_map *map,
>         struct bpf_insn *insn = insn_buf;
>         const int ret = BPF_REG_0;
>
> -       *insn++ = BPF_EMIT_CALL((u64 (*)(u64, u64, u64, u64, u64))__htab_map_lookup_elem);
> +       BUILD_BUG_ON(!__same_type(&__htab_map_lookup_elem,
> +                    (void *(*)(struct bpf_map *map, void *key))NULL));
> +       *insn++ = BPF_EMIT_CALL(BPF_CAST_CALL(__htab_map_lookup_elem));
>         *insn++ = BPF_JMP_IMM(BPF_JEQ, ret, 0, 2);
>         *insn++ = BPF_ALU64_IMM(BPF_ADD, ret,
>                                 offsetof(struct htab_elem, key) +
> diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
> index 4f4786e..5684b15 100644
> --- a/kernel/bpf/verifier.c
> +++ b/kernel/bpf/verifier.c
> @@ -2421,8 +2421,11 @@ record_func_map(struct bpf_verifier_env *env, struct bpf_call_arg_meta *meta,
>         struct bpf_insn_aux_data *aux = &env->insn_aux_data[insn_idx];
>
>         if (func_id != BPF_FUNC_tail_call &&
> -           func_id != BPF_FUNC_map_lookup_elem)
> +           func_id != BPF_FUNC_map_lookup_elem &&
> +           func_id != BPF_FUNC_map_update_elem &&
> +           func_id != BPF_FUNC_map_delete_elem)
>                 return 0;
> +
>         if (meta->map_ptr == NULL) {
>                 verbose(env, "kernel subsystem misconfigured verifier\n");
>                 return -EINVAL;
> @@ -5586,6 +5589,7 @@ static int fixup_bpf_calls(struct bpf_verifier_env *env)
>         struct bpf_insn *insn = prog->insnsi;
>         const struct bpf_func_proto *fn;
>         const int insn_cnt = prog->len;
> +       const struct bpf_map_ops *ops;
>         struct bpf_insn_aux_data *aux;
>         struct bpf_insn insn_buf[16];
>         struct bpf_prog *new_prog;
> @@ -5715,10 +5719,13 @@ static int fixup_bpf_calls(struct bpf_verifier_env *env)
>                 }
>
>                 /* BPF_EMIT_CALL() assumptions in some of the map_gen_lookup
> -                * handlers are currently limited to 64 bit only.
> +                * and other inlining handlers are currently limited to 64 bit
> +                * only.
>                  */
>                 if (prog->jit_requested && BITS_PER_LONG == 64 &&
> -                   insn->imm == BPF_FUNC_map_lookup_elem) {
> +                   (insn->imm == BPF_FUNC_map_lookup_elem ||
> +                    insn->imm == BPF_FUNC_map_update_elem ||
> +                    insn->imm == BPF_FUNC_map_delete_elem)) {
>                         aux = &env->insn_aux_data[i + delta];
>                         if (bpf_map_ptr_poisoned(aux))
>                                 goto patch_call_imm;
> @@ -5727,23 +5734,49 @@ static int fixup_bpf_calls(struct bpf_verifier_env *env)
>                         if (!map_ptr->ops->map_gen_lookup)
>                                 goto patch_call_imm;
>
> -                       cnt = map_ptr->ops->map_gen_lookup(map_ptr, insn_buf);
> -                       if (cnt == 0 || cnt >= ARRAY_SIZE(insn_buf)) {
> -                               verbose(env, "bpf verifier is misconfigured\n");
> -                               return -EINVAL;
> -                       }
> +                       ops = map_ptr->ops;
> +                       if (insn->imm == BPF_FUNC_map_lookup_elem &&
> +                           ops->map_gen_lookup) {
> +                               cnt = ops->map_gen_lookup(map_ptr, insn_buf);
> +                               if (cnt == 0 || cnt >= ARRAY_SIZE(insn_buf)) {
> +                                       verbose(env, "bpf verifier is misconfigured\n");
> +                                       return -EINVAL;
> +                               }
>
> -                       new_prog = bpf_patch_insn_data(env, i + delta, insn_buf,
> -                                                      cnt);
> -                       if (!new_prog)
> -                               return -ENOMEM;
> +                               new_prog = bpf_patch_insn_data(env, i + delta,
> +                                                              insn_buf, cnt);
> +                               if (!new_prog)
> +                                       return -ENOMEM;
>
> -                       delta += cnt - 1;
> +                               delta    += cnt - 1;
> +                               env->prog = prog = new_prog;
> +                               insn      = new_prog->insnsi + i + delta;
> +                               continue;
> +                       }
>
> -                       /* keep walking new program and skip insns we just inserted */
> -                       env->prog = prog = new_prog;
> -                       insn      = new_prog->insnsi + i + delta;
> -                       continue;
> +                       BUILD_BUG_ON(!__same_type(ops->map_lookup_elem,
> +                                    (void *(*)(struct bpf_map *map, void *key))NULL));
> +                       BUILD_BUG_ON(!__same_type(ops->map_delete_elem,
> +                                    (int (*)(struct bpf_map *map, void *key))NULL));
> +                       BUILD_BUG_ON(!__same_type(ops->map_update_elem,
> +                                    (int (*)(struct bpf_map *map, void *key, void *value,
> +                                             u64 flags))NULL));
> +                       switch (insn->imm) {
> +                       case BPF_FUNC_map_lookup_elem:
> +                               insn->imm = BPF_CAST_CALL(ops->map_lookup_elem) -
> +                                           __bpf_call_base;
> +                               continue;
> +                       case BPF_FUNC_map_update_elem:
> +                               insn->imm = BPF_CAST_CALL(ops->map_update_elem) -
> +                                           __bpf_call_base;
> +                               continue;
> +                       case BPF_FUNC_map_delete_elem:
> +                               insn->imm = BPF_CAST_CALL(ops->map_delete_elem) -
> +                                           __bpf_call_base;
> +                               continue;
> +                       }
> +
> +                       goto patch_call_imm;
>                 }
>
>                 if (insn->imm == BPF_FUNC_redirect_map) {
> --
> 2.9.5
>

^ permalink raw reply

* [PATCH iproute2 net-next] iproute: ip route get support for sport, dport and ipproto match
From: Roopa Prabhu @ 2018-05-30 17:06 UTC (permalink / raw)
  To: dsahern; +Cc: netdev

From: Roopa Prabhu <roopa@cumulusnetworks.com>

Signed-off-by: Roopa Prabhu <roopa@cumulusnetworks.com>
---
 ip/iproute.c           | 26 +++++++++++++++++++++++++-
 man/man8/ip-route.8.in | 20 +++++++++++++++++++-
 2 files changed, 44 insertions(+), 2 deletions(-)

diff --git a/ip/iproute.c b/ip/iproute.c
index 56dd9f2..ef04d27 100644
--- a/ip/iproute.c
+++ b/ip/iproute.c
@@ -69,7 +69,8 @@ static void usage(void)
 		"                            [ from ADDRESS iif STRING ]\n"
 		"                            [ oif STRING ] [ tos TOS ]\n"
 		"                            [ mark NUMBER ] [ vrf NAME ]\n"
-		"                            [ uid NUMBER ]\n"
+		"                            [ uid NUMBER ] [ ipproto PROTOCOL ]\n"
+		"                            [ sport NUMBER ] [ dport NUMBER ]\n"
 		"       ip route { add | del | change | append | replace } ROUTE\n"
 		"SELECTOR := [ root PREFIX ] [ match PREFIX ] [ exact PREFIX ]\n"
 		"            [ table TABLE_ID ] [ vrf NAME ] [ proto RTPROTO ]\n"
@@ -1994,6 +1995,29 @@ static int iproute_get(int argc, char **argv)
 				req.r.rtm_family = addr.family;
 			addattr_l(&req.n, sizeof(req), RTA_NEWDST,
 				  &addr.data, addr.bytelen);
+		} else if (matches(*argv, "sport") == 0) {
+			__be16 sport;
+
+			NEXT_ARG();
+			if (get_be16(&sport, *argv, 0))
+				invarg("invalid sport\n", *argv);
+			addattr16(&req.n, sizeof(req), RTA_SPORT, sport);
+		} else if (matches(*argv, "dport") == 0) {
+			__be16 dport;
+
+			NEXT_ARG();
+			if (get_be16(&dport, *argv, 0))
+				invarg("invalid dport\n", *argv);
+			addattr16(&req.n, sizeof(req), RTA_DPORT, dport);
+		} else if (matches(*argv, "ipproto") == 0) {
+			int ipproto;
+
+			NEXT_ARG();
+			ipproto = inet_proto_a2n(*argv);
+			if (ipproto < 0)
+				invarg("Invalid \"ipproto\" value\n",
+				       *argv);
+			addattr8(&req.n, sizeof(req), RTA_IP_PROTO, ipproto);
 		} else {
 			inet_prefix addr;
 
diff --git a/man/man8/ip-route.8.in b/man/man8/ip-route.8.in
index b28f3d2..b21a847 100644
--- a/man/man8/ip-route.8.in
+++ b/man/man8/ip-route.8.in
@@ -38,7 +38,13 @@ ip-route \- routing table management
 .B  tos
 .IR TOS " ] [ "
 .B  vrf
-.IR NAME " ] "
+.IR NAME " ] [ "
+.B  ipproto
+.IR PROTOCOL " ] [ "
+.B  sport
+.IR NUMBER " ] [ "
+.B  dport
+.IR NUMBER " ] "
 
 .ti -8
 .BR "ip route" " { " add " | " del " | " change " | " append " | "\
@@ -1045,6 +1051,18 @@ the firewall mark
 force the vrf device on which this packet will be routed.
 
 .TP
+.BI ipproto " PROTOCOL"
+ip protocol as seen by the route lookup
+
+.TP
+.BI sport " NUMBER"
+source port as seen by the route lookup
+
+.TP
+.BI dport " NUMBER"
+destination port as seen by the route lookup
+
+.TP
 .B connected
 if no source address
 .RB "(option " from ")"
-- 
2.1.4

^ permalink raw reply related

* Re: [PATCH bpf-next 09/11] bpf: fix context access in tracing progs on 32 bit archs
From: Song Liu @ 2018-05-30 16:46 UTC (permalink / raw)
  To: Daniel Borkmann; +Cc: Alexei Starovoitov, Networking
In-Reply-To: <20180528004344.3606-10-daniel@iogearbox.net>

On Sun, May 27, 2018 at 5:43 PM, Daniel Borkmann <daniel@iogearbox.net> wrote:
> Wang reported that all the testcases for BPF_PROG_TYPE_PERF_EVENT
> program type in test_verifier report the following errors on x86_32:
>
>   172/p unpriv: spill/fill of different pointers ldx FAIL
>   Unexpected error message!
>   0: (bf) r6 = r10
>   1: (07) r6 += -8
>   2: (15) if r1 == 0x0 goto pc+3
>   R1=ctx(id=0,off=0,imm=0) R6=fp-8,call_-1 R10=fp0,call_-1
>   3: (bf) r2 = r10
>   4: (07) r2 += -76
>   5: (7b) *(u64 *)(r6 +0) = r2
>   6: (55) if r1 != 0x0 goto pc+1
>   R1=ctx(id=0,off=0,imm=0) R2=fp-76,call_-1 R6=fp-8,call_-1 R10=fp0,call_-1 fp-8=fp
>   7: (7b) *(u64 *)(r6 +0) = r1
>   8: (79) r1 = *(u64 *)(r6 +0)
>   9: (79) r1 = *(u64 *)(r1 +68)
>   invalid bpf_context access off=68 size=8
>
>   378/p check bpf_perf_event_data->sample_period byte load permitted FAIL
>   Failed to load prog 'Permission denied'!
>   0: (b7) r0 = 0
>   1: (71) r0 = *(u8 *)(r1 +68)
>   invalid bpf_context access off=68 size=1
>
>   379/p check bpf_perf_event_data->sample_period half load permitted FAIL
>   Failed to load prog 'Permission denied'!
>   0: (b7) r0 = 0
>   1: (69) r0 = *(u16 *)(r1 +68)
>   invalid bpf_context access off=68 size=2
>
>   380/p check bpf_perf_event_data->sample_period word load permitted FAIL
>   Failed to load prog 'Permission denied'!
>   0: (b7) r0 = 0
>   1: (61) r0 = *(u32 *)(r1 +68)
>   invalid bpf_context access off=68 size=4
>
>   381/p check bpf_perf_event_data->sample_period dword load permitted FAIL
>   Failed to load prog 'Permission denied'!
>   0: (b7) r0 = 0
>   1: (79) r0 = *(u64 *)(r1 +68)
>   invalid bpf_context access off=68 size=8
>
> Reason is that struct pt_regs on x86_32 doesn't fully align to 8 byte
> boundary due to its size of 68 bytes.
>
> Therefore, bpf_ctx_narrow_access_ok() will then bail out saying that
> off & (size_default - 1) which is 68 & 7 doesn't cleanly align in the
> case of sample_period access from struct bpf_perf_event_data, hence
> verifier wrongly thinks we might be doing an unaligned access here.
> Therefore adjust this down to machine size and check the offset for
> narrow access on that basis.
>
> We also need to fix pe_prog_is_valid_access(), since we hit the check
> for off % size != 0 (e.g. 68 % 8 -> 4) in the first and last test.
>
> Reported-by: Wang YanQing <udknight@gmail.com>
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
> Acked-by: Alexei Starovoitov <ast@kernel.org>
> ---
>  include/linux/filter.h   | 30 ++++++++++++++++++++++++------
>  kernel/trace/bpf_trace.c | 10 ++++++++--
>  2 files changed, 32 insertions(+), 8 deletions(-)
>
> diff --git a/include/linux/filter.h b/include/linux/filter.h
> index d407ede..89903d2 100644
> --- a/include/linux/filter.h
> +++ b/include/linux/filter.h
> @@ -639,16 +639,34 @@ static inline bool bpf_prog_was_classic(const struct bpf_prog *prog)
>         return prog->type == BPF_PROG_TYPE_UNSPEC;
>  }
>
> -static inline bool
> -bpf_ctx_narrow_access_ok(u32 off, u32 size, const u32 size_default)
> +static inline u32 bpf_ctx_off_adjust_machine(u32 size)
> +{
> +       const u32 size_machine = sizeof(unsigned long);
> +
> +       if (size > size_machine && size % size_machine == 0)
> +               size = size_machine;

Not sure whether I understand this correctly. I guess we only need:
    if (size % size_machine == 0)
               size = size_machine;

Or, is this function equivalent to
    if (size == 8 && size_machine == 4)
         size = 4;

If this is the case, maybe we can make bpf_ctx_narrow_align_ok()
simpler?

Thanks,
Song

> +
> +       return size;
> +}
> +
> +static inline bool bpf_ctx_narrow_align_ok(u32 off, u32 size_access,
> +                                          u32 size_default)
>  {
> -       bool off_ok;
> +       size_default = bpf_ctx_off_adjust_machine(size_default);
> +       size_access  = bpf_ctx_off_adjust_machine(size_access);
> +
>  #ifdef __LITTLE_ENDIAN
> -       off_ok = (off & (size_default - 1)) == 0;
> +       return (off & (size_default - 1)) == 0;
>  #else
> -       off_ok = (off & (size_default - 1)) + size == size_default;
> +       return (off & (size_default - 1)) + size_access == size_default;
>  #endif
> -       return off_ok && size <= size_default && (size & (size - 1)) == 0;
> +}
> +
> +static inline bool
> +bpf_ctx_narrow_access_ok(u32 off, u32 size, u32 size_default)
> +{
> +       return bpf_ctx_narrow_align_ok(off, size, size_default) &&
> +              size <= size_default && (size & (size - 1)) == 0;
>  }
>
>  #define bpf_classic_proglen(fprog) (fprog->len * sizeof(fprog->filter[0]))
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index 81fdf2f..7269530 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -880,8 +880,14 @@ static bool pe_prog_is_valid_access(int off, int size, enum bpf_access_type type
>                 return false;
>         if (type != BPF_READ)
>                 return false;
> -       if (off % size != 0)
> -               return false;
> +       if (off % size != 0) {
> +               if (sizeof(unsigned long) != 4)
> +                       return false;
> +               if (size != 8)
> +                       return false;
> +               if (off % size != 4)
> +                       return false;
> +       }
>
>         switch (off) {
>         case bpf_ctx_range(struct bpf_perf_event_data, sample_period):
> --
> 2.9.5
>

^ permalink raw reply

* [PATCH] samples/bpf: Add xdp_sample_pkts example
From: Toke Høiland-Jørgensen @ 2018-05-30 16:45 UTC (permalink / raw)
  To: netdev; +Cc: Toke Høiland-Jørgensen

This adds an example program showing how to sample packets from XDP using
the perf event buffer. The example userspace program just prints the
ethernet header for every packet sampled.

Most of the userspace code is borrowed from other examples, most notably
trace_output.

Note that the example only works when everything runs on CPU0; so
suitable smp_affinity needs to be set on the device. Some drivers seem
to reset smp_affinity when loading an XDP program, so it may be
necessary to change it after starting the example userspace program.

Signed-off-by: Toke Høiland-Jørgensen <toke@toke.dk>
---
 samples/bpf/Makefile               |   4 +
 samples/bpf/xdp_sample_pkts_kern.c |  48 ++++++++++++
 samples/bpf/xdp_sample_pkts_user.c | 147 +++++++++++++++++++++++++++++++++++++
 3 files changed, 199 insertions(+)
 create mode 100644 samples/bpf/xdp_sample_pkts_kern.c
 create mode 100644 samples/bpf/xdp_sample_pkts_user.c

diff --git a/samples/bpf/Makefile b/samples/bpf/Makefile
index 1303af1..6f0c6d2 100644
--- a/samples/bpf/Makefile
+++ b/samples/bpf/Makefile
@@ -52,6 +52,7 @@ hostprogs-y += xdp_adjust_tail
 hostprogs-y += xdpsock
 hostprogs-y += xdp_fwd
 hostprogs-y += task_fd_query
+hostprogs-y += xdp_sample_pkts
 
 # Libbpf dependencies
 LIBBPF = $(TOOLS_PATH)/lib/bpf/libbpf.a
@@ -107,6 +108,7 @@ xdp_adjust_tail-objs := xdp_adjust_tail_user.o
 xdpsock-objs := bpf_load.o xdpsock_user.o
 xdp_fwd-objs := bpf_load.o xdp_fwd_user.o
 task_fd_query-objs := bpf_load.o task_fd_query_user.o $(TRACE_HELPERS)
+xdp_sample_pkts-objs := bpf_load.o xdp_sample_pkts_user.o $(TRACE_HELPERS)
 
 # Tell kbuild to always build the programs
 always := $(hostprogs-y)
@@ -163,6 +165,7 @@ always += xdp_adjust_tail_kern.o
 always += xdpsock_kern.o
 always += xdp_fwd_kern.o
 always += task_fd_query_kern.o
+always += xdp_sample_pkts_kern.o
 
 HOSTCFLAGS += -I$(objtree)/usr/include
 HOSTCFLAGS += -I$(srctree)/tools/lib/
@@ -179,6 +182,7 @@ HOSTCFLAGS_spintest_user.o += -I$(srctree)/tools/lib/bpf/
 HOSTCFLAGS_trace_event_user.o += -I$(srctree)/tools/lib/bpf/
 HOSTCFLAGS_sampleip_user.o += -I$(srctree)/tools/lib/bpf/
 HOSTCFLAGS_task_fd_query_user.o += -I$(srctree)/tools/lib/bpf/
+HOSTCFLAGS_xdp_sample_pkts_user.o += -I$(srctree)/tools/lib/bpf/
 
 HOST_LOADLIBES		+= $(LIBBPF) -lelf
 HOSTLOADLIBES_tracex4		+= -lrt
diff --git a/samples/bpf/xdp_sample_pkts_kern.c b/samples/bpf/xdp_sample_pkts_kern.c
new file mode 100644
index 0000000..c58183a
--- /dev/null
+++ b/samples/bpf/xdp_sample_pkts_kern.c
@@ -0,0 +1,48 @@
+#include <linux/ptrace.h>
+#include <linux/version.h>
+#include <uapi/linux/bpf.h>
+#include "bpf_helpers.h"
+
+#define SAMPLE_SIZE 64ul
+
+struct bpf_map_def SEC("maps") my_map = {
+	.type = BPF_MAP_TYPE_PERF_EVENT_ARRAY,
+	.key_size = sizeof(int),
+	.value_size = sizeof(u32),
+	.max_entries = 2,
+};
+
+SEC("xdp_sample")
+int xdp_sample_prog(struct xdp_md *ctx)
+{
+	void *data_end = (void *)(long)ctx->data_end;
+	void *data = (void *)(long)ctx->data;
+
+        /* Metadata will be in the perf event before the packet data. */
+	struct S {
+		u16 cookie;
+		u16 pkt_len;
+	} __attribute__((packed)) metadata;
+
+	if (data + SAMPLE_SIZE < data_end) {
+		/* The XDP perf_event_output handler will use the upper 32 bits
+		 * of the flags argument as a number of bytes to include of the
+		 * packet payload in the event data. If the size is too big, the
+		 * call to bpf_perf_event_output will fail and return -EFAULT.
+		 *
+		 * See bpf_xdp_event_output in net/core/filter.c.
+		 */
+		u64 flags = SAMPLE_SIZE << 32;
+
+		metadata.cookie = 0xdead;
+		metadata.pkt_len = (u16)(data_end - data);
+
+		bpf_perf_event_output(ctx, &my_map, flags,
+				      &metadata, sizeof(metadata));
+	}
+
+	return XDP_PASS;
+}
+
+char _license[] SEC("license") = "GPL";
+u32 _version SEC("version") = LINUX_VERSION_CODE;
diff --git a/samples/bpf/xdp_sample_pkts_user.c b/samples/bpf/xdp_sample_pkts_user.c
new file mode 100644
index 0000000..f996917
--- /dev/null
+++ b/samples/bpf/xdp_sample_pkts_user.c
@@ -0,0 +1,147 @@
+/* This program is free software; you can redistribute it and/or
+ * modify it under the terms of version 2 of the GNU General Public
+ * License as published by the Free Software Foundation.
+ */
+#include <stdio.h>
+#include <unistd.h>
+#include <stdlib.h>
+#include <stdbool.h>
+#include <string.h>
+#include <fcntl.h>
+#include <poll.h>
+#include <linux/perf_event.h>
+#include <linux/bpf.h>
+#include <net/if.h>
+#include <errno.h>
+#include <assert.h>
+#include <sys/syscall.h>
+#include <sys/ioctl.h>
+#include <sys/mman.h>
+#include <time.h>
+#include <signal.h>
+#include <libbpf.h>
+#include "bpf_load.h"
+#include "bpf_util.h"
+#include <bpf/bpf.h>
+
+#include "perf-sys.h"
+#include "trace_helpers.h"
+
+static int pmu_fd, if_idx = 0;
+static char *if_name;
+
+static int do_attach(int idx, int fd, const char *name)
+{
+	int err;
+
+	err = bpf_set_link_xdp_fd(idx, fd, 0);
+	if (err < 0)
+		printf("ERROR: failed to attach program to %s\n", name);
+
+	return err;
+}
+
+static int do_detach(int idx, const char *name)
+{
+	int err;
+
+	err = bpf_set_link_xdp_fd(idx, -1, 0);
+	if (err < 0)
+		printf("ERROR: failed to detach program from %s\n", name);
+
+	return err;
+}
+
+#define SAMPLE_SIZE 64
+
+static int print_bpf_output(void *data, int size)
+{
+	struct {
+		__u16 cookie;
+		__u16 pkt_len;
+		__u8  pkt_data[SAMPLE_SIZE];
+	} __attribute__((packed)) *e = data;
+	int i;
+
+	if (e->cookie != 0xdead) {
+		printf("BUG cookie %x sized %d\n",
+		       e->cookie, size);
+		return LIBBPF_PERF_EVENT_ERROR;
+	}
+
+	printf("Pkt len: %-5d bytes. Ethernet hdr: ", e->pkt_len);
+	for (i = 0; i < 14 && i < e->pkt_len; i++)
+		printf("%02x ", e->pkt_data[i]);
+	printf("\n");
+
+	return LIBBPF_PERF_EVENT_CONT;
+}
+
+static void test_bpf_perf_event(void)
+{
+	struct perf_event_attr attr = {
+		.sample_type = PERF_SAMPLE_RAW,
+		.type = PERF_TYPE_SOFTWARE,
+		.config = PERF_COUNT_SW_BPF_OUTPUT,
+	};
+	int key = 0;
+
+	pmu_fd = sys_perf_event_open(&attr, -1/*pid*/, 0/*cpu*/, -1/*group_fd*/, 0);
+
+	assert(pmu_fd >= 0);
+	assert(bpf_map_update_elem(map_fd[0], &key, &pmu_fd, BPF_ANY) == 0);
+	ioctl(pmu_fd, PERF_EVENT_IOC_ENABLE, 0);
+}
+
+static void sig_handler(int signo)
+{
+	do_detach(if_idx, if_name);
+	exit(0);
+}
+
+int main(int argc, char **argv)
+{
+	char filename[256];
+	int ret, err;
+
+	if (argc < 2) {
+		printf("Usage: %s <ifname>\n", argv[0]);
+		return 1;
+	}
+
+	snprintf(filename, sizeof(filename), "%s_kern.o", argv[0]);
+
+	if (load_bpf_file(filename)) {
+		printf("%s", bpf_log_buf);
+		return 1;
+	}
+
+	if_idx = if_nametoindex(argv[1]);
+	if (!if_idx)
+		if_idx = strtoul(argv[1], NULL, 0);
+
+	if (!if_idx) {
+		fprintf(stderr, "Invalid ifname\n");
+		return 1;
+	}
+	if_name = argv[1];
+	err = do_attach(if_idx, prog_fd[0], argv[1]);
+	if (err)
+		return err;
+
+	if (signal(SIGINT, sig_handler) ||
+	    signal(SIGHUP, sig_handler) ||
+	    signal(SIGTERM, sig_handler)) {
+		perror("signal");
+		return 1;
+	}
+
+	test_bpf_perf_event();
+
+	if (perf_event_mmap(pmu_fd) < 0)
+		return 1;
+
+	ret = perf_event_poller(pmu_fd, print_bpf_output);
+	kill(0, SIGINT);
+	return ret;
+}
-- 
2.7.4

^ permalink raw reply related

* Re: [PATCH net-next] net: bridge: Lock before br_fdb_find()
From: David Miller @ 2018-05-30 16:42 UTC (permalink / raw)
  To: petrm; +Cc: netdev, bridge
In-Reply-To: <ff84c00d882dc646a7d69dcbdfe8417b4c0bdf91.1527522008.git.petrm@mellanox.com>

From: Petr Machata <petrm@mellanox.com>
Date: Mon, 28 May 2018 17:44:16 +0200

> Callers of br_fdb_find() need to hold the hash lock, which
> br_fdb_find_port() doesn't do. Add the missing lock/unlock
> pair.
> 
> Signed-off-by: Petr Machata <petrm@mellanox.com>

If all of the these uses of br_fdb_find_port() are safe, then it
should use the RCU fdb lookup variant.

So I basically agree with Stephen that this locking doesn't make any
sense.

The lock is needed when you are going to add or delete an FDB entry.

Here we are doing a lookup and returning a device pointer via the FDB
entry found in the lookup.

The RTNL assertion assures that the device returned won't disappear.

If the device can disappear, the spinlock added by this patch doesn't
change that at all.

^ permalink raw reply

* Re: [PATCH v2 0/5] build warnings cleanup
From: Herbert Xu @ 2018-05-30 16:28 UTC (permalink / raw)
  To: Atul Gupta; +Cc: linux-crypto, gustavo, dan.carpenter, netdev, davem
In-Reply-To: <1527435922-6727-1-git-send-email-atul.gupta@chelsio.com>

On Sun, May 27, 2018 at 09:15:17PM +0530, Atul Gupta wrote:
> Build warnings cleanup reported for
> - using only 128b key
> - wait for buffer in sendmsg/sendpage
> - check for null before using skb
> - free rspq_skb_cache in error path
> - indentation
> 
> v2:
>   Added bug report description for 0002
>   Incorported comments from Dan Carpenter
> 
> Atul Gupta (5):
>   crypto:chtls: key len correction
>   crypto: chtls: wait for memory sendmsg, sendpage
>   crypto: chtls: dereference null variable
>   crypto: chtls: kbuild warnings
>   crypto: chtls: free beyond end rspq_skb_cache
> 
>  drivers/crypto/chelsio/chtls/chtls.h      |   1 +
>  drivers/crypto/chelsio/chtls/chtls_hw.c   |   6 +-
>  drivers/crypto/chelsio/chtls/chtls_io.c   | 104 +++++++++++++++++++++++++++---
>  drivers/crypto/chelsio/chtls/chtls_main.c |   3 +-
>  4 files changed, 98 insertions(+), 16 deletions(-)

All applied.  Thanks.
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply

* Re: [PATCH mlx5-next 2/2] net/mlx5: Add FPGA QP error event
From: Andrew Lunn @ 2018-05-30 16:21 UTC (permalink / raw)
  To: Saeed Mahameed
  Cc: Jason Gunthorpe, netdev@vger.kernel.org, Ilan Tayari,
	linux-rdma@vger.kernel.org, Leon Romanovsky, Adi Nissim
In-Reply-To: <da8acf8e4a2c976fd3ce4f8e17a969b3464fd263.camel@mellanox.com>

On Wed, May 30, 2018 at 03:14:20PM +0000, Saeed Mahameed wrote:
> On Wed, 2018-05-30 at 03:07 +0200, Andrew Lunn wrote:
> > On Tue, May 29, 2018 at 05:19:54PM -0700, Saeed Mahameed wrote:
> > > From: Ilan Tayari <ilant@mellanox.com>
> > > 
> > > The FPGA QP event fires whenever a QP on the FPGA trasitions
> > > to the error state.
> > 
> > FPGA i know, field programmable gate array. Could you offer some clue
> > as to what QP means?
> > 
> 
> Hi Andre, QP "Queue Pair" is well known rdma concept, and widely used
> in mlx drivers, it is used as in the driver as a ring buffer to
> communicate with the FPGA device.

O.K. Thanks.

It is hard to get the right balance between assuming people know
everything, and know nothing. But you can help teach people these
terms i commit messages:

      The FPGA queue pair event fires whenever a QP on the FPGA
      transitions to the error state.

   Andrew

^ permalink raw reply

* Re: [PATCH RESEND rdma-next] net/mlx5: Use flow counter pointer as input to the query function
From: Saeed Mahameed @ 2018-05-30 16:17 UTC (permalink / raw)
  To: Jason Gunthorpe, Or Gerlitz
  Cc: netdev@vger.kernel.org, Leon Romanovsky,
	linux-rdma@vger.kernel.org, Raed Salem
In-Reply-To: <1527662674-16546-1-git-send-email-ogerlitz@mellanox.com>

On Wed, 2018-05-30 at 09:44 +0300, Or Gerlitz wrote:
> This allows to un-expose the details of struct mlx5_fc and keep
> it internal to the core driver as it used to be.
> 
> Signed-off-by: Or Gerlitz <ogerlitz@mellanox.com>
> ---
> 
> Jason,
> 
> As you asked, I am sending a fixup in case you intend to apply
> V2 of the flow counter series [1], if there's going to be V3,
> Leon, please apply it from the begining.
> 
> Fixed Jason's address @ my git aliases, he's with MLNX by now..
> 
> Or.
> 
> [1] https://marc.info/?l=linux-netdev&m=152759937829994&w=2
> 
>  drivers/infiniband/hw/mlx5/main.c                  |  2 +-
>  drivers/net/ethernet/mellanox/mlx5/core/eswitch.c  | 15 ++++++----
> ----
>  drivers/net/ethernet/mellanox/mlx5/core/fs_core.h  | 22
> +++++++++++++++++---
>  .../net/ethernet/mellanox/mlx5/core/fs_counters.c  |  4 ++--
>  include/linux/mlx5/fs.h                            | 24 ++++------


I like this patch, this should go into mlx5-next tree though, along
with "net/mlx5: Export flow counter related API"

> ------------
>  5 files changed, 32 insertions(+), 35 deletions(-)
> 
> diff --git a/drivers/infiniband/hw/mlx5/main.c
> b/drivers/infiniband/hw/mlx5/main.c
> index ac99125..4b09dcd 100644
> --- a/drivers/infiniband/hw/mlx5/main.c
> +++ b/drivers/infiniband/hw/mlx5/main.c
> @@ -3151,7 +3151,7 @@ static int read_flow_counters(struct ib_device
> *ibdev,
>  	struct mlx5_fc *fc = read_attr->hw_cntrs_hndl;
>  	struct mlx5_ib_dev *dev = to_mdev(ibdev);
>  
> -	return mlx5_fc_query(dev->mdev, fc->id,
> +	return mlx5_fc_query(dev->mdev, fc,
>  			     &read_attr->out[IB_COUNTER_PACKETS],
>  			     &read_attr->out[IB_COUNTER_BYTES]);
>  }
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
> b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
> index 6cab1dd..f63dfbc 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/eswitch.c
> @@ -2104,21 +2104,18 @@ static int
> mlx5_eswitch_query_vport_drop_stats(struct mlx5_core_dev *dev,
>  	struct mlx5_vport *vport = &esw->vports[vport_idx];
>  	u64 rx_discard_vport_down, tx_discard_vport_down;
>  	u64 bytes = 0;
> -	u16 idx = 0;
>  	int err = 0;
>  
>  	if (!vport->enabled || esw->mode != SRIOV_LEGACY)
>  		return 0;
>  
> -	if (vport->egress.drop_counter) {
> -		idx = vport->egress.drop_counter->id;
> -		mlx5_fc_query(dev, idx, &stats->rx_dropped, &bytes);
> -	}
> +	if (vport->egress.drop_counter)
> +		mlx5_fc_query(dev, vport->egress.drop_counter,
> +			      &stats->rx_dropped, &bytes);
>  
> -	if (vport->ingress.drop_counter) {
> -		idx = vport->ingress.drop_counter->id;
> -		mlx5_fc_query(dev, idx, &stats->tx_dropped, &bytes);
> -	}
> +	if (vport->ingress.drop_counter)
> +		mlx5_fc_query(dev, vport->ingress.drop_counter,
> +			      &stats->tx_dropped, &bytes);
>  
>  	if (!MLX5_CAP_GEN(dev, receive_discard_vport_down) &&
>  	    !MLX5_CAP_GEN(dev, transmit_discard_vport_down))
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.h
> b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.h
> index 40992ae..0211d77 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/fs_core.h
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_core.h
> @@ -131,6 +131,25 @@ struct mlx5_flow_table {
>  	struct rhltable			fgs_hash;
>  };
>  
> +struct mlx5_fc_cache {
> +	u64 packets;
> +	u64 bytes;
> +	u64 lastuse;
> +};
> +
> +struct mlx5_fc {
> +	struct rb_node node;
> +	struct list_head list;
> +
> +	u64 lastpackets;
> +	u64 lastbytes;
> +
> +	u32 id;
> +	bool deleted;
> +	bool aging;
> +	struct mlx5_fc_cache cache ____cacheline_aligned_in_smp;
> +};
> +
>  struct mlx5_ft_underlay_qp {
>  	struct list_head list;
>  	u32 qpn;
> @@ -210,9 +229,6 @@ void mlx5_fc_queue_stats_work(struct
> mlx5_core_dev *dev,
>  			      unsigned long delay);
>  void mlx5_fc_update_sampling_interval(struct mlx5_core_dev *dev,
>  				      unsigned long interval);
> -int mlx5_fc_query(struct mlx5_core_dev *dev, u16 id,
> -		  u64 *packets, u64 *bytes);
> -
>  int mlx5_init_fs(struct mlx5_core_dev *dev);
>  void mlx5_cleanup_fs(struct mlx5_core_dev *dev);
>  
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/fs_counters.c
> b/drivers/net/ethernet/mellanox/mlx5/core/fs_counters.c
> index 10f4078..58af6be 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/fs_counters.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/fs_counters.c
> @@ -314,10 +314,10 @@ void mlx5_cleanup_fc_stats(struct mlx5_core_dev
> *dev)
>  	}
>  }
>  
> -int mlx5_fc_query(struct mlx5_core_dev *dev, u16 id,
> +int mlx5_fc_query(struct mlx5_core_dev *dev, struct mlx5_fc
> *counter,
>  		  u64 *packets, u64 *bytes)
>  {
> -	return mlx5_cmd_fc_query(dev, id, packets, bytes);
> +	return mlx5_cmd_fc_query(dev, counter->id, packets, bytes);
>  }
>  EXPORT_SYMBOL(mlx5_fc_query);
>  
> diff --git a/include/linux/mlx5/fs.h b/include/linux/mlx5/fs.h
> index 4612e0a..ef2f3bf 100644
> --- a/include/linux/mlx5/fs.h
> +++ b/include/linux/mlx5/fs.h
> @@ -185,30 +185,14 @@ int mlx5_modify_rule_destination(struct
> mlx5_flow_handle *handler,
>  struct mlx5_fc *mlx5_flow_rule_counter(struct mlx5_flow_handle
> *handler);
>  struct mlx5_fc *mlx5_fc_create(struct mlx5_core_dev *dev, bool
> aging);
>  void mlx5_fc_destroy(struct mlx5_core_dev *dev, struct mlx5_fc
> *counter);
> +
> +struct mlx5_fc *counter;
> +
>  void mlx5_fc_query_cached(struct mlx5_fc *counter,
>  			  u64 *bytes, u64 *packets, u64 *lastuse);
> -int mlx5_fc_query(struct mlx5_core_dev *dev, u16 id,
> +int mlx5_fc_query(struct mlx5_core_dev *dev, struct mlx5_fc
> *counter,
>  		  u64 *packets, u64 *bytes);
>  
> -struct mlx5_fc_cache {
> -	u64 packets;
> -	u64 bytes;
> -	u64 lastuse;
> -};
> -
> -struct mlx5_fc {
> -	struct rb_node node;
> -	struct list_head list;
> -
> -	u64 lastpackets;
> -	u64 lastbytes;
> -
> -	u32 id;
> -	bool deleted;
> -	bool aging;
> -	struct mlx5_fc_cache cache ____cacheline_aligned_in_smp;
> -};
> -
>  int mlx5_fs_add_rx_underlay_qpn(struct mlx5_core_dev *dev, u32
> underlay_qpn);
>  int mlx5_fs_remove_rx_underlay_qpn(struct mlx5_core_dev *dev, u32
> underlay_qpn);
>  

^ permalink raw reply

* Re: [PATCH mlx5-next 1/2] net/mlx5: Add temperature warning event to log
From: Andrew Lunn @ 2018-05-30 16:17 UTC (permalink / raw)
  To: Saeed Mahameed
  Cc: Jason Gunthorpe, netdev@vger.kernel.org, Ilan Tayari,
	linux-rdma@vger.kernel.org, Leon Romanovsky, Adi Nissim
In-Reply-To: <994d317b8c28c197ec4f8b7db67bcd045cb0284e.camel@mellanox.com>

> Hi Andrew, yes the temperature is available by other means, this patch
> is needed for alert information reasons in order to know which internal
> sensors triggered the alarm.
> We are working in parallel to expose temperature sensor to hwmon, but
> this is still WIP.
> 
> 
> Is it ok to have both ?

Hi Saeed

Ideally no. hwmon has mechanisms for setting alarm thresholds, and
indicating the thresholds have been exceeded. There are also ways to
tie this to thermal zones, so the system can react on overheating,
slow down the CPU, drop voltages, ramp up fans, etc. hwmon should be
your primary interface, not dmesg.

But if you are stuck doing things in the wrong order, i guess it is
O.K. I don't think dmesg is a Binary API, so you can remove it later.

     Andrew

^ permalink raw reply

* Re: [PATCH bpf-next 04/11] bpf: show prog and map id in fdinfo
From: Song Liu @ 2018-05-30 16:15 UTC (permalink / raw)
  To: Daniel Borkmann; +Cc: Jesper Dangaard Brouer, Networking
In-Reply-To: <c7a3a2bb-fe79-66cb-159e-b5680f53910f@iogearbox.net>

On Tue, May 29, 2018 at 12:55 PM, Daniel Borkmann <daniel@iogearbox.net> wrote:
> On 05/29/2018 07:27 PM, Jesper Dangaard Brouer wrote:
>> On Mon, 28 May 2018 02:43:37 +0200
>> Daniel Borkmann <daniel@iogearbox.net> wrote:
>>
>>> Its trivial and straight forward to expose it for scripts that can
>>> then use it along with bpftool in order to inspect an individual
>>> application's used maps and progs. Right now we dump some basic
>>> information in the fdinfo file but with the help of the map/prog
>>> id full introspection becomes possible now.
>>>
>>> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
>>> Acked-by: Alexei Starovoitov <ast@kernel.org>

Acked-by: Song Liu <songliubraving@fb.com>

>>
>> AFAICR iproute uses this proc fdinfo, for pinned maps.  Have you tested
>> if this change is handled gracefully by tc ?
>
> Yep, it works just fine, I also tested it before submission.

^ permalink raw reply

* Re: [PATCH bpf-next 10/11] bpf: sync bpf uapi header with tools
From: Song Liu @ 2018-05-30 16:10 UTC (permalink / raw)
  To: Daniel Borkmann; +Cc: Alexei Starovoitov, Networking
In-Reply-To: <20180528004344.3606-11-daniel@iogearbox.net>

On Sun, May 27, 2018 at 5:43 PM, Daniel Borkmann <daniel@iogearbox.net> wrote:
> Pull in recent changes from include/uapi/linux/bpf.h.
>
> Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
> Acked-by: Alexei Starovoitov <ast@kernel.org>

Acked-by: Song Liu <songliubraving@fb.com>

> ---
>  tools/include/uapi/linux/bpf.h | 20 ++++++++++++++++++--
>  1 file changed, 18 insertions(+), 2 deletions(-)
>
> diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
> index 9b8c6e3..7108711 100644
> --- a/tools/include/uapi/linux/bpf.h
> +++ b/tools/include/uapi/linux/bpf.h
> @@ -2004,6 +2004,20 @@ union bpf_attr {
>   *             direct packet access.
>   *     Return
>   *             0 on success, or a negative error in case of failure.
> + *
> + * uint64_t bpf_skb_cgroup_id(struct sk_buff *skb)
> + *     Description
> + *             Return the cgroup v2 id of the socket associated with the *skb*.
> + *             This is roughly similar to the **bpf_get_cgroup_classid**\ ()
> + *             helper for cgroup v1 by providing a tag resp. identifier that
> + *             can be matched on or used for map lookups e.g. to implement
> + *             policy. The cgroup v2 id of a given path in the hierarchy is
> + *             exposed in user space through the f_handle API in order to get
> + *             to the same 64-bit id.
> + *
> + *             This helper can be used on TC egress path, but not on ingress.
> + *     Return
> + *             The id is returned or 0 in case the id could not be retrieved.
>   */
>  #define __BPF_FUNC_MAPPER(FN)          \
>         FN(unspec),                     \
> @@ -2082,7 +2096,8 @@ union bpf_attr {
>         FN(lwt_push_encap),             \
>         FN(lwt_seg6_store_bytes),       \
>         FN(lwt_seg6_adjust_srh),        \
> -       FN(lwt_seg6_action),
> +       FN(lwt_seg6_action),            \
> +       FN(skb_cgroup_id),
>
>  /* integer value in 'imm' field of BPF_CALL instruction selects which helper
>   * function eBPF program intends to call
> @@ -2199,7 +2214,7 @@ struct bpf_tunnel_key {
>         };
>         __u8 tunnel_tos;
>         __u8 tunnel_ttl;
> -       __u16 tunnel_ext;
> +       __u16 tunnel_ext;       /* Padding, future use. */
>         __u32 tunnel_label;
>  };
>
> @@ -2210,6 +2225,7 @@ struct bpf_xfrm_state {
>         __u32 reqid;
>         __u32 spi;      /* Stored in network byte order */
>         __u16 family;
> +       __u16 ext;      /* Padding, future use. */
>         union {
>                 __u32 remote_ipv4;      /* Stored in network byte order */
>                 __u32 remote_ipv6[4];   /* Stored in network byte order */
> --
> 2.9.5
>

^ permalink raw reply

* Re: [RFC net-next 0/4] net: sched: support replay of filter offload when binding to block
From: Or Gerlitz @ 2018-05-30 15:59 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: John Hurley, Linux Netdev List, Jiri Pirko, Samudrala, Sridhar,
	oss-drivers, Rabie Loulou
In-Reply-To: <20180528130202.16981241@cakuba>

On Mon, May 28, 2018 at 11:02 PM, Jakub Kicinski
<jakub.kicinski@netronome.com> wrote:
> On Mon, 28 May 2018 13:48:28 +0300, Or Gerlitz wrote:
>> On Fri, May 25, 2018 at 5:25 AM, Jakub Kicinski wrote:
>> > This series from John adds the ability to replay filter offload requests
>> > when new offload callback is being registered on a TC block.  This is most
>> > likely to take place for shared blocks today, when a block which already
>> > has rules is bound to another interface.  Prior to this patch set if any
>> > of the rules were offloaded the block bind would fail.
>>
>> Can you elaborate a little further here? is this something that you are planning
>> to use for the uplink LAG use-case? AFAIU if we apply share-block to nfp as
>> things are prior to this patch, it would work, so there's a case where
>> it doesn't and this is now handled with the series?
>
> Just looking at things as they stand today, no bond/forward looking
> plans - nfp "supports" shared blocks by registering multiple callbacks
> to the block.  There are two problems:
>
> (a) one can't install a second callback if some rules are already
>     offloaded because of:
>
>         /* At this point, playback of previous block cb calls is not supported,
>          * so forbid to register to block which already has some offloaded
>          * filters present.
>          */
>         if (tcf_block_offload_in_use(block))
>                 return ERR_PTR(-EOPNOTSUPP);
>
>     in __tcf_block_cb_register(), so block sharing has to be set up
>     before any rules are added.
>
> (b) when block is unshared filters are not removed today and driver
>     would have to sweep its rule table, as John notes.  It's not a big
>     deal but this series fixes it nicely in the core, too.

OK, thanks for these two point clarifications, much helpful.

> Looking forward there are two things we can use shared blocks for: we
> can try to teach user space to share ingress blocks on all legs of bonds
> instead of trying to propagate the rules from the bond down in the
> kernel, which is more tricky to get right.  We will need reliable
> replay for that, because we want new links to be able to join and leave
> the bond when rules are already present.

> Second use case, which is more far fetched, is trying to discover and
> register callbacks for blocks of tunnel devices directly, and avoid the
> egdev infrastructure...

> We should discuss the above further, but regardless, I think this
> patchset is quite a nice addition on it's own.  Would you agree?

yes, it sounds good, but I need to look deeper, a bit behind on that :(

Or.

^ permalink raw reply

* Re: [PATCH net-next v4 7/8] net: bridge: Notify about bridge VLANs
From: Nikolay Aleksandrov @ 2018-05-30 15:58 UTC (permalink / raw)
  To: Petr Machata, netdev, devel, bridge
  Cc: jiri, idosch, davem, razvan.stefanescu, gregkh, stephen, andrew,
	vivien.didelot, f.fainelli, dan.carpenter
In-Reply-To: <583d583ad158363411fde87dbf6709024714e498.1527641426.git.petrm@mellanox.com>

On 30/05/18 04:00, Petr Machata wrote:
> A driver might need to react to changes in settings of brentry VLANs.
> Therefore send switchdev port notifications for these as well. Reuse
> SWITCHDEV_OBJ_ID_PORT_VLAN for this purpose. Listeners should use
> netif_is_bridge_master() on orig_dev to determine whether the
> notification is about a bridge port or a bridge.
> 
> Signed-off-by: Petr Machata <petrm@mellanox.com>
> ---
>  net/bridge/br_vlan.c | 28 +++++++++++++++++++++++++---
>  1 file changed, 25 insertions(+), 3 deletions(-)
> 

LGTM,
Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>

^ permalink raw reply

* Re: [PATCH mlx5-next v2 11/13] IB/mlx5: Add flow counters binding support
From: Jason Gunthorpe @ 2018-05-30 15:35 UTC (permalink / raw)
  To: Yishai Hadas
  Cc: Leon Romanovsky, Doug Ledford, Leon Romanovsky, RDMA mailing list,
	Boris Pismenny, Matan Barak, Raed Salem, Yishai Hadas,
	Saeed Mahameed, linux-netdev, Alex Rosenbaum
In-Reply-To: <0dec7cc6-5715-4513-d55b-c53271c4fbee@dev.mellanox.co.il>

On Wed, May 30, 2018 at 06:24:00PM +0300, Yishai Hadas wrote:
> On 5/30/2018 6:06 PM, Jason Gunthorpe wrote:
> >On Wed, May 30, 2018 at 03:31:34PM +0300, Yishai Hadas wrote:
> >>On 5/29/2018 10:56 PM, Jason Gunthorpe wrote:
> >>>On Tue, May 29, 2018 at 04:09:15PM +0300, Leon Romanovsky wrote:
> >>>>diff --git a/include/uapi/rdma/mlx5-abi.h b/include/uapi/rdma/mlx5-abi.h
> >>>>index 508ea8c82da7..ef3f430a7050 100644
> >>>>+++ b/include/uapi/rdma/mlx5-abi.h
> >>>>@@ -443,4 +443,18 @@ enum {
> >>>>  enum {
> >>>>  	MLX5_IB_CLOCK_INFO_V1              = 0,
> >>>>  };
> >>>>+
> >>>>+struct mlx5_ib_flow_counters_data {
> >>>>+	__aligned_u64   counters_data;
> >>>>+	__u32   ncounters;
> >>>>+	__u32   reserved;
> >>>>+};
> >>>>+
> >>>>+struct mlx5_ib_create_flow {
> >>>>+	__u32   ncounters_data;
> >>>>+	__u32   reserved;
> >>>>+	/* Following are counters data based on ncounters_data */
> >>>>+	struct mlx5_ib_flow_counters_data data[];
> >>>>+};
> >>>>+
> >>>>  #endif /* MLX5_ABI_USER_H */
> >>>
> >>>This uapi thing still needs to be fixed as I pointed out before.
> >>
> >>In V3 we can go with below, no change in memory layout but it can clarify
> >>the code/usage.
> >>
> >>struct mlx5_ib_flow_counters_desc {
> >>         __u32   description;
> >>         __u32   index;
> >>};
> >>
> >>struct mlx5_ib_flow_counters_data {
> >>         RDMA_UAPI_PTR(struct mlx5_ib_flow_counters_desc *, counters_data);
> >>         __u32   ncounters;
> >>         __u32   reserved;
> >>};
> >
> >OK, this is what I asked for originally..
> >
> >>struct mlx5_ib_create_flow {
> >>         __u32   ncounters_data;
> >>         __u32   reserved;
> >>         /* Following are counters data based on ncounters_data */
> >>         struct mlx5_ib_flow_counters_data data[];
> >>
> >>
> >>>I still can't figure out why this should be a 2d array.
> >>
> >>This comes to support the future case of multiple counters objects/specs
> >>passed with the same flow. There is a need to differentiate mapping data for
> >>each counters object and that is done via the 'ncounters_data' field and the
> >>2d array.
> >
> >This still doesn't make any sense to me. How are these multiple
> >counters objects/specs going to be identified?
> >
> >Basically, what does the array index for data[] mean? Should it match
> >the spec index from the main verb or something?
> >
> 
> Each entry in the data[] should match a corresponding counter object that
> was pointed by a counters spec upon the flow creation.

What if there are a mixture of specs, some with counters and some
without?

The index is just matching the index of the spec? That makes sense.

> >This is a good place for a comment, since the intention is completely
> >opaque here.
> 
> Sure, we'll add comment to clarify the above.

Sure, can leave the flex array then too

Jason

^ permalink raw reply

* [PATCH] b53: Add brcm5389 support
From: Damien Thébault @ 2018-05-30 15:33 UTC (permalink / raw)
  To: vivien.didelot@savoirfairelinux.com, f.fainelli@gmail.com,
	andrew@lunn.ch
  Cc: davem@davemloft.net, netdev@vger.kernel.org

This patch adds support for the BCM5389 switch connected through MDIO.

Signed-off-by: Damien Thébault <damien.thebault@vitec.com>
---
 drivers/net/dsa/b53/b53_common.c | 13 +++++++++++++
 drivers/net/dsa/b53/b53_mdio.c   |  5 ++++-
 drivers/net/dsa/b53/b53_priv.h   |  1 +
 3 files changed, 18 insertions(+), 1 deletion(-)

diff --git a/drivers/net/dsa/b53/b53_common.c b/drivers/net/dsa/b53/b53_common.c
index 78616787f2a3..3da5fca77cbd 100644
--- a/drivers/net/dsa/b53/b53_common.c
+++ b/drivers/net/dsa/b53/b53_common.c
@@ -1711,6 +1711,18 @@ static const struct b53_chip_data b53_switch_chips[] = {
 		.cpu_port = B53_CPU_PORT_25,
 		.duplex_reg = B53_DUPLEX_STAT_FE,
 	},
+	{
+		.chip_id = BCM5389_DEVICE_ID,
+		.dev_name = "BCM5389",
+		.vlans = 4096,
+		.enabled_ports = 0x1f,
+		.arl_entries = 4,
+		.cpu_port = B53_CPU_PORT,
+		.vta_regs = B53_VTA_REGS,
+		.duplex_reg = B53_DUPLEX_STAT_GE,
+		.jumbo_pm_reg = B53_JUMBO_PORT_MASK,
+		.jumbo_size_reg = B53_JUMBO_MAX_SIZE,
+	},
 	{
 		.chip_id = BCM5395_DEVICE_ID,
 		.dev_name = "BCM5395",
@@ -2034,6 +2046,7 @@ int b53_switch_detect(struct b53_device *dev)
 		else
 			dev->chip_id = BCM5365_DEVICE_ID;
 		break;
+	case BCM5389_DEVICE_ID:
 	case BCM5395_DEVICE_ID:
 	case BCM5397_DEVICE_ID:
 	case BCM5398_DEVICE_ID:
diff --git a/drivers/net/dsa/b53/b53_mdio.c b/drivers/net/dsa/b53/b53_mdio.c
index fa7556f5d4fb..a533a90e3904 100644
--- a/drivers/net/dsa/b53/b53_mdio.c
+++ b/drivers/net/dsa/b53/b53_mdio.c
@@ -285,6 +285,7 @@ static const struct b53_io_ops b53_mdio_ops = {
 #define B53_BRCM_OUI_1	0x0143bc00
 #define B53_BRCM_OUI_2	0x03625c00
 #define B53_BRCM_OUI_3	0x00406000
+#define B53_BRCM_OUI_4	0x01410c00
 
 static int b53_mdio_probe(struct mdio_device *mdiodev)
 {
@@ -311,7 +312,8 @@ static int b53_mdio_probe(struct mdio_device *mdiodev)
 	 */
 	if ((phy_id & 0xfffffc00) != B53_BRCM_OUI_1 &&
 	    (phy_id & 0xfffffc00) != B53_BRCM_OUI_2 &&
-	    (phy_id & 0xfffffc00) != B53_BRCM_OUI_3) {
+	    (phy_id & 0xfffffc00) != B53_BRCM_OUI_3 &&
+	    (phy_id & 0xfffffc00) != B53_BRCM_OUI_4) {
 		dev_err(&mdiodev->dev, "Unsupported device: 0x%08x\n", phy_id);
 		return -ENODEV;
 	}
@@ -360,6 +362,7 @@ static const struct of_device_id b53_of_match[] = {
 	{ .compatible = "brcm,bcm53125" },
 	{ .compatible = "brcm,bcm53128" },
 	{ .compatible = "brcm,bcm5365" },
+	{ .compatible = "brcm,bcm5389" },
 	{ .compatible = "brcm,bcm5395" },
 	{ .compatible = "brcm,bcm5397" },
 	{ .compatible = "brcm,bcm5398" },
diff --git a/drivers/net/dsa/b53/b53_priv.h b/drivers/net/dsa/b53/b53_priv.h
index 1187ebd79287..3b57f47d0e79 100644
--- a/drivers/net/dsa/b53/b53_priv.h
+++ b/drivers/net/dsa/b53/b53_priv.h
@@ -48,6 +48,7 @@ struct b53_io_ops {
 enum {
 	BCM5325_DEVICE_ID = 0x25,
 	BCM5365_DEVICE_ID = 0x65,
+	BCM5389_DEVICE_ID = 0x89,
 	BCM5395_DEVICE_ID = 0x95,
 	BCM5397_DEVICE_ID = 0x97,
 	BCM5398_DEVICE_ID = 0x98,
-- 
2.17.0

^ permalink raw reply related

* [PATCH v2 iproute2-next] ip route: print RTA_CACHEINFO if it exists
From: dsahern @ 2018-05-30 15:30 UTC (permalink / raw)
  To: netdev, stephen; +Cc: David Ahern

From: David Ahern <dsahern@gmail.com>

RTA_CACHEINFO can be sent for non-cloned routes. If the attribute is
present print it. Allows route dumps to print expires times for example
which can exist on FIB entries.

Signed-off-by: David Ahern <dsahern@gmail.com>
---
v2
- leave print_cache_flags under r->rtm_flags & RTM_F_CLONED check

 ip/iproute.c | 13 +++++--------
 1 file changed, 5 insertions(+), 8 deletions(-)

diff --git a/ip/iproute.c b/ip/iproute.c
index 56dd9f25e38e..254d7abd2abf 100644
--- a/ip/iproute.c
+++ b/ip/iproute.c
@@ -899,17 +899,14 @@ int print_route(const struct sockaddr_nl *who, struct nlmsghdr *n, void *arg)
 			   rta_getattr_u32(tb[RTA_UID]));
 
 	if (r->rtm_family == AF_INET) {
-		if (r->rtm_flags & RTM_F_CLONED) {
+		if (r->rtm_flags & RTM_F_CLONED)
 			print_cache_flags(fp, r->rtm_flags);
 
-			if (tb[RTA_CACHEINFO])
-				print_rta_cacheinfo(fp, RTA_DATA(tb[RTA_CACHEINFO]));
-		}
+		if (tb[RTA_CACHEINFO])
+			print_rta_cacheinfo(fp, RTA_DATA(tb[RTA_CACHEINFO]));
 	} else if (r->rtm_family == AF_INET6) {
-		if (r->rtm_flags & RTM_F_CLONED) {
-			if (tb[RTA_CACHEINFO])
-				print_rta_cacheinfo(fp, RTA_DATA(tb[RTA_CACHEINFO]));
-		}
+		if (tb[RTA_CACHEINFO])
+			print_rta_cacheinfo(fp, RTA_DATA(tb[RTA_CACHEINFO]));
 	}
 
 	if (tb[RTA_METRICS])
-- 
2.11.0

^ permalink raw reply related

* Re: [PATCH mlx5-next v2 11/13] IB/mlx5: Add flow counters binding support
From: Yishai Hadas @ 2018-05-30 15:24 UTC (permalink / raw)
  To: Jason Gunthorpe
  Cc: Leon Romanovsky, Doug Ledford, Leon Romanovsky, RDMA mailing list,
	Boris Pismenny, Matan Barak, Raed Salem, Yishai Hadas,
	Saeed Mahameed, linux-netdev, Alex Rosenbaum
In-Reply-To: <20180530150608.GA30754@ziepe.ca>

On 5/30/2018 6:06 PM, Jason Gunthorpe wrote:
> On Wed, May 30, 2018 at 03:31:34PM +0300, Yishai Hadas wrote:
>> On 5/29/2018 10:56 PM, Jason Gunthorpe wrote:
>>> On Tue, May 29, 2018 at 04:09:15PM +0300, Leon Romanovsky wrote:
>>>> diff --git a/include/uapi/rdma/mlx5-abi.h b/include/uapi/rdma/mlx5-abi.h
>>>> index 508ea8c82da7..ef3f430a7050 100644
>>>> +++ b/include/uapi/rdma/mlx5-abi.h
>>>> @@ -443,4 +443,18 @@ enum {
>>>>   enum {
>>>>   	MLX5_IB_CLOCK_INFO_V1              = 0,
>>>>   };
>>>> +
>>>> +struct mlx5_ib_flow_counters_data {
>>>> +	__aligned_u64   counters_data;
>>>> +	__u32   ncounters;
>>>> +	__u32   reserved;
>>>> +};
>>>> +
>>>> +struct mlx5_ib_create_flow {
>>>> +	__u32   ncounters_data;
>>>> +	__u32   reserved;
>>>> +	/* Following are counters data based on ncounters_data */
>>>> +	struct mlx5_ib_flow_counters_data data[];
>>>> +};
>>>> +
>>>>   #endif /* MLX5_ABI_USER_H */
>>>
>>> This uapi thing still needs to be fixed as I pointed out before.
>>
>> In V3 we can go with below, no change in memory layout but it can clarify
>> the code/usage.
>>
>> struct mlx5_ib_flow_counters_desc {
>>          __u32   description;
>>          __u32   index;
>> };
>>
>> struct mlx5_ib_flow_counters_data {
>>          RDMA_UAPI_PTR(struct mlx5_ib_flow_counters_desc *, counters_data);
>>          __u32   ncounters;
>>          __u32   reserved;
>> };
> 
> OK, this is what I asked for originally..
> 
>> struct mlx5_ib_create_flow {
>>          __u32   ncounters_data;
>>          __u32   reserved;
>>          /* Following are counters data based on ncounters_data */
>>          struct mlx5_ib_flow_counters_data data[];
>>
>>
>>> I still can't figure out why this should be a 2d array.
>>
>> This comes to support the future case of multiple counters objects/specs
>> passed with the same flow. There is a need to differentiate mapping data for
>> each counters object and that is done via the 'ncounters_data' field and the
>> 2d array.
> 
> This still doesn't make any sense to me. How are these multiple
> counters objects/specs going to be identified?
> 
> Basically, what does the array index for data[] mean? Should it match
> the spec index from the main verb or something?
> 

Each entry in the data[] should match a corresponding counter object 
that was pointed by a counters spec upon the flow creation.

> This is a good place for a comment, since the intention is completely
> opaque here.

Sure, we'll add comment to clarify the above.

^ permalink raw reply

* Re: [PATCH iproute2-next] ipaddress: Add support for address metric
From: David Ahern @ 2018-05-30 15:22 UTC (permalink / raw)
  To: dsahern, netdev; +Cc: roopa
In-Reply-To: <20180527151000.30488-9-dsahern@kernel.org>

On 5/27/18 9:10 AM, dsahern@kernel.org wrote:
> From: David Ahern <dsahern@gmail.com>
> 
> Add support for IFA_RT_PRIORITY using the same keywords as iproute for
> RTA_PRIORITY.
> 
> Signed-off-by: David Ahern <dsahern@gmail.com>
> ---
>  include/uapi/linux/if_addr.h |  1 +
>  ip/ipaddress.c               | 15 ++++++++++++++-
>  man/man8/ip-address.8.in     |  6 ++++++
>  3 files changed, 21 insertions(+), 1 deletion(-)

applied to iproute2-next.

^ permalink raw reply

* Re: [PATCH] [net-next, wrong] make BPFILTER_UMH depend on X86
From: Alexei Starovoitov @ 2018-05-30 15:17 UTC (permalink / raw)
  To: Arnd Bergmann
  Cc: David S. Miller, Alexei Starovoitov, Masahiro Yamada,
	linux-kbuild, netdev, linux-kernel
In-Reply-To: <20180528153222.2037158-1-arnd@arndb.de>

On Mon, May 28, 2018 at 05:31:01PM +0200, Arnd Bergmann wrote:
> When build testing across architectures, I run into a build error on
> all targets other than X86:
> 
> gcc-8.1.0-nolibc/arm-linux-gnueabi/bin/arm-linux-gnueabi-objdump: net/bpfilter/bpfilter_umh: File format not recognized
> gcc-8.1.0-nolibc/arm-linux-gnueabi/bin/arm-linux-gnueabi-objcopy:net/bpfilter/bpfilter_umh.o: Invalid bfd target
> 
> The problem is that 'hostprogs' get built with 'gcc' rather than
> '$(CROSS_COMPILE)gcc', and my default gcc (as most people's) targets x86.
> 
> To work around it, adding an X86 dependency gets randconfigs building
> again on my box.
> 
> Clearly, this is not a good solution, since it should actually work fine
> when building native kernels on other architectures but that is now
> disabled, while cross building an x86 kernel on another host is still
> broken after my patch.
> 
> What we probably want here is to try out if the compiler is able to build
> executables for the target architecture and not build the helper otherwise,
> at least when compile-testing. No idea how to do that though.
> 
> Link: http://www.kernel.org/pub/tools/crosstool/
> Cc: Masahiro Yamada <yamada.masahiro@socionext.com>
> Cc: linux-kbuild@vger.kernel.org
> Signed-off-by: Arnd Bergmann <arnd@arndb.de>
> ---
>  net/bpfilter/Kconfig | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/net/bpfilter/Kconfig b/net/bpfilter/Kconfig
> index 60725c5f79db..61cc4fcbb4d0 100644
> --- a/net/bpfilter/Kconfig
> +++ b/net/bpfilter/Kconfig
> @@ -9,6 +9,7 @@ menuconfig BPFILTER
>  if BPFILTER
>  config BPFILTER_UMH
>  	tristate "bpfilter kernel module with user mode helper"
> +	depends on X86 # actually depends on native builds

depends on X86 will break it on arm.
I think the better short term fix would be to test that HOSTCC == CC
It doesn't have to be the same compiler. HOSTCC's arch == kernel ARCH
Not sure how to hack makefile to do that.
Long term we need to get rid of HOSTCC dependency.

^ permalink raw reply

* Re: [PATCH net-next v4 2/8] net: bridge: Extract br_vlan_add_existing()
From: Nikolay Aleksandrov @ 2018-05-30 15:15 UTC (permalink / raw)
  To: Petr Machata, netdev, devel, bridge
  Cc: f.fainelli, andrew, gregkh, vivien.didelot, idosch, jiri,
	razvan.stefanescu, davem, dan.carpenter
In-Reply-To: <3f7f1f0580acedf385d993304d53d370a60410c2.1527641426.git.petrm@mellanox.com>

On 30/05/18 03:56, Petr Machata wrote:
> Extract the code that deals with adding a preexisting VLAN to bridge CPU
> port to a separate function. A follow-up patch introduces a need to roll
> back operations in this block due to an error, and this split will make
> the error-handling code clearer.
> 
> Signed-off-by: Petr Machata <petrm@mellanox.com>
> ---
>  net/bridge/br_vlan.c | 55 +++++++++++++++++++++++++++++++---------------------
>  1 file changed, 33 insertions(+), 22 deletions(-)
> 

Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>

^ permalink raw reply

* Re: [PATCH mlx5-next 2/2] net/mlx5: Add FPGA QP error event
From: Saeed Mahameed @ 2018-05-30 15:14 UTC (permalink / raw)
  To: andrew@lunn.ch
  Cc: Jason Gunthorpe, netdev@vger.kernel.org, Ilan Tayari,
	linux-rdma@vger.kernel.org, Leon Romanovsky, Adi Nissim
In-Reply-To: <20180530010710.GC30239@lunn.ch>

On Wed, 2018-05-30 at 03:07 +0200, Andrew Lunn wrote:
> On Tue, May 29, 2018 at 05:19:54PM -0700, Saeed Mahameed wrote:
> > From: Ilan Tayari <ilant@mellanox.com>
> > 
> > The FPGA QP event fires whenever a QP on the FPGA trasitions
> > to the error state.
> 
> FPGA i know, field programmable gate array. Could you offer some clue
> as to what QP means?
> 

Hi Andre, QP "Queue Pair" is well known rdma concept, and widely used
in mlx drivers, it is used as in the driver as a ring buffer to
communicate with the FPGA device.
 
>    Thanks
> 	Andrew

^ permalink raw reply

* Re: [PATCH net-next v4 1/8] net: bridge: Extract boilerplate around switchdev_port_obj_*()
From: Nikolay Aleksandrov @ 2018-05-30 15:12 UTC (permalink / raw)
  To: Petr Machata, netdev, devel, bridge
  Cc: f.fainelli, andrew, gregkh, vivien.didelot, idosch, jiri,
	razvan.stefanescu, davem, dan.carpenter
In-Reply-To: <7b6f3bdc759168227bbff78173837d5fb5560528.1527641426.git.petrm@mellanox.com>

On 30/05/18 03:56, Petr Machata wrote:
> A call to switchdev_port_obj_add() or switchdev_port_obj_del() involves
> initializing a struct switchdev_obj_port_vlan, a piece of code that
> repeats on each call site almost verbatim. While in the current codebase
> there is just one duplicated add call, the follow-up patches add more of
> both add and del calls.
> 
> Thus to remove the duplication, extract the repetition into named
> functions and reuse.
> 
> Signed-off-by: Petr Machata <petrm@mellanox.com>
> Reviewed-by: Vivien Didelot <vivien.didelot@savoirfairelinux.com>
> ---
>  net/bridge/br_private.h   | 13 +++++++++++++
>  net/bridge/br_switchdev.c | 25 +++++++++++++++++++++++++
>  net/bridge/br_vlan.c      | 26 +++-----------------------
>  3 files changed, 41 insertions(+), 23 deletions(-)
> 

Reviewed-by: Nikolay Aleksandrov <nikolay@cumulusnetworks.com>

^ permalink raw reply

* [PATCH][next] bpf: devmap: remove redundant assignment of dev = dev
From: Colin King @ 2018-05-30 15:09 UTC (permalink / raw)
  To: Alexei Starovoitov, Daniel Borkmann, netdev; +Cc: kernel-janitors, linux-kernel

From: Colin Ian King <colin.king@canonical.com>

The assignment dev = dev is redundant and should be removed.

Detected by CoverityScan, CID#1469486 ("Evaluation order violation")

Signed-off-by: Colin Ian King <colin.king@canonical.com>
---
 kernel/bpf/devmap.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/bpf/devmap.c b/kernel/bpf/devmap.c
index ae16d0c373ef..1fe3fe60508a 100644
--- a/kernel/bpf/devmap.c
+++ b/kernel/bpf/devmap.c
@@ -352,7 +352,7 @@ int dev_map_enqueue(struct bpf_dtab_netdev *dst, struct xdp_buff *xdp,
 static void *dev_map_lookup_elem(struct bpf_map *map, void *key)
 {
 	struct bpf_dtab_netdev *obj = __dev_map_lookup_elem(map, *(u32 *)key);
-	struct net_device *dev = dev = obj ? obj->dev : NULL;
+	struct net_device *dev = obj ? obj->dev : NULL;
 
 	return dev ? &dev->ifindex : NULL;
 }
-- 
2.17.0

^ permalink raw reply related

* Re: [PATCH mlx5-next 1/2] net/mlx5: Add temperature warning event to log
From: Saeed Mahameed @ 2018-05-30 15:08 UTC (permalink / raw)
  To: andrew@lunn.ch
  Cc: Jason Gunthorpe, netdev@vger.kernel.org, Ilan Tayari,
	linux-rdma@vger.kernel.org, Leon Romanovsky, Adi Nissim
In-Reply-To: <20180530010404.GB30239@lunn.ch>

On Wed, 2018-05-30 at 03:04 +0200, Andrew Lunn wrote:
> On Tue, May 29, 2018 at 05:19:53PM -0700, Saeed Mahameed wrote:
> > From: Ilan Tayari <ilant@mellanox.com>
> > 
> > Temperature warning event is sent by FW to indicate high
> > temperature
> > as detected by one of the sensors on the board.
> > Add handling of this event by writing the numbers of the alert
> > sensors
> > to the kernel log.
> 
> Hi Saaed
> 
> Is the temperature itself available? If so, it would be better to
> expose this as a hwmon device per temperature sensor.
> 

Hi Andrew, yes the temperature is available by other means, this patch
is needed for alert information reasons in order to know which internal
sensors triggered the alarm.
We are working in parallel to expose temperature sensor to hwmon, but
this is still WIP.


Is it ok to have both ?

>        Andrew

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox