Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH v2] seg6_iptunnel: Refactor seg6_lwt_headroom out of uapi header
From: David Miller @ 2020-07-31 21:25 UTC (permalink / raw)
  To: ioanaruxandra.stancioi
  Cc: david.lebrun, kuznet, yoshfuji, netdev, linux-kernel, elver,
	glider, stancioi
In-Reply-To: <20200731074032.293456-1-ioanaruxandra.stancioi@gmail.com>

From: Ioana-Ruxandra Stancioi <ioanaruxandra.stancioi@gmail.com>
Date: Fri, 31 Jul 2020 07:40:32 +0000

> --- a/net/ipv6/seg6_iptunnel.c
> +++ b/net/ipv6/seg6_iptunnel.c
> @@ -27,6 +27,23 @@
>  #include <net/seg6_hmac.h>
>  #endif
>  
> +static inline size_t seg6_lwt_headroom(struct seg6_iptunnel_encap *tuninfo)
> +{

Please remove the inline tag when you move the function here, we do
not use the inline keyword in foo.c files.

Thank you.

^ permalink raw reply

* Re: [RFC PATCH bpf-next 2/3] bpf: Add helper to do forwarding lookups in kernel FDB table
From: Daniel Borkmann @ 2020-07-31 21:12 UTC (permalink / raw)
  To: Yoshiki Komachi, David S. Miller, Alexei Starovoitov,
	Jesper Dangaard Brouer, John Fastabend, Jakub Kicinski,
	Martin KaFai Lau, Song Liu, Yonghong Song, Andrii Nakryiko,
	KP Singh, Roopa Prabhu, Nikolay Aleksandrov, David Ahern
  Cc: netdev, bridge, bpf
In-Reply-To: <1596170660-5582-3-git-send-email-komachi.yoshiki@gmail.com>

On 7/31/20 6:44 AM, Yoshiki Komachi wrote:
> This patch adds a new bpf helper to access FDB in the kernel tables
> from XDP programs. The helper enables us to find the destination port
> of master bridge in XDP layer with high speed. If an entry in the
> tables is successfully found, egress device index will be returned.
> 
> In cases of failure, packets will be dropped or forwarded to upper
> networking stack in the kernel by XDP programs. Multicast and broadcast
> packets are currently not supported. Thus, these will need to be
> passed to upper layer on the basis of XDP_PASS action.
> 
> The API uses destination MAC and VLAN ID as keys, so XDP programs
> need to extract these from forwarded packets.
> 
> Signed-off-by: Yoshiki Komachi <komachi.yoshiki@gmail.com>

Few initial comments below:

> ---
>   include/uapi/linux/bpf.h       | 28 +++++++++++++++++++++
>   net/core/filter.c              | 45 ++++++++++++++++++++++++++++++++++
>   scripts/bpf_helpers_doc.py     |  1 +
>   tools/include/uapi/linux/bpf.h | 28 +++++++++++++++++++++
>   4 files changed, 102 insertions(+)
> 
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index 54d0c886e3ba..f2e729dd1721 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -2149,6 +2149,22 @@ union bpf_attr {
>    *		* > 0 one of **BPF_FIB_LKUP_RET_** codes explaining why the
>    *		  packet is not forwarded or needs assist from full stack
>    *
> + * long bpf_fdb_lookup(void *ctx, struct bpf_fdb_lookup *params, int plen, u32 flags)
> + *	Description
> + *		Do FDB lookup in kernel tables using parameters in *params*.
> + *		If lookup is successful (ie., FDB lookup finds a destination entry),
> + *		ifindex is set to the egress device index from the FDB lookup.
> + *		Both multicast and broadcast packets are currently unsupported
> + *		in XDP layer.
> + *
> + *		*plen* argument is the size of the passed **struct bpf_fdb_lookup**.
> + *		*ctx* is only **struct xdp_md** for XDP programs.
> + *
> + *     Return
> + *		* < 0 if any input argument is invalid
> + *		*   0 on success (destination port is found)
> + *		* > 0 on failure (there is no entry)
> + *
>    * long bpf_sock_hash_update(struct bpf_sock_ops *skops, struct bpf_map *map, void *key, u64 flags)
>    *	Description
>    *		Add an entry to, or update a sockhash *map* referencing sockets.
> @@ -3449,6 +3465,7 @@ union bpf_attr {
>   	FN(get_stack),			\
>   	FN(skb_load_bytes_relative),	\
>   	FN(fib_lookup),			\
> +	FN(fdb_lookup),			\

This breaks UAPI. Needs to be added to the very end of the list.

>   	FN(sock_hash_update),		\
>   	FN(msg_redirect_hash),		\
>   	FN(sk_redirect_hash),		\
> @@ -4328,6 +4345,17 @@ struct bpf_fib_lookup {
>   	__u8	dmac[6];     /* ETH_ALEN */
>   };
>   
> +enum {
> +	BPF_FDB_LKUP_RET_SUCCESS,      /* lookup successful */
> +	BPF_FDB_LKUP_RET_NOENT,        /* entry is not found */
> +};
> +
> +struct bpf_fdb_lookup {
> +	unsigned char addr[6];     /* ETH_ALEN */
> +	__u16 vlan_id;
> +	__u32 ifindex;
> +};
> +
>   enum bpf_task_fd_type {
>   	BPF_FD_TYPE_RAW_TRACEPOINT,	/* tp name */
>   	BPF_FD_TYPE_TRACEPOINT,		/* tp name */
> diff --git a/net/core/filter.c b/net/core/filter.c
> index 654c346b7d91..68800d1b8cd5 100644
> --- a/net/core/filter.c
> +++ b/net/core/filter.c
> @@ -45,6 +45,7 @@
>   #include <linux/filter.h>
>   #include <linux/ratelimit.h>
>   #include <linux/seccomp.h>
> +#include <linux/if_bridge.h>
>   #include <linux/if_vlan.h>
>   #include <linux/bpf.h>
>   #include <linux/btf.h>
> @@ -5084,6 +5085,46 @@ static const struct bpf_func_proto bpf_skb_fib_lookup_proto = {
>   	.arg4_type	= ARG_ANYTHING,
>   };
>   
> +#if IS_ENABLED(CONFIG_BRIDGE)
> +BPF_CALL_4(bpf_xdp_fdb_lookup, struct xdp_buff *, ctx,
> +	   struct bpf_fdb_lookup *, params, int, plen, u32, flags)
> +{
> +	struct net_device *src, *dst;
> +	struct net *net;
> +
> +	if (plen < sizeof(*params))
> +		return -EINVAL;

Given flags are not used, this needs to reject anything invalid otherwise
you're not able to extend it in future.

> +	net = dev_net(ctx->rxq->dev);
> +
> +	if (is_multicast_ether_addr(params->addr) ||
> +	    is_broadcast_ether_addr(params->addr))
> +		return BPF_FDB_LKUP_RET_NOENT;
> +
> +	src = dev_get_by_index_rcu(net, params->ifindex);
> +	if (unlikely(!src))
> +		return -ENODEV;
> +
> +	dst = br_fdb_find_port_xdp(src, params->addr, params->vlan_id);
> +	if (dst) {
> +		params->ifindex = dst->ifindex;
> +		return BPF_FDB_LKUP_RET_SUCCESS;
> +	}

Currently the helper description says nothing that this is /only/ limited to
bridges. I think it would be better to also name the helper bpf_br_fdb_lookup()
as well if so to avoid any confusion.

> +	return BPF_FDB_LKUP_RET_NOENT;
> +}
> +
> +static const struct bpf_func_proto bpf_xdp_fdb_lookup_proto = {
> +	.func		= bpf_xdp_fdb_lookup,
> +	.gpl_only	= true,
> +	.ret_type	= RET_INTEGER,
> +	.arg1_type      = ARG_PTR_TO_CTX,
> +	.arg2_type      = ARG_PTR_TO_MEM,
> +	.arg3_type      = ARG_CONST_SIZE,
> +	.arg4_type	= ARG_ANYTHING,
> +};
> +#endif

This should also have a tc pendant (similar as done in routing lookup helper)
in case native XDP is not available. This will be useful for those that have
the same code compilable for both tc/XDP.

>   #if IS_ENABLED(CONFIG_IPV6_SEG6_BPF)
>   static int bpf_push_seg6_encap(struct sk_buff *skb, u32 type, void *hdr, u32 len)
>   {
> @@ -6477,6 +6518,10 @@ xdp_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
>   		return &bpf_xdp_adjust_tail_proto;
>   	case BPF_FUNC_fib_lookup:
>   		return &bpf_xdp_fib_lookup_proto;
> +#if IS_ENABLED(CONFIG_BRIDGE)
> +	case BPF_FUNC_fdb_lookup:
> +		return &bpf_xdp_fdb_lookup_proto;
> +#endif
>   #ifdef CONFIG_INET
>   	case BPF_FUNC_sk_lookup_udp:
>   		return &bpf_xdp_sk_lookup_udp_proto;

^ permalink raw reply

* [PATCH net-next] cxgb4: fix extracting IP addresses in TC-FLOWER rules
From: Rahul Lakkireddy @ 2020-07-31 20:53 UTC (permalink / raw)
  To: netdev; +Cc: davem, vishal, dt

commit c8729cac2a11 ("cxgb4: add ethtool n-tuple filter insertion")
has removed checking control key for determining IP address types
for TC-FLOWER rules, which causes all the rules being inserted to
hardware to become IPv6 rule type always. So, add back the check
to select the correct IP address type to extract and hence fix the
correct rule type being inserted to hardware.

Also, ethtool_rx_flow_key doesn't have any control key and instead
directly sets the IPv4/IPv6 address keys. So, explicitly set the
IP address type for ethtool n-tuple filters to reuse the same code.

Fixes: c8729cac2a11 ("cxgb4: add ethtool n-tuple filter insertion")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
---
 .../ethernet/chelsio/cxgb4/cxgb4_tc_flower.c    | 17 +++++++++++++++--
 1 file changed, 15 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c
index 6251444ffa58..f642c1b475c4 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_tc_flower.c
@@ -80,6 +80,19 @@ static void cxgb4_process_flow_match(struct net_device *dev,
 				     struct flow_rule *rule,
 				     struct ch_filter_specification *fs)
 {
+	u16 addr_type = 0;
+
+	if (flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_CONTROL)) {
+		struct flow_match_control match;
+
+		flow_rule_match_control(rule, &match);
+		addr_type = match.key->addr_type;
+	} else if (flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_IPV4_ADDRS)) {
+		addr_type = FLOW_DISSECTOR_KEY_IPV4_ADDRS;
+	} else if (flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_IPV6_ADDRS)) {
+		addr_type = FLOW_DISSECTOR_KEY_IPV6_ADDRS;
+	}
+
 	if (flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_BASIC)) {
 		struct flow_match_basic match;
 		u16 ethtype_key, ethtype_mask;
@@ -102,7 +115,7 @@ static void cxgb4_process_flow_match(struct net_device *dev,
 		fs->mask.proto = match.mask->ip_proto;
 	}
 
-	if (flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_IPV4_ADDRS)) {
+	if (addr_type == FLOW_DISSECTOR_KEY_IPV4_ADDRS) {
 		struct flow_match_ipv4_addrs match;
 
 		flow_rule_match_ipv4_addrs(rule, &match);
@@ -117,7 +130,7 @@ static void cxgb4_process_flow_match(struct net_device *dev,
 		memcpy(&fs->nat_fip[0], &match.key->src, sizeof(match.key->src));
 	}
 
-	if (flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_IPV6_ADDRS)) {
+	if (addr_type == FLOW_DISSECTOR_KEY_IPV6_ADDRS) {
 		struct flow_match_ipv6_addrs match;
 
 		flow_rule_match_ipv6_addrs(rule, &match);
-- 
2.24.0


^ permalink raw reply related

* [PATCH net-next] cxgb4: fix check for running offline ethtool selftest
From: Rahul Lakkireddy @ 2020-07-31 20:50 UTC (permalink / raw)
  To: netdev; +Cc: davem, vishal, dt

The flag indicating the selftest to run is a bitmask. So, fix the
check. Also, the selftests will fail if adapter initialization has
not been completed yet. So, add appropriate check and bail sooner.

Fixes: 7235ffae3d2c ("cxgb4: add loopback ethtool self-test")
Signed-off-by: Rahul Lakkireddy <rahul.lakkireddy@chelsio.com>
---
 drivers/net/ethernet/chelsio/cxgb4/cxgb4_ethtool.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_ethtool.c b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_ethtool.c
index 12ef9ddd1e54..9f3173f86eed 100644
--- a/drivers/net/ethernet/chelsio/cxgb4/cxgb4_ethtool.c
+++ b/drivers/net/ethernet/chelsio/cxgb4/cxgb4_ethtool.c
@@ -31,7 +31,7 @@ enum cxgb4_ethtool_tests {
 };
 
 static const char cxgb4_selftest_strings[CXGB4_ETHTOOL_MAX_TEST][ETH_GSTRING_LEN] = {
-	"Loop back test",
+	"Loop back test (offline)",
 };
 
 static const char * const flash_region_strings[] = {
@@ -2095,12 +2095,13 @@ static void cxgb4_self_test(struct net_device *netdev,
 
 	memset(data, 0, sizeof(u64) * CXGB4_ETHTOOL_MAX_TEST);
 
-	if (!(adap->flags & CXGB4_FW_OK)) {
+	if (!(adap->flags & CXGB4_FULL_INIT_DONE) ||
+	    !(adap->flags & CXGB4_FW_OK)) {
 		eth_test->flags |= ETH_TEST_FL_FAILED;
 		return;
 	}
 
-	if (eth_test->flags == ETH_TEST_FL_OFFLINE)
+	if (eth_test->flags & ETH_TEST_FL_OFFLINE)
 		cxgb4_lb_test(netdev, &data[CXGB4_ETHTOOL_LB_TEST]);
 
 	if (data[CXGB4_ETHTOOL_LB_TEST])
-- 
2.24.0


^ permalink raw reply related

* Re: [net-next v4 1/6] net: marvell: prestera: Add driver for Prestera family ASIC devices
From: Vadym Kochan @ 2020-07-31 20:55 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: David S. Miller, Jakub Kicinski, Jiri Pirko, Ido Schimmel,
	Andrew Lunn, Oleksandr Mazur, Serhiy Boiko, Serhiy Pshyk,
	Volodymyr Mytnyk, Taras Chornyi, Andrii Savka, netdev,
	Linux Kernel Mailing List, Mickey Rachamim
In-Reply-To: <CAHp75VcS4fEak3z0exODErs5FbDwf+Di1RJmf7JfMgnD8xgXOA@mail.gmail.com>

Hi Andy,

On Fri, Jul 31, 2020 at 07:02:47PM +0300, Andy Shevchenko wrote:
> On Fri, Jul 31, 2020 at 6:22 PM Vadym Kochan <vadym.kochan@plvision.eu> wrote:
> > On Mon, Jul 27, 2020 at 03:59:13PM +0300, Andy Shevchenko wrote:
> > > On Mon, Jul 27, 2020 at 3:23 PM Vadym Kochan <vadym.kochan@plvision.eu> wrote:
> 
> ...
> 
> > > > Signed-off-by: Andrii Savka <andrii.savka@plvision.eu>
> > > > Signed-off-by: Oleksandr Mazur <oleksandr.mazur@plvision.eu>
> > > > Signed-off-by: Serhiy Boiko <serhiy.boiko@plvision.eu>
> > > > Signed-off-by: Serhiy Pshyk <serhiy.pshyk@plvision.eu>
> > > > Signed-off-by: Taras Chornyi <taras.chornyi@plvision.eu>
> > > > Signed-off-by: Volodymyr Mytnyk <volodymyr.mytnyk@plvision.eu>
> > > > Signed-off-by: Vadym Kochan <vadym.kochan@plvision.eu>
> > >
> > > This needs more work. You have to really understand the role of each
> > > person in the above list.
> > > I highly recommend (re-)read sections 11-13 of Submitting Patches.
> > >
> > At least looks like I need to add these persons as Co-author's.
> 
> I don't know, you are!
> And I think you meant Co-developer's
> 
> ...
> 
> > > > +#include <linux/string.h>
> > > > +#include <linux/bitops.h>
> > > > +#include <linux/bitfield.h>
> > > > +#include <linux/errno.h>
> > >
> > > Perhaps ordered?
> > >
> > alphabetical ?
> 
> Yes.
> 
> ...
> 
> > > > +       struct prestera_msg_event_port *hw_evt;
> > > > +
> > > > +       hw_evt = (struct prestera_msg_event_port *)msg;
> > >
> > > Can be one line I suppose.
> > >
> > Yes, but I am trying to avoid line-breaking because of 80 chars
> > limitation.
> 
> We have 100, but okay.
> 
> ...
> 
> > > > +       if (evt->id == PRESTERA_PORT_EVENT_STATE_CHANGED)
> > > > +               evt->port_evt.data.oper_state = hw_evt->param.oper_state;
> > > > +       else
> > > > +               return -EINVAL;
> > > > +
> > > > +       return 0;
> > >
> > > Perhaps traditional pattern, i.e.
> > >
> > >   if (...)
> > >     return -EINVAL;
> > >   ...
> > >   return 0;
> > >
> > I am not sure if it is applicable here, because the error state here
> > is if 'evt->id' did not matched after all checks. Actually this is
> > simply a 'switch', but I use 'if' to have shorter code.
> 
> Then do it a switch-case. I can see that other reviewers/contributors
> may stumble over this.
> 
> ...
> 
> > > > +       /* Only 0xFF mac addrs are supported */
> > > > +       if (port->fp_id >= 0xFF)
> > > > +               goto err_port_init;
> > >
> > > You meant 255, right?
> > > Otherwise you have to mentioned is it byte limitation or what?
> > >
> > > ...
> > Yes, 255 is a limitation because of max byte value.
> 
> But 255 itself is some kind of error value? Perhaps it deserves a definition.
> 
> ...
> 
> > > > +       np = of_find_compatible_node(NULL, NULL, "marvell,prestera");
> > > > +       if (np) {
> > > > +               base_mac_np = of_parse_phandle(np, "base-mac-provider", 0);
> > > > +               if (base_mac_np) {
> > > > +                       const char *base_mac;
> > > > +
> > > > +                       base_mac = of_get_mac_address(base_mac_np);
> > > > +                       of_node_put(base_mac_np);
> > > > +                       if (!IS_ERR(base_mac))
> > > > +                               ether_addr_copy(sw->base_mac, base_mac);
> > > > +               }
> > > > +       }
> > > > +
> > > > +       if (!is_valid_ether_addr(sw->base_mac)) {
> > > > +               eth_random_addr(sw->base_mac);
> > > > +               dev_info(sw->dev->dev, "using random base mac address\n");
> > > > +       }
> > >
> > > Isn't it device_get_mac_address() reimplementation?
> > >
> > device_get_mac_address() just tries to get mac via fwnode.
> 
> Yes, and how is it different from here? (consider
> fwnode_get_mac_address() if it suits better).
> 
In this case of_get_mac_address() tries to get mac address from
different sources:

    1) device-tree - if it is defined here

    2) nvmem device (eeprom) - if nvmem ref node was pointed in device-tree
       and nvmem provider has the mac address cell.

otherwise the driver will just use the random one.

> ...
> 
> > > > +       new_skb = alloc_skb(len, GFP_ATOMIC | GFP_DMA);
> > >
> > > Atomic? Why?
> > >
> > TX path might be called from net_tx_action which is softirq.
> 
> Okay, but GFP_DMA is quite a limitation to the memory region. Can't  be 32-bit?
> 
Yes in this case there are a limitation when the device supports only
30bit (this dma mask is used in prestera_pci.c), physical address range
even on 64bit platform.

And thankfully few months ago someone added such ability for RaspberryPI
(aarch64) to support ZONE_DMA.

> ...
> 
> > > > +       int tx_retry_num = 10 * tx_ring->max_burst;
> > >
> > > Magic!
> > You mean the code is magic ? Yes, I am trying to relax the
> > calling of SDMA engine.
> 
> Usually when reviewers tell you about magic it assumes magic numbers
> whose meaning is not clear.
> (Requires either to be defined or commented)
> 
> -- 
> With Best Regards,
> Andy Shevchenko

^ permalink raw reply

* [PATCH bpf-next] selftests/bpf: fix spurious test failures in core_retro selftest
From: Andrii Nakryiko @ 2020-07-31 20:49 UTC (permalink / raw)
  To: bpf, netdev, ast, daniel; +Cc: andrii.nakryiko, kernel-team, Andrii Nakryiko

core_retro selftest uses BPF program that's triggered on sys_enter
system-wide, but has no protection from some unrelated process doing syscall
while selftest is running. This leads to occasional test failures with
unexpected PIDs being returned. Fix that by filtering out all processes that
are not test_progs process.

Fixes: fcda189a5133 ("selftests/bpf: Add test relying only on CO-RE and no recent kernel features")
Signed-off-by: Andrii Nakryiko <andriin@fb.com>
---
 tools/testing/selftests/bpf/prog_tests/core_retro.c |  8 ++++++--
 tools/testing/selftests/bpf/progs/test_core_retro.c | 13 +++++++++++++
 2 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/bpf/prog_tests/core_retro.c b/tools/testing/selftests/bpf/prog_tests/core_retro.c
index 78e30d3a23d5..6acb0e94d4d7 100644
--- a/tools/testing/selftests/bpf/prog_tests/core_retro.c
+++ b/tools/testing/selftests/bpf/prog_tests/core_retro.c
@@ -6,7 +6,7 @@
 
 void test_core_retro(void)
 {
-	int err, zero = 0, res, duration = 0;
+	int err, zero = 0, res, duration = 0, my_pid = getpid();
 	struct test_core_retro *skel;
 
 	/* load program */
@@ -14,6 +14,10 @@ void test_core_retro(void)
 	if (CHECK(!skel, "skel_load", "skeleton open/load failed\n"))
 		goto out_close;
 
+	err = bpf_map_update_elem(bpf_map__fd(skel->maps.exp_tgid_map), &zero, &my_pid, 0);
+	if (CHECK(err, "map_update", "failed to set expected PID: %d\n", errno))
+		goto out_close;
+
 	/* attach probe */
 	err = test_core_retro__attach(skel);
 	if (CHECK(err, "attach_kprobe", "err %d\n", err))
@@ -26,7 +30,7 @@ void test_core_retro(void)
 	if (CHECK(err, "map_lookup", "failed to lookup result: %d\n", errno))
 		goto out_close;
 
-	CHECK(res != getpid(), "pid_check", "got %d != exp %d\n", res, getpid());
+	CHECK(res != my_pid, "pid_check", "got %d != exp %d\n", res, my_pid);
 
 out_close:
 	test_core_retro__destroy(skel);
diff --git a/tools/testing/selftests/bpf/progs/test_core_retro.c b/tools/testing/selftests/bpf/progs/test_core_retro.c
index 75c60c3c29cf..20861ec2f674 100644
--- a/tools/testing/selftests/bpf/progs/test_core_retro.c
+++ b/tools/testing/selftests/bpf/progs/test_core_retro.c
@@ -8,6 +8,13 @@ struct task_struct {
 	int tgid;
 } __attribute__((preserve_access_index));
 
+struct {
+	__uint(type, BPF_MAP_TYPE_ARRAY);
+	__uint(max_entries, 1);
+	__type(key, int);
+	__type(value, int);
+} exp_tgid_map SEC(".maps");
+
 struct {
 	__uint(type, BPF_MAP_TYPE_ARRAY);
 	__uint(max_entries, 1);
@@ -21,6 +28,12 @@ int handle_sys_enter(void *ctx)
 	struct task_struct *task = (void *)bpf_get_current_task();
 	int tgid = BPF_CORE_READ(task, tgid);
 	int zero = 0;
+	int real_tgid = bpf_get_current_pid_tgid() >> 32;
+	int *exp_tgid = bpf_map_lookup_elem(&exp_tgid_map, &zero);
+
+	/* only pass through sys_enters from test process */
+	if (!exp_tgid || *exp_tgid != real_tgid)
+		return 0;
 
 	bpf_map_update_elem(&results, &zero, &tgid, 0);
 
-- 
2.24.1


^ permalink raw reply related

* Re: [PATCH v2] drivers/net/wan/lapbether: Use needed_headroom instead of hard_header_len
From: Xie He @ 2020-07-31 20:40 UTC (permalink / raw)
  To: Willem de Bruijn
  Cc: David S. Miller, Jakub Kicinski, Linux Kernel Network Developers,
	LKML, Linux X25, Brian Norris
In-Reply-To: <CA+FuTSf_nuiah6rFy-KC1Taw+Wc4z0G7LzkAm-+Ms4FzYmTPEw@mail.gmail.com>

Thank you for your thorough review comment!

On Fri, Jul 31, 2020 at 7:13 AM Willem de Bruijn
<willemdebruijn.kernel@gmail.com> wrote:
>
> Thanks for fixing a kernel panic. The existing line was added recently
> in commit 9dc829a135fb ("drivers/net/wan/lapbether: Fixed the value of
> hard_header_len"). I assume a kernel with that commit reverted also
> panics? It does looks like it would.

Yes, that commit also fixed kernel panic. But that patch only fixed
kernel panic when using AF_PACKET/DGRAM sockets. It didn't fix kernel
panic when using AF_PACKET/RAW sockets. This patch attempts to fix
kernel panic when using AF_PACKET/RAW sockets, too.

> If this driver submits a modified packet to an underlying eth device,
> it is akin to tunnel drivers. The hard_header_len vs needed_headroom
> discussion also came up there recently [1]. That discussion points to
> commit c95b819ad75b ("gre: Use needed_headroom"). So the general
> approach in this patch is fine. Do note the point about mtu
> calculations -- but this device just hardcodes a 1000 byte dev->mtu
> irrespective of underlying ethernet device mtu, so I guess it has
> bigger issues on that point.

Yes, I didn't consider the issue of mtu calculation. Maybe we need to
calculate the mtu of this device based on the underlying Ethernet
device, too.

We may also need to handle the situation where the mtu of the
underlying Ethernet device changes.

I'm not sure if the mtu of the device can be changed by the user
without explicit support from the driver. If it can, we may also need
to set max_mtu and min_mtu properly to prevent the user from setting
it to invalid values.

> But, packet sockets with SOCK_RAW have to pass a fully formed packet
> with all the headers the ndo_start_xmit expects, i.e., it should be
> safe for the device to just pull that many bytes. X25 requires the
> peculiar one byte pseudo header you mention: lapbeth_xmit
> unconditionally reads skb->data[0] and then calls skb_pull(skb, 1).
> This could be considered the device hard header len.

Yes, I agree that we can use hard_header_len (and min_header_len) to
prevent packets shorter than 1 byte from passing.

But because af_packet.c reserves a header space of needed_headroom for
RAW sockets, but hard_header_len + needed_headroom for DGRAM sockets,
it appears to me that af_packet.c expects hard_header_len to be the
header length created by dev_hard_header. We can, however, set
hard_header_len to 1 and let dev_hard_header generate a 0-sized
header, but this makes af_packet.c to reserve an extra unused 1-byte
header space for DGRAM sockets, and DGRAM sockets will not be
protected by the 1-byte minimum length check like RAW sockets.

The best solution might be to implement header_ops for X.25 drivers
and let dev_hard_header create this 1-byte header, so that
hard_header_len can equal to the header length created by
dev_hard_header. This might be the best way to fit the logic of
af_packet.c. But this requires changing the interface of X.25 drivers
so it might be a big change.

^ permalink raw reply

* Re: [net-next 0/4] devlink flash update overwrite mask
From: Jacob Keller @ 2020-07-31 20:35 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, Jiri Pirko, Jonathan Corbet, Michael Chan, Bin Luo,
	Saeed Mahameed, Leon Romanovsky, Ido Schimmel, Danielle Ratson
In-Reply-To: <20200730173928.676a7a29@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com>



On 7/30/2020 5:39 PM, Jakub Kicinski wrote:
> LGTM,
> 
> minor suggestions:
>  - could you add opt-in support flags to struct devlink_ops, a'la
>    ethtool_ops->supported_coalesce_params so that you don't have to
>    modify all drivers to reject unsupported things?

Sure that makes sense.

>  - could you split patch 2 into an ice change and a devlink core
>    change?
> 

Yep.

^ permalink raw reply

* Re: [PATCH v2 0/3]
From: Jakub Kicinski @ 2020-07-31 20:31 UTC (permalink / raw)
  To: Rakesh Pillai; +Cc: ath10k, linux-wireless, linux-kernel, kvalo, davem, netdev
In-Reply-To: <1596220042-2778-1-git-send-email-pillair@codeaurora.org>

On Fri, 31 Jul 2020 23:57:19 +0530 Rakesh Pillai wrote:
> The history recording will be compiled only if
> ATH10K_DEBUG is enabled, and also enabled via
> the module parameter. Once the history recording
> is enabled via module parameter, it can be enabled
> or disabled runtime via debugfs.

Have you seen the trace_devlink_hwmsg() interface?
Could it be used here?

^ permalink raw reply

* Re: [PATCH v2 net-next 08/11] sfc_ef100: statistics gathering
From: Jakub Kicinski @ 2020-07-31 20:27 UTC (permalink / raw)
  To: Edward Cree; +Cc: linux-net-drivers, davem, netdev
In-Reply-To: <64c11143-4749-5891-85eb-c312f1de721e@solarflare.com>

On Fri, 31 Jul 2020 14:00:24 +0100 Edward Cree wrote:
> +	core_stats->rx_packets = stats[EF100_STAT_port_rx_packets];
> +	core_stats->tx_packets = stats[EF100_STAT_port_tx_packets];
> +	core_stats->rx_bytes = stats[EF100_STAT_port_rx_bytes];
> +	core_stats->tx_bytes = stats[EF100_STAT_port_tx_bytes];

> +	core_stats->multicast = stats[EF100_STAT_port_rx_multicast];

> +	core_stats->rx_crc_errors = stats[EF100_STAT_port_rx_bad];
> +	core_stats->rx_frame_errors =
> +			stats[EF100_STAT_port_rx_align_error];
> +	core_stats->rx_fifo_errors = stats[EF100_STAT_port_rx_overflow];

Since this is a new driver please stop reporting this in ethtool.
They clearly have a perfect match with the standard stats there is 
no need for duplication.

^ permalink raw reply

* Re: KASAN: slab-out-of-bounds Read in qrtr_endpoint_post (2)
From: syzbot @ 2020-07-31 20:27 UTC (permalink / raw)
  To: bjorn.andersson, davem, kuba, linux-kernel, manivannan.sadhasivam,
	netdev, syzkaller-bugs
In-Reply-To: <000000000000c5c9ad05abbfc71b@google.com>

syzbot has bisected this issue to:

commit e42671084361302141a09284fde9bbc14fdd16bf
Author: Manivannan Sadhasivam <manivannan.sadhasivam@linaro.org>
Date:   Thu May 7 12:53:06 2020 +0000

    net: qrtr: Do not depend on ARCH_QCOM

bisection log:  https://syzkaller.appspot.com/x/bisect.txt?x=12a8076c900000
start commit:   83bdc727 random32: remove net_rand_state from the latent e..
git tree:       upstream
final oops:     https://syzkaller.appspot.com/x/report.txt?x=11a8076c900000
console output: https://syzkaller.appspot.com/x/log.txt?x=16a8076c900000
kernel config:  https://syzkaller.appspot.com/x/.config?x=e59ee776d5aa8d55
dashboard link: https://syzkaller.appspot.com/bug?extid=1917d778024161609247
syz repro:      https://syzkaller.appspot.com/x/repro.syz?x=14ac9b60900000
C reproducer:   https://syzkaller.appspot.com/x/repro.c?x=14256c5c900000

Reported-by: syzbot+1917d778024161609247@syzkaller.appspotmail.com
Fixes: e42671084361 ("net: qrtr: Do not depend on ARCH_QCOM")

For information about bisection process see: https://goo.gl/tpsmEJ#bisection

^ permalink raw reply

* Re: [PATCH v2 net-next 03/11] sfc_ef100: read Design Parameters at probe time
From: Jakub Kicinski @ 2020-07-31 20:18 UTC (permalink / raw)
  To: Edward Cree; +Cc: linux-net-drivers, davem, netdev
In-Reply-To: <48b1fedf-0863-8fab-7f7a-e2df6946b764@solarflare.com>

On Fri, 31 Jul 2020 13:58:35 +0100 Edward Cree wrote:
> +	default:
> +		/* Host interface says "Drivers should ignore design parameters
> +		 * that they do not recognise."
> +		 */
> +		netif_info(efx, probe, efx->net_dev,
> +			   "Ignoring unrecognised design parameter %u\n",
> +			   reader->type);

Is this really important enough to spam the logs with?

> +		return 0;
> +	}
> +}
> +
> +static int ef100_check_design_params(struct efx_nic *efx)
> +{
> +	struct ef100_tlv_state reader = {};
> +	u32 total_len, offset = 0;
> +	efx_dword_t reg;
> +	int rc = 0, i;
> +	u32 data;
> +
> +	efx_readd(efx, &reg, ER_GZ_PARAMS_TLV_LEN);
> +	total_len = EFX_DWORD_FIELD(reg, EFX_DWORD_0);
> +	netif_dbg(efx, probe, efx->net_dev, "%u bytes of design parameters\n",
> +		  total_len);
> +	while (offset < total_len) {
> +		efx_readd(efx, &reg, ER_GZ_PARAMS_TLV + offset);
> +		data = EFX_DWORD_FIELD(reg, EFX_DWORD_0);
> +		for (i = 0; i < sizeof(data); i++) {
> +			rc = ef100_tlv_feed(&reader, data);
> +			/* Got a complete value? */
> +			if (!rc && reader.state == EF100_TLV_TYPE)
> +				rc = ef100_process_design_param(efx, &reader);
> +			if (rc)
> +				goto out;
> +			data >>= 8;
> +			offset++;
> +		}
> +	}

Should you warn if the TLV stream ends half-way through an entry?

> +out:
> +	return rc;
> +}

^ permalink raw reply

* [PATCH v3 net-next 2/3] ionic: tx separate servicing
From: Shannon Nelson @ 2020-07-31 20:15 UTC (permalink / raw)
  To: netdev, davem; +Cc: Shannon Nelson
In-Reply-To: <20200731201536.18246-1-snelson@pensando.io>

We give the tx clean path its own budget and service routine in
order to give a little more leeway to be more aggressive, and
in preparation for coming changes.  We've found this gives us
a little better performance in some packet processing scenarios
without hurting other scenarios.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
---
 .../net/ethernet/pensando/ionic/ionic_lif.c   |   1 +
 .../net/ethernet/pensando/ionic/ionic_lif.h   |   2 +
 .../net/ethernet/pensando/ionic/ionic_txrx.c  | 106 +++++++++---------
 3 files changed, 53 insertions(+), 56 deletions(-)

diff --git a/drivers/net/ethernet/pensando/ionic/ionic_lif.c b/drivers/net/ethernet/pensando/ionic/ionic_lif.c
index 728dd6429d80..b228362363ca 100644
--- a/drivers/net/ethernet/pensando/ionic/ionic_lif.c
+++ b/drivers/net/ethernet/pensando/ionic/ionic_lif.c
@@ -2063,6 +2063,7 @@ static struct ionic_lif *ionic_lif_alloc(struct ionic *ionic, unsigned int index
 	lif->index = index;
 	lif->ntxq_descs = IONIC_DEF_TXRX_DESC;
 	lif->nrxq_descs = IONIC_DEF_TXRX_DESC;
+	lif->tx_budget = IONIC_TX_BUDGET_DEFAULT;
 
 	/* Convert the default coalesce value to actual hw resolution */
 	lif->rx_coalesce_usecs = IONIC_ITR_COAL_USEC_DEFAULT;
diff --git a/drivers/net/ethernet/pensando/ionic/ionic_lif.h b/drivers/net/ethernet/pensando/ionic/ionic_lif.h
index edad17d7aeeb..13f173d83970 100644
--- a/drivers/net/ethernet/pensando/ionic/ionic_lif.h
+++ b/drivers/net/ethernet/pensando/ionic/ionic_lif.h
@@ -13,6 +13,7 @@
 #define IONIC_MAX_NUM_NAPI_CNTR		(NAPI_POLL_WEIGHT + 1)
 #define IONIC_MAX_NUM_SG_CNTR		(IONIC_TX_MAX_SG_ELEMS + 1)
 #define IONIC_RX_COPYBREAK_DEFAULT	256
+#define IONIC_TX_BUDGET_DEFAULT		256
 
 struct ionic_tx_stats {
 	u64 dma_map_err;
@@ -176,6 +177,7 @@ struct ionic_lif {
 	unsigned int ntxq_descs;
 	unsigned int nrxq_descs;
 	u32 rx_copybreak;
+	u32 tx_budget;
 	unsigned int rx_mode;
 	u64 hw_features;
 	bool mc_overflow;
diff --git a/drivers/net/ethernet/pensando/ionic/ionic_txrx.c b/drivers/net/ethernet/pensando/ionic/ionic_txrx.c
index e660cd66f9a8..766f4595895c 100644
--- a/drivers/net/ethernet/pensando/ionic/ionic_txrx.c
+++ b/drivers/net/ethernet/pensando/ionic/ionic_txrx.c
@@ -15,6 +15,10 @@ static void ionic_rx_clean(struct ionic_queue *q,
 			   struct ionic_cq_info *cq_info,
 			   void *cb_arg);
 
+static bool ionic_rx_service(struct ionic_cq *cq, struct ionic_cq_info *cq_info);
+
+static bool ionic_tx_service(struct ionic_cq *cq, struct ionic_cq_info *cq_info);
+
 static inline void ionic_txq_post(struct ionic_queue *q, bool ring_dbell,
 				  ionic_desc_cb cb_func, void *cb_arg)
 {
@@ -249,29 +253,13 @@ static bool ionic_rx_service(struct ionic_cq *cq, struct ionic_cq_info *cq_info)
 	return true;
 }
 
-static u32 ionic_rx_walk_cq(struct ionic_cq *rxcq, u32 limit)
-{
-	u32 work_done = 0;
-
-	while (ionic_rx_service(rxcq, rxcq->tail)) {
-		if (rxcq->tail->last)
-			rxcq->done_color = !rxcq->done_color;
-		rxcq->tail = rxcq->tail->next;
-		DEBUG_STATS_CQE_CNT(rxcq);
-
-		if (++work_done >= limit)
-			break;
-	}
-
-	return work_done;
-}
-
 void ionic_rx_flush(struct ionic_cq *cq)
 {
 	struct ionic_dev *idev = &cq->lif->ionic->idev;
 	u32 work_done;
 
-	work_done = ionic_rx_walk_cq(cq, cq->num_descs);
+	work_done = ionic_cq_service(cq, cq->num_descs,
+				     ionic_rx_service, NULL, NULL);
 
 	if (work_done)
 		ionic_intr_credits(idev->intr_ctrl, cq->bound_intr->index,
@@ -439,32 +427,42 @@ int ionic_rx_napi(struct napi_struct *napi, int budget)
 	struct ionic_dev *idev;
 	struct ionic_lif *lif;
 	struct ionic_cq *txcq;
+	u32 rx_work_done = 0;
+	u32 tx_work_done = 0;
 	u32 work_done = 0;
 	u32 flags = 0;
+	bool unmask;
 
 	lif = rxcq->bound_q->lif;
 	idev = &lif->ionic->idev;
 	txcq = &lif->txqcqs[qi].qcq->cq;
 
-	ionic_tx_flush(txcq);
-
-	work_done = ionic_rx_walk_cq(rxcq, budget);
+	tx_work_done = ionic_cq_service(txcq, lif->tx_budget,
+					ionic_tx_service, NULL, NULL);
 
-	if (work_done)
+	rx_work_done = ionic_cq_service(rxcq, budget,
+					ionic_rx_service, NULL, NULL);
+	if (rx_work_done)
 		ionic_rx_fill_cb(rxcq->bound_q);
 
-	if (work_done < budget && napi_complete_done(napi, work_done)) {
+	unmask = (rx_work_done < budget) && (tx_work_done < lif->tx_budget);
+
+	if (unmask && napi_complete_done(napi, rx_work_done)) {
 		flags |= IONIC_INTR_CRED_UNMASK;
 		DEBUG_STATS_INTR_REARM(rxcq->bound_intr);
+		work_done = rx_work_done;
+	} else {
+		work_done = budget;
 	}
 
 	if (work_done || flags) {
 		flags |= IONIC_INTR_CRED_RESET_COALESCE;
 		ionic_intr_credits(idev->intr_ctrl, rxcq->bound_intr->index,
-				   work_done, flags);
+				   tx_work_done + rx_work_done, flags);
 	}
 
-	DEBUG_STATS_NAPI_POLL(qcq, work_done);
+	DEBUG_STATS_NAPI_POLL(qcq, rx_work_done);
+	DEBUG_STATS_NAPI_POLL(qcq, tx_work_done);
 
 	return work_done;
 }
@@ -552,43 +550,39 @@ static void ionic_tx_clean(struct ionic_queue *q,
 	}
 }
 
-void ionic_tx_flush(struct ionic_cq *cq)
+static bool ionic_tx_service(struct ionic_cq *cq, struct ionic_cq_info *cq_info)
 {
-	struct ionic_txq_comp *comp = cq->tail->cq_desc;
-	struct ionic_dev *idev = &cq->lif->ionic->idev;
+	struct ionic_txq_comp *comp = cq_info->cq_desc;
 	struct ionic_queue *q = cq->bound_q;
 	struct ionic_desc_info *desc_info;
-	unsigned int work_done = 0;
-
-	/* walk the completed cq entries */
-	while (work_done < cq->num_descs &&
-	       color_match(comp->color, cq->done_color)) {
-
-		/* clean the related q entries, there could be
-		 * several q entries completed for each cq completion
-		 */
-		do {
-			desc_info = q->tail;
-			q->tail = desc_info->next;
-			ionic_tx_clean(q, desc_info, cq->tail,
-				       desc_info->cb_arg);
-			desc_info->cb = NULL;
-			desc_info->cb_arg = NULL;
-		} while (desc_info->index != le16_to_cpu(comp->comp_index));
-
-		if (cq->tail->last)
-			cq->done_color = !cq->done_color;
-
-		cq->tail = cq->tail->next;
-		comp = cq->tail->cq_desc;
-		DEBUG_STATS_CQE_CNT(cq);
-
-		work_done++;
-	}
 
+	if (!color_match(comp->color, cq->done_color))
+		return false;
+
+	/* clean the related q entries, there could be
+	 * several q entries completed for each cq completion
+	 */
+	do {
+		desc_info = q->tail;
+		q->tail = desc_info->next;
+		ionic_tx_clean(q, desc_info, cq->tail, desc_info->cb_arg);
+		desc_info->cb = NULL;
+		desc_info->cb_arg = NULL;
+	} while (desc_info->index != le16_to_cpu(comp->comp_index));
+
+	return true;
+}
+
+void ionic_tx_flush(struct ionic_cq *cq)
+{
+	struct ionic_dev *idev = &cq->lif->ionic->idev;
+	u32 work_done;
+
+	work_done = ionic_cq_service(cq, cq->num_descs,
+				     ionic_tx_service, NULL, NULL);
 	if (work_done)
 		ionic_intr_credits(idev->intr_ctrl, cq->bound_intr->index,
-				   work_done, 0);
+				   work_done, IONIC_INTR_CRED_RESET_COALESCE);
 }
 
 void ionic_tx_empty(struct ionic_queue *q)
-- 
2.17.1


^ permalink raw reply related

* [PATCH v3 net-next 3/3] ionic: separate interrupt for Tx and Rx
From: Shannon Nelson @ 2020-07-31 20:15 UTC (permalink / raw)
  To: netdev, davem; +Cc: Shannon Nelson
In-Reply-To: <20200731201536.18246-1-snelson@pensando.io>

Add the capability to split the Tx queues onto their own
interrupts with their own napi contexts.  This gives the
opportunity for more direct control of Tx interrupt
handling, such as CPU affinity and interrupt coalescing,
useful for some traffic loads.

v2: use ethtool -L, not a vendor specific priv-flag
v3: simplify logging, drop unnecessary "no-change" tests

Reviewed-by: Jakub Kicinski <kuba@kernel.org>
Signed-off-by: Shannon Nelson <snelson@pensando.io>
---
 .../ethernet/pensando/ionic/ionic_ethtool.c   | 96 +++++++++++++++----
 .../net/ethernet/pensando/ionic/ionic_lif.c   | 41 ++++++--
 .../net/ethernet/pensando/ionic/ionic_lif.h   |  3 +
 .../net/ethernet/pensando/ionic/ionic_txrx.c  | 67 +++++++++++++
 .../net/ethernet/pensando/ionic/ionic_txrx.h  |  2 +
 5 files changed, 182 insertions(+), 27 deletions(-)

diff --git a/drivers/net/ethernet/pensando/ionic/ionic_ethtool.c b/drivers/net/ethernet/pensando/ionic/ionic_ethtool.c
index 095561924bdc..3c57c331729f 100644
--- a/drivers/net/ethernet/pensando/ionic/ionic_ethtool.c
+++ b/drivers/net/ethernet/pensando/ionic/ionic_ethtool.c
@@ -403,8 +403,7 @@ static int ionic_get_coalesce(struct net_device *netdev,
 {
 	struct ionic_lif *lif = netdev_priv(netdev);
 
-	/* Tx uses Rx interrupt */
-	coalesce->tx_coalesce_usecs = lif->rx_coalesce_usecs;
+	coalesce->tx_coalesce_usecs = lif->tx_coalesce_usecs;
 	coalesce->rx_coalesce_usecs = lif->rx_coalesce_usecs;
 
 	return 0;
@@ -417,7 +416,8 @@ static int ionic_set_coalesce(struct net_device *netdev,
 	struct ionic_identity *ident;
 	struct ionic_qcq *qcq;
 	unsigned int i;
-	u32 coal;
+	u32 rx_coal;
+	u32 tx_coal;
 
 	ident = &lif->ionic->ident;
 	if (ident->dev.intr_coal_div == 0) {
@@ -426,26 +426,31 @@ static int ionic_set_coalesce(struct net_device *netdev,
 		return -EIO;
 	}
 
-	/* Tx uses Rx interrupt, so only change Rx */
-	if (coalesce->tx_coalesce_usecs != lif->rx_coalesce_usecs) {
+	/* Tx normally shares Rx interrupt, so only change Rx */
+	if (!test_bit(IONIC_LIF_F_SPLIT_INTR, lif->state) &&
+	    coalesce->tx_coalesce_usecs != lif->rx_coalesce_usecs) {
 		netdev_warn(netdev, "only the rx-usecs can be changed\n");
 		return -EINVAL;
 	}
 
-	/* Convert the usec request to a HW useable value.  If they asked
+	/* Convert the usec request to a HW usable value.  If they asked
 	 * for non-zero and it resolved to zero, bump it up
 	 */
-	coal = ionic_coal_usec_to_hw(lif->ionic, coalesce->rx_coalesce_usecs);
-	if (!coal && coalesce->rx_coalesce_usecs)
-		coal = 1;
-
-	if (coal > IONIC_INTR_CTRL_COAL_MAX)
+	rx_coal = ionic_coal_usec_to_hw(lif->ionic, coalesce->rx_coalesce_usecs);
+	if (!rx_coal && coalesce->rx_coalesce_usecs)
+		rx_coal = 1;
+	tx_coal = ionic_coal_usec_to_hw(lif->ionic, coalesce->tx_coalesce_usecs);
+	if (!tx_coal && coalesce->tx_coalesce_usecs)
+		tx_coal = 1;
+
+	if (rx_coal > IONIC_INTR_CTRL_COAL_MAX ||
+	    tx_coal > IONIC_INTR_CTRL_COAL_MAX)
 		return -ERANGE;
 
-	/* Save the new value */
+	/* Save the new values */
 	lif->rx_coalesce_usecs = coalesce->rx_coalesce_usecs;
-	if (coal != lif->rx_coalesce_hw) {
-		lif->rx_coalesce_hw = coal;
+	if (rx_coal != lif->rx_coalesce_hw) {
+		lif->rx_coalesce_hw = rx_coal;
 
 		if (test_bit(IONIC_LIF_F_UP, lif->state)) {
 			for (i = 0; i < lif->nxqs; i++) {
@@ -457,6 +462,23 @@ static int ionic_set_coalesce(struct net_device *netdev,
 		}
 	}
 
+	if (test_bit(IONIC_LIF_F_SPLIT_INTR, lif->state))
+		lif->tx_coalesce_usecs = coalesce->tx_coalesce_usecs;
+	else
+		lif->tx_coalesce_usecs = coalesce->rx_coalesce_usecs;
+	if (tx_coal != lif->tx_coalesce_hw) {
+		lif->tx_coalesce_hw = tx_coal;
+
+		if (test_bit(IONIC_LIF_F_UP, lif->state)) {
+			for (i = 0; i < lif->nxqs; i++) {
+				qcq = lif->txqcqs[i].qcq;
+				ionic_intr_coal_init(lif->ionic->idev.intr_ctrl,
+						     qcq->intr.index,
+						     lif->tx_coalesce_hw);
+			}
+		}
+	}
+
 	return 0;
 }
 
@@ -510,29 +532,63 @@ static void ionic_get_channels(struct net_device *netdev,
 
 	/* report maximum channels */
 	ch->max_combined = lif->ionic->ntxqs_per_lif;
+	ch->max_rx = lif->ionic->ntxqs_per_lif / 2;
+	ch->max_tx = lif->ionic->ntxqs_per_lif / 2;
 
 	/* report current channels */
-	ch->combined_count = lif->nxqs;
+	if (test_bit(IONIC_LIF_F_SPLIT_INTR, lif->state)) {
+		ch->rx_count = lif->nxqs;
+		ch->tx_count = lif->nxqs;
+	} else {
+		ch->combined_count = lif->nxqs;
+	}
 }
 
 static void ionic_set_queuecount(struct ionic_lif *lif, void *arg)
 {
 	struct ethtool_channels *ch = arg;
 
-	lif->nxqs = ch->combined_count;
+	if (ch->combined_count) {
+		lif->nxqs = ch->combined_count;
+		if (test_bit(IONIC_LIF_F_SPLIT_INTR, lif->state)) {
+			clear_bit(IONIC_LIF_F_SPLIT_INTR, lif->state);
+			lif->tx_coalesce_usecs = lif->rx_coalesce_usecs;
+			lif->tx_coalesce_hw = lif->rx_coalesce_hw;
+			netdev_info(lif->netdev, "Sharing queue interrupts\n");
+		}
+	} else {
+		lif->nxqs = ch->rx_count;
+		if (!test_bit(IONIC_LIF_F_SPLIT_INTR, lif->state)) {
+			set_bit(IONIC_LIF_F_SPLIT_INTR, lif->state);
+			netdev_info(lif->netdev, "Splitting queue interrupts\n");
+		}
+	}
 }
 
 static int ionic_set_channels(struct net_device *netdev,
 			      struct ethtool_channels *ch)
 {
 	struct ionic_lif *lif = netdev_priv(netdev);
+	int new_cnt;
 
-	if (!ch->combined_count || ch->other_count ||
-	    ch->rx_count || ch->tx_count)
+	if (ch->rx_count != ch->tx_count) {
+		netdev_info(netdev, "The rx and tx count must be equal\n");
 		return -EINVAL;
+	}
 
-	if (ch->combined_count == lif->nxqs)
-		return 0;
+	if (ch->combined_count && ch->rx_count) {
+		netdev_info(netdev, "Use either combined_count or rx/tx_count, not both\n");
+		return -EINVAL;
+	}
+
+	if (ch->combined_count)
+		new_cnt = ch->combined_count;
+	else
+		new_cnt = ch->rx_count;
+
+	if (lif->nxqs != new_cnt)
+		netdev_info(netdev, "Changing queue count from %d to %d\n",
+			    lif->nxqs, new_cnt);
 
 	return ionic_reset_queues(lif, ionic_set_queuecount, ch);
 }
diff --git a/drivers/net/ethernet/pensando/ionic/ionic_lif.c b/drivers/net/ethernet/pensando/ionic/ionic_lif.c
index b228362363ca..28868d343540 100644
--- a/drivers/net/ethernet/pensando/ionic/ionic_lif.c
+++ b/drivers/net/ethernet/pensando/ionic/ionic_lif.c
@@ -616,7 +616,6 @@ static int ionic_lif_txq_init(struct ionic_lif *lif, struct ionic_qcq *qcq)
 			.index = cpu_to_le32(q->index),
 			.flags = cpu_to_le16(IONIC_QINIT_F_IRQ |
 					     IONIC_QINIT_F_SG),
-			.intr_index = cpu_to_le16(lif->rxqcqs[q->index].qcq->intr.index),
 			.pid = cpu_to_le16(q->pid),
 			.ring_size = ilog2(q->num_descs),
 			.ring_base = cpu_to_le64(q->base_pa),
@@ -624,14 +623,22 @@ static int ionic_lif_txq_init(struct ionic_lif *lif, struct ionic_qcq *qcq)
 			.sg_ring_base = cpu_to_le64(q->sg_base_pa),
 		},
 	};
+	unsigned int intr_index;
 	int err;
 
+	if (test_bit(IONIC_LIF_F_SPLIT_INTR, lif->state))
+		intr_index = qcq->intr.index;
+	else
+		intr_index = lif->rxqcqs[q->index].qcq->intr.index;
+	ctx.cmd.q_init.intr_index = cpu_to_le16(intr_index);
+
 	dev_dbg(dev, "txq_init.pid %d\n", ctx.cmd.q_init.pid);
 	dev_dbg(dev, "txq_init.index %d\n", ctx.cmd.q_init.index);
 	dev_dbg(dev, "txq_init.ring_base 0x%llx\n", ctx.cmd.q_init.ring_base);
 	dev_dbg(dev, "txq_init.ring_size %d\n", ctx.cmd.q_init.ring_size);
 	dev_dbg(dev, "txq_init.flags 0x%x\n", ctx.cmd.q_init.flags);
 	dev_dbg(dev, "txq_init.ver %d\n", ctx.cmd.q_init.ver);
+	dev_dbg(dev, "txq_init.intr_index %d\n", ctx.cmd.q_init.intr_index);
 
 	q->tail = q->info;
 	q->head = q->tail;
@@ -648,6 +655,10 @@ static int ionic_lif_txq_init(struct ionic_lif *lif, struct ionic_qcq *qcq)
 	dev_dbg(dev, "txq->hw_type %d\n", q->hw_type);
 	dev_dbg(dev, "txq->hw_index %d\n", q->hw_index);
 
+	if (test_bit(IONIC_LIF_F_SPLIT_INTR, lif->state))
+		netif_napi_add(lif->netdev, &qcq->napi, ionic_tx_napi,
+			       NAPI_POLL_WEIGHT);
+
 	qcq->flags |= IONIC_QCQ_F_INITED;
 
 	return 0;
@@ -684,6 +695,7 @@ static int ionic_lif_rxq_init(struct ionic_lif *lif, struct ionic_qcq *qcq)
 	dev_dbg(dev, "rxq_init.ring_size %d\n", ctx.cmd.q_init.ring_size);
 	dev_dbg(dev, "rxq_init.flags 0x%x\n", ctx.cmd.q_init.flags);
 	dev_dbg(dev, "rxq_init.ver %d\n", ctx.cmd.q_init.ver);
+	dev_dbg(dev, "rxq_init.intr_index %d\n", ctx.cmd.q_init.intr_index);
 
 	q->tail = q->info;
 	q->head = q->tail;
@@ -700,8 +712,12 @@ static int ionic_lif_rxq_init(struct ionic_lif *lif, struct ionic_qcq *qcq)
 	dev_dbg(dev, "rxq->hw_type %d\n", q->hw_type);
 	dev_dbg(dev, "rxq->hw_index %d\n", q->hw_index);
 
-	netif_napi_add(lif->netdev, &qcq->napi, ionic_rx_napi,
-		       NAPI_POLL_WEIGHT);
+	if (test_bit(IONIC_LIF_F_SPLIT_INTR, lif->state))
+		netif_napi_add(lif->netdev, &qcq->napi, ionic_rx_napi,
+			       NAPI_POLL_WEIGHT);
+	else
+		netif_napi_add(lif->netdev, &qcq->napi, ionic_txrx_napi,
+			       NAPI_POLL_WEIGHT);
 
 	qcq->flags |= IONIC_QCQ_F_INITED;
 
@@ -1537,6 +1553,8 @@ static int ionic_txrx_alloc(struct ionic_lif *lif)
 		sg_desc_sz = sizeof(struct ionic_txq_sg_desc);
 
 	flags = IONIC_QCQ_F_TX_STATS | IONIC_QCQ_F_SG;
+	if (test_bit(IONIC_LIF_F_SPLIT_INTR, lif->state))
+		flags |= IONIC_QCQ_F_INTR;
 	for (i = 0; i < lif->nxqs; i++) {
 		err = ionic_qcq_alloc(lif, IONIC_QTYPE_TXQ, i, "tx", flags,
 				      lif->ntxq_descs,
@@ -1547,6 +1565,11 @@ static int ionic_txrx_alloc(struct ionic_lif *lif)
 		if (err)
 			goto err_out;
 
+		if (flags & IONIC_QCQ_F_INTR)
+			ionic_intr_coal_init(lif->ionic->idev.intr_ctrl,
+					     lif->txqcqs[i].qcq->intr.index,
+					     lif->tx_coalesce_hw);
+
 		lif->txqcqs[i].qcq->stats = lif->txqcqs[i].stats;
 		ionic_debugfs_add_qcq(lif, lif->txqcqs[i].qcq);
 	}
@@ -1562,13 +1585,15 @@ static int ionic_txrx_alloc(struct ionic_lif *lif)
 		if (err)
 			goto err_out;
 
-		lif->rxqcqs[i].qcq->stats = lif->rxqcqs[i].stats;
-
 		ionic_intr_coal_init(lif->ionic->idev.intr_ctrl,
 				     lif->rxqcqs[i].qcq->intr.index,
 				     lif->rx_coalesce_hw);
-		ionic_link_qcq_interrupts(lif->rxqcqs[i].qcq,
-					  lif->txqcqs[i].qcq);
+
+		if (!test_bit(IONIC_LIF_F_SPLIT_INTR, lif->state))
+			ionic_link_qcq_interrupts(lif->rxqcqs[i].qcq,
+						  lif->txqcqs[i].qcq);
+
+		lif->rxqcqs[i].qcq->stats = lif->rxqcqs[i].stats;
 		ionic_debugfs_add_qcq(lif, lif->rxqcqs[i].qcq);
 	}
 
@@ -2069,6 +2094,8 @@ static struct ionic_lif *ionic_lif_alloc(struct ionic *ionic, unsigned int index
 	lif->rx_coalesce_usecs = IONIC_ITR_COAL_USEC_DEFAULT;
 	lif->rx_coalesce_hw = ionic_coal_usec_to_hw(lif->ionic,
 						    lif->rx_coalesce_usecs);
+	lif->tx_coalesce_usecs = lif->rx_coalesce_usecs;
+	lif->tx_coalesce_hw = lif->rx_coalesce_hw;
 
 	snprintf(lif->name, sizeof(lif->name), "lif%u", index);
 
diff --git a/drivers/net/ethernet/pensando/ionic/ionic_lif.h b/drivers/net/ethernet/pensando/ionic/ionic_lif.h
index 13f173d83970..1ee3b14c8d50 100644
--- a/drivers/net/ethernet/pensando/ionic/ionic_lif.h
+++ b/drivers/net/ethernet/pensando/ionic/ionic_lif.h
@@ -137,6 +137,7 @@ enum ionic_lif_state_flags {
 	IONIC_LIF_F_UP,
 	IONIC_LIF_F_LINK_CHECK_REQUESTED,
 	IONIC_LIF_F_FW_RESET,
+	IONIC_LIF_F_SPLIT_INTR,
 
 	/* leave this as last */
 	IONIC_LIF_F_STATE_SIZE
@@ -205,6 +206,8 @@ struct ionic_lif {
 	struct dentry *dentry;
 	u32 rx_coalesce_usecs;		/* what the user asked for */
 	u32 rx_coalesce_hw;		/* what the hw is using */
+	u32 tx_coalesce_usecs;		/* what the user asked for */
+	u32 tx_coalesce_hw;		/* what the hw is using */
 
 	struct work_struct tx_timeout_work;
 };
diff --git a/drivers/net/ethernet/pensando/ionic/ionic_txrx.c b/drivers/net/ethernet/pensando/ionic/ionic_txrx.c
index 766f4595895c..8107d32c2767 100644
--- a/drivers/net/ethernet/pensando/ionic/ionic_txrx.c
+++ b/drivers/net/ethernet/pensando/ionic/ionic_txrx.c
@@ -419,7 +419,74 @@ void ionic_rx_empty(struct ionic_queue *q)
 	}
 }
 
+int ionic_tx_napi(struct napi_struct *napi, int budget)
+{
+	struct ionic_qcq *qcq = napi_to_qcq(napi);
+	struct ionic_cq *cq = napi_to_cq(napi);
+	struct ionic_dev *idev;
+	struct ionic_lif *lif;
+	u32 work_done = 0;
+	u32 flags = 0;
+
+	lif = cq->bound_q->lif;
+	idev = &lif->ionic->idev;
+
+	work_done = ionic_cq_service(cq, budget,
+				     ionic_tx_service, NULL, NULL);
+
+	if (work_done < budget && napi_complete_done(napi, work_done)) {
+		flags |= IONIC_INTR_CRED_UNMASK;
+		DEBUG_STATS_INTR_REARM(cq->bound_intr);
+	}
+
+	if (work_done || flags) {
+		flags |= IONIC_INTR_CRED_RESET_COALESCE;
+		ionic_intr_credits(idev->intr_ctrl,
+				   cq->bound_intr->index,
+				   work_done, flags);
+	}
+
+	DEBUG_STATS_NAPI_POLL(qcq, work_done);
+
+	return work_done;
+}
+
 int ionic_rx_napi(struct napi_struct *napi, int budget)
+{
+	struct ionic_qcq *qcq = napi_to_qcq(napi);
+	struct ionic_cq *cq = napi_to_cq(napi);
+	struct ionic_dev *idev;
+	struct ionic_lif *lif;
+	u32 work_done = 0;
+	u32 flags = 0;
+
+	lif = cq->bound_q->lif;
+	idev = &lif->ionic->idev;
+
+	work_done = ionic_cq_service(cq, budget,
+				     ionic_rx_service, NULL, NULL);
+
+	if (work_done)
+		ionic_rx_fill(cq->bound_q);
+
+	if (work_done < budget && napi_complete_done(napi, work_done)) {
+		flags |= IONIC_INTR_CRED_UNMASK;
+		DEBUG_STATS_INTR_REARM(cq->bound_intr);
+	}
+
+	if (work_done || flags) {
+		flags |= IONIC_INTR_CRED_RESET_COALESCE;
+		ionic_intr_credits(idev->intr_ctrl,
+				   cq->bound_intr->index,
+				   work_done, flags);
+	}
+
+	DEBUG_STATS_NAPI_POLL(qcq, work_done);
+
+	return work_done;
+}
+
+int ionic_txrx_napi(struct napi_struct *napi, int budget)
 {
 	struct ionic_qcq *qcq = napi_to_qcq(napi);
 	struct ionic_cq *rxcq = napi_to_cq(napi);
diff --git a/drivers/net/ethernet/pensando/ionic/ionic_txrx.h b/drivers/net/ethernet/pensando/ionic/ionic_txrx.h
index 71973e3c35a6..a5883be0413f 100644
--- a/drivers/net/ethernet/pensando/ionic/ionic_txrx.h
+++ b/drivers/net/ethernet/pensando/ionic/ionic_txrx.h
@@ -11,6 +11,8 @@ void ionic_rx_fill(struct ionic_queue *q);
 void ionic_rx_empty(struct ionic_queue *q);
 void ionic_tx_empty(struct ionic_queue *q);
 int ionic_rx_napi(struct napi_struct *napi, int budget);
+int ionic_tx_napi(struct napi_struct *napi, int budget);
+int ionic_txrx_napi(struct napi_struct *napi, int budget);
 netdev_tx_t ionic_start_xmit(struct sk_buff *skb, struct net_device *netdev);
 
 #endif /* _IONIC_TXRX_H_ */
-- 
2.17.1


^ permalink raw reply related

* [PATCH v3 net-next 1/3] ionic: use fewer firmware doorbells on rx fill
From: Shannon Nelson @ 2020-07-31 20:15 UTC (permalink / raw)
  To: netdev, davem; +Cc: Shannon Nelson
In-Reply-To: <20200731201536.18246-1-snelson@pensando.io>

We really don't need to hit the Rx queue doorbell so many times,
we can wait to the end and cause a little less thrash.

Signed-off-by: Shannon Nelson <snelson@pensando.io>
---
 drivers/net/ethernet/pensando/ionic/ionic_txrx.c | 15 +++++----------
 1 file changed, 5 insertions(+), 10 deletions(-)

diff --git a/drivers/net/ethernet/pensando/ionic/ionic_txrx.c b/drivers/net/ethernet/pensando/ionic/ionic_txrx.c
index 85eb8f276a37..e660cd66f9a8 100644
--- a/drivers/net/ethernet/pensando/ionic/ionic_txrx.c
+++ b/drivers/net/ethernet/pensando/ionic/ionic_txrx.c
@@ -331,9 +331,6 @@ static void ionic_rx_page_free(struct ionic_queue *q, struct page *page,
 	__free_page(page);
 }
 
-#define IONIC_RX_RING_DOORBELL_STRIDE		((1 << 5) - 1)
-#define IONIC_RX_RING_HEAD_BUF_SZ		2048
-
 void ionic_rx_fill(struct ionic_queue *q)
 {
 	struct net_device *netdev = q->lif->netdev;
@@ -345,7 +342,6 @@ void ionic_rx_fill(struct ionic_queue *q)
 	unsigned int remain_len;
 	unsigned int seg_len;
 	unsigned int nfrags;
-	bool ring_doorbell;
 	unsigned int i, j;
 	unsigned int len;
 
@@ -360,9 +356,7 @@ void ionic_rx_fill(struct ionic_queue *q)
 		page_info = &desc_info->pages[0];
 
 		if (page_info->page) { /* recycle the buffer */
-			ring_doorbell = ((q->head->index + 1) &
-					IONIC_RX_RING_DOORBELL_STRIDE) == 0;
-			ionic_rxq_post(q, ring_doorbell, ionic_rx_clean, NULL);
+			ionic_rxq_post(q, false, ionic_rx_clean, NULL);
 			continue;
 		}
 
@@ -401,10 +395,11 @@ void ionic_rx_fill(struct ionic_queue *q)
 			page_info++;
 		}
 
-		ring_doorbell = ((q->head->index + 1) &
-				IONIC_RX_RING_DOORBELL_STRIDE) == 0;
-		ionic_rxq_post(q, ring_doorbell, ionic_rx_clean, NULL);
+		ionic_rxq_post(q, false, ionic_rx_clean, NULL);
 	}
+
+	ionic_dbell_ring(q->lif->kern_dbpage, q->hw_type,
+			 q->dbval | q->head->index);
 }
 
 static void ionic_rx_fill_cb(void *arg)
-- 
2.17.1


^ permalink raw reply related

* [PATCH v3 net-next 0/3] ionic txrx updates
From: Shannon Nelson @ 2020-07-31 20:15 UTC (permalink / raw)
  To: netdev, davem; +Cc: Shannon Nelson

These are a few patches to do some cleanup in the packet
handling and give us more flexibility in tuning performance
by allowing us to put Tx handling on separate interrupts
when it makes sense for particular traffic loads.

v3: simplified queue count change logging, removed unnecessary
    check for no count change
v2: dropped the original patch 2 for ringsize change
    changed the separated tx/rx interrupts to use ethtool -L

Shannon Nelson (3):
  ionic: use fewer firmware doorbells on rx fill
  ionic: tx separate servicing
  ionic: separate interrupt for Tx and Rx

 .../ethernet/pensando/ionic/ionic_ethtool.c   |  96 +++++++--
 .../net/ethernet/pensando/ionic/ionic_lif.c   |  42 +++-
 .../net/ethernet/pensando/ionic/ionic_lif.h   |   5 +
 .../net/ethernet/pensando/ionic/ionic_txrx.c  | 188 ++++++++++++------
 .../net/ethernet/pensando/ionic/ionic_txrx.h  |   2 +
 5 files changed, 240 insertions(+), 93 deletions(-)

-- 
2.17.1


^ permalink raw reply

* Re: [PATCH net v2 5/5] fsl/fman: fix eth hash table allocation
From: Jakub Kicinski @ 2020-07-31 20:12 UTC (permalink / raw)
  To: Florinel Iordache
  Cc: madalin.bucur, davem, netdev, Markus.Elfring, linux-kernel
In-Reply-To: <1596192562-7629-6-git-send-email-florinel.iordache@nxp.com>

On Fri, 31 Jul 2020 13:49:22 +0300 Florinel Iordache wrote:
> Fixes: 57ba4c9b56d8 ("fsl/fman: Add FMan MAC support")
> 
> Signed-off-by: Florinel Iordache <florinel.iordache@nxp.com>

Please repost without the empty lines between these tags.

^ permalink raw reply

* Re: [PATCH v4 net-next 04/10] qed: implement devlink info request
From: Jakub Kicinski @ 2020-07-31 20:04 UTC (permalink / raw)
  To: Igor Russkikh
  Cc: netdev, David S . Miller, Ariel Elior, Michal Kalderon,
	Denis Bolotin, Jiri Pirko, Alexander Lobakin, Michal Kalderon
In-Reply-To: <20200731055401.940-5-irusskikh@marvell.com>

On Fri, 31 Jul 2020 08:53:55 +0300 Igor Russkikh wrote:
> Here we return existing fw & mfw versions, we also fetch device's
> serial number.
> 
> The base device specific structure (qed_dev_info) was not directly
> available to the base driver before.
> Thus, here we create and store a private copy of this structure
> in qed_dev root object.

Please include example output of devlink info on you device.

> diff --git a/drivers/net/ethernet/qlogic/qed/qed_devlink.c b/drivers/net/ethernet/qlogic/qed/qed_devlink.c
> index a62c47c61edf..57ef2c56c884 100644
> --- a/drivers/net/ethernet/qlogic/qed/qed_devlink.c
> +++ b/drivers/net/ethernet/qlogic/qed/qed_devlink.c
> @@ -45,7 +45,55 @@ static const struct devlink_param qed_devlink_params[] = {
>  			     qed_dl_param_get, qed_dl_param_set, NULL),
>  };
>  
> -static const struct devlink_ops qed_dl_ops;
> +static int qed_devlink_info_get(struct devlink *devlink,
> +				struct devlink_info_req *req,
> +				struct netlink_ext_ack *extack)
> +{
> +	struct qed_devlink *qed_dl = devlink_priv(devlink);
> +	struct qed_dev *cdev = qed_dl->cdev;
> +	struct qed_dev_info *dev_info;
> +	char buf[100];
> +	int err;
> +
> +	dev_info = &cdev->common_dev_info;
> +
> +	err = devlink_info_driver_name_put(req, KBUILD_MODNAME);
> +	if (err)
> +		return err;
> +
> +	memcpy(buf, cdev->hwfns[0].hw_info.part_num, sizeof(cdev->hwfns[0].hw_info.part_num));
> +	buf[sizeof(cdev->hwfns[0].hw_info.part_num)] = 0;

Part number != serial number. What's the thing you're reporting here
actually identifying.

> +
> +	snprintf(buf, sizeof(buf), "%d.%d.%d.%d",
> +		 GET_MFW_FIELD(dev_info->mfw_rev, QED_MFW_VERSION_3),
> +		 GET_MFW_FIELD(dev_info->mfw_rev, QED_MFW_VERSION_2),
> +		 GET_MFW_FIELD(dev_info->mfw_rev, QED_MFW_VERSION_1),
> +		 GET_MFW_FIELD(dev_info->mfw_rev, QED_MFW_VERSION_0));
> +
> +	err = devlink_info_version_stored_put(req,
> +					      DEVLINK_INFO_VERSION_GENERIC_FW_MGMT, buf);
> +	if (err)
> +		return err;

Assuming MFW means management FW - this looks good.

> +	snprintf(buf, sizeof(buf), "%d.%d.%d.%d",
> +		 dev_info->fw_major,
> +		 dev_info->fw_minor,
> +		 dev_info->fw_rev,
> +		 dev_info->fw_eng);
> +
> +	return devlink_info_version_running_put(req,
> +						DEVLINK_INFO_VERSION_GENERIC_FW, buf);

But what's this one?

^ permalink raw reply

* Re: [PATCH net-next] Add Mellanox BlueField Gigabit Ethernet driver
From: Andrew Lunn @ 2020-07-31 19:54 UTC (permalink / raw)
  To: Asmaa Mnebhi
  Cc: David Thompson, netdev@vger.kernel.org, davem@davemloft.net,
	kuba@kernel.org, Jiri Pirko
In-Reply-To: <VI1PR05MB4110070900CF42CB3E18983EDA4E0@VI1PR05MB4110.eurprd05.prod.outlook.com>

On Fri, Jul 31, 2020 at 06:54:04PM +0000, Asmaa Mnebhi wrote:

Hi Asmaa

Please don't send HTML obfusticated emails to mailing lists.

> > +static int mlxbf_gige_mdio_read(struct mii_bus *bus, int phy_add, int
> 
> > +phy_reg) {
> 
> > +         struct mlxbf_gige *priv = bus->priv;
> 
> > +         u32 cmd;
> 
> > +         u32 ret;
> 
> > +
> 
> > +         /* If the lock is held by something else, drop the request.
> 
> > +         * If the lock is cleared, that means the busy bit was cleared.
> 
> > +         */
> 
>  
> 
> How can this happen? The mdio core has a mutex which prevents parallel access?
> 
>  
> 
> This is a HW Lock. It is an actual register. So another HW entity can be
> holding that lock and reading/changing the values in the HW registers.

You have not explains how that can happen? Is there something in the
driver i missed which takes a backdoor to read/write MDIO
transactions?

> > +         ret = mlxbf_gige_mdio_poll_bit(priv, MLXBF_GIGE_MDIO_GW_LOCK_MASK);
> 
> > +         if (ret)
> 
> > +                       return -EBUSY;
> 
>  
> 
> PHY drivers are not going to like that. They are not going to retry. What is
> likely to happen is that phylib moves into the ERROR state, and the PHY driver
> grinds to a halt.
> 
>  
> 
> This is a fairly quick HW transaction. So I don’t think it would cause and
> issue for the PHY drivers. In this case, we use the micrel KSZ9031. We haven’t
> seen issues.

So you have happy to debug hard to find and reproduce issues when it
does happen? Or would you like to spend a little bit of time now and
just prevent it happening at all?

     Andrew

^ permalink raw reply

* Re: [PATCH net-next v2 1/2] hinic: add generating mailbox random index support
From: Jakub Kicinski @ 2020-07-31 19:52 UTC (permalink / raw)
  To: Luo bin
  Cc: davem, linux-kernel, netdev, luoxianjun, yin.yinshi,
	cloud.wangxiaoyun, chiqijun
In-Reply-To: <20200731015642.17452-2-luobin9@huawei.com>

On Fri, 31 Jul 2020 09:56:41 +0800 Luo bin wrote:
> add support to generate mailbox random id of VF to ensure that
> mailbox messages PF received are from the correct VF.
> 
> Signed-off-by: Luo bin <luobin9@huawei.com>

> diff --git a/drivers/net/ethernet/huawei/hinic/hinic_hw_mbox.c b/drivers/net/ethernet/huawei/hinic/hinic_hw_mbox.c
> index 47c93f946b94..c72aa8e8bce8 100644
> --- a/drivers/net/ethernet/huawei/hinic/hinic_hw_mbox.c
> +++ b/drivers/net/ethernet/huawei/hinic/hinic_hw_mbox.c
> @@ -486,6 +486,111 @@ static void recv_mbox_handler(struct hinic_mbox_func_to_func *func_to_func,
>  	kfree(rcv_mbox_temp);
>  }
>  
> +static int set_vf_mbox_random_id(struct hinic_hwdev *hwdev, u16 func_id)
> +{
> +	struct hinic_mbox_func_to_func *func_to_func = hwdev->func_to_func;
> +	struct hinic_set_random_id rand_info = {0};
> +	u16 out_size = sizeof(rand_info);
> +	struct hinic_pfhwdev *pfhwdev;
> +	int ret;
> +
> +	pfhwdev = container_of(hwdev, struct hinic_pfhwdev, hwdev);
> +
> +	rand_info.version = HINIC_CMD_VER_FUNC_ID;
> +	rand_info.func_idx = func_id;
> +	rand_info.vf_in_pf = (u8)(func_id - hinic_glb_pf_vf_offset(hwdev->hwif));

this cast is unnecessary

> +	get_random_bytes(&rand_info.random_id, sizeof(u32));

get_random_u32()

> +
> +	func_to_func->vf_mbx_rand_id[func_id] = rand_info.random_id;
> +
> +	ret = hinic_msg_to_mgmt(&pfhwdev->pf_to_mgmt, HINIC_MOD_COMM,
> +				HINIC_MGMT_CMD_SET_VF_RANDOM_ID,
> +				&rand_info, sizeof(rand_info),
> +				&rand_info, &out_size, HINIC_MGMT_MSG_SYNC);
> +	if ((rand_info.status != HINIC_MGMT_CMD_UNSUPPORTED &&
> +	     rand_info.status) || !out_size || ret) {
> +		dev_err(&hwdev->hwif->pdev->dev, "Set VF random id failed, err: %d, status: 0x%x, out size: 0x%x\n",
> +			ret, rand_info.status, out_size);
> +		return -EIO;
> +	}
> +
> +	if (rand_info.status == HINIC_MGMT_CMD_UNSUPPORTED)
> +		return rand_info.status;
> +
> +	func_to_func->vf_mbx_old_rand_id[func_id] =
> +				func_to_func->vf_mbx_rand_id[func_id];
> +
> +	return 0;
> +}

> +static bool check_vf_mbox_random_id(struct hinic_mbox_func_to_func *func_to_func,
> +				    u8 *header)
> +{
> +	struct hinic_hwdev *hwdev = func_to_func->hwdev;
> +	struct hinic_mbox_work *mbox_work = NULL;
> +	u64 mbox_header = *((u64 *)header);
> +	u16 offset, src;
> +	u32 random_id;
> +	int vf_in_pf;
> +
> +	src = HINIC_MBOX_HEADER_GET(mbox_header, SRC_GLB_FUNC_IDX);
> +
> +	if (IS_PF_OR_PPF_SRC(src) || !func_to_func->support_vf_random)
> +		return true;
> +
> +	if (!HINIC_IS_PPF(hwdev->hwif)) {
> +		offset = hinic_glb_pf_vf_offset(hwdev->hwif);
> +		vf_in_pf = src - offset;
> +
> +		if (vf_in_pf < 1 || vf_in_pf > hwdev->nic_cap.max_vf) {
> +			dev_warn(&hwdev->hwif->pdev->dev,
> +				 "Receive vf id(0x%x) is invalid, vf id should be from 0x%x to 0x%x\n",
> +				 src, offset + 1,
> +				 hwdev->nic_cap.max_vf + offset);
> +			return false;
> +		}
> +	}
> +
> +	random_id = be32_to_cpu(*(u32 *)(header + MBOX_SEG_LEN +
> +					 MBOX_HEADER_SZ));
> +
> +	if (random_id == func_to_func->vf_mbx_rand_id[src] ||
> +	    random_id == func_to_func->vf_mbx_old_rand_id[src])

What guarantees src < MAX_FUNCTION_NUM ?

> +		return true;
> +
> +	dev_warn(&hwdev->hwif->pdev->dev,
> +		 "The mailbox random id(0x%x) of func_id(0x%x) doesn't match with pf reservation(0x%x)\n",
> +		 random_id, src, func_to_func->vf_mbx_rand_id[src]);
> +
> +	mbox_work = kzalloc(sizeof(*mbox_work), GFP_KERNEL);
> +	if (!mbox_work)
> +		return false;
> +
> +	mbox_work->func_to_func = func_to_func;
> +	mbox_work->src_func_idx = src;
> +
> +	INIT_WORK(&mbox_work->work, update_random_id_work_handler);
> +	queue_work(func_to_func->workq, &mbox_work->work);
> +
> +	return false;
> +}

> +int hinic_vf_mbox_random_id_init(struct hinic_hwdev *hwdev)
> +{
> +	u8 vf_in_pf;
> +	int err = 0;
> +
> +	if (HINIC_IS_VF(hwdev->hwif))
> +		return 0;
> +
> +	for (vf_in_pf = 1; vf_in_pf <= hwdev->nic_cap.max_vf; vf_in_pf++) {
> +		err = set_vf_mbox_random_id(hwdev, hinic_glb_pf_vf_offset
> +					    (hwdev->hwif) + vf_in_pf);

Parenthesis around hwdev->hwif not necessary

> +		if (err)
> +			break;
> +	}
> +
> +	if (err == HINIC_MGMT_CMD_UNSUPPORTED) {
> +		hwdev->func_to_func->support_vf_random = false;

So all VFs need to support the feature for it to be used?

> +		err = 0;
> +		dev_warn(&hwdev->hwif->pdev->dev, "Mgmt is unsupported to set VF%d random id\n",
> +			 vf_in_pf - 1);
> +	} else if (!err) {
> +		hwdev->func_to_func->support_vf_random = true;
> +		dev_info(&hwdev->hwif->pdev->dev, "PF Set VF random id success\n");

Is this info message really necessary?

> +	}


^ permalink raw reply

* Re: [PATCH v2 net-next 04/11] sfc_ef100: TX path for EF100 NICs
From: Jakub Kicinski @ 2020-07-31 19:39 UTC (permalink / raw)
  To: Edward Cree; +Cc: linux-net-drivers, davem, netdev
In-Reply-To: <9776704b-d4b0-7477-42ba-f82ad3d4ec48@solarflare.com>

On Fri, 31 Jul 2020 13:59:04 +0100 Edward Cree wrote:
> +static inline efx_oword_t *ef100_tx_desc(struct efx_tx_queue *tx_queue,
> +					 unsigned int index)

Does this static inline make any difference?

You know the general policy...

> +{
> +	if (likely(tx_queue->txd.buf.addr))
> +		return ((efx_oword_t *)tx_queue->txd.buf.addr) + index;
> +	else
> +		return NULL;
> +}

^ permalink raw reply

* Re: [PATCH v2 0/3]
From: Florian Fainelli @ 2020-07-31 18:46 UTC (permalink / raw)
  To: Rakesh Pillai, ath10k
  Cc: linux-wireless, linux-kernel, kvalo, davem, kuba, netdev
In-Reply-To: <1596220042-2778-1-git-send-email-pillair@codeaurora.org>

On 7/31/20 11:27 AM, Rakesh Pillai wrote:
> The history recording will be compiled only if
> ATH10K_DEBUG is enabled, and also enabled via
> the module parameter. Once the history recording
> is enabled via module parameter, it can be enabled
> or disabled runtime via debugfs.

Why not use trace prints and retrieving them via the function tracer?
This seems very ad-hoc.

> 
> ---
> Changes from v1:
> - Add module param and debugfs to enable/disable history recording.
> 
> Rakesh Pillai (3):
>   ath10k: Add history for tracking certain events
>   ath10k: Add module param to enable history
>   ath10k: Add debugfs support to enable event history
> 
>  drivers/net/wireless/ath/ath10k/ce.c      |   1 +
>  drivers/net/wireless/ath/ath10k/core.c    |   3 +
>  drivers/net/wireless/ath/ath10k/core.h    |  82 ++++++++++++
>  drivers/net/wireless/ath/ath10k/debug.c   | 207 ++++++++++++++++++++++++++++++
>  drivers/net/wireless/ath/ath10k/debug.h   |  75 +++++++++++
>  drivers/net/wireless/ath/ath10k/snoc.c    |  15 ++-
>  drivers/net/wireless/ath/ath10k/wmi-tlv.c |   1 +
>  drivers/net/wireless/ath/ath10k/wmi.c     |  10 ++
>  8 files changed, 393 insertions(+), 1 deletion(-)
> 


-- 
Florian

^ permalink raw reply

* Re: [PATCH net-next] Add Mellanox BlueField Gigabit Ethernet driver
From: Andrew Lunn @ 2020-07-31 18:41 UTC (permalink / raw)
  To: David Thompson, f; +Cc: netdev, davem, kuba, jiri, Asmaa Mnebhi
In-Reply-To: <1596047355-28777-1-git-send-email-dthompson@mellanox.com>

> +static int mlxbf_gige_mdio_read(struct mii_bus *bus, int phy_add, int phy_reg)
> +{
> +	struct mlxbf_gige *priv = bus->priv;
> +	u32 cmd;
> +	u32 ret;

Here and in write, please check if it is a C45 transaction request and
return -EOPNOTSUPP.

       Andrew

^ permalink raw reply

* Re: [PATCH net-next] Add Mellanox BlueField Gigabit Ethernet driver
From: Andrew Lunn @ 2020-07-31 18:38 UTC (permalink / raw)
  To: David Thompson; +Cc: netdev, davem, kuba, jiri, Asmaa Mnebhi
In-Reply-To: <1596047355-28777-1-git-send-email-dthompson@mellanox.com>

> +config MLXBF_GIGE
> +	tristate "Mellanox Technologies BlueField Gigabit Ethernet support"
> +	depends on (ARM64 || COMPILE_TEST) && ACPI && INET
> +	select PHYLIB
> +	help
> +	  The second generation BlueField SoC from Mellanox Technologies
> +	  supports an out-of-band Gigabit Ethernet management port to the
> +	  Arm subsystem.

You might want to additionally select the PHY driver you are using.

    Andrew

^ permalink raw reply

* Re: [PATCH v2 1/3] ath10k: Add history for tracking certain events
From: Ben Greear @ 2020-07-31 18:38 UTC (permalink / raw)
  To: Rakesh Pillai, ath10k
  Cc: linux-wireless, linux-kernel, kvalo, davem, kuba, netdev
In-Reply-To: <1596220042-2778-2-git-send-email-pillair@codeaurora.org>

On 7/31/20 11:27 AM, Rakesh Pillai wrote:
> Add history for tracking the below events
> - register read
> - register write
> - IRQ trigger
> - NAPI poll
> - CE service
> - WMI cmd
> - WMI event
> - WMI tx completion
> 
> This will help in debugging any crash or any
> improper behaviour.
> 
> Tested-on: WCN3990 hw1.0 SNOC WLAN.HL.3.1-01040-QCAHLSWMTPLZ-1
> 
> Signed-off-by: Rakesh Pillai <pillair@codeaurora.org>
> ---
>   drivers/net/wireless/ath/ath10k/ce.c      |   1 +
>   drivers/net/wireless/ath/ath10k/core.h    |  74 +++++++++++++++++
>   drivers/net/wireless/ath/ath10k/debug.c   | 133 ++++++++++++++++++++++++++++++
>   drivers/net/wireless/ath/ath10k/debug.h   |  74 +++++++++++++++++
>   drivers/net/wireless/ath/ath10k/snoc.c    |  15 +++-
>   drivers/net/wireless/ath/ath10k/wmi-tlv.c |   1 +
>   drivers/net/wireless/ath/ath10k/wmi.c     |  10 +++
>   7 files changed, 307 insertions(+), 1 deletion(-)
> 

> +void ath10k_record_wmi_event(struct ath10k *ar, enum ath10k_wmi_type type,
> +			     u32 id, unsigned char *data)
> +{
> +	struct ath10k_wmi_event_entry *entry;
> +	u32 idx;
> +
> +	if (type == ATH10K_WMI_EVENT) {
> +		if (!ar->wmi_event_history.record)
> +			return;

This check above is duplicated below, add it once at top of the method
instead.

> +
> +		spin_lock_bh(&ar->wmi_event_history.hist_lock);
> +		idx = ath10k_core_get_next_idx(&ar->reg_access_history.index,
> +					       ar->wmi_event_history.max_entries);
> +		spin_unlock_bh(&ar->wmi_event_history.hist_lock);
> +		entry = &ar->wmi_event_history.record[idx];
> +	} else {
> +		if (!ar->wmi_cmd_history.record)
> +			return;
> +
> +		spin_lock_bh(&ar->wmi_cmd_history.hist_lock);
> +		idx = ath10k_core_get_next_idx(&ar->reg_access_history.index,
> +					       ar->wmi_cmd_history.max_entries);
> +		spin_unlock_bh(&ar->wmi_cmd_history.hist_lock);
> +		entry = &ar->wmi_cmd_history.record[idx];
> +	}
> +
> +	entry->timestamp = ath10k_core_get_timestamp();
> +	entry->cpu_id = smp_processor_id();
> +	entry->type = type;
> +	entry->id = id;
> +	memcpy(&entry->data, data + 4, ATH10K_WMI_DATA_LEN);
> +}
> +EXPORT_SYMBOL(ath10k_record_wmi_event);

> @@ -1660,6 +1668,11 @@ static int ath10k_snoc_probe(struct platform_device *pdev)
>   	ar->ce_priv = &ar_snoc->ce;
>   	msa_size = drv_data->msa_size;
>   
> +	ath10k_core_reg_access_history_init(ar, ATH10K_REG_ACCESS_HISTORY_MAX);
> +	ath10k_core_wmi_event_history_init(ar, ATH10K_WMI_EVENT_HISTORY_MAX);
> +	ath10k_core_wmi_cmd_history_init(ar, ATH10K_WMI_CMD_HISTORY_MAX);
> +	ath10k_core_ce_event_history_init(ar, ATH10K_CE_EVENT_HISTORY_MAX);

Maybe only enable this once user turns it on?  It sucks up a bit of memory?

> +
>   	ath10k_snoc_quirks_init(ar);
>   
>   	ret = ath10k_snoc_resource_init(ar);
> diff --git a/drivers/net/wireless/ath/ath10k/wmi-tlv.c b/drivers/net/wireless/ath/ath10k/wmi-tlv.c
> index 932266d..9df5748 100644
> --- a/drivers/net/wireless/ath/ath10k/wmi-tlv.c
> +++ b/drivers/net/wireless/ath/ath10k/wmi-tlv.c
> @@ -627,6 +627,7 @@ static void ath10k_wmi_tlv_op_rx(struct ath10k *ar, struct sk_buff *skb)
>   	if (skb_pull(skb, sizeof(struct wmi_cmd_hdr)) == NULL)
>   		goto out;
>   
> +	ath10k_record_wmi_event(ar, ATH10K_WMI_EVENT, id, skb->data);
>   	trace_ath10k_wmi_event(ar, id, skb->data, skb->len);
>   
>   	consumed = ath10k_tm_event_wmi(ar, id, skb);
> diff --git a/drivers/net/wireless/ath/ath10k/wmi.c b/drivers/net/wireless/ath/ath10k/wmi.c
> index a81a1ab..8ebd05c 100644
> --- a/drivers/net/wireless/ath/ath10k/wmi.c
> +++ b/drivers/net/wireless/ath/ath10k/wmi.c
> @@ -1802,6 +1802,15 @@ struct sk_buff *ath10k_wmi_alloc_skb(struct ath10k *ar, u32 len)
>   
>   static void ath10k_wmi_htc_tx_complete(struct ath10k *ar, struct sk_buff *skb)
>   {
> +	struct wmi_cmd_hdr *cmd_hdr;
> +	enum wmi_tlv_event_id id;
> +
> +	cmd_hdr = (struct wmi_cmd_hdr *)skb->data;
> +	id = MS(__le32_to_cpu(cmd_hdr->cmd_id), WMI_CMD_HDR_CMD_ID);
> +
> +	ath10k_record_wmi_event(ar, ATH10K_WMI_TX_COMPL, id,
> +				skb->data + sizeof(struct wmi_cmd_hdr));
> +
>   	dev_kfree_skb(skb);
>   }

I think guard the above new code with if (unlikely(ar->ce_event_history.record)) { ... }

All in all, I think I'd want to compile this out (while leaving other debug compiled
in) since it seems this stuff would be rarely used and it adds method calls to hot
paths.

That is a decision for Kalle though, so see what he says...

Thanks,
Ben


-- 
Ben Greear <greearb@candelatech.com>
Candela Technologies Inc  http://www.candelatech.com

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox