Netdev List
 help / color / mirror / Atom feed
* RE: Is bug 200755 in anyone's queue??
From: Steve Zabele @ 2019-08-30  8:48 UTC (permalink / raw)
  To: 'Willem de Bruijn'
  Cc: 'Network Development', shum, vladimir116, saifi.khan,
	saifi.khan, 'Daniel Borkmann', on2k16nm,
	'Stephen Hemminger', mark.keaton
In-Reply-To: <CA+FuTSdu5inPWp_jkUcFnb-Fs-rdk0AMiieCYtjLE7Qs5oFWZQ@mail.gmail.com>

Hi Willem!

**Thank you** for the reply and the code segment, very much appreciated.

Can we expect that this will make its way into a near-term official release of the kernel? Our customers are really not up to patching and rebuilding kernels, plus it "taints" the kernel from a security perspective, and whenever there is a new release of the kernel (you come in one morning and your kernel has been magically upgraded for you because you forgot to disable auto updates) you need to rebuild and hope that the previous patch is still good for the new code, etc, etc.

Getting this onto the main branch as part of the official release cycle will be greatly appreciated!

Note that using an ebpf approach can't solve this problem (we know because we tried for quite a while to make it work, no luck). The key issue is that at the point when the ebpf filter gets the packet buffer reference it is pointing to the start of the UDP portion of the packet, and hence is not able to access the IP source address which is earlier in the buffer. Plus every time a new socket is opened or closed, a new epbf has to be created and inserted -- and there is really no good way to figure out which index is (now) associated with which file descriptor.. 

So thank you and the group for your attention to this.

With respect to your comment

>SO_REUSEPORT was not intended to be used in this way. Opening
>multiple connected sockets with the same local port.

I'd like to offer that there are a number of reliable transport protocols (alternatives to TCP) that use UDP. NORM (IETF RFC 5470) and Google's new QUIC protocol (https://www.ietf.org/blog/whats-happening-quic) are good examples.

Now consider that users of these protocols will want to create servers using these protocols -- a webserver is a good example. In fact Google has one running on QUIC, and many Chrome users don't even know they are using QUIC when they access Google webservers.

With a client-server model, clients contact the server at a well known server address and port. Upon first contact from a new client, the server opens another socket with the same local address and port and "connects" to the clients address and ephemeral port so that only traffic for the given five tuple arrives on the new file descriptor -- this allows the server application to keep concurrent sessions with different clients cleanly separated, even though all sessions use the same local server port. In fact, reusing the same port for different sessions is really important from a firewalling perspective,

This is pretty much what our application does, i.e., it uses different sockets/file descriptors to keep sessions straight.

And if it's worth anything, we have been using this mechanism with UDP for a *very* long time, the change in behavior appears to have happened with the 4.5 kernel.

So **thank you**!!

Steve

-----Original Message-----
From: Willem de Bruijn [mailto:willemdebruijn.kernel@gmail.com] 
Sent: Thursday, August 29, 2019 3:27 PM
To: Steve Zabele
Cc: Network Development; shum@canndrew.org; vladimir116@gmail.com; saifi.khan@datasynergy.org; saifi.khan@strikr.in; Daniel Borkmann; on2k16nm@gmail.com; Stephen Hemminger
Subject: Re: Is bug 200755 in anyone's queue??

On Fri, Aug 23, 2019 at 3:11 PM Steve Zabele <zabele@comcast.net> wrote:
>
> Hi folks,
>
> Is there a way to find out where the SO_REUSEPORT bug reported a year ago in
> August (and apparently has been a bug with kernels later than 4.4) is being
> addressed?
>
> The bug characteristics, simple standalone test code demonstrating the bug,
> and an assessment of the likely location/cause of the bug within the kernel
> are all described here
>
> https://bugzilla.kernel.org/show_bug.cgi?id=200755
>
> I'm really hoping this gets fixed so we can move forward on updating our
> kernels/Ubuntu release from our aging 4.4/16.04 release
>
> Thanks!
>
> Steve
>
>
>
> -----Original Message-----
> From: Stephen Hemminger [mailto:stephen@networkplumber.org]
> Sent: Tuesday, July 16, 2019 10:03 AM
> To: Steve Zabele
> Cc: shum@canndrew.org; vladimir116@gmail.com; saifi.khan@DataSynergy.org;
> saifi.khan@strikr.in; daniel@iogearbox.net; on2k16nm@gmail.com
> Subject: Re: Is bug 200755 in anyone's queue??
>
> On Tue, 16 Jul 2019 09:43:24 -0400
> "Steve Zabele" <zabele@comcast.net> wrote:
>
>
> > I came across bug report 200755 trying to figure out why some code I had
> > provided to customers a while ago no longer works with the current Linux
> > kernel. See
> >
> > https://bugzilla.kernel.org/show_bug.cgi?id=200755
> >
> > I've verified that, as reported, 'connect' no longer works for UDP.
> > Moreover, it appears it has been broken since the 4.5 kernel has been
> > released.
> >
> >
> >
> > It does also appear that the intended new feature of doing round robin
> > assignments to different UDP sockets opened with SO_REUSEPORT also does
> not
> > work as described.
> >
> >
> >
> > Since the original bug report was made nearly a year ago for the 4.14
> kernel
> > (and the bug is also still present in the 4.15 kernel) I'm curious if
> anyone
> > is on the hook to get this fixed any time soon.
> >
> >
> >
> > I'd rather not have to do my own demultiplexing using a single socket in
> > user space to work around what is clearly a (maybe not so recently
> > introduced) kernel bug if at all possible. My code had worked just fine on
> > 3.X kernels, and appears to work okay up through 4.4.
> >
>
> Kernel developers do not use bugzilla, I forward bug reports
> to netdev@vger.kernel.org (after filtering).

SO_REUSEPORT was not intended to be used in this way. Opening
multiple connected sockets with the same local port.

But since the interface allowed connect after joining a group, and
that is being used, I guess that point is moot. Still, I'm a bit
surprised that it ever worked as described.

Also note that the default distribution algorithm is not round robin
assignment, but hash based. So multiple consecutive datagrams arriving
at the same socket is not unexpected.

I suspect that this quick hack might "work". It seemed to on the
supplied .c file:

                  score = compute_score(sk, net, saddr, sport,
                                        daddr, hnum, dif, sdif);
                  if (score > badness) {
  -                       if (sk->sk_reuseport) {
  +                       if (sk->sk_reuseport && !sk->sk_state !=
TCP_ESTABLISHED) {

But a more robust approach, that also works on existing kernels, is to
swap the default distribution algorithm with a custom BPF based one (
SO_ATTACH_REUSEPORT_EBPF).


^ permalink raw reply

* linux-next: manual merge of the staging tree with the net-next and usb trees
From: Stephen Rothwell @ 2019-08-30  8:34 UTC (permalink / raw)
  To: Greg KH, David Miller, Networking
  Cc: Linux Next Mailing List, Linux Kernel Mailing List,
	Benjamin Poirier, Valdis Klētnieks, Sasha Levin

[-- Attachment #1: Type: text/plain, Size: 1819 bytes --]

Hi all,

Today's linux-next merge of the staging tree got conflicts in:

  drivers/staging/Kconfig
  drivers/staging/Makefile

between commits:

  955315b0dc8c ("qlge: Move drivers/net/ethernet/qlogic/qlge/ to drivers/staging/qlge/")
  71ed79b0e4be ("USB: Move wusbcore and UWB to staging as it is obsolete")

from the net-next and usb trees and commit:

  c48c9f7ff32b ("staging: exfat: add exfat filesystem code to staging")

from the staging tree.

I fixed it up (see below) and can carry the fix as necessary. This
is now fixed as far as linux-next is concerned, but any non trivial
conflicts should be mentioned to your upstream maintainer when your tree
is submitted for merging.  You may also want to consider cooperating
with the maintainer of the conflicting tree to minimise any particularly
complex conflicts.

-- 
Cheers,
Stephen Rothwell

diff --cc drivers/staging/Kconfig
index fc1420f2a949,fbdc33874780..000000000000
--- a/drivers/staging/Kconfig
+++ b/drivers/staging/Kconfig
@@@ -120,9 -118,6 +118,11 @@@ source "drivers/staging/kpc2000/Kconfig
  
  source "drivers/staging/isdn/Kconfig"
  
 +source "drivers/staging/qlge/Kconfig"
 +
 +source "drivers/staging/wusbcore/Kconfig"
 +source "drivers/staging/uwb/Kconfig"
 +
+ source "drivers/staging/exfat/Kconfig"
+ 
  endif # STAGING
diff --cc drivers/staging/Makefile
index b08ab677e49b,ca13f87b1e1b..000000000000
--- a/drivers/staging/Makefile
+++ b/drivers/staging/Makefile
@@@ -49,7 -49,4 +49,7 @@@ obj-$(CONFIG_XIL_AXIS_FIFO)	+= axis-fif
  obj-$(CONFIG_FIELDBUS_DEV)     += fieldbus/
  obj-$(CONFIG_KPC2000)		+= kpc2000/
  obj-$(CONFIG_ISDN_CAPI)		+= isdn/
 +obj-$(CONFIG_QLGE)		+= qlge/
 +obj-$(CONFIG_UWB)		+= uwb/
 +obj-$(CONFIG_USB_WUSB)		+= wusbcore/
+ obj-$(CONFIG_EXFAT_FS)		+= exfat/

[-- Attachment #2: OpenPGP digital signature --]
[-- Type: application/pgp-signature, Size: 488 bytes --]

^ permalink raw reply

* [PATCH 0/1] Fix deadlock problem and make performance better
From: Zhu Yanjun @ 2019-08-30  8:35 UTC (permalink / raw)
  To: yanjun.zhu, netdev, davem, nan.1986san

When running with about 1Gbit/ses for very long time, running ifconfig
and netstat causes dead lock. These symptoms are similar to the
commit 5f6b4e14cada ("net: dsa: User per-cpu 64-bit statistics"). After
replacing network devices statistics with per-cpu 64-bit statistics,
the dead locks disappear even after very long time running with 1Gbit/sec.

Zhu Yanjun (1):
  forcedeth: use per cpu to collect xmit/recv statistics

 drivers/net/ethernet/nvidia/forcedeth.c | 132 +++++++++++++++++++++-----------
 1 file changed, 88 insertions(+), 44 deletions(-)

-- 
2.7.4


^ permalink raw reply

* [PATCH 1/1] forcedeth: use per cpu to collect xmit/recv statistics
From: Zhu Yanjun @ 2019-08-30  8:35 UTC (permalink / raw)
  To: yanjun.zhu, netdev, davem, nan.1986san
In-Reply-To: <1567154111-23315-1-git-send-email-yanjun.zhu@oracle.com>

When testing with a background iperf pushing 1Gbit/sec traffic and running
both ifconfig and netstat to collect statistics, some deadlocks occurred.

Ifconfig and netstat will call nv_get_stats64 to get software xmit/recv
statistics. In the commit f5d827aece36 ("forcedeth: implement
ndo_get_stats64() API"), the normal tx/rx variables is to collect tx/rx
statistics. The fix is to replace normal tx/rx variables with per
cpu 64-bit variable to collect xmit/recv statistics. The per cpu variable
will avoid deadlocks and provide fast efficient statistics updates.

In nv_probe, the per cpu variable is initialized. In nv_remove, this
per cpu variable is freed.

In xmit/recv process, this per cpu variable will be updated.

In nv_get_stats64, this per cpu variable on each cpu is added up. Then
the driver can get xmit/recv packets statistics.

A test runs for several days with this commit, the deadlocks disappear
and the performance is better.

Tested:
	- iperf SMP x86_64 ->
	Client connecting to 1.1.1.108, TCP port 5001
	TCP window size: 85.0 KByte (default)
	------------------------------------------------------------
	[  3] local 1.1.1.105 port 38888 connected with 1.1.1.108 port 5001
	[ ID] Interval       Transfer     Bandwidth
	[  3]  0.0-10.0 sec  1.10 GBytes   943 Mbits/sec

	ifconfig results:

	enp0s9    Link encap:Ethernet  HWaddr 00:21:28:6f:de:0f
		  inet addr:1.1.1.105  Bcast:0.0.0.0  Mask:255.255.255.0
		  UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
		  RX packets:5774764531 errors:0 dropped:0 overruns:0 frame:0
		  TX packets:633534193 errors:0 dropped:0 overruns:0 carrier:0
		  collisions:0 txqueuelen:1000
		  RX bytes:7646159340904 (7.6 TB) TX bytes:11425340407722 (11.4 TB)

	netstat results:

	Kernel Interface table
	Iface MTU Met RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flg
	...
	enp0s9 1500 0  5774764531 0    0 0      633534193      0      0  0 BMRU
	...

Fixes: f5d827aece36 ("forcedeth: implement ndo_get_stats64() API")
CC: Joe Jin <joe.jin@oracle.com>
CC: JUNXIAO_BI <junxiao.bi@oracle.com>
Reported-and-tested-by: Nan san <nan.1986san@gmail.com>
Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
---
 drivers/net/ethernet/nvidia/forcedeth.c | 132 +++++++++++++++++++++-----------
 1 file changed, 88 insertions(+), 44 deletions(-)

diff --git a/drivers/net/ethernet/nvidia/forcedeth.c b/drivers/net/ethernet/nvidia/forcedeth.c
index b327b29..ee8bb9d 100644
--- a/drivers/net/ethernet/nvidia/forcedeth.c
+++ b/drivers/net/ethernet/nvidia/forcedeth.c
@@ -713,6 +713,21 @@ struct nv_skb_map {
 	struct nv_skb_map *next_tx_ctx;
 };
 
+struct nv_txrx_stats {
+	u64 stat_rx_packets;
+	u64 stat_rx_bytes; /* not always available in HW */
+	u64 stat_rx_missed_errors;
+	u64 stat_rx_dropped;
+	u64 stat_tx_packets; /* not always available in HW */
+	u64 stat_tx_bytes;
+	u64 stat_tx_dropped;
+};
+
+#define nv_txrx_stats_inc(member) \
+		__this_cpu_inc(np->txrx_stats->member)
+#define nv_txrx_stats_add(member, count) \
+		__this_cpu_add(np->txrx_stats->member, (count))
+
 /*
  * SMP locking:
  * All hardware access under netdev_priv(dev)->lock, except the performance
@@ -797,10 +812,7 @@ struct fe_priv {
 
 	/* RX software stats */
 	struct u64_stats_sync swstats_rx_syncp;
-	u64 stat_rx_packets;
-	u64 stat_rx_bytes; /* not always available in HW */
-	u64 stat_rx_missed_errors;
-	u64 stat_rx_dropped;
+	struct nv_txrx_stats __percpu *txrx_stats;
 
 	/* media detection workaround.
 	 * Locking: Within irq hander or disable_irq+spin_lock(&np->lock);
@@ -826,9 +838,6 @@ struct fe_priv {
 
 	/* TX software stats */
 	struct u64_stats_sync swstats_tx_syncp;
-	u64 stat_tx_packets; /* not always available in HW */
-	u64 stat_tx_bytes;
-	u64 stat_tx_dropped;
 
 	/* msi/msi-x fields */
 	u32 msi_flags;
@@ -1721,6 +1730,28 @@ static void nv_update_stats(struct net_device *dev)
 	}
 }
 
+static inline void nv_get_stats(int cpu, struct fe_priv *np,
+				struct rtnl_link_stats64 *storage)
+{
+	struct nv_txrx_stats *src = per_cpu_ptr(np->txrx_stats, cpu);
+	unsigned int syncp_start;
+
+	do {
+		syncp_start = u64_stats_fetch_begin_irq(&np->swstats_rx_syncp);
+		storage->rx_packets       += src->stat_rx_packets;
+		storage->rx_bytes         += src->stat_rx_bytes;
+		storage->rx_dropped       += src->stat_rx_dropped;
+		storage->rx_missed_errors += src->stat_rx_missed_errors;
+	} while (u64_stats_fetch_retry_irq(&np->swstats_rx_syncp, syncp_start));
+
+	do {
+		syncp_start = u64_stats_fetch_begin_irq(&np->swstats_tx_syncp);
+		storage->tx_packets += src->stat_tx_packets;
+		storage->tx_bytes   += src->stat_tx_bytes;
+		storage->tx_dropped += src->stat_tx_dropped;
+	} while (u64_stats_fetch_retry_irq(&np->swstats_tx_syncp, syncp_start));
+}
+
 /*
  * nv_get_stats64: dev->ndo_get_stats64 function
  * Get latest stats value from the nic.
@@ -1733,7 +1764,7 @@ nv_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *storage)
 	__releases(&netdev_priv(dev)->hwstats_lock)
 {
 	struct fe_priv *np = netdev_priv(dev);
-	unsigned int syncp_start;
+	int cpu;
 
 	/*
 	 * Note: because HW stats are not always available and for
@@ -1746,20 +1777,8 @@ nv_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *storage)
 	 */
 
 	/* software stats */
-	do {
-		syncp_start = u64_stats_fetch_begin_irq(&np->swstats_rx_syncp);
-		storage->rx_packets       = np->stat_rx_packets;
-		storage->rx_bytes         = np->stat_rx_bytes;
-		storage->rx_dropped       = np->stat_rx_dropped;
-		storage->rx_missed_errors = np->stat_rx_missed_errors;
-	} while (u64_stats_fetch_retry_irq(&np->swstats_rx_syncp, syncp_start));
-
-	do {
-		syncp_start = u64_stats_fetch_begin_irq(&np->swstats_tx_syncp);
-		storage->tx_packets = np->stat_tx_packets;
-		storage->tx_bytes   = np->stat_tx_bytes;
-		storage->tx_dropped = np->stat_tx_dropped;
-	} while (u64_stats_fetch_retry_irq(&np->swstats_tx_syncp, syncp_start));
+	for_each_online_cpu(cpu)
+		nv_get_stats(cpu, np, storage);
 
 	/* If the nic supports hw counters then retrieve latest values */
 	if (np->driver_data & DEV_HAS_STATISTICS_V123) {
@@ -1827,7 +1846,7 @@ static int nv_alloc_rx(struct net_device *dev)
 		} else {
 packet_dropped:
 			u64_stats_update_begin(&np->swstats_rx_syncp);
-			np->stat_rx_dropped++;
+			nv_txrx_stats_inc(stat_rx_dropped);
 			u64_stats_update_end(&np->swstats_rx_syncp);
 			return 1;
 		}
@@ -1869,7 +1888,7 @@ static int nv_alloc_rx_optimized(struct net_device *dev)
 		} else {
 packet_dropped:
 			u64_stats_update_begin(&np->swstats_rx_syncp);
-			np->stat_rx_dropped++;
+			nv_txrx_stats_inc(stat_rx_dropped);
 			u64_stats_update_end(&np->swstats_rx_syncp);
 			return 1;
 		}
@@ -2013,7 +2032,7 @@ static void nv_drain_tx(struct net_device *dev)
 		}
 		if (nv_release_txskb(np, &np->tx_skb[i])) {
 			u64_stats_update_begin(&np->swstats_tx_syncp);
-			np->stat_tx_dropped++;
+			nv_txrx_stats_inc(stat_tx_dropped);
 			u64_stats_update_end(&np->swstats_tx_syncp);
 		}
 		np->tx_skb[i].dma = 0;
@@ -2227,7 +2246,7 @@ static netdev_tx_t nv_start_xmit(struct sk_buff *skb, struct net_device *dev)
 			/* on DMA mapping error - drop the packet */
 			dev_kfree_skb_any(skb);
 			u64_stats_update_begin(&np->swstats_tx_syncp);
-			np->stat_tx_dropped++;
+			nv_txrx_stats_inc(stat_tx_dropped);
 			u64_stats_update_end(&np->swstats_tx_syncp);
 			return NETDEV_TX_OK;
 		}
@@ -2273,7 +2292,7 @@ static netdev_tx_t nv_start_xmit(struct sk_buff *skb, struct net_device *dev)
 				dev_kfree_skb_any(skb);
 				np->put_tx_ctx = start_tx_ctx;
 				u64_stats_update_begin(&np->swstats_tx_syncp);
-				np->stat_tx_dropped++;
+				nv_txrx_stats_inc(stat_tx_dropped);
 				u64_stats_update_end(&np->swstats_tx_syncp);
 				return NETDEV_TX_OK;
 			}
@@ -2384,7 +2403,7 @@ static netdev_tx_t nv_start_xmit_optimized(struct sk_buff *skb,
 			/* on DMA mapping error - drop the packet */
 			dev_kfree_skb_any(skb);
 			u64_stats_update_begin(&np->swstats_tx_syncp);
-			np->stat_tx_dropped++;
+			nv_txrx_stats_inc(stat_tx_dropped);
 			u64_stats_update_end(&np->swstats_tx_syncp);
 			return NETDEV_TX_OK;
 		}
@@ -2431,7 +2450,7 @@ static netdev_tx_t nv_start_xmit_optimized(struct sk_buff *skb,
 				dev_kfree_skb_any(skb);
 				np->put_tx_ctx = start_tx_ctx;
 				u64_stats_update_begin(&np->swstats_tx_syncp);
-				np->stat_tx_dropped++;
+				nv_txrx_stats_inc(stat_tx_dropped);
 				u64_stats_update_end(&np->swstats_tx_syncp);
 				return NETDEV_TX_OK;
 			}
@@ -2560,9 +2579,12 @@ static int nv_tx_done(struct net_device *dev, int limit)
 					    && !(flags & NV_TX_RETRYCOUNT_MASK))
 						nv_legacybackoff_reseed(dev);
 				} else {
+					unsigned int len;
+
 					u64_stats_update_begin(&np->swstats_tx_syncp);
-					np->stat_tx_packets++;
-					np->stat_tx_bytes += np->get_tx_ctx->skb->len;
+					nv_txrx_stats_inc(stat_tx_packets);
+					len = np->get_tx_ctx->skb->len;
+					nv_txrx_stats_add(stat_tx_bytes, len);
 					u64_stats_update_end(&np->swstats_tx_syncp);
 				}
 				bytes_compl += np->get_tx_ctx->skb->len;
@@ -2577,9 +2599,12 @@ static int nv_tx_done(struct net_device *dev, int limit)
 					    && !(flags & NV_TX2_RETRYCOUNT_MASK))
 						nv_legacybackoff_reseed(dev);
 				} else {
+					unsigned int len;
+
 					u64_stats_update_begin(&np->swstats_tx_syncp);
-					np->stat_tx_packets++;
-					np->stat_tx_bytes += np->get_tx_ctx->skb->len;
+					nv_txrx_stats_inc(stat_tx_packets);
+					len = np->get_tx_ctx->skb->len;
+					nv_txrx_stats_add(stat_tx_bytes, len);
 					u64_stats_update_end(&np->swstats_tx_syncp);
 				}
 				bytes_compl += np->get_tx_ctx->skb->len;
@@ -2627,9 +2652,12 @@ static int nv_tx_done_optimized(struct net_device *dev, int limit)
 						nv_legacybackoff_reseed(dev);
 				}
 			} else {
+				unsigned int len;
+
 				u64_stats_update_begin(&np->swstats_tx_syncp);
-				np->stat_tx_packets++;
-				np->stat_tx_bytes += np->get_tx_ctx->skb->len;
+				nv_txrx_stats_inc(stat_tx_packets);
+				len = np->get_tx_ctx->skb->len;
+				nv_txrx_stats_add(stat_tx_bytes, len);
 				u64_stats_update_end(&np->swstats_tx_syncp);
 			}
 
@@ -2806,6 +2834,15 @@ static int nv_getlen(struct net_device *dev, void *packet, int datalen)
 	}
 }
 
+static inline void rx_missing_handler(u32 flags, struct fe_priv *np)
+{
+	if (flags & NV_RX_MISSEDFRAME) {
+		u64_stats_update_begin(&np->swstats_rx_syncp);
+		nv_txrx_stats_inc(stat_rx_missed_errors);
+		u64_stats_update_end(&np->swstats_rx_syncp);
+	}
+}
+
 static int nv_rx_process(struct net_device *dev, int limit)
 {
 	struct fe_priv *np = netdev_priv(dev);
@@ -2848,11 +2885,7 @@ static int nv_rx_process(struct net_device *dev, int limit)
 					}
 					/* the rest are hard errors */
 					else {
-						if (flags & NV_RX_MISSEDFRAME) {
-							u64_stats_update_begin(&np->swstats_rx_syncp);
-							np->stat_rx_missed_errors++;
-							u64_stats_update_end(&np->swstats_rx_syncp);
-						}
+						rx_missing_handler(flags, np);
 						dev_kfree_skb(skb);
 						goto next_pkt;
 					}
@@ -2896,8 +2929,8 @@ static int nv_rx_process(struct net_device *dev, int limit)
 		skb->protocol = eth_type_trans(skb, dev);
 		napi_gro_receive(&np->napi, skb);
 		u64_stats_update_begin(&np->swstats_rx_syncp);
-		np->stat_rx_packets++;
-		np->stat_rx_bytes += len;
+		nv_txrx_stats_inc(stat_rx_packets);
+		nv_txrx_stats_add(stat_rx_bytes, len);
 		u64_stats_update_end(&np->swstats_rx_syncp);
 next_pkt:
 		if (unlikely(np->get_rx.orig++ == np->last_rx.orig))
@@ -2982,8 +3015,8 @@ static int nv_rx_process_optimized(struct net_device *dev, int limit)
 			}
 			napi_gro_receive(&np->napi, skb);
 			u64_stats_update_begin(&np->swstats_rx_syncp);
-			np->stat_rx_packets++;
-			np->stat_rx_bytes += len;
+			nv_txrx_stats_inc(stat_rx_packets);
+			nv_txrx_stats_add(stat_rx_bytes, len);
 			u64_stats_update_end(&np->swstats_rx_syncp);
 		} else {
 			dev_kfree_skb(skb);
@@ -5651,6 +5684,12 @@ static int nv_probe(struct pci_dev *pci_dev, const struct pci_device_id *id)
 	SET_NETDEV_DEV(dev, &pci_dev->dev);
 	u64_stats_init(&np->swstats_rx_syncp);
 	u64_stats_init(&np->swstats_tx_syncp);
+	np->txrx_stats = alloc_percpu(struct nv_txrx_stats);
+	if (!np->txrx_stats) {
+		pr_err("np->txrx_stats, alloc memory error.\n");
+		err = -ENOMEM;
+		goto out_alloc_percpu;
+	}
 
 	timer_setup(&np->oom_kick, nv_do_rx_refill, 0);
 	timer_setup(&np->nic_poll, nv_do_nic_poll, 0);
@@ -6060,6 +6099,8 @@ static int nv_probe(struct pci_dev *pci_dev, const struct pci_device_id *id)
 out_disable:
 	pci_disable_device(pci_dev);
 out_free:
+	free_percpu(np->txrx_stats);
+out_alloc_percpu:
 	free_netdev(dev);
 out:
 	return err;
@@ -6105,6 +6146,9 @@ static void nv_restore_mac_addr(struct pci_dev *pci_dev)
 static void nv_remove(struct pci_dev *pci_dev)
 {
 	struct net_device *dev = pci_get_drvdata(pci_dev);
+	struct fe_priv *np = netdev_priv(dev);
+
+	free_percpu(np->txrx_stats);
 
 	unregister_netdev(dev);
 
-- 
2.7.4


^ permalink raw reply related

* [patch net-next v2] mlx5: Add missing init_net check in FIB notifier
From: Jiri Pirko @ 2019-08-30  8:25 UTC (permalink / raw)
  To: netdev; +Cc: davem, saeedm, leon, roid, mlxsw

From: Jiri Pirko <jiri@mellanox.com>

Take only FIB events that are happening in init_net into account. No other
namespaces are supported.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
v1->v2:
- no change, just cced maintainers (fat finger made me avoid them in v1)
---
 drivers/net/ethernet/mellanox/mlx5/core/lag_mp.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag_mp.c b/drivers/net/ethernet/mellanox/mlx5/core/lag_mp.c
index e69766393990..5d20d615663e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lag_mp.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lag_mp.c
@@ -248,6 +248,9 @@ static int mlx5_lag_fib_event(struct notifier_block *nb,
 	struct net_device *fib_dev;
 	struct fib_info *fi;
 
+	if (!net_eq(info->net, &init_net))
+		return NOTIFY_DONE;
+
 	if (info->family != AF_INET)
 		return NOTIFY_DONE;
 
-- 
2.21.0


^ permalink raw reply related

* [patch net-next] mlx5: Add missing init_net check in FIB notifier
From: Jiri Pirko @ 2019-08-30  8:23 UTC (permalink / raw)
  To: netdev; +Cc: davem, mlxsw

From: Jiri Pirko <jiri@mellanox.com>

Take only FIB events that are happening in init_net into account. No other
namespaces are supported.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/lag_mp.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/lag_mp.c b/drivers/net/ethernet/mellanox/mlx5/core/lag_mp.c
index e69766393990..5d20d615663e 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/lag_mp.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/lag_mp.c
@@ -248,6 +248,9 @@ static int mlx5_lag_fib_event(struct notifier_block *nb,
 	struct net_device *fib_dev;
 	struct fib_info *fi;
 
+	if (!net_eq(info->net, &init_net))
+		return NOTIFY_DONE;
+
 	if (info->family != AF_INET)
 		return NOTIFY_DONE;
 
-- 
2.21.0


^ permalink raw reply related

* Re: [PATCH v2 net-next 05/15] net: sgi: ioc3-eth: allocate space for desc rings only once
From: Thomas Bogendoerfer @ 2019-08-30  8:09 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Ralf Baechle, Paul Burton, James Hogan, David S. Miller,
	linux-mips, linux-kernel, netdev
In-Reply-To: <20190829150504.68a04fe4@cakuba.netronome.com>

On Thu, 29 Aug 2019 15:05:04 -0700
Jakub Kicinski <jakub.kicinski@netronome.com> wrote:

> On Fri, 30 Aug 2019 00:00:58 +0200, Thomas Bogendoerfer wrote:

> > Out of curiosity does kcalloc/kmalloc_array give me the same guarantees about
> > alignment ? rx ring needs to be 4KB aligned, tx ring 16KB aligned.
> 
> I don't think so, actually, I was mostly worried you are passing
> address from get_page() into kfree() here ;) But patch 11 cures that,
> so that's good, too.

I realized that after sending my last mail. I'll fix that in v3 even
it's just a transient bug.

Thomas.

-- 
SUSE Software Solutions Germany GmbH
HRB 247165 (AG München)
Geschäftsführer: Felix Imendörffer

^ permalink raw reply

* Re: [PATCH v3 1/2] net: core: Notify on changes to dev->promiscuity.
From: Jiri Pirko @ 2019-08-30  8:01 UTC (permalink / raw)
  To: David Miller
  Cc: idosch, andrew, horatiu.vultur, alexandre.belloni, UNGLinuxDriver,
	allan.nielsen, ivecera, f.fainelli, netdev, linux-kernel
In-Reply-To: <20190830.003225.292019185488425085.davem@davemloft.net>

Fri, Aug 30, 2019 at 09:32:25AM CEST, davem@davemloft.net wrote:
>From: Jiri Pirko <jiri@resnulli.us>
>Date: Fri, 30 Aug 2019 09:21:33 +0200
>
>> Fri, Aug 30, 2019 at 09:12:23AM CEST, davem@davemloft.net wrote:
>>>From: Jiri Pirko <jiri@resnulli.us>
>>>Date: Fri, 30 Aug 2019 08:36:24 +0200
>>>
>>>> The promiscuity is a way to setup the rx filter. So promics == rx filter
>>>> off. For normal nics, where there is no hw fwd datapath,
>>>> this coincidentally means all received packets go to cpu.
>>>
>>>You cannot convince me that the HW datapath isn't a "rx filter" too, sorry.
>> 
>> If you look at it that way, then we have 2: rx_filter and hw_rx_filter.
>> The point is, those 2 are not one item, that is the point I'm trying to
>> make :/
>
>And you can turn both of them off when I ask for promiscuous mode, that's
>a detail of the device not a semantic issue.

Well, bridge asks for promiscuous mode during enslave -> hw_rx_filter off
When you, want to see all traffic in tcpdump -> rx_filter off

So basically there are 2 flavours of promiscuous mode we have to somehow
distinguish between, so the driver knows what to do.

Nothe that the hw_rx_filter off is not something special to bridge.
There is a usecase for this when no bridge is there, only TC filters for
example.

^ permalink raw reply

* Re: [PATCH v2 2/2] PTP: add support for one-shot output
From: Felipe Balbi @ 2019-08-30  8:00 UTC (permalink / raw)
  To: Richard Cochran; +Cc: Christopher S Hall, netdev, linux-kernel, davem
In-Reply-To: <20190829172848.GC2166@localhost>

[-- Attachment #1: Type: text/plain, Size: 1969 bytes --]


Hi,

Richard Cochran <richardcochran@gmail.com> writes:
> Adding davem onto CC...
>
> On Thu, Aug 29, 2019 at 12:58:25PM +0300, Felipe Balbi wrote:
>> diff --git a/drivers/ptp/ptp_chardev.c b/drivers/ptp/ptp_chardev.c
>> index 98ec1395544e..a407e5f76e2d 100644
>> --- a/drivers/ptp/ptp_chardev.c
>> +++ b/drivers/ptp/ptp_chardev.c
>> @@ -177,9 +177,8 @@ long ptp_ioctl(struct posix_clock *pc, unsigned int cmd, unsigned long arg)
>>  			err = -EFAULT;
>>  			break;
>>  		}
>> -		if ((req.perout.flags || req.perout.rsv[0] || req.perout.rsv[1]
>> -				|| req.perout.rsv[2] || req.perout.rsv[3])
>> -			&& cmd == PTP_PEROUT_REQUEST2) {
>> +		if ((req.perout.rsv[0] || req.perout.rsv[1] || req.perout.rsv[2]
>> +			|| req.perout.rsv[3]) && cmd == PTP_PEROUT_REQUEST2) {
>
> Please check that the reserved bits of req.perout.flags, namely
> ~PTP_PEROUT_ONE_SHOT, are clear.

Actually, we should check more. PEROUT_FEATURE_ENABLE is still valid
here, right? So are RISING and FALLING edges, no?

>
>>  			err = -EINVAL;
>>  			break;
>>  		} else if (cmd == PTP_PEROUT_REQUEST) {
>> diff --git a/include/uapi/linux/ptp_clock.h b/include/uapi/linux/ptp_clock.h
>> index 039cd62ec706..95840e5f5c53 100644
>> --- a/include/uapi/linux/ptp_clock.h
>> +++ b/include/uapi/linux/ptp_clock.h
>> @@ -67,7 +67,9 @@ struct ptp_perout_request {
>>  	struct ptp_clock_time start;  /* Absolute start time. */
>>  	struct ptp_clock_time period; /* Desired period, zero means disable. */
>>  	unsigned int index;           /* Which channel to configure. */
>> -	unsigned int flags;           /* Reserved for future use. */
>> +
>> +#define PTP_PEROUT_ONE_SHOT BIT(0)
>> +	unsigned int flags;
>
> @davem  Any CodingStyle policy on #define within a struct?  (Some
> maintainers won't allow it.)

seems like this should be defined together with the other flags? If
that's the case, it seems like we would EXTTS and PEROUT masks.

-- 
balbi

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply

* Re: [PATCH v2 1/2] PTP: introduce new versions of IOCTLs
From: Felipe Balbi @ 2019-08-30  7:57 UTC (permalink / raw)
  To: Richard Cochran; +Cc: Christopher S Hall, netdev, linux-kernel
In-Reply-To: <20190829172113.GA2166@localhost>

[-- Attachment #1: Type: text/plain, Size: 1165 bytes --]


Hi,

Richard Cochran <richardcochran@gmail.com> writes:
>> @@ -139,11 +141,24 @@ long ptp_ioctl(struct posix_clock *pc, unsigned int cmd, unsigned long arg)
>>  		break;
>>  
>>  	case PTP_EXTTS_REQUEST:
>> +	case PTP_EXTTS_REQUEST2:
>> +		memset(&req, 0, sizeof(req));
>> +
>>  		if (copy_from_user(&req.extts, (void __user *)arg,
>>  				   sizeof(req.extts))) {
>>  			err = -EFAULT;
>>  			break;
>>  		}
>> +		if ((req.extts.flags || req.extts.rsv[0] || req.extts.rsv[1])
>> +			&& cmd == PTP_EXTTS_REQUEST2) {
>> +			err = -EINVAL;
>> +			break;
>> +		} else if (cmd == PTP_EXTTS_REQUEST) {
>> +			req.extts.flags = 0;
>
> This still isn't quite right.  Sorry that was my fault.
>
> The req.extts.flags can be (PTP_ENABLE_FEATURE | PTP_RISING_EDGE |
> PTP_FALLING_EDGE), and ENABLE is used immediately below in this case.
>
> Please #define those bits into a valid mask, and then:
>
> - for PTP_EXTTS_REQUEST2 check that ~mask is zero, and
> - for PTP_EXTTS_REQUEST clear the ~mask bits for the drivers. 
>
> Thanks again for cleaning this up!

good point. This will actually reduce the size of the patch 2.

-- 
balbi

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply

* Re: [RESEND PATCH 0/5] Add bluetooth support for Orange Pi 3
From: Marcel Holtmann @ 2019-08-30  7:53 UTC (permalink / raw)
  To: megous
  Cc: Maxime Ripard, Chen-Yu Tsai, Rob Herring, Johan Hedberg,
	Mark Rutland, David S. Miller, netdev, devicetree, linux-kernel,
	linux-arm-kernel, linux-bluetooth
In-Reply-To: <20190823103139.17687-1-megous@megous.com>

Hi Ondrej,

> (Resend to add missing lists, sorry for the noise.)
> 
> This series implements bluetooth support for Xunlong Orange Pi 3 board.
> 
> The board uses AP6256 WiFi/BT 5.0 chip.
> 
> Summary of changes:
> 
> - add more delay to let initialize the chip
> - let the kernel detect firmware file path
> - add new compatible and update dt-bindings
> - update Orange Pi 3 / H6 DTS
> 
> Please take a look.
> 
> thank you and regards,
>  Ondrej Jirman
> 
> Ondrej Jirman (5):
>  dt-bindings: net: Add compatible for BCM4345C5 bluetooth device
>  bluetooth: bcm: Add support for loading firmware for BCM4345C5
>  bluetooth: hci_bcm: Give more time to come out of reset
>  arm64: dts: allwinner: h6: Add pin configs for uart1
>  arm64: dts: allwinner: orange-pi-3: Enable UART1 / Bluetooth
> 
> .../bindings/net/broadcom-bluetooth.txt       |  1 +
> .../dts/allwinner/sun50i-h6-orangepi-3.dts    | 19 +++++++++++++++++++
> arch/arm64/boot/dts/allwinner/sun50i-h6.dtsi  | 10 ++++++++++
> drivers/bluetooth/btbcm.c                     |  3 +++
> drivers/bluetooth/hci_bcm.c                   |  3 ++-
> 5 files changed, 35 insertions(+), 1 deletion(-)

all 5 patches have been applied to bluetooth-next tree.

Regards

Marcel


^ permalink raw reply

* [PATCH v2 2/2] net: dsa: microchip: add KSZ8563 compatibility string
From: Razvan Stefanescu @ 2019-08-30  7:52 UTC (permalink / raw)
  To: Woojung Huh, Microchip Linux Driver Support, Andrew Lunn,
	Vivien Didelot, Florian Fainelli, David S . Miller
  Cc: netdev, linux-kernel, Razvan Stefanescu
In-Reply-To: <20190830075202.20740-1-razvan.stefanescu@microchip.com>

It is a 3-Port 10/100 Ethernet Switch with 1588v2 PTP.

Signed-off-by: Razvan Stefanescu <razvan.stefanescu@microchip.com>
---
Changelog
v2: no update

 drivers/net/dsa/microchip/ksz9477_spi.c | 1 +
 1 file changed, 1 insertion(+)

diff --git a/drivers/net/dsa/microchip/ksz9477_spi.c b/drivers/net/dsa/microchip/ksz9477_spi.c
index a226b389e12d..2e402e4d866f 100644
--- a/drivers/net/dsa/microchip/ksz9477_spi.c
+++ b/drivers/net/dsa/microchip/ksz9477_spi.c
@@ -80,6 +80,7 @@ static const struct of_device_id ksz9477_dt_ids[] = {
 	{ .compatible = "microchip,ksz9897" },
 	{ .compatible = "microchip,ksz9893" },
 	{ .compatible = "microchip,ksz9563" },
+	{ .compatible = "microchip,ksz8563" },
 	{},
 };
 MODULE_DEVICE_TABLE(of, ksz9477_dt_ids);
-- 
2.20.1


^ permalink raw reply related

* [PATCH v2 1/2] dt-bindings: net: dsa: document additional Microchip KSZ8563 switch
From: Razvan Stefanescu @ 2019-08-30  7:52 UTC (permalink / raw)
  To: Woojung Huh, Microchip Linux Driver Support, Andrew Lunn,
	Vivien Didelot, Florian Fainelli, David S . Miller
  Cc: netdev, linux-kernel, Razvan Stefanescu
In-Reply-To: <20190830075202.20740-1-razvan.stefanescu@microchip.com>

It is a 3-Port 10/100 Ethernet Switch with 1588v2 PTP.

Signed-off-by: Razvan Stefanescu <razvan.stefanescu@microchip.com>
---
Changelog
v2: no update

 Documentation/devicetree/bindings/net/dsa/ksz.txt | 1 +
 1 file changed, 1 insertion(+)

diff --git a/Documentation/devicetree/bindings/net/dsa/ksz.txt b/Documentation/devicetree/bindings/net/dsa/ksz.txt
index 5e8429b6f9ca..95e91e84151c 100644
--- a/Documentation/devicetree/bindings/net/dsa/ksz.txt
+++ b/Documentation/devicetree/bindings/net/dsa/ksz.txt
@@ -15,6 +15,7 @@ Required properties:
   - "microchip,ksz8565"
   - "microchip,ksz9893"
   - "microchip,ksz9563"
+  - "microchip,ksz8563"
 
 Optional properties:
 
-- 
2.20.1


^ permalink raw reply related

* [PATCH v2 0/2] net: dsa: microchip: add KSZ8563 support
From: Razvan Stefanescu @ 2019-08-30  7:52 UTC (permalink / raw)
  To: Woojung Huh, Microchip Linux Driver Support, Andrew Lunn,
	Vivien Didelot, Florian Fainelli, David S . Miller
  Cc: netdev, linux-kernel, Razvan Stefanescu

This patchset adds compatibility string for the KSZ8563 switch.

Razvan Stefanescu (2):
  dt-bindings: net: dsa: document additional Microchip KSZ8563 switch
  net: dsa: microchip: add KSZ8563 compatibility string

 Documentation/devicetree/bindings/net/dsa/ksz.txt | 1 +
 drivers/net/dsa/microchip/ksz9477_spi.c           | 1 +
 2 files changed, 2 insertions(+)

--
Changelog:
v2: drop fix patches

2.20.1


^ permalink raw reply

* [PATCH net-next 4/4] qede: Add support for dumping the grc data.
From: Sudarsana Reddy Kalluru @ 2019-08-30  7:42 UTC (permalink / raw)
  To: davem; +Cc: netdev, mkalderon, aelior
In-Reply-To: <20190830074206.8836-1-skalluru@marvell.com>

This patch adds driver support for configuring grc dump config flags, and
dumping the grc data via ethtool get/set-dump interfaces.

Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
---
 drivers/net/ethernet/qlogic/qede/qede.h         |  1 +
 drivers/net/ethernet/qlogic/qede/qede_ethtool.c | 29 +++++++++++++++++++++++--
 2 files changed, 28 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/qlogic/qede/qede.h b/drivers/net/ethernet/qlogic/qede/qede.h
index 8f2adde..c303a92 100644
--- a/drivers/net/ethernet/qlogic/qede/qede.h
+++ b/drivers/net/ethernet/qlogic/qede/qede.h
@@ -181,6 +181,7 @@ enum qede_flags_bit {
 enum qede_dump_cmd {
 	QEDE_DUMP_CMD_NONE = 0,
 	QEDE_DUMP_CMD_NVM_CFG,
+	QEDE_DUMP_CMD_GRCDUMP,
 	QEDE_DUMP_CMD_MAX
 };
 
diff --git a/drivers/net/ethernet/qlogic/qede/qede_ethtool.c b/drivers/net/ethernet/qlogic/qede/qede_ethtool.c
index 2359293..ec27a43 100644
--- a/drivers/net/ethernet/qlogic/qede/qede_ethtool.c
+++ b/drivers/net/ethernet/qlogic/qede/qede_ethtool.c
@@ -2001,6 +2001,10 @@ static int qede_set_dump(struct net_device *dev, struct ethtool_dump *val)
 		edev->dump_info.args[edev->dump_info.num_args] = val->flag;
 		edev->dump_info.num_args++;
 		break;
+	case QEDE_DUMP_CMD_GRCDUMP:
+		rc = edev->ops->common->set_grc_config(edev->cdev,
+						       val->flag, 1);
+		break;
 	default:
 		break;
 	}
@@ -2013,14 +2017,24 @@ static int qede_get_dump_flag(struct net_device *dev,
 {
 	struct qede_dev *edev = netdev_priv(dev);
 
+	if (!edev->ops || !edev->ops->common) {
+		DP_ERR(edev, "Edev ops not populated\n");
+		return -EINVAL;
+	}
+
 	dump->version = QEDE_DUMP_VERSION;
 	switch (edev->dump_info.cmd) {
 	case QEDE_DUMP_CMD_NVM_CFG:
 		dump->flag = QEDE_DUMP_CMD_NVM_CFG;
 		dump->len = QEDE_DUMP_NVM_BUF_LEN;
 		break;
-	default:
+	case QEDE_DUMP_CMD_GRCDUMP:
+		dump->flag = QEDE_DUMP_CMD_GRCDUMP;
+		dump->len = edev->ops->common->dbg_all_data_size(edev->cdev);
 		break;
+	default:
+		DP_ERR(edev, "Invalid cmd = %d\n", edev->dump_info.cmd);
+		return -EINVAL;
 	}
 
 	DP_VERBOSE(edev, QED_MSG_DEBUG,
@@ -2033,7 +2047,14 @@ static int qede_get_dump_data(struct net_device *dev,
 			      struct ethtool_dump *dump, void *buf)
 {
 	struct qede_dev *edev = netdev_priv(dev);
-	int rc;
+	int rc = 0;
+
+	if (!edev->ops || !edev->ops->common) {
+		DP_ERR(edev, "Edev ops not populated\n");
+		edev->dump_info.cmd = QEDE_DUMP_CMD_NONE;
+		edev->dump_info.num_args = 0;
+		return -EINVAL;
+	}
 
 	switch (edev->dump_info.cmd) {
 	case QEDE_DUMP_CMD_NVM_CFG:
@@ -2047,6 +2068,10 @@ static int qede_get_dump_data(struct net_device *dev,
 						      edev->dump_info.args[0],
 						      edev->dump_info.args[1]);
 		break;
+	case QEDE_DUMP_CMD_GRCDUMP:
+		memset(buf, 0, dump->len);
+		rc = edev->ops->common->dbg_all_data(edev->cdev, buf);
+		break;
 	default:
 		DP_ERR(edev, "Invalid cmd = %d\n", edev->dump_info.cmd);
 		rc = -EINVAL;
-- 
1.8.3.1


^ permalink raw reply related

* [PATCH net-next 3/4] qed: Add APIs for configuring grc dump config flags.
From: Sudarsana Reddy Kalluru @ 2019-08-30  7:42 UTC (permalink / raw)
  To: davem; +Cc: netdev, mkalderon, aelior
In-Reply-To: <20190830074206.8836-1-skalluru@marvell.com>

The patch adds driver support for configuring the grc dump config flags.

Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
---
 drivers/net/ethernet/qlogic/qed/qed_debug.c | 82 +++++++++++++++++++++++++++++
 drivers/net/ethernet/qlogic/qed/qed_hsi.h   | 15 ++++++
 drivers/net/ethernet/qlogic/qed/qed_main.c  | 21 ++++++++
 include/linux/qed/qed_if.h                  |  9 ++++
 4 files changed, 127 insertions(+)

diff --git a/drivers/net/ethernet/qlogic/qed/qed_debug.c b/drivers/net/ethernet/qlogic/qed/qed_debug.c
index 5ea6c4f..859caa6 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_debug.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_debug.c
@@ -1756,6 +1756,15 @@ static u32 qed_read_unaligned_dword(u8 *buf)
 	return dword;
 }
 
+/* Sets the value of the specified GRC param */
+static void qed_grc_set_param(struct qed_hwfn *p_hwfn,
+			      enum dbg_grc_params grc_param, u32 val)
+{
+	struct dbg_tools_data *dev_data = &p_hwfn->dbg_info;
+
+	dev_data->grc.param_val[grc_param] = val;
+}
+
 /* Returns the value of the specified GRC param */
 static u32 qed_grc_get_param(struct qed_hwfn *p_hwfn,
 			     enum dbg_grc_params grc_param)
@@ -5119,6 +5128,69 @@ bool qed_read_fw_info(struct qed_hwfn *p_hwfn,
 	return false;
 }
 
+enum dbg_status qed_dbg_grc_config(struct qed_hwfn *p_hwfn,
+				   struct qed_ptt *p_ptt,
+				   enum dbg_grc_params grc_param, u32 val)
+{
+	enum dbg_status status;
+	int i;
+
+	DP_VERBOSE(p_hwfn, QED_MSG_DEBUG,
+		   "dbg_grc_config: paramId = %d, val = %d\n", grc_param, val);
+
+	status = qed_dbg_dev_init(p_hwfn, p_ptt);
+	if (status != DBG_STATUS_OK)
+		return status;
+
+	/* Initializes the GRC parameters (if not initialized). Needed in order
+	 * to set the default parameter values for the first time.
+	 */
+	qed_dbg_grc_init_params(p_hwfn);
+
+	if (grc_param >= MAX_DBG_GRC_PARAMS)
+		return DBG_STATUS_INVALID_ARGS;
+	if (val < s_grc_param_defs[grc_param].min ||
+	    val > s_grc_param_defs[grc_param].max)
+		return DBG_STATUS_INVALID_ARGS;
+
+	if (s_grc_param_defs[grc_param].is_preset) {
+		/* Preset param */
+
+		/* Disabling a preset is not allowed. Call
+		 * dbg_grc_set_params_default instead.
+		 */
+		if (!val)
+			return DBG_STATUS_INVALID_ARGS;
+
+		/* Update all params with the preset values */
+		for (i = 0; i < MAX_DBG_GRC_PARAMS; i++) {
+			u32 preset_val;
+
+			/* Skip persistent params */
+			if (s_grc_param_defs[i].is_persistent)
+				continue;
+
+			/* Find preset value */
+			if (grc_param == DBG_GRC_PARAM_EXCLUDE_ALL)
+				preset_val =
+				    s_grc_param_defs[i].exclude_all_preset_val;
+			else if (grc_param == DBG_GRC_PARAM_CRASH)
+				preset_val =
+				    s_grc_param_defs[i].crash_preset_val;
+			else
+				return DBG_STATUS_INVALID_ARGS;
+
+			qed_grc_set_param(p_hwfn,
+					  (enum dbg_grc_params)i, preset_val);
+		}
+	} else {
+		/* Regular param - set its value */
+		qed_grc_set_param(p_hwfn, grc_param, val);
+	}
+
+	return DBG_STATUS_OK;
+}
+
 /* Assign default GRC param values */
 void qed_dbg_grc_set_params_default(struct qed_hwfn *p_hwfn)
 {
@@ -7997,9 +8069,16 @@ static u32 qed_calc_regdump_header(enum debug_print_features feature,
 int qed_dbg_all_data(struct qed_dev *cdev, void *buffer)
 {
 	u8 cur_engine, omit_engine = 0, org_engine;
+	struct qed_hwfn *p_hwfn =
+		&cdev->hwfns[cdev->dbg_params.engine_for_debug];
+	struct dbg_tools_data *dev_data = &p_hwfn->dbg_info;
+	int grc_params[MAX_DBG_GRC_PARAMS], i;
 	u32 offset = 0, feature_size;
 	int rc;
 
+	for (i = 0; i < MAX_DBG_GRC_PARAMS; i++)
+		grc_params[i] = dev_data->grc.param_val[i];
+
 	if (cdev->num_hwfns == 1)
 		omit_engine = 1;
 
@@ -8087,6 +8166,9 @@ int qed_dbg_all_data(struct qed_dev *cdev, void *buffer)
 			       rc);
 		}
 
+		for (i = 0; i < MAX_DBG_GRC_PARAMS; i++)
+			dev_data->grc.param_val[i] = grc_params[i];
+
 		/* GRC dump - must be last because when mcp stuck it will
 		 * clutter idle_chk, reg_fifo, ...
 		 */
diff --git a/drivers/net/ethernet/qlogic/qed/qed_hsi.h b/drivers/net/ethernet/qlogic/qed/qed_hsi.h
index 557a12e..cf3ceb6 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_hsi.h
+++ b/drivers/net/ethernet/qlogic/qed/qed_hsi.h
@@ -3024,6 +3024,21 @@ void qed_read_regs(struct qed_hwfn *p_hwfn,
  */
 bool qed_read_fw_info(struct qed_hwfn *p_hwfn,
 		      struct qed_ptt *p_ptt, struct fw_info *fw_info);
+/**
+ * @brief qed_dbg_grc_config - Sets the value of a GRC parameter.
+ *
+ * @param p_hwfn -	HW device data
+ * @param grc_param -	GRC parameter
+ * @param val -		Value to set.
+ *
+ * @return error if one of the following holds:
+ *	- the version wasn't set
+ *	- grc_param is invalid
+ *	- val is outside the allowed boundaries
+ */
+enum dbg_status qed_dbg_grc_config(struct qed_hwfn *p_hwfn,
+				   struct qed_ptt *p_ptt,
+				   enum dbg_grc_params grc_param, u32 val);
 
 /**
  * @brief qed_dbg_grc_set_params_default - Reverts all GRC parameters to their
diff --git a/drivers/net/ethernet/qlogic/qed/qed_main.c b/drivers/net/ethernet/qlogic/qed/qed_main.c
index c9a7571..ac1511a8 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_main.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_main.c
@@ -2583,6 +2583,26 @@ static int qed_read_module_eeprom(struct qed_dev *cdev, char *buf,
 	return rc;
 }
 
+static int qed_set_grc_config(struct qed_dev *cdev, u32 cfg_id, u32 val)
+{
+	struct qed_hwfn *hwfn = QED_LEADING_HWFN(cdev);
+	struct qed_ptt *ptt;
+	int rc = 0;
+
+	if (IS_VF(cdev))
+		return 0;
+
+	ptt = qed_ptt_acquire(hwfn);
+	if (!ptt)
+		return -EAGAIN;
+
+	rc = qed_dbg_grc_config(hwfn, ptt, cfg_id, val);
+
+	qed_ptt_release(hwfn, ptt);
+
+	return rc;
+}
+
 static u8 qed_get_affin_hwfn_idx(struct qed_dev *cdev)
 {
 	return QED_AFFIN_HWFN_IDX(cdev);
@@ -2637,6 +2657,7 @@ static u8 qed_get_affin_hwfn_idx(struct qed_dev *cdev)
 	.read_module_eeprom = &qed_read_module_eeprom,
 	.get_affin_hwfn_idx = &qed_get_affin_hwfn_idx,
 	.read_nvm_cfg = &qed_nvm_flash_cfg_read,
+	.set_grc_config = &qed_set_grc_config,
 };
 
 void qed_get_protocol_stats(struct qed_dev *cdev,
diff --git a/include/linux/qed/qed_if.h b/include/linux/qed/qed_if.h
index 06fd958..e354638 100644
--- a/include/linux/qed/qed_if.h
+++ b/include/linux/qed/qed_if.h
@@ -1143,6 +1143,15 @@ struct qed_common_ops {
  */
 	int (*read_nvm_cfg)(struct qed_dev *cdev, u8 **buf, u32 cmd,
 			    u32 entity_id);
+
+/**
+ * @brief set_grc_config - Configure value for grc config id.
+ * @param cdev
+ * @param cfg_id - grc config id
+ * @param val - grc config value
+ *
+ */
+	int (*set_grc_config)(struct qed_dev *cdev, u32 cfg_id, u32 val);
 };
 
 #define MASK_FIELD(_name, _value) \
-- 
1.8.3.1


^ permalink raw reply related

* [PATCH net-next 2/4] qede: Add support for reading the config id attributes.
From: Sudarsana Reddy Kalluru @ 2019-08-30  7:42 UTC (permalink / raw)
  To: davem; +Cc: netdev, mkalderon, aelior
In-Reply-To: <20190830074206.8836-1-skalluru@marvell.com>

Add driver support for dumping the config id attributes via ethtool dump
interfaces.

Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
---
 drivers/net/ethernet/qlogic/qede/qede.h         | 14 ++++
 drivers/net/ethernet/qlogic/qede/qede_ethtool.c | 89 +++++++++++++++++++++++++
 2 files changed, 103 insertions(+)

diff --git a/drivers/net/ethernet/qlogic/qede/qede.h b/drivers/net/ethernet/qlogic/qede/qede.h
index 0e931c0..8f2adde 100644
--- a/drivers/net/ethernet/qlogic/qede/qede.h
+++ b/drivers/net/ethernet/qlogic/qede/qede.h
@@ -177,6 +177,19 @@ enum qede_flags_bit {
 	QEDE_FLAGS_TX_TIMESTAMPING_EN
 };
 
+#define QEDE_DUMP_MAX_ARGS 4
+enum qede_dump_cmd {
+	QEDE_DUMP_CMD_NONE = 0,
+	QEDE_DUMP_CMD_NVM_CFG,
+	QEDE_DUMP_CMD_MAX
+};
+
+struct qede_dump_info {
+	enum qede_dump_cmd cmd;
+	u8 num_args;
+	u32 args[QEDE_DUMP_MAX_ARGS];
+};
+
 struct qede_dev {
 	struct qed_dev			*cdev;
 	struct net_device		*ndev;
@@ -262,6 +275,7 @@ struct qede_dev {
 	struct qede_rdma_dev		rdma_info;
 
 	struct bpf_prog *xdp_prog;
+	struct qede_dump_info		dump_info;
 };
 
 enum QEDE_STATE {
diff --git a/drivers/net/ethernet/qlogic/qede/qede_ethtool.c b/drivers/net/ethernet/qlogic/qede/qede_ethtool.c
index abcee47..2359293 100644
--- a/drivers/net/ethernet/qlogic/qede/qede_ethtool.c
+++ b/drivers/net/ethernet/qlogic/qede/qede_ethtool.c
@@ -48,6 +48,9 @@
 	 {QEDE_RQSTAT_OFFSET(stat_name), QEDE_RQSTAT_STRING(stat_name)}
 
 #define QEDE_SELFTEST_POLL_COUNT 100
+#define QEDE_DUMP_VERSION	0x1
+#define QEDE_DUMP_NVM_BUF_LEN	32
+#define QEDE_DUMP_NVM_ARG_COUNT	2
 
 static const struct {
 	u64 offset;
@@ -1973,6 +1976,89 @@ static int qede_get_module_eeprom(struct net_device *dev,
 	return rc;
 }
 
+static int qede_set_dump(struct net_device *dev, struct ethtool_dump *val)
+{
+	struct qede_dev *edev = netdev_priv(dev);
+	int rc = 0;
+
+	if (edev->dump_info.cmd == QEDE_DUMP_CMD_NONE) {
+		if (val->flag > QEDE_DUMP_CMD_MAX) {
+			DP_ERR(edev, "Invalid command %d\n", val->flag);
+			return -EINVAL;
+		}
+		edev->dump_info.cmd = val->flag;
+		edev->dump_info.num_args = 0;
+		return 0;
+	}
+
+	if (edev->dump_info.num_args == QEDE_DUMP_MAX_ARGS) {
+		DP_ERR(edev, "Arg count = %d\n", edev->dump_info.num_args);
+		return -EINVAL;
+	}
+
+	switch (edev->dump_info.cmd) {
+	case QEDE_DUMP_CMD_NVM_CFG:
+		edev->dump_info.args[edev->dump_info.num_args] = val->flag;
+		edev->dump_info.num_args++;
+		break;
+	default:
+		break;
+	}
+
+	return rc;
+}
+
+static int qede_get_dump_flag(struct net_device *dev,
+			      struct ethtool_dump *dump)
+{
+	struct qede_dev *edev = netdev_priv(dev);
+
+	dump->version = QEDE_DUMP_VERSION;
+	switch (edev->dump_info.cmd) {
+	case QEDE_DUMP_CMD_NVM_CFG:
+		dump->flag = QEDE_DUMP_CMD_NVM_CFG;
+		dump->len = QEDE_DUMP_NVM_BUF_LEN;
+		break;
+	default:
+		break;
+	}
+
+	DP_VERBOSE(edev, QED_MSG_DEBUG,
+		   "dump->version = 0x%x dump->flag = %d dump->len = %d\n",
+		   dump->version, dump->flag, dump->len);
+	return 0;
+}
+
+static int qede_get_dump_data(struct net_device *dev,
+			      struct ethtool_dump *dump, void *buf)
+{
+	struct qede_dev *edev = netdev_priv(dev);
+	int rc;
+
+	switch (edev->dump_info.cmd) {
+	case QEDE_DUMP_CMD_NVM_CFG:
+		if (edev->dump_info.num_args != QEDE_DUMP_NVM_ARG_COUNT) {
+			DP_ERR(edev, "Arg count = %d required = %d\n",
+			       edev->dump_info.num_args,
+			       QEDE_DUMP_NVM_ARG_COUNT);
+			return -EINVAL;
+		}
+		rc =  edev->ops->common->read_nvm_cfg(edev->cdev, (u8 **)&buf,
+						      edev->dump_info.args[0],
+						      edev->dump_info.args[1]);
+		break;
+	default:
+		DP_ERR(edev, "Invalid cmd = %d\n", edev->dump_info.cmd);
+		rc = -EINVAL;
+		break;
+	}
+
+	edev->dump_info.cmd = QEDE_DUMP_CMD_NONE;
+	edev->dump_info.num_args = 0;
+
+	return rc;
+}
+
 static const struct ethtool_ops qede_ethtool_ops = {
 	.get_link_ksettings = qede_get_link_ksettings,
 	.set_link_ksettings = qede_set_link_ksettings,
@@ -2014,6 +2100,9 @@ static int qede_get_module_eeprom(struct net_device *dev,
 	.get_tunable = qede_get_tunable,
 	.set_tunable = qede_set_tunable,
 	.flash_device = qede_flash_device,
+	.get_dump_flag = qede_get_dump_flag,
+	.get_dump_data = qede_get_dump_data,
+	.set_dump = qede_set_dump,
 };
 
 static const struct ethtool_ops qede_vf_ethtool_ops = {
-- 
1.8.3.1


^ permalink raw reply related

* [PATCH net-next 1/4] qed: Add APIs for reading config id attributes.
From: Sudarsana Reddy Kalluru @ 2019-08-30  7:42 UTC (permalink / raw)
  To: davem; +Cc: netdev, mkalderon, aelior
In-Reply-To: <20190830074206.8836-1-skalluru@marvell.com>

The patch adds driver support for reading the config id attributes from NVM
flash partition.

Signed-off-by: Sudarsana Reddy Kalluru <skalluru@marvell.com>
Signed-off-by: Ariel Elior <aelior@marvell.com>
---
 drivers/net/ethernet/qlogic/qed/qed_main.c | 27 +++++++++++++++++++++++++++
 drivers/net/ethernet/qlogic/qed/qed_mcp.c  | 29 +++++++++++++++++++++++++++++
 drivers/net/ethernet/qlogic/qed/qed_mcp.h  | 15 +++++++++++++++
 include/linux/qed/qed_if.h                 | 11 +++++++++++
 4 files changed, 82 insertions(+)

diff --git a/drivers/net/ethernet/qlogic/qed/qed_main.c b/drivers/net/ethernet/qlogic/qed/qed_main.c
index 7891f8c..c9a7571 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_main.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_main.c
@@ -69,6 +69,8 @@
 #define QED_RDMA_SRQS                   QED_ROCE_QPS
 #define QED_NVM_CFG_SET_FLAGS		0xE
 #define QED_NVM_CFG_SET_PF_FLAGS	0x1E
+#define QED_NVM_CFG_GET_FLAGS		0xA
+#define QED_NVM_CFG_GET_PF_FLAGS	0x1A
 
 static char version[] =
 	"QLogic FastLinQ 4xxxx Core Module qed " DRV_MODULE_VERSION "\n";
@@ -2298,6 +2300,30 @@ static int qed_nvm_flash_cfg_write(struct qed_dev *cdev, const u8 **data)
 	return rc;
 }
 
+static int qed_nvm_flash_cfg_read(struct qed_dev *cdev, u8 **data,
+				  u32 cmd, u32 entity_id)
+{
+	struct qed_hwfn *hwfn = QED_LEADING_HWFN(cdev);
+	struct qed_ptt *ptt;
+	u32 flags, len;
+	int rc = 0;
+
+	ptt = qed_ptt_acquire(hwfn);
+	if (!ptt)
+		return -EAGAIN;
+
+	DP_VERBOSE(cdev, NETIF_MSG_DRV,
+		   "Read config cmd = %d entity id %d\n", cmd, entity_id);
+	flags = entity_id ? QED_NVM_CFG_GET_PF_FLAGS : QED_NVM_CFG_GET_FLAGS;
+	rc = qed_mcp_nvm_get_cfg(hwfn, ptt, cmd, entity_id, flags, *data, &len);
+	if (rc)
+		DP_ERR(cdev, "Error %d reading %d\n", rc, cmd);
+
+	qed_ptt_release(hwfn, ptt);
+
+	return rc;
+}
+
 static int qed_nvm_flash(struct qed_dev *cdev, const char *name)
 {
 	const struct firmware *image;
@@ -2610,6 +2636,7 @@ static u8 qed_get_affin_hwfn_idx(struct qed_dev *cdev)
 	.db_recovery_del = &qed_db_recovery_del,
 	.read_module_eeprom = &qed_read_module_eeprom,
 	.get_affin_hwfn_idx = &qed_get_affin_hwfn_idx,
+	.read_nvm_cfg = &qed_nvm_flash_cfg_read,
 };
 
 void qed_get_protocol_stats(struct qed_dev *cdev,
diff --git a/drivers/net/ethernet/qlogic/qed/qed_mcp.c b/drivers/net/ethernet/qlogic/qed/qed_mcp.c
index 89462c4..36ddb89 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_mcp.c
+++ b/drivers/net/ethernet/qlogic/qed/qed_mcp.c
@@ -3751,6 +3751,35 @@ int qed_mcp_get_ppfid_bitmap(struct qed_hwfn *p_hwfn, struct qed_ptt *p_ptt)
 	return 0;
 }
 
+int qed_mcp_nvm_get_cfg(struct qed_hwfn *p_hwfn, struct qed_ptt *p_ptt,
+			u16 option_id, u8 entity_id, u16 flags, u8 *p_buf,
+			u32 *p_len)
+{
+	u32 mb_param = 0, resp, param;
+	int rc;
+
+	QED_MFW_SET_FIELD(mb_param, DRV_MB_PARAM_NVM_CFG_OPTION_ID, option_id);
+	if (flags & QED_NVM_CFG_OPTION_INIT)
+		QED_MFW_SET_FIELD(mb_param,
+				  DRV_MB_PARAM_NVM_CFG_OPTION_INIT, 1);
+	if (flags & QED_NVM_CFG_OPTION_FREE)
+		QED_MFW_SET_FIELD(mb_param,
+				  DRV_MB_PARAM_NVM_CFG_OPTION_FREE, 1);
+	if (flags & QED_NVM_CFG_OPTION_ENTITY_SEL) {
+		QED_MFW_SET_FIELD(mb_param,
+				  DRV_MB_PARAM_NVM_CFG_OPTION_ENTITY_SEL, 1);
+		QED_MFW_SET_FIELD(mb_param,
+				  DRV_MB_PARAM_NVM_CFG_OPTION_ENTITY_ID,
+				  entity_id);
+	}
+
+	rc = qed_mcp_nvm_rd_cmd(p_hwfn, p_ptt,
+				DRV_MSG_CODE_GET_NVM_CFG_OPTION,
+				mb_param, &resp, &param, p_len, (u32 *)p_buf);
+
+	return rc;
+}
+
 int qed_mcp_nvm_set_cfg(struct qed_hwfn *p_hwfn, struct qed_ptt *p_ptt,
 			u16 option_id, u8 entity_id, u16 flags, u8 *p_buf,
 			u32 len)
diff --git a/drivers/net/ethernet/qlogic/qed/qed_mcp.h b/drivers/net/ethernet/qlogic/qed/qed_mcp.h
index 83649a8..9c4c276 100644
--- a/drivers/net/ethernet/qlogic/qed/qed_mcp.h
+++ b/drivers/net/ethernet/qlogic/qed/qed_mcp.h
@@ -1209,6 +1209,21 @@ void qed_mcp_resc_lock_default_init(struct qed_resc_lock_params *p_lock,
 int qed_mcp_get_ppfid_bitmap(struct qed_hwfn *p_hwfn, struct qed_ptt *p_ptt);
 
 /**
+ * @brief Get NVM config attribute value.
+ *
+ * @param p_hwfn
+ * @param p_ptt
+ * @param option_id
+ * @param entity_id
+ * @param flags
+ * @param p_buf
+ * @param p_len
+ */
+int qed_mcp_nvm_get_cfg(struct qed_hwfn *p_hwfn, struct qed_ptt *p_ptt,
+			u16 option_id, u8 entity_id, u16 flags, u8 *p_buf,
+			u32 *p_len);
+
+/**
  * @brief Set NVM config attribute value.
  *
  * @param p_hwfn
diff --git a/include/linux/qed/qed_if.h b/include/linux/qed/qed_if.h
index e366399..06fd958 100644
--- a/include/linux/qed/qed_if.h
+++ b/include/linux/qed/qed_if.h
@@ -1132,6 +1132,17 @@ struct qed_common_ops {
  * @param cdev
  */
 	u8 (*get_affin_hwfn_idx)(struct qed_dev *cdev);
+
+/**
+ * @brief read_nvm_cfg - Read NVM config attribute value.
+ * @param cdev
+ * @param buf - buffer
+ * @param cmd - NVM CFG command id
+ * @param entity_id - Entity id
+ *
+ */
+	int (*read_nvm_cfg)(struct qed_dev *cdev, u8 **buf, u32 cmd,
+			    u32 entity_id);
 };
 
 #define MASK_FIELD(_name, _value) \
-- 
1.8.3.1


^ permalink raw reply related

* [PATCH net-next 0/4] qed*: Enhancements.
From: Sudarsana Reddy Kalluru @ 2019-08-30  7:42 UTC (permalink / raw)
  To: davem; +Cc: netdev, mkalderon, aelior

The patch series adds couple of enhancements to qed/qede drivers.
  - Support for dumping the config id attributes via ethtool -w/W.
  - Support for dumping the GRC data of required memory regions using
    ethtool -w/W interfaces.

Patch (1) adds driver APIs for reading the config id attributes.
Patch (2) adds ethtool support for dumping the config id attributes.
Patch (3) adds support for configuring the GRC dump config flags.
Patch (4) adds ethtool support for dumping the grc dump.

Please consider applying it to net-next.

Sudarsana Reddy Kalluru (4):
  qed: Add APIs for reading config id attributes.
  qede: Add support for reading the config id attributes.
  qed: Add APIs for configuring grc dump config flags.
  qede: Add support for dumping the grc data.

 drivers/net/ethernet/qlogic/qed/qed_debug.c     |  82 +++++++++++++++++
 drivers/net/ethernet/qlogic/qed/qed_hsi.h       |  15 ++++
 drivers/net/ethernet/qlogic/qed/qed_main.c      |  48 ++++++++++
 drivers/net/ethernet/qlogic/qed/qed_mcp.c       |  29 ++++++
 drivers/net/ethernet/qlogic/qed/qed_mcp.h       |  15 ++++
 drivers/net/ethernet/qlogic/qede/qede.h         |  15 ++++
 drivers/net/ethernet/qlogic/qede/qede_ethtool.c | 114 ++++++++++++++++++++++++
 include/linux/qed/qed_if.h                      |  20 +++++
 8 files changed, 338 insertions(+)

-- 
1.8.3.1


^ permalink raw reply

* Re: [PATCH v3 1/2] net: core: Notify on changes to dev->promiscuity.
From: David Miller @ 2019-08-30  7:32 UTC (permalink / raw)
  To: jiri
  Cc: idosch, andrew, horatiu.vultur, alexandre.belloni, UNGLinuxDriver,
	allan.nielsen, ivecera, f.fainelli, netdev, linux-kernel
In-Reply-To: <20190830072133.GP2312@nanopsycho>

From: Jiri Pirko <jiri@resnulli.us>
Date: Fri, 30 Aug 2019 09:21:33 +0200

> Fri, Aug 30, 2019 at 09:12:23AM CEST, davem@davemloft.net wrote:
>>From: Jiri Pirko <jiri@resnulli.us>
>>Date: Fri, 30 Aug 2019 08:36:24 +0200
>>
>>> The promiscuity is a way to setup the rx filter. So promics == rx filter
>>> off. For normal nics, where there is no hw fwd datapath,
>>> this coincidentally means all received packets go to cpu.
>>
>>You cannot convince me that the HW datapath isn't a "rx filter" too, sorry.
> 
> If you look at it that way, then we have 2: rx_filter and hw_rx_filter.
> The point is, those 2 are not one item, that is the point I'm trying to
> make :/

And you can turn both of them off when I ask for promiscuous mode, that's
a detail of the device not a semantic issue.

^ permalink raw reply

* [PATCH 1/1] batman-adv: Add Sven to MAINTAINERS file
From: Simon Wunderlich @ 2019-08-30  7:27 UTC (permalink / raw)
  To: davem; +Cc: netdev, b.a.t.m.a.n, Simon Wunderlich, Antonio Quartulli
In-Reply-To: <20190830072736.18535-1-sw@simonwunderlich.de>

Sven is taking care of tracking our patches and merging most of them in
our tree. Let's add him to the MAINTAINERS file so he will get all
patch e-mails.

Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
Acked-by: Antonio Quartulli <a@unstable.cc>
---
 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

diff --git a/MAINTAINERS b/MAINTAINERS
index 783569e3c4b4..ce8316cbe3b2 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -2911,6 +2911,7 @@ BATMAN ADVANCED
 M:	Marek Lindner <mareklindner@neomailbox.ch>
 M:	Simon Wunderlich <sw@simonwunderlich.de>
 M:	Antonio Quartulli <a@unstable.cc>
+M:	Sven Eckelmann <sven@narfation.org>
 L:	b.a.t.m.a.n@lists.open-mesh.org (moderated for non-subscribers)
 W:	https://www.open-mesh.org/
 B:	https://www.open-mesh.org/projects/batman-adv/issues
-- 
2.20.1


^ permalink raw reply related

* [PATCH 0/1] pull request for net-next: batman-adv 2019-08-30
From: Simon Wunderlich @ 2019-08-30  7:27 UTC (permalink / raw)
  To: davem; +Cc: netdev, b.a.t.m.a.n, Simon Wunderlich

Hi David,

here is a small maintenance pull request of batman-adv to go into net-next.

Please pull or let me know of any problem!

Thank you,
      Simon

The following changes since commit 9cb9a17813bf0de1f8ad6deb9538296d5148b5a8:

  batman-adv: BATMAN_V: aggregate OGMv2 packets (2019-08-04 22:22:00 +0200)

are available in the Git repository at:

  git://git.open-mesh.org/linux-merge.git tags/batadv-next-for-davem-20190830

for you to fetch changes up to 2a813f1392205654aea28098f3bcc3e6e2478fa5:

  batman-adv: Add Sven to MAINTAINERS file (2019-08-17 13:11:50 +0200)

----------------------------------------------------------------
This maintenance patchset includes the following patches:

 - Add Sven to the MAINTAINERS file, by Simon Wunderlich

----------------------------------------------------------------
Simon Wunderlich (1):
      batman-adv: Add Sven to MAINTAINERS file

 MAINTAINERS | 1 +
 1 file changed, 1 insertion(+)

^ permalink raw reply

* Re: [PATCH v3 1/2] net: core: Notify on changes to dev->promiscuity.
From: Ivan Vecera @ 2019-08-30  7:26 UTC (permalink / raw)
  To: Jiri Pirko
  Cc: Andrew Lunn, Horatiu Vultur, alexandre.belloni, UNGLinuxDriver,
	davem, allan.nielsen, f.fainelli, netdev, linux-kernel
In-Reply-To: <20190830061327.GM2312@nanopsycho>

On Fri, 30 Aug 2019 08:13:27 +0200
Jiri Pirko <jiri@resnulli.us> wrote:

> Thu, Aug 29, 2019 at 04:37:32PM CEST, andrew@lunn.ch wrote:
> >> Wait, I believe there has been some misundestanding. Promisc mode
> >> is NOT about getting packets to the cpu. It's about setting hw
> >> filters in a way that no rx packet is dropped.
> >> 
> >> If you want to get packets from the hw forwarding dataplane to
> >> cpu, you should not use promisc mode for that. That would be
> >> incorrect.  
> >
> >Hi Jiri
> >
> >I'm not sure a wireshark/tcpdump/pcap user would agree with you. They
> >want to see packets on an interface, so they use these tools. The
> >fact that the interface is a switch interface should not matter. The
> >switchdev model is that we try to hide away the interface happens to
> >be on a switch, you can just use it as normal. So why should promisc
> >mode not work as normal?  
> 
> It does, disables the rx filter. Why do you think it means the same
> thing as "trap all to cpu"? Hw datapath was never considered by
> wireshark.
> 
> In fact, I have usecase where I need to see only slow-path traffic by
> wireshark, not all packets going through hw. So apparently, there is a
> need of another wireshark option and perhaps another flag
> IFF_HW_TRAPPING?.

Agree with Jiri but understand both perspectives. We can treat
IFF_PROMISC as:

1) "I want to _SEE_ all Rx traffic on specified interface"
that means for switchdev driver that it has to trap all traffic to CPU
implicitly. And in this case we need another flag that will say "I
don't want to see offloaded traffic".

2) "I want to ensure that nothing is dropped on specified interface" so
IFF_PROMISC is treated as filtering option only. To see offloaded
traffic you need to setup TC rule with trap action or another flag like
IFF_TRAPPING.

IMO IFF_PROMISC should be considered to be a filtering option and
should not imply trapping of offloaded traffic.

Thanks,
Ivan 

^ permalink raw reply

* Re: [PATCH bpf-next 00/13] bpf: adding map batch processing support
From: Yonghong Song @ 2019-08-30  7:25 UTC (permalink / raw)
  To: Jakub Kicinski, Alexei Starovoitov
  Cc: bpf@vger.kernel.org, netdev@vger.kernel.org, Brian Vazquez,
	Daniel Borkmann, Kernel Team
In-Reply-To: <20190829113932.5c058194@cakuba.netronome.com>



On 8/29/19 11:39 AM, Jakub Kicinski wrote:
> On Wed, 28 Aug 2019 23:45:02 -0700, Yonghong Song wrote:
>> Brian Vazquez has proposed BPF_MAP_DUMP command to look up more than one
>> map entries per syscall.
>>    https://lore.kernel.org/bpf/CABCgpaU3xxX6CMMxD+1knApivtc2jLBHysDXw-0E9bQEL0qC3A@mail.gmail.com/T/#t
>>
>> During discussion, we found more use cases can be supported in a similar
>> map operation batching framework. For example, batched map lookup and delete,
>> which can be really helpful for bcc.
>>    https://github.com/iovisor/bcc/blob/master/tools/tcptop.py#L233-L243
>>    https://github.com/iovisor/bcc/blob/master/tools/slabratetop.py#L129-L138
>>      
>> Also, in bcc, we have API to delete all entries in a map.
>>    https://github.com/iovisor/bcc/blob/master/src/cc/api/BPFTable.h#L257-L264
>>
>> For map update, batched operations also useful as sometimes applications need
>> to populate initial maps with more than one entry. For example, the below
>> example is from kernel/samples/bpf/xdp_redirect_cpu_user.c:
>>    https://github.com/torvalds/linux/blob/master/samples/bpf/xdp_redirect_cpu_user.c#L543-L550
>>
>> This patch addresses all the above use cases. To make uapi stable, it also
>> covers other potential use cases. Four bpf syscall subcommands are introduced:
>>      BPF_MAP_LOOKUP_BATCH
>>      BPF_MAP_LOOKUP_AND_DELETE_BATCH
>>      BPF_MAP_UPDATE_BATCH
>>      BPF_MAP_DELETE_BATCH
>>
>> In userspace, application can iterate through the whole map one batch
>> as a time, e.g., bpf_map_lookup_batch() in the below:
>>      p_key = NULL;
>>      p_next_key = &key;
>>      while (true) {
>>         err = bpf_map_lookup_batch(fd, p_key, &p_next_key, keys, values,
>>                                    &batch_size, elem_flags, flags);
>>         if (err) ...
>>         if (p_next_key) break; // done
>>         if (!p_key) p_key = p_next_key;
>>      }
>> Please look at individual patches for details of new syscall subcommands
>> and examples of user codes.
>>
>> The testing is also done in a qemu VM environment:
>>        measure_lookup: max_entries 1000000, batch 10, time 342ms
>>        measure_lookup: max_entries 1000000, batch 1000, time 295ms
>>        measure_lookup: max_entries 1000000, batch 1000000, time 270ms
>>        measure_lookup: max_entries 1000000, no batching, time 1346ms
>>        measure_lookup_delete: max_entries 1000000, batch 10, time 433ms
>>        measure_lookup_delete: max_entries 1000000, batch 1000, time 363ms
>>        measure_lookup_delete: max_entries 1000000, batch 1000000, time 357ms
>>        measure_lookup_delete: max_entries 1000000, not batch, time 1894ms
>>        measure_delete: max_entries 1000000, batch, time 220ms
>>        measure_delete: max_entries 1000000, not batch, time 1289ms
>> For a 1M entry hash table, batch size of 10 can reduce cpu time
>> by 70%. Please see patch "tools/bpf: measure map batching perf"
>> for details of test codes.
> 
> Hi Yonghong!
> 
> great to see this, we have been looking at implementing some way to
> speed up map walks as well.
> 
> The direction we were looking in, after previous discussions [1],
> however, was to provide a BPF program which can run the logic entirely
> within the kernel.
> 
> We have a rough PoC on the FW side (we can offload the program which
> walks the map, which is pretty neat), but the kernel verifier side
> hasn't really progressed. It will soon.
> 
> The rough idea is that the user space provides two programs, "filter"
> and "dumper":
> 
> 	bpftool map exec id XYZ filter pinned /some/prog \
> 				dumper pinned /some/other_prog
> 
> Both programs get this context:
> 
> struct map_op_ctx {
> 	u64 key;
> 	u64 value;
> }
> 
> We need a per-map implementation of the exec side, but roughly maps
> would do:
> 
> 	LIST_HEAD(deleted);
> 
> 	for entry in map {
> 		struct map_op_ctx {
> 			.key	= entry->key,
> 			.value	= entry->value,
> 		};
> 
> 		act = BPF_PROG_RUN(filter, &map_op_ctx);
> 		if (act & ~ACT_BITS)
> 			return -EINVAL;
> 
> 		if (act & DELETE) {
> 			map_unlink(entry);
> 			list_add(entry, &deleted);
> 		}
> 		if (act & STOP)
> 			break;
> 	}
> 
> 	synchronize_rcu();
> 
> 	for entry in deleted {
> 		struct map_op_ctx {
> 			.key	= entry->key,
> 			.value	= entry->value,
> 		};
> 		
> 		BPF_PROG_RUN(dumper, &map_op_ctx);
> 		map_free(entry);
> 	}
> 
> The filter program can't perform any map operations other than lookup,
> otherwise we won't be able to guarantee that we'll walk the entire map
> (if the filter program deletes some entries in a unfortunate order).

Looks like you will provide a new program type and per-map 
implementation of above code. My patch set indeed avoided per-map 
implementation for all of lookup/delete/get-next-key...

> 
> If user space just wants a pure dump it can simply load a program which
> dumps the entries into a perf ring.

percpu perf ring is not really ideal for user space which simply just
want to get some key/value pairs back. Some kind of generate non-per-cpu
ring buffer might be better for such cases.

> 
> I'm bringing this up because that mechanism should cover what is
> achieved with this patch set and much more.

The only case it did not cover is batched update. But that may not
be super critical.

Your approach give each element an action choice through another bpf 
program. This indeed powerful. My use case is simpler than your use case
below, hence the implementation.

> 
> In particular for networking workloads where old flows have to be
> pruned from the map periodically it's far more efficient to communicate
> to user space only the flows which timed out (the delete batching from
> this set won't help at all).

Maybe LRU map will help in this case? It is designed for such
use cases.

> 
> With a 2M entry map and this patch set we still won't be able to prune
> once a second on one core.
> 
> [1]
> https://lore.kernel.org/netdev/20190813130921.10704-4-quentin.monnet@netronome.com/
> 

^ permalink raw reply

* [PATCH 2/2] batman-adv: Only read OGM2 tvlv_len after buffer len check
From: Simon Wunderlich @ 2019-08-30  7:25 UTC (permalink / raw)
  To: davem; +Cc: netdev, b.a.t.m.a.n, Sven Eckelmann, Simon Wunderlich
In-Reply-To: <20190830072502.16929-1-sw@simonwunderlich.de>

From: Sven Eckelmann <sven@narfation.org>

Multiple batadv_ogm2_packet can be stored in an skbuff. The functions
batadv_v_ogm_send_to_if() uses batadv_v_ogm_aggr_packet() to check if there
is another additional batadv_ogm2_packet in the skb or not before they
continue processing the packet.

The length for such an OGM2 is BATADV_OGM2_HLEN +
batadv_ogm2_packet->tvlv_len. The check must first check that at least
BATADV_OGM2_HLEN bytes are available before it accesses tvlv_len (which is
part of the header. Otherwise it might try read outside of the currently
available skbuff to get the content of tvlv_len.

Fixes: 9323158ef9f4 ("batman-adv: OGMv2 - implement originators logic")
Signed-off-by: Sven Eckelmann <sven@narfation.org>
Signed-off-by: Simon Wunderlich <sw@simonwunderlich.de>
---
 net/batman-adv/bat_v_ogm.c | 18 ++++++++++++------
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/net/batman-adv/bat_v_ogm.c b/net/batman-adv/bat_v_ogm.c
index fad95ef64e01..bc06e3cdfa84 100644
--- a/net/batman-adv/bat_v_ogm.c
+++ b/net/batman-adv/bat_v_ogm.c
@@ -631,17 +631,23 @@ batadv_v_ogm_process_per_outif(struct batadv_priv *bat_priv,
  * batadv_v_ogm_aggr_packet() - checks if there is another OGM aggregated
  * @buff_pos: current position in the skb
  * @packet_len: total length of the skb
- * @tvlv_len: tvlv length of the previously considered OGM
+ * @ogm2_packet: potential OGM2 in buffer
  *
  * Return: true if there is enough space for another OGM, false otherwise.
  */
-static bool batadv_v_ogm_aggr_packet(int buff_pos, int packet_len,
-				     __be16 tvlv_len)
+static bool
+batadv_v_ogm_aggr_packet(int buff_pos, int packet_len,
+			 const struct batadv_ogm2_packet *ogm2_packet)
 {
 	int next_buff_pos = 0;
 
-	next_buff_pos += buff_pos + BATADV_OGM2_HLEN;
-	next_buff_pos += ntohs(tvlv_len);
+	/* check if there is enough space for the header */
+	next_buff_pos += buff_pos + sizeof(*ogm2_packet);
+	if (next_buff_pos > packet_len)
+		return false;
+
+	/* check if there is enough space for the optional TVLV */
+	next_buff_pos += ntohs(ogm2_packet->tvlv_len);
 
 	return (next_buff_pos <= packet_len) &&
 	       (next_buff_pos <= BATADV_MAX_AGGREGATION_BYTES);
@@ -818,7 +824,7 @@ int batadv_v_ogm_packet_recv(struct sk_buff *skb,
 	ogm_packet = (struct batadv_ogm2_packet *)skb->data;
 
 	while (batadv_v_ogm_aggr_packet(ogm_offset, skb_headlen(skb),
-					ogm_packet->tvlv_len)) {
+					ogm_packet)) {
 		batadv_v_ogm_process(skb, ogm_offset, if_incoming);
 
 		ogm_offset += BATADV_OGM2_HLEN;
-- 
2.20.1


^ permalink raw reply related


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox