Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH v7 net-next 00/19] ionic: Add ionic driver
From: David Miller @ 2019-09-05  7:25 UTC (permalink / raw)
  To: snelson; +Cc: netdev
In-Reply-To: <20190903222821.46161-1-snelson@pensando.io>

From: Shannon Nelson <snelson@pensando.io>
Date: Tue,  3 Sep 2019 15:28:02 -0700

> This is a patch series that adds the ionic driver, supporting the Pensando
> ethernet device.
> 
> In this initial patchset we implement basic transmit and receive.  Later
> patchsets will add more advanced features.
> 
> Our thanks to Saeed Mahameed, David Miller, Andrew Lunn, Michal Kubecek,
> Jacub Kicinski, Jiri Pirko, Yunsheng Lin, and the ever present kbuild
> test robots for their comments and suggestions.
 ...

Series applied, thank you.

^ permalink raw reply

* Re: [PATCH][next] net/sched: cbs: remove redundant assignment to variable port_rate
From: David Miller @ 2019-09-05  7:37 UTC (permalink / raw)
  To: colin.king
  Cc: jhs, xiyou.wangcong, jiri, netdev, kernel-janitors, linux-kernel
In-Reply-To: <20190902182637.22167-1-colin.king@canonical.com>

From: Colin King <colin.king@canonical.com>
Date: Mon,  2 Sep 2019 19:26:37 +0100

> From: Colin Ian King <colin.king@canonical.com>
> 
> Variable port_rate is being initialized with a value that is never read
> and is being re-assigned a little later on. The assignment is redundant
> and hence can be removed.
> 
> Addresses-Coverity: ("Unused value")
> Signed-off-by: Colin Ian King <colin.king@canonical.com>

Applied.

^ permalink raw reply

* Re: [PATCH net-next] Convert usage of IN_MULTICAST to ipv4_is_multicast
From: David Miller @ 2019-09-05  7:38 UTC (permalink / raw)
  To: dave.taht; +Cc: netdev
In-Reply-To: <1567466976-1351-1-git-send-email-dave.taht@gmail.com>

From: Dave Taht <dave.taht@gmail.com>
Date: Mon,  2 Sep 2019 16:29:36 -0700

> IN_MULTICAST's primary intent is as a uapi macro.
> 
> Elsewhere in the kernel we use ipv4_is_multicast consistently.
> 
> This patch unifies linux's multicast checks to use that function
> rather than this macro.
> 
> Signed-off-by: Dave Taht <dave.taht@gmail.com>
> Reviewed-by: Toke Høiland-Jørgensen <toke@toke.dk>

Applied.

^ permalink raw reply

* Re: [PATCH net-next 0/5] net/tls: minor cleanups
From: David Miller @ 2019-09-05  7:51 UTC (permalink / raw)
  To: jakub.kicinski
  Cc: netdev, oss-drivers, davejwatson, borisp, aviadye, john.fastabend,
	daniel
In-Reply-To: <20190903043106.27570-1-jakub.kicinski@netronome.com>

From: Jakub Kicinski <jakub.kicinski@netronome.com>
Date: Mon,  2 Sep 2019 21:31:01 -0700

> This set is a grab bag of TLS cleanups accumulated in my tree
> in an attempt to avoid merge problems with net. Nothing stands
> out. First patch dedups context information. Next control path
> locking is very slightly optimized. Fourth patch cleans up
> ugly #ifdefs.

Series applied, thanks.

^ permalink raw reply

* Re: [PATCH net-next] vsock/virtio: a better comment on credit update
From: David Miller @ 2019-09-05  7:53 UTC (permalink / raw)
  To: mst; +Cc: linux-kernel, sgarzare, stefanha, kvm, virtualization, netdev
In-Reply-To: <20190903073748.25214-1-mst@redhat.com>

From: "Michael S. Tsirkin" <mst@redhat.com>
Date: Tue, 3 Sep 2019 03:38:16 -0400

> The comment we have is just repeating what the code does.
> Include the *reason* for the condition instead.
> 
> Cc: Stefano Garzarella <sgarzare@redhat.com>
> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

Applied.

^ permalink raw reply

* Re: pull-request: can-next 2019-09-03,pull-request: can-next 2019-09-03
From: David Miller @ 2019-09-05  7:57 UTC (permalink / raw)
  To: mkl; +Cc: netdev, kernel, linux-can
In-Reply-To: <a6751a50-f15d-612d-783b-a706098ea90e@pengutronix.de>

From: Marc Kleine-Budde <mkl@pengutronix.de>
Date: Tue, 3 Sep 2019 11:46:31 +0200

> this is a pull request for net-next/master consisting of 15 patches.
> 
> The first patch is by Christer Beskow, targets the kvaser_pciefd driver
> and fixes the PWM generator's frequency.
> 
> The next three patches are by Dan Murphy, the tcan4x5x is updated to use
> a proper interrupts/interrupt-parent DT binding to specify the devices
> IRQ line. Further the unneeded wake ups of the device is removed from
> the driver.
> 
> A patch by me for the mcp25xx driver removes the deprecated board file
> setup example. Three patches by Andy Shevchenko simplify clock handling,
> update the driver from OF to device property API and simplify the
> mcp251x_can_suspend() function.
> 
> The remaining 7 patches are by me and clean up checkpatch warnings in
> the generic CAN device infrastructure.

Pulled, thanks.

^ permalink raw reply

* Re: [PATCH net] tipc: add NULL pointer check before calling kfree_rcu
From: David Miller @ 2019-09-05  7:59 UTC (permalink / raw)
  To: lucien.xin; +Cc: netdev, jon.maloy, ying.xue, tipc-discussion
In-Reply-To: <f42a6270d821baf1445b5fa40dc201f7c9c5ebd0.1567504392.git.lucien.xin@gmail.com>

From: Xin Long <lucien.xin@gmail.com>
Date: Tue,  3 Sep 2019 17:53:12 +0800

> Unlike kfree(p), kfree_rcu(p, rcu) won't do NULL pointer check. When
> tipc_nametbl_remove_publ returns NULL, the panic below happens:
 ...
> Fixes: 97ede29e80ee ("tipc: convert name table read-write lock to RCU")
> Reported-by: Li Shuang <shuali@redhat.com>
> Signed-off-by: Xin Long <lucien.xin@gmail.com>

Applied and queued up for -stable, thanks.

^ permalink raw reply

* Re: [PATCH] net/skbuff: silence warnings under memory pressure
From: Eric Dumazet @ 2019-09-05  8:32 UTC (permalink / raw)
  To: Qian Cai, Sergey Senozhatsky
  Cc: Sergey Senozhatsky, Michal Hocko, Eric Dumazet, davem, netdev,
	linux-mm, linux-kernel, Petr Mladek, Steven Rostedt
In-Reply-To: <1567629737.5576.87.camel@lca.pw>

On 9/4/19 10:42 PM, Qian Cai wrote:

> To summary, those look to me are all good long-term improvement that would
> reduce the likelihood of this kind of livelock in general especially for other
> unknown allocations that happen while processing softirqs, but it is still up to
> the air if it fixes it 100% in all situations as printk() is going to take more
> time and could deal with console hardware that involve irq_exit() anyway.
> 
> On the other hand, adding __GPF_NOWARN in the build_skb() allocation will fix
> this known NET_TX_SOFTIRQ case which is common when softirqd involved at least
> in short-term. It even have a benefit to reduce the overall warn_alloc() noise
> out there.
> 
> I can resubmit with an update changelog. Does it make any sense?

It does not make sense.

We have thousands other GFP_ATOMIC allocations in the networking stacks.

Soon you will have to send more and more patches adding __GFP_NOWARN once
your workloads/tests can hit all these various points.

It is really time to fix this problem generically, instead of having
to review hundreds of patches.

This was my initial feedback really, nothing really has changed since.

The ability to send a warning with a stack trace, holding the cpu
for many milliseconds should not be decided case by case, otherwise
every call points will decide to opt-out from the harmful warnings.

^ permalink raw reply

* Re: [net-next 2/3] ravb: Remove undocumented processing
From: Simon Horman @ 2019-09-05  8:34 UTC (permalink / raw)
  To: Sergei Shtylyov
  Cc: David Miller, Magnus Damm, netdev, linux-renesas-soc,
	Kazuya Mizuguchi
In-Reply-To: <f54e244a-2d9d-7ba8-02fb-af5572b3a191@cogentembedded.com>

On Mon, Sep 02, 2019 at 08:41:14PM +0300, Sergei Shtylyov wrote:
> On 09/02/2019 11:06 AM, Simon Horman wrote:
> 
> > From: Kazuya Mizuguchi <kazuya.mizuguchi.ks@renesas.com>
> > 
> > This patch removes the use of the undocumented registers
> > CDCR, LCCR, CERCR, CEECR and the undocumented BOC bit of the CCC register.
> 
>    The driver has many more #define's marked as undocumented. It's not clear
> why you crammed the counters and the endianness bit in one patch. It clearly
> needs to be split -- one patch for the MAC counters and one patch for the
> AVB-DMAC bit.

Thanks for the suggestion, I will split the patch.

> > Current documentation for EtherAVB (ravb) describes the offset of
> > what the driver uses as the BOC bit as reserved and that only a value of
> > 0 should be written. Furthermore, the offsets used for the undocumented
> > registers are also considered reserved nd should not be written to.
> > 
> > After some internal investigation with Renesas it remains unclear
> > why this driver accesses these fields but regardless of what the historical
> > reasons are the current code is considered incorrect.
> > 
> > Signed-off-by: Kazuya Mizuguchi <kazuya.mizuguchi.ks@renesas.com>
> > Signed-off-by: Simon Horman <horms+renesas@verge.net.au>
> [...]
> 
> MBR, Sergei
> 

^ permalink raw reply

* Re: [net-next 1/3] ravb: correct typo in FBP field of SFO register
From: Simon Horman @ 2019-09-05  8:34 UTC (permalink / raw)
  To: David Miller
  Cc: sergei.shtylyov, magnus.damm, netdev, linux-renesas-soc,
	kazuya.mizuguchi.ks
In-Reply-To: <20190902.113355.2056970452068168668.davem@davemloft.net>

On Mon, Sep 02, 2019 at 11:33:55AM -0700, David Miller wrote:
> From: Simon Horman <horms+renesas@verge.net.au>
> Date: Mon,  2 Sep 2019 10:06:01 +0200
> 
> > -	SFO_FPB		= 0x0000003F,
> > +	SFO_FBP		= 0x0000003F,
> >  };
> > 
> >  /* RTC */
> > ---
> >  drivers/net/ethernet/renesas/ravb.h | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/drivers/net/ethernet/renesas/ravb.h b/drivers/net/ethernet/renesas/ravb.h
> 
> Simon please clean this up, I don't know what happened here :-)

Yeah, sorry about that. I don't know how it happened either.

^ permalink raw reply

* Re: [PATCH v2 bpf-next 2/3] bpf: implement CAP_BPF
From: Daniel Borkmann @ 2019-09-05  8:37 UTC (permalink / raw)
  To: Alexei Starovoitov, nicolas.dichtel@6wind.com, Alexei Starovoitov
  Cc: Alexei Starovoitov, luto@amacapital.net, davem@davemloft.net,
	peterz@infradead.org, rostedt@goodmis.org, netdev@vger.kernel.org,
	bpf@vger.kernel.org, Kernel Team, linux-api@vger.kernel.org
In-Reply-To: <99acd443-69d7-f53a-1af0-263e0b73abef@fb.com>

On 9/4/19 5:21 PM, Alexei Starovoitov wrote:
> On 9/4/19 8:16 AM, Daniel Borkmann wrote:
>> opening/creating BPF maps" error="Unable to create map
>> /run/cilium/bpffs/tc/globals/cilium_lxc: operation not permitted"
>> subsys=daemon
>> 2019-09-04T14:11:47.28178666Z level=fatal msg="Error while creating
>> daemon" error="Unable to create map
>> /run/cilium/bpffs/tc/globals/cilium_lxc: operation not permitted"
>> subsys=daemon
> 
> Ok. We have to include caps in both cap_sys_admin and cap_bpf then.
> 
>> And /same/ deployment with reverted patches, hence no CAP_BPF gets it up
>> and running again:
>>
>> # kubectl get pods --all-namespaces -o wide
> 
> Can you share what this magic commands do underneath?

What do you mean by magic commands? Latter is showing all pods in the cluster:

https://kubernetes.io/docs/reference/kubectl/cheatsheet/#viewing-finding-resources

I've only been using the normal kubeadm guide for setup, it's pretty straight
forward, just the kubeadm init to bootstrap and then the kubectl create for
deploying if you need to give it a spin for testing:

https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/#tabs-pod-install-4

> What user do they pick to start under? and what caps are granted?

The deployment is using a 'securityContext' with 'privileged: true' for the
the container spec as majority of CNIs do. My understanding is that this is
passed down to the underlying container runtime e.g. docker as one option.

Checking moby go code, it seems to exec with GetAllCapabilities which returns
all of the capabilities it is aware of, and afaics, they seem to be using
the below go library to get the hard-coded list from where obviously CAP_BPF
is unknown which might also explain the breakage I've been seeing:

https://github.com/syndtr/gocapability/blob/33e07d32887e1e06b7c025f27ce52f62c7990bc0/capability/enum_gen.go

Thanks,
Daniel

^ permalink raw reply

* [PATCHv3 0/1] Fix deadlock problem and make performance better
From: Zhu Yanjun @ 2019-09-05  9:15 UTC (permalink / raw)
  To: eric.dumazet, davem, netdev

When running with about 1Gbit/ses for very long time, running ifconfig
and netstat causes dead lock. These symptoms are similar to the
commit 5f6b4e14cada ("net: dsa: User per-cpu 64-bit statistics"). After
replacing network devices statistics with per-cpu 64-bit statistics,
the dead locks disappear even after very long time running with 1Gbit/sec.

V2->V3:
Based on David's advice, "Never use the inline keyword in foo.c files,
let the compiler decide.".

The inline keyword is removed from the functions nv_get_stats and
rx_missing_handler.

V1->V2:
Based on Eric's advice, "If the loops are ever restarted, the
storage->fields will have been modified multiple times.".

A similar change in the commit 5f6b4e14cada ("net: dsa: User per-cpu
64-bit statistics") is borrowed to fix the above problem.

Zhu Yanjun (1):
  forcedeth: use per cpu to collect xmit/recv statistics

 drivers/net/ethernet/nvidia/forcedeth.c | 143 ++++++++++++++++++++++----------
 1 file changed, 99 insertions(+), 44 deletions(-)

-- 
2.7.4

^ permalink raw reply

* [PATCHv3 1/1] forcedeth: use per cpu to collect xmit/recv statistics
From: Zhu Yanjun @ 2019-09-05  9:15 UTC (permalink / raw)
  To: eric.dumazet, davem, netdev
In-Reply-To: <1567674942-5132-1-git-send-email-yanjun.zhu@oracle.com>

When testing with a background iperf pushing 1Gbit/sec traffic and running
both ifconfig and netstat to collect statistics, some deadlocks occurred.

Ifconfig and netstat will call nv_get_stats64 to get software xmit/recv
statistics. In the commit f5d827aece36 ("forcedeth: implement
ndo_get_stats64() API"), the normal tx/rx variables is to collect tx/rx
statistics. The fix is to replace normal tx/rx variables with per
cpu 64-bit variable to collect xmit/recv statistics. The per cpu variable
will avoid deadlocks and provide fast efficient statistics updates.

In nv_probe, the per cpu variable is initialized. In nv_remove, this
per cpu variable is freed.

In xmit/recv process, this per cpu variable will be updated.

In nv_get_stats64, this per cpu variable on each cpu is added up. Then
the driver can get xmit/recv packets statistics.

A test runs for several days with this commit, the deadlocks disappear
and the performance is better.

Tested:
   - iperf SMP x86_64 ->
   Client connecting to 1.1.1.108, TCP port 5001
   TCP window size: 85.0 KByte (default)
   ------------------------------------------------------------
   [  3] local 1.1.1.105 port 38888 connected with 1.1.1.108 port 5001
   [ ID] Interval       Transfer     Bandwidth
   [  3]  0.0-10.0 sec  1.10 GBytes   943 Mbits/sec

   ifconfig results:

   enp0s9 Link encap:Ethernet  HWaddr 00:21:28:6f:de:0f
          inet addr:1.1.1.105  Bcast:0.0.0.0  Mask:255.255.255.0
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:5774764531 errors:0 dropped:0 overruns:0 frame:0
          TX packets:633534193 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000
          RX bytes:7646159340904 (7.6 TB) TX bytes:11425340407722 (11.4 TB)

   netstat results:

   Kernel Interface table
   Iface MTU Met RX-OK RX-ERR RX-DRP RX-OVR TX-OK TX-ERR TX-DRP TX-OVR Flg
   ...
   enp0s9 1500 0  5774764531 0    0 0      633534193      0      0  0 BMRU
   ...

Fixes: f5d827aece36 ("forcedeth: implement ndo_get_stats64() API")
CC: Joe Jin <joe.jin@oracle.com>
CC: JUNXIAO_BI <junxiao.bi@oracle.com>
Reported-and-tested-by: Nan san <nan.1986san@gmail.com>
Signed-off-by: Zhu Yanjun <yanjun.zhu@oracle.com>
---
V2->V3: Following David's advice, fix the problem "Never use the inline
	 keyword in foo.c files, let the compiler decide."
V1->V2: Following Eric's advice fix the problem "If the loops are ever
         restarted, the storage->fields will have been modified multiple
         times."
---
 drivers/net/ethernet/nvidia/forcedeth.c | 143 ++++++++++++++++++++++----------
 1 file changed, 99 insertions(+), 44 deletions(-)

diff --git a/drivers/net/ethernet/nvidia/forcedeth.c b/drivers/net/ethernet/nvidia/forcedeth.c
index b327b29..a6b4bfa 100644
--- a/drivers/net/ethernet/nvidia/forcedeth.c
+++ b/drivers/net/ethernet/nvidia/forcedeth.c
@@ -713,6 +713,21 @@ struct nv_skb_map {
 	struct nv_skb_map *next_tx_ctx;
 };
 
+struct nv_txrx_stats {
+	u64 stat_rx_packets;
+	u64 stat_rx_bytes; /* not always available in HW */
+	u64 stat_rx_missed_errors;
+	u64 stat_rx_dropped;
+	u64 stat_tx_packets; /* not always available in HW */
+	u64 stat_tx_bytes;
+	u64 stat_tx_dropped;
+};
+
+#define nv_txrx_stats_inc(member) \
+		__this_cpu_inc(np->txrx_stats->member)
+#define nv_txrx_stats_add(member, count) \
+		__this_cpu_add(np->txrx_stats->member, (count))
+
 /*
  * SMP locking:
  * All hardware access under netdev_priv(dev)->lock, except the performance
@@ -797,10 +812,7 @@ struct fe_priv {
 
 	/* RX software stats */
 	struct u64_stats_sync swstats_rx_syncp;
-	u64 stat_rx_packets;
-	u64 stat_rx_bytes; /* not always available in HW */
-	u64 stat_rx_missed_errors;
-	u64 stat_rx_dropped;
+	struct nv_txrx_stats __percpu *txrx_stats;
 
 	/* media detection workaround.
 	 * Locking: Within irq hander or disable_irq+spin_lock(&np->lock);
@@ -826,9 +838,6 @@ struct fe_priv {
 
 	/* TX software stats */
 	struct u64_stats_sync swstats_tx_syncp;
-	u64 stat_tx_packets; /* not always available in HW */
-	u64 stat_tx_bytes;
-	u64 stat_tx_dropped;
 
 	/* msi/msi-x fields */
 	u32 msi_flags;
@@ -1721,6 +1730,39 @@ static void nv_update_stats(struct net_device *dev)
 	}
 }
 
+static void nv_get_stats(int cpu, struct fe_priv *np,
+			 struct rtnl_link_stats64 *storage)
+{
+	struct nv_txrx_stats *src = per_cpu_ptr(np->txrx_stats, cpu);
+	unsigned int syncp_start;
+	u64 rx_packets, rx_bytes, rx_dropped, rx_missed_errors;
+	u64 tx_packets, tx_bytes, tx_dropped;
+
+	do {
+		syncp_start = u64_stats_fetch_begin_irq(&np->swstats_rx_syncp);
+		rx_packets       = src->stat_rx_packets;
+		rx_bytes         = src->stat_rx_bytes;
+		rx_dropped       = src->stat_rx_dropped;
+		rx_missed_errors = src->stat_rx_missed_errors;
+	} while (u64_stats_fetch_retry_irq(&np->swstats_rx_syncp, syncp_start));
+
+	storage->rx_packets       += rx_packets;
+	storage->rx_bytes         += rx_bytes;
+	storage->rx_dropped       += rx_dropped;
+	storage->rx_missed_errors += rx_missed_errors;
+
+	do {
+		syncp_start = u64_stats_fetch_begin_irq(&np->swstats_tx_syncp);
+		tx_packets  = src->stat_tx_packets;
+		tx_bytes    = src->stat_tx_bytes;
+		tx_dropped  = src->stat_tx_dropped;
+	} while (u64_stats_fetch_retry_irq(&np->swstats_tx_syncp, syncp_start));
+
+	storage->tx_packets += tx_packets;
+	storage->tx_bytes   += tx_bytes;
+	storage->tx_dropped += tx_dropped;
+}
+
 /*
  * nv_get_stats64: dev->ndo_get_stats64 function
  * Get latest stats value from the nic.
@@ -1733,7 +1775,7 @@ nv_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *storage)
 	__releases(&netdev_priv(dev)->hwstats_lock)
 {
 	struct fe_priv *np = netdev_priv(dev);
-	unsigned int syncp_start;
+	int cpu;
 
 	/*
 	 * Note: because HW stats are not always available and for
@@ -1746,20 +1788,8 @@ nv_get_stats64(struct net_device *dev, struct rtnl_link_stats64 *storage)
 	 */
 
 	/* software stats */
-	do {
-		syncp_start = u64_stats_fetch_begin_irq(&np->swstats_rx_syncp);
-		storage->rx_packets       = np->stat_rx_packets;
-		storage->rx_bytes         = np->stat_rx_bytes;
-		storage->rx_dropped       = np->stat_rx_dropped;
-		storage->rx_missed_errors = np->stat_rx_missed_errors;
-	} while (u64_stats_fetch_retry_irq(&np->swstats_rx_syncp, syncp_start));
-
-	do {
-		syncp_start = u64_stats_fetch_begin_irq(&np->swstats_tx_syncp);
-		storage->tx_packets = np->stat_tx_packets;
-		storage->tx_bytes   = np->stat_tx_bytes;
-		storage->tx_dropped = np->stat_tx_dropped;
-	} while (u64_stats_fetch_retry_irq(&np->swstats_tx_syncp, syncp_start));
+	for_each_online_cpu(cpu)
+		nv_get_stats(cpu, np, storage);
 
 	/* If the nic supports hw counters then retrieve latest values */
 	if (np->driver_data & DEV_HAS_STATISTICS_V123) {
@@ -1827,7 +1857,7 @@ static int nv_alloc_rx(struct net_device *dev)
 		} else {
 packet_dropped:
 			u64_stats_update_begin(&np->swstats_rx_syncp);
-			np->stat_rx_dropped++;
+			nv_txrx_stats_inc(stat_rx_dropped);
 			u64_stats_update_end(&np->swstats_rx_syncp);
 			return 1;
 		}
@@ -1869,7 +1899,7 @@ static int nv_alloc_rx_optimized(struct net_device *dev)
 		} else {
 packet_dropped:
 			u64_stats_update_begin(&np->swstats_rx_syncp);
-			np->stat_rx_dropped++;
+			nv_txrx_stats_inc(stat_rx_dropped);
 			u64_stats_update_end(&np->swstats_rx_syncp);
 			return 1;
 		}
@@ -2013,7 +2043,7 @@ static void nv_drain_tx(struct net_device *dev)
 		}
 		if (nv_release_txskb(np, &np->tx_skb[i])) {
 			u64_stats_update_begin(&np->swstats_tx_syncp);
-			np->stat_tx_dropped++;
+			nv_txrx_stats_inc(stat_tx_dropped);
 			u64_stats_update_end(&np->swstats_tx_syncp);
 		}
 		np->tx_skb[i].dma = 0;
@@ -2227,7 +2257,7 @@ static netdev_tx_t nv_start_xmit(struct sk_buff *skb, struct net_device *dev)
 			/* on DMA mapping error - drop the packet */
 			dev_kfree_skb_any(skb);
 			u64_stats_update_begin(&np->swstats_tx_syncp);
-			np->stat_tx_dropped++;
+			nv_txrx_stats_inc(stat_tx_dropped);
 			u64_stats_update_end(&np->swstats_tx_syncp);
 			return NETDEV_TX_OK;
 		}
@@ -2273,7 +2303,7 @@ static netdev_tx_t nv_start_xmit(struct sk_buff *skb, struct net_device *dev)
 				dev_kfree_skb_any(skb);
 				np->put_tx_ctx = start_tx_ctx;
 				u64_stats_update_begin(&np->swstats_tx_syncp);
-				np->stat_tx_dropped++;
+				nv_txrx_stats_inc(stat_tx_dropped);
 				u64_stats_update_end(&np->swstats_tx_syncp);
 				return NETDEV_TX_OK;
 			}
@@ -2384,7 +2414,7 @@ static netdev_tx_t nv_start_xmit_optimized(struct sk_buff *skb,
 			/* on DMA mapping error - drop the packet */
 			dev_kfree_skb_any(skb);
 			u64_stats_update_begin(&np->swstats_tx_syncp);
-			np->stat_tx_dropped++;
+			nv_txrx_stats_inc(stat_tx_dropped);
 			u64_stats_update_end(&np->swstats_tx_syncp);
 			return NETDEV_TX_OK;
 		}
@@ -2431,7 +2461,7 @@ static netdev_tx_t nv_start_xmit_optimized(struct sk_buff *skb,
 				dev_kfree_skb_any(skb);
 				np->put_tx_ctx = start_tx_ctx;
 				u64_stats_update_begin(&np->swstats_tx_syncp);
-				np->stat_tx_dropped++;
+				nv_txrx_stats_inc(stat_tx_dropped);
 				u64_stats_update_end(&np->swstats_tx_syncp);
 				return NETDEV_TX_OK;
 			}
@@ -2560,9 +2590,12 @@ static int nv_tx_done(struct net_device *dev, int limit)
 					    && !(flags & NV_TX_RETRYCOUNT_MASK))
 						nv_legacybackoff_reseed(dev);
 				} else {
+					unsigned int len;
+
 					u64_stats_update_begin(&np->swstats_tx_syncp);
-					np->stat_tx_packets++;
-					np->stat_tx_bytes += np->get_tx_ctx->skb->len;
+					nv_txrx_stats_inc(stat_tx_packets);
+					len = np->get_tx_ctx->skb->len;
+					nv_txrx_stats_add(stat_tx_bytes, len);
 					u64_stats_update_end(&np->swstats_tx_syncp);
 				}
 				bytes_compl += np->get_tx_ctx->skb->len;
@@ -2577,9 +2610,12 @@ static int nv_tx_done(struct net_device *dev, int limit)
 					    && !(flags & NV_TX2_RETRYCOUNT_MASK))
 						nv_legacybackoff_reseed(dev);
 				} else {
+					unsigned int len;
+
 					u64_stats_update_begin(&np->swstats_tx_syncp);
-					np->stat_tx_packets++;
-					np->stat_tx_bytes += np->get_tx_ctx->skb->len;
+					nv_txrx_stats_inc(stat_tx_packets);
+					len = np->get_tx_ctx->skb->len;
+					nv_txrx_stats_add(stat_tx_bytes, len);
 					u64_stats_update_end(&np->swstats_tx_syncp);
 				}
 				bytes_compl += np->get_tx_ctx->skb->len;
@@ -2627,9 +2663,12 @@ static int nv_tx_done_optimized(struct net_device *dev, int limit)
 						nv_legacybackoff_reseed(dev);
 				}
 			} else {
+				unsigned int len;
+
 				u64_stats_update_begin(&np->swstats_tx_syncp);
-				np->stat_tx_packets++;
-				np->stat_tx_bytes += np->get_tx_ctx->skb->len;
+				nv_txrx_stats_inc(stat_tx_packets);
+				len = np->get_tx_ctx->skb->len;
+				nv_txrx_stats_add(stat_tx_bytes, len);
 				u64_stats_update_end(&np->swstats_tx_syncp);
 			}
 
@@ -2806,6 +2845,15 @@ static int nv_getlen(struct net_device *dev, void *packet, int datalen)
 	}
 }
 
+static void rx_missing_handler(u32 flags, struct fe_priv *np)
+{
+	if (flags & NV_RX_MISSEDFRAME) {
+		u64_stats_update_begin(&np->swstats_rx_syncp);
+		nv_txrx_stats_inc(stat_rx_missed_errors);
+		u64_stats_update_end(&np->swstats_rx_syncp);
+	}
+}
+
 static int nv_rx_process(struct net_device *dev, int limit)
 {
 	struct fe_priv *np = netdev_priv(dev);
@@ -2848,11 +2896,7 @@ static int nv_rx_process(struct net_device *dev, int limit)
 					}
 					/* the rest are hard errors */
 					else {
-						if (flags & NV_RX_MISSEDFRAME) {
-							u64_stats_update_begin(&np->swstats_rx_syncp);
-							np->stat_rx_missed_errors++;
-							u64_stats_update_end(&np->swstats_rx_syncp);
-						}
+						rx_missing_handler(flags, np);
 						dev_kfree_skb(skb);
 						goto next_pkt;
 					}
@@ -2896,8 +2940,8 @@ static int nv_rx_process(struct net_device *dev, int limit)
 		skb->protocol = eth_type_trans(skb, dev);
 		napi_gro_receive(&np->napi, skb);
 		u64_stats_update_begin(&np->swstats_rx_syncp);
-		np->stat_rx_packets++;
-		np->stat_rx_bytes += len;
+		nv_txrx_stats_inc(stat_rx_packets);
+		nv_txrx_stats_add(stat_rx_bytes, len);
 		u64_stats_update_end(&np->swstats_rx_syncp);
 next_pkt:
 		if (unlikely(np->get_rx.orig++ == np->last_rx.orig))
@@ -2982,8 +3026,8 @@ static int nv_rx_process_optimized(struct net_device *dev, int limit)
 			}
 			napi_gro_receive(&np->napi, skb);
 			u64_stats_update_begin(&np->swstats_rx_syncp);
-			np->stat_rx_packets++;
-			np->stat_rx_bytes += len;
+			nv_txrx_stats_inc(stat_rx_packets);
+			nv_txrx_stats_add(stat_rx_bytes, len);
 			u64_stats_update_end(&np->swstats_rx_syncp);
 		} else {
 			dev_kfree_skb(skb);
@@ -5651,6 +5695,12 @@ static int nv_probe(struct pci_dev *pci_dev, const struct pci_device_id *id)
 	SET_NETDEV_DEV(dev, &pci_dev->dev);
 	u64_stats_init(&np->swstats_rx_syncp);
 	u64_stats_init(&np->swstats_tx_syncp);
+	np->txrx_stats = alloc_percpu(struct nv_txrx_stats);
+	if (!np->txrx_stats) {
+		pr_err("np->txrx_stats, alloc memory error.\n");
+		err = -ENOMEM;
+		goto out_alloc_percpu;
+	}
 
 	timer_setup(&np->oom_kick, nv_do_rx_refill, 0);
 	timer_setup(&np->nic_poll, nv_do_nic_poll, 0);
@@ -6060,6 +6110,8 @@ static int nv_probe(struct pci_dev *pci_dev, const struct pci_device_id *id)
 out_disable:
 	pci_disable_device(pci_dev);
 out_free:
+	free_percpu(np->txrx_stats);
+out_alloc_percpu:
 	free_netdev(dev);
 out:
 	return err;
@@ -6105,6 +6157,9 @@ static void nv_restore_mac_addr(struct pci_dev *pci_dev)
 static void nv_remove(struct pci_dev *pci_dev)
 {
 	struct net_device *dev = pci_get_drvdata(pci_dev);
+	struct fe_priv *np = netdev_priv(dev);
+
+	free_percpu(np->txrx_stats);
 
 	unregister_netdev(dev);
 
-- 
2.7.4


^ permalink raw reply related

* Re: [PATCH net 00/11] net: fix nested device bugs
From: Taehee Yoo @ 2019-09-05  9:15 UTC (permalink / raw)
  To: Stephen Hemminger
  Cc: David Miller, Netdev, j.vosburgh, vfalico, Andy Gospodarek,
	Jiří Pírko, sd, Roopa Prabhu, saeedm, manishc,
	rahulv, kys, haiyangz, sthemmin, sashal, hare, varun, ubraun,
	kgraul
In-Reply-To: <20190904115839.64c27609@hermes.lan>

On Thu, 5 Sep 2019 at 03:58, Stephen Hemminger
<stephen@networkplumber.org> wrote:
>
> On Thu,  5 Sep 2019 03:38:28 +0900
> Taehee Yoo <ap420073@gmail.com> wrote:
>
> > This patchset fixes several bugs that are related to nesting
> > device infrastructure.
> > Current nesting infrastructure code doesn't limit the depth level of
> > devices. nested devices could be handled recursively. at that moment,
> > it needs huge memory and stack overflow could occur.
> > Below devices type have same bug.
> > VLAN, BONDING, TEAM, MACSEC, MACVLAN and VXLAN.
> >
> > Test commands:
> >     ip link add dummy0 type dummy
> >     ip link add vlan1 link dummy0 type vlan id 1
> >
> >     for i in {2..100}
> >     do
> >           let A=$i-1
> >           ip link add name vlan$i link vlan$A type vlan id $i
> >     done
> >     ip link del dummy0
> >
> > 1st patch actually fixes the root cause.
> > It adds new common variables {upper/lower}_level that represent
> > depth level. upper_level variable is depth of upper devices.
> > lower_level variable is depth of lower devices.
> >
> >       [U][L]       [U][L]
> > vlan1  1  5  vlan4  1  4
> > vlan2  2  4  vlan5  2  3
> > vlan3  3  3    |
> >   |            |
> >   +------------+
> >   |
> > vlan6  4  2
> > dummy0 5  1
> >
> > After this patch, the nesting infrastructure code uses this variable to
> > check the depth level.
> >
> > 2, 4, 5, 6, 7 patches fix lockdep related problem.
> > Before this patch, devices use static lockdep map.
> > So, if devices that are same type is nested, lockdep will warn about
> > recursive situation.
> > These patches make these devices use dynamic lockdep key instead of
> > static lock or subclass.
> >
> > 3rd patch splits IFF_BONDING flag into IFF_BONDING and IFF_BONDING_SLAVE.
> > Before this patch, there is only IFF_BONDING flags, which means
> > a bonding master or a bonding slave device.
> > But this single flag could be problem when bonding devices are set to
> > nested.
> >
> > 8th patch fixes a refcnt leak in the macsec module.
> >
> > 9th patch adds ignore flag to an adjacent structure.
> > In order to exchange an adjacent node safely, ignore flag is needed.
> >
> > 10th patch makes vxlan add an adjacent link to limit depth level.
> >
> > 11th patch removes unnecessary variables and callback.
> >
> > Taehee Yoo (11):
> >   net: core: limit nested device depth
> >   vlan: use dynamic lockdep key instead of subclass
> >   bonding: split IFF_BONDING into IFF_BONDING and IFF_BONDING_SLAVE
> >   bonding: use dynamic lockdep key instead of subclass
> >   team: use dynamic lockdep key instead of static key
> >   macsec: use dynamic lockdep key instead of subclass
> >   macvlan: use dynamic lockdep key instead of subclass
> >   macsec: fix refcnt leak in module exit routine
> >   net: core: add ignore flag to netdev_adjacent structure
> >   vxlan: add adjacent link to limit depth level
> >   net: remove unnecessary variables and callback
> >
> >  drivers/net/bonding/bond_alb.c                |   2 +-
> >  drivers/net/bonding/bond_main.c               |  87 ++++--
> >  .../net/ethernet/mellanox/mlx5/core/en_tc.c   |   2 +-
> >  .../ethernet/qlogic/netxen/netxen_nic_main.c  |   2 +-
> >  drivers/net/hyperv/netvsc_drv.c               |   3 +-
> >  drivers/net/macsec.c                          |  50 ++--
> >  drivers/net/macvlan.c                         |  36 ++-
> >  drivers/net/team/team.c                       |  61 ++++-
> >  drivers/net/vxlan.c                           |  71 ++++-
> >  drivers/scsi/fcoe/fcoe.c                      |   2 +-
> >  drivers/target/iscsi/cxgbit/cxgbit_cm.c       |   2 +-
> >  include/linux/if_macvlan.h                    |   3 +-
> >  include/linux/if_team.h                       |   5 +
> >  include/linux/if_vlan.h                       |  13 +-
> >  include/linux/netdevice.h                     |  29 +-
> >  include/net/bonding.h                         |   4 +-
> >  include/net/vxlan.h                           |   1 +
> >  net/8021q/vlan.c                              |   1 -
> >  net/8021q/vlan_dev.c                          |  32 +--
> >  net/core/dev.c                                | 252 ++++++++++++++++--
> >  net/core/dev_addr_lists.c                     |  12 +-
> >  net/smc/smc_core.c                            |   2 +-
> >  net/smc/smc_pnet.c                            |   2 +-
> >  23 files changed, 519 insertions(+), 155 deletions(-)
> >
>

Hi Stephen,
Thank you so much for the review!

> The network receive path already avoids excessive stack
> depth. Maybe the real problem is in the lockdep code.

Sorry, I don't understand the point that you mentioned.
I appreciate if you tell me more in details about your review.

^ permalink raw reply

* [PATCH bpf-next] i40e: fix xdp handle calculations
From: Kevin Laatz @ 2019-09-05  1:11 UTC (permalink / raw)
  To: netdev, ast, daniel, bjorn.topel, magnus.karlsson, jonathan.lemon
  Cc: bruce.richardson, ciara.loftus, bpf, intel-wired-lan, Kevin Laatz

Currently, we don't add headroom to the handle in i40e_zca_free,
i40e_alloc_buffer_slow_zc and i40e_alloc_buffer_zc. The addition of the
headroom to the handle was removed in
commit 2f86c806a8a8 ("i40e: modify driver for handling offsets"), which
will break things when headroom is non-zero. This patch fixes this and uses
xsk_umem_adjust_offset to add it appropritely based on the mode being run.

Fixes: 2f86c806a8a8 ("i40e: modify driver for handling offsets")
Reported-by: Bjorn Topel <bjorn.topel@intel.com>
Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
---
 drivers/net/ethernet/intel/i40e/i40e_xsk.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
index eaca6162a6e6..0373bc6c7e61 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c
+++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
@@ -267,7 +267,7 @@ static bool i40e_alloc_buffer_zc(struct i40e_ring *rx_ring,
 	bi->addr = xdp_umem_get_data(umem, handle);
 	bi->addr += hr;
 
-	bi->handle = handle;
+	bi->handle = xsk_umem_adjust_offset(umem, handle, umem->headroom);
 
 	xsk_umem_discard_addr(umem);
 	return true;
@@ -304,7 +304,7 @@ static bool i40e_alloc_buffer_slow_zc(struct i40e_ring *rx_ring,
 	bi->addr = xdp_umem_get_data(umem, handle);
 	bi->addr += hr;
 
-	bi->handle = handle;
+	bi->handle = xsk_umem_adjust_offset(umem, handle, umem->headroom);
 
 	xsk_umem_discard_addr_rq(umem);
 	return true;
@@ -469,7 +469,8 @@ void i40e_zca_free(struct zero_copy_allocator *alloc, unsigned long handle)
 	bi->addr = xdp_umem_get_data(rx_ring->xsk_umem, handle);
 	bi->addr += hr;
 
-	bi->handle = (u64)handle;
+	bi->handle = xsk_umem_adjust_offset(rx_ring->xsk_umem, (u64)handle,
+					    rx_ring->xsk_umem->headroom);
 }
 
 /**
-- 
2.17.1


^ permalink raw reply related

* [PATCH bpf-next] ixgbe: fix xdp handle calculations
From: Kevin Laatz @ 2019-09-05  1:12 UTC (permalink / raw)
  To: netdev, ast, daniel, bjorn.topel, magnus.karlsson, jonathan.lemon
  Cc: bruce.richardson, ciara.loftus, bpf, intel-wired-lan, Kevin Laatz

Currently, we don't add headroom to the handle in ixgbe_zca_free,
ixgbe_alloc_buffer_slow_zc and ixgbe_alloc_buffer_zc. The addition of the
headroom to the handle was removed in
commit d8c3061e5edd ("ixgbe: modify driver for handling offsets"), which
will break things when headroom isvnon-zero. This patch fixes this and uses
xsk_umem_adjust_offset to add it appropritely based on the mode being run.

Fixes: d8c3061e5edd ("ixgbe: modify driver for handling offsets")
Reported-by: Bjorn Topel <bjorn.topel@intel.com>
Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>
---
 drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c | 7 ++++---
 1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
index 17061c799f72..ad802a8909e0 100644
--- a/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
+++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
@@ -248,7 +248,8 @@ void ixgbe_zca_free(struct zero_copy_allocator *alloc, unsigned long handle)
 	bi->addr = xdp_umem_get_data(rx_ring->xsk_umem, handle);
 	bi->addr += hr;
 
-	bi->handle = (u64)handle;
+	bi->handle = xsk_umem_adjust_offset(rx_ring->xsk_umem, (u64)handle,
+					    rx_ring->xsk_umem->headroom);
 }
 
 static bool ixgbe_alloc_buffer_zc(struct ixgbe_ring *rx_ring,
@@ -274,7 +275,7 @@ static bool ixgbe_alloc_buffer_zc(struct ixgbe_ring *rx_ring,
 	bi->addr = xdp_umem_get_data(umem, handle);
 	bi->addr += hr;
 
-	bi->handle = handle;
+	bi->handle = xsk_umem_adjust_offset(umem, handle, umem->headroom);
 
 	xsk_umem_discard_addr(umem);
 	return true;
@@ -301,7 +302,7 @@ static bool ixgbe_alloc_buffer_slow_zc(struct ixgbe_ring *rx_ring,
 	bi->addr = xdp_umem_get_data(umem, handle);
 	bi->addr += hr;
 
-	bi->handle = handle;
+	bi->handle = xsk_umem_adjust_offset(umem, handle, umem->headroom);
 
 	xsk_umem_discard_addr_rq(umem);
 	return true;
-- 
2.17.1


^ permalink raw reply related

* Re: [Intel-wired-lan] [PATCH bpf-next] i40e: fix xdp handle calculations
From: Björn Töpel @ 2019-09-05  9:29 UTC (permalink / raw)
  To: Kevin Laatz
  Cc: Netdev, Alexei Starovoitov, Daniel Borkmann,
	Björn Töpel, Karlsson, Magnus, Jonathan Lemon,
	Bruce Richardson, ciara.loftus, intel-wired-lan, bpf
In-Reply-To: <20190905011144.3513-1-kevin.laatz@intel.com>

On Thu, 5 Sep 2019 at 11:27, Kevin Laatz <kevin.laatz@intel.com> wrote:
>
> Currently, we don't add headroom to the handle in i40e_zca_free,
> i40e_alloc_buffer_slow_zc and i40e_alloc_buffer_zc. The addition of the
> headroom to the handle was removed in
> commit 2f86c806a8a8 ("i40e: modify driver for handling offsets"), which
> will break things when headroom is non-zero. This patch fixes this and uses
> xsk_umem_adjust_offset to add it appropritely based on the mode being run.
>
> Fixes: 2f86c806a8a8 ("i40e: modify driver for handling offsets")
> Reported-by: Bjorn Topel <bjorn.topel@intel.com>
> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>

Thanks Kevin!

Acked-by: Björn Töpel <bjorn.topel@intel.com>

> ---
>  drivers/net/ethernet/intel/i40e/i40e_xsk.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/i40e/i40e_xsk.c b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> index eaca6162a6e6..0373bc6c7e61 100644
> --- a/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> +++ b/drivers/net/ethernet/intel/i40e/i40e_xsk.c
> @@ -267,7 +267,7 @@ static bool i40e_alloc_buffer_zc(struct i40e_ring *rx_ring,
>         bi->addr = xdp_umem_get_data(umem, handle);
>         bi->addr += hr;
>
> -       bi->handle = handle;
> +       bi->handle = xsk_umem_adjust_offset(umem, handle, umem->headroom);
>
>         xsk_umem_discard_addr(umem);
>         return true;
> @@ -304,7 +304,7 @@ static bool i40e_alloc_buffer_slow_zc(struct i40e_ring *rx_ring,
>         bi->addr = xdp_umem_get_data(umem, handle);
>         bi->addr += hr;
>
> -       bi->handle = handle;
> +       bi->handle = xsk_umem_adjust_offset(umem, handle, umem->headroom);
>
>         xsk_umem_discard_addr_rq(umem);
>         return true;
> @@ -469,7 +469,8 @@ void i40e_zca_free(struct zero_copy_allocator *alloc, unsigned long handle)
>         bi->addr = xdp_umem_get_data(rx_ring->xsk_umem, handle);
>         bi->addr += hr;
>
> -       bi->handle = (u64)handle;
> +       bi->handle = xsk_umem_adjust_offset(rx_ring->xsk_umem, (u64)handle,
> +                                           rx_ring->xsk_umem->headroom);
>  }
>
>  /**
> --
> 2.17.1
>
> _______________________________________________
> Intel-wired-lan mailing list
> Intel-wired-lan@osuosl.org
> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply

* Re: [Intel-wired-lan] [PATCH bpf-next] ixgbe: fix xdp handle calculations
From: Björn Töpel @ 2019-09-05  9:30 UTC (permalink / raw)
  To: Kevin Laatz
  Cc: Netdev, Alexei Starovoitov, Daniel Borkmann,
	Björn Töpel, Karlsson, Magnus, Jonathan Lemon,
	Bruce Richardson, ciara.loftus, intel-wired-lan, bpf
In-Reply-To: <20190905011217.3567-1-kevin.laatz@intel.com>

On Thu, 5 Sep 2019 at 11:28, Kevin Laatz <kevin.laatz@intel.com> wrote:
>
> Currently, we don't add headroom to the handle in ixgbe_zca_free,
> ixgbe_alloc_buffer_slow_zc and ixgbe_alloc_buffer_zc. The addition of the
> headroom to the handle was removed in
> commit d8c3061e5edd ("ixgbe: modify driver for handling offsets"), which
> will break things when headroom isvnon-zero. This patch fixes this and uses
> xsk_umem_adjust_offset to add it appropritely based on the mode being run.
>
> Fixes: d8c3061e5edd ("ixgbe: modify driver for handling offsets")
> Reported-by: Bjorn Topel <bjorn.topel@intel.com>
> Signed-off-by: Kevin Laatz <kevin.laatz@intel.com>

Acked-by: Björn Töpel <bjorn.topel@intel.com>

> ---
>  drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c | 7 ++++---
>  1 file changed, 4 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
> index 17061c799f72..ad802a8909e0 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_xsk.c
> @@ -248,7 +248,8 @@ void ixgbe_zca_free(struct zero_copy_allocator *alloc, unsigned long handle)
>         bi->addr = xdp_umem_get_data(rx_ring->xsk_umem, handle);
>         bi->addr += hr;
>
> -       bi->handle = (u64)handle;
> +       bi->handle = xsk_umem_adjust_offset(rx_ring->xsk_umem, (u64)handle,
> +                                           rx_ring->xsk_umem->headroom);
>  }
>
>  static bool ixgbe_alloc_buffer_zc(struct ixgbe_ring *rx_ring,
> @@ -274,7 +275,7 @@ static bool ixgbe_alloc_buffer_zc(struct ixgbe_ring *rx_ring,
>         bi->addr = xdp_umem_get_data(umem, handle);
>         bi->addr += hr;
>
> -       bi->handle = handle;
> +       bi->handle = xsk_umem_adjust_offset(umem, handle, umem->headroom);
>
>         xsk_umem_discard_addr(umem);
>         return true;
> @@ -301,7 +302,7 @@ static bool ixgbe_alloc_buffer_slow_zc(struct ixgbe_ring *rx_ring,
>         bi->addr = xdp_umem_get_data(umem, handle);
>         bi->addr += hr;
>
> -       bi->handle = handle;
> +       bi->handle = xsk_umem_adjust_offset(umem, handle, umem->headroom);
>
>         xsk_umem_discard_addr_rq(umem);
>         return true;
> --
> 2.17.1
>
> _______________________________________________
> Intel-wired-lan mailing list
> Intel-wired-lan@osuosl.org
> https://lists.osuosl.org/mailman/listinfo/intel-wired-lan

^ permalink raw reply

* [PATCH net-next] net/mlx5: DR, Remove useless set memory to zero use memset()
From: Wei Yongjun @ 2019-09-05  9:53 UTC (permalink / raw)
  To: Saeed Mahameed, Leon Romanovsky, Mark Bloch, Alex Vesker,
	Erez Shitrit
  Cc: Wei Yongjun, netdev, linux-rdma, kernel-janitors

The memory return by kzalloc() has already be set to zero, so
remove useless memset(0).

Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/steering/dr_send.c | 1 -
 1 file changed, 1 deletion(-)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_send.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_send.c
index ef0dea44f3b3..5df8436b2ae3 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_send.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_send.c
@@ -899,7 +899,6 @@ int mlx5dr_send_ring_alloc(struct mlx5dr_domain *dmn)
 		goto clean_qp;
 	}
 
-	memset(dmn->send_ring->buf, 0, size);
 	dmn->send_ring->buf_size = size;
 
 	dmn->send_ring->mr = dr_reg_mr(dmn->mdev,




^ permalink raw reply related

* [PATCH net-next] net/mlx5: DR, Fix error return code in dr_domain_init_resources()
From: Wei Yongjun @ 2019-09-05  9:56 UTC (permalink / raw)
  To: Saeed Mahameed, Leon Romanovsky, Mark Bloch, Alex Vesker,
	Erez Shitrit
  Cc: Wei Yongjun, netdev, linux-rdma, kernel-janitors

Fix to return negative error code -ENOMEM from the error handling
case instead of 0, as done elsewhere in this function.

Fixes: 4ec9e7b02697 ("net/mlx5: DR, Expose steering domain functionality")
Signed-off-by: Wei Yongjun <weiyongjun1@huawei.com>
---
 drivers/net/ethernet/mellanox/mlx5/core/steering/dr_domain.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_domain.c b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_domain.c
index 3b9cf0bccf4d..461cc2c30538 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_domain.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/steering/dr_domain.c
@@ -66,6 +66,7 @@ static int dr_domain_init_resources(struct mlx5dr_domain *dmn)
 	dmn->uar = mlx5_get_uars_page(dmn->mdev);
 	if (!dmn->uar) {
 		mlx5dr_err(dmn, "Couldn't allocate UAR\n");
+		ret = -ENOMEM;
 		goto clean_pd;
 	}
 
@@ -73,6 +74,7 @@ static int dr_domain_init_resources(struct mlx5dr_domain *dmn)
 	if (!dmn->ste_icm_pool) {
 		mlx5dr_err(dmn, "Couldn't get icm memory for %s\n",
 			   dev_name(dmn->mdev->device));
+		ret = -ENOMEM;
 		goto clean_uar;
 	}
 
@@ -80,6 +82,7 @@ static int dr_domain_init_resources(struct mlx5dr_domain *dmn)
 	if (!dmn->action_icm_pool) {
 		mlx5dr_err(dmn, "Couldn't get action icm memory for %s\n",
 			   dev_name(dmn->mdev->device));
+		ret = -ENOMEM;
 		goto free_ste_icm_pool;
 	}




^ permalink raw reply related

* Re: [PATCH] net: fixed_phy: Add forward declaration for struct gpio_desc;
From: David Miller @ 2019-09-05  9:54 UTC (permalink / raw)
  To: mdf; +Cc: netdev, andrew, f.fainelli, hkallweit1, linux-kernel
In-Reply-To: <20190903184652.3148-1-mdf@kernel.org>

From: Moritz Fischer <mdf@kernel.org>
Date: Tue,  3 Sep 2019 11:46:52 -0700

> Add forward declaration for struct gpio_desc in order to address
> the following:
> 
> ./include/linux/phy_fixed.h:48:17: error: 'struct gpio_desc' declared inside parameter list [-Werror]
> ./include/linux/phy_fixed.h:48:17: error: its scope is only this definition or declaration, which is probably not what you want [-Werror]
> 
> Fixes commit 71bd106d2567 ("net: fixed-phy: Add
> fixed_phy_register_with_gpiod() API")
> Signed-off-by: Moritz Fischer <mdf@kernel.org>

Applied with Fixes tag fixed up.

^ permalink raw reply

* Re: [net PATCH] net: sock_map, fix missing ulp check in sock hash case
From: David Miller @ 2019-09-05  9:57 UTC (permalink / raw)
  To: john.fastabend; +Cc: hdanton, jakub.kicinski, netdev
In-Reply-To: <156754228993.21629.4076822768659778848.stgit@john-Precision-5820-Tower>

From: John Fastabend <john.fastabend@gmail.com>
Date: Tue, 03 Sep 2019 13:24:50 -0700

> sock_map and ULP only work together when ULP is loaded after the sock
> map is loaded. In the sock_map case we added a check for this to fail
> the load if ULP is already set. However, we missed the check on the
> sock_hash side.
> 
> Add a ULP check to the sock_hash update path.
> 
> Fixes: 604326b41a6fb ("bpf, sockmap: convert to generic sk_msg interface")
> Reported-by: syzbot+7a6ee4d0078eac6bf782@syzkaller.appspotmail.com
> Signed-off-by: John Fastabend <john.fastabend@gmail.com>

Applied and queued up for -stable, thanks.

^ permalink raw reply

* Re: [PATCH net 0/2] nexthops: Fix multipath notifications for IPv6 and selftests
From: David Miller @ 2019-09-05 10:00 UTC (permalink / raw)
  To: dsahern; +Cc: netdev, sharpd, dsahern
In-Reply-To: <20190903222213.7029-1-dsahern@kernel.org>

From: David Ahern <dsahern@kernel.org>
Date: Tue,  3 Sep 2019 15:22:11 -0700

> From: David Ahern <dsahern@gmail.com>
> 
> A couple of bug fixes noticed while testing Donald's patch.

Series applied.

^ permalink raw reply

* Re: [PATCH v2 2/2] PTP: add support for one-shot output
From: Felipe Balbi @ 2019-09-05 10:03 UTC (permalink / raw)
  To: Richard Cochran; +Cc: Christopher S Hall, netdev, linux-kernel, davem
In-Reply-To: <20190831144732.GA1692@localhost>

[-- Attachment #1: Type: text/plain, Size: 2004 bytes --]


Hi,

Richard Cochran <richardcochran@gmail.com> writes:
> On Fri, Aug 30, 2019 at 11:00:20AM +0300, Felipe Balbi wrote:
>> >> @@ -177,9 +177,8 @@ long ptp_ioctl(struct posix_clock *pc, unsigned int cmd, unsigned long arg)
>> >>  			err = -EFAULT;
>> >>  			break;
>> >>  		}
>> >> -		if ((req.perout.flags || req.perout.rsv[0] || req.perout.rsv[1]
>> >> -				|| req.perout.rsv[2] || req.perout.rsv[3])
>> >> -			&& cmd == PTP_PEROUT_REQUEST2) {
>> >> +		if ((req.perout.rsv[0] || req.perout.rsv[1] || req.perout.rsv[2]
>> >> +			|| req.perout.rsv[3]) && cmd == PTP_PEROUT_REQUEST2) {
>> >
>> > Please check that the reserved bits of req.perout.flags, namely
>> > ~PTP_PEROUT_ONE_SHOT, are clear.
>> 
>> Actually, we should check more. PEROUT_FEATURE_ENABLE is still valid
>> here, right? So are RISING and FALLING edges, no?
>
> No.  The ptp_extts_request.flags are indeed defined:
>
> struct ptp_extts_request {
> 	...
> 	unsigned int flags;  /* Bit field for PTP_xxx flags. */
> 	...
> };
>
> But the ptp_perout_request.flags are reserved:
>
> struct ptp_perout_request {
> 	...
> 	unsigned int flags;           /* Reserved for future use. */
> 	...
> };

This a bit confusing, really. Specially when the comment right above
those flags states:

/* PTP_xxx bits, for the flags field within the request structures. */

The request "structures" include EXTTS and PEROUT:

struct ptp_clock_request {
	enum {
		PTP_CLK_REQ_EXTTS,
		PTP_CLK_REQ_PEROUT,
		PTP_CLK_REQ_PPS,
	} type;
	union {
		struct ptp_extts_request extts;
		struct ptp_perout_request perout;
	};
};

Seems like we will, at least, make it clear which flags are valid for
which request structures.

> For this ioctl, the test for enable/disable is
> ptp_perout_request.period is zero:
>
> 		enable = req.perout.period.sec || req.perout.period.nsec;
> 		err = ops->enable(ops, &req, enable);
>
> The usage pattern here is taken from timer_settime(2).

got it

-- 
balbi

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 832 bytes --]

^ permalink raw reply

* Re: [PATCH 0/3] net: Use kzfree() directly
From: David Miller @ 2019-09-05 10:06 UTC (permalink / raw)
  To: zhongjiang; +Cc: anna.schumaker, trond.myklebust, netdev, linux-kernel
In-Reply-To: <1567564752-6430-1-git-send-email-zhongjiang@huawei.com>

From: zhong jiang <zhongjiang@huawei.com>
Date: Wed, 4 Sep 2019 10:39:09 +0800

> With the help of Coccinelle. We find some place to replace.
> 
> @@
> expression M, S;
> @@
> 
> - memset(M, 0, S);
> - kfree(M);
> + kzfree(M); 

Series applied to net-next.

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox