Netdev List
 help / color / mirror / Atom feed
* Re: [PATCH] net: linkwatch: ignore events for unregistered netdevs
From: Lukas Wunner @ 2022-04-25 17:20 UTC (permalink / raw)
  To: Jann Horn
  Cc: Eric Dumazet, Jakub Kicinski, Paolo Abeni, Oliver Neukum,
	David S. Miller, Oleksij Rempel, netdev, USB list, Andrew Lunn,
	Jacky Chou, Willy Tarreau, Lino Sanfilippo, Philipp Rosenberger,
	Heiner Kallweit, Greg Kroah-Hartman
In-Reply-To: <CAG48ez0nw7coDXYozaUOTThWLkHWZuKVUpMosY2hgVSSfeM4Pw@mail.gmail.com>

On Mon, Apr 25, 2022 at 05:18:51PM +0200, Jann Horn wrote:
> Well, it's not quite a refcount. It's a count that can be incremented
> and decremented but can't be read while the device is alive, and then
> at some point it turns into a count that can be read and decremented
> but can't be incremented

Pardon me for being dense, but most other subsystems use the refcounting
built into struct device (or rather, its kobject) and tear it down
when the refcount reaches zero (e.g. pci_release_dev(), spidev_release()).

What's the rationale for struct net_device rolling its own refcounting?
Historic artifact?

I think a lot of these issues would solve themselves if that was done away
with and replaced with the generic kobject refcounting.  It's a pity that
the tracking infrastructure is now netdev-specific and other subsystems
cannot benefit from it.

Thanks,

Lukas

^ permalink raw reply

* Re: [PATCH RFC 4/5] net/tls: Add support for PF_TLSH (a TLS handshake listener)
From: Jakub Kicinski @ 2022-04-25 17:14 UTC (permalink / raw)
  To: Chuck Lever
  Cc: netdev, linux-nfs, linux-nvme, linux-cifs, linux-fsdevel, ak,
	borisp, simo
In-Reply-To: <165030059051.5073.16723746870370826608.stgit@oracle-102.nfsv4.dev>

On Mon, 18 Apr 2022 12:49:50 -0400 Chuck Lever wrote:
> In-kernel TLS consumers need a way to perform a TLS handshake. In
> the absence of a handshake implementation in the kernel itself, a
> mechanism to perform the handshake in user space, using an existing
> TLS handshake library, is necessary.
> 
> I've designed a way to pass a connected kernel socket endpoint to
> user space using the traditional listen/accept mechanism. accept(2)
> gives us a well-understood way to materialize a socket endpoint as a
> normal file descriptor in a specific user space process. Like any
> open socket descriptor, the accepted FD can then be passed to a
> library such as openSSL to perform a TLS handshake.
> 
> This prototype currently handles only initiating client-side TLS
> handshakes. Server-side handshakes and key renegotiation are left
> to do.
> 
> Security Considerations
> ~~~~~~~~ ~~~~~~~~~~~~~~
> 
> This prototype is net-namespace aware.
> 
> The kernel has no mechanism to attest that the listening user space
> agent is trustworthy.
> 
> Currently the prototype does not handle multiple listeners that
> overlap -- multiple listeners in the same net namespace that have
> overlapping bind addresses.

Create the socket in user space, do all the handshakes you need there
and then pass it to the kernel.  This is how NBD + TLS works.  Scales
better and requires much less kernel code.

^ permalink raw reply

* Re: [PATCH RFC 2/5] tls: build proto after context has been initialized
From: Jakub Kicinski @ 2022-04-25 17:11 UTC (permalink / raw)
  To: Chuck Lever
  Cc: netdev, linux-nfs, linux-nvme, linux-cifs, linux-fsdevel, ak,
	borisp, simo
In-Reply-To: <165030057652.5073.10364318727607743572.stgit@oracle-102.nfsv4.dev>

On Mon, 18 Apr 2022 12:49:36 -0400 Chuck Lever wrote:
> From: Hannes Reinecke <hare@suse.de>
> 
> We have to build the proto ops only after the context has been
> initialized, as otherwise we might crash when I/O is ongoing
> during initialisation.

Can you say more about the crash you see? protos are not used until 
at least one socket calls update_sk_prot().

^ permalink raw reply

* [PATCH 1/1] tc-bpf: added instructions to build cbpf generator
From: Ray Kinsella @ 2022-04-25 16:57 UTC (permalink / raw)
  To: netdev; +Cc: daniel, stephen, Ray Kinsella
In-Reply-To: <20220425165733.240902-1-mdr@ashroe.eu>

Updated the man page for tc-bpf, detailing how to build to the cbpf
generator and using it in the subsequent example.

Signed-off-by: Ray Kinsella <mdr@ashroe.eu>
---
 man/man8/tc-bpf.8 | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

diff --git a/man/man8/tc-bpf.8 b/man/man8/tc-bpf.8
index e4f68aaa..1d3ab633 100644
--- a/man/man8/tc-bpf.8
+++ b/man/man8/tc-bpf.8
@@ -892,13 +892,21 @@ int main(int argc, char **argv)
 .fi
 .in
 
+Build this helper by compiling the source above, and linking with
+.B libpcap
+as follows:
+
+.in +4n
+.B clang -g -O2 cbpf-gen.c -lpcap -o cbpf-gen
+.in
+
 Given this small helper, any
 .B tcpdump(8)
 filter expression can be abused as a classifier where a match will
 result in the default classid:
 
 .in +4n
-.B bpftool EN10MB 'tcp[tcpflags] & tcp-syn != 0' > /var/bpf/tcp-syn
+.B cbpf-gen 'tcp[tcpflags] & tcp-syn != 0' > /var/bpf/tcp-syn
 .br
 .B tc filter add dev em1 parent 1: bpf bytecode-file /var/bpf/tcp-syn flowid 1:1
 .in
-- 
2.26.2


^ permalink raw reply related

* [PATCH 0/1] updates to tc-bpf manpage
From: Ray Kinsella @ 2022-04-25 16:57 UTC (permalink / raw)
  To: netdev; +Cc: daniel, stephen, Ray Kinsella

Hi folks,

I was a bit unsure what ML this patch should go to?
It is a small update/fix for the tc-bpf manpage.

Ray Kinsella (1):
  tc-bpf: added instructions to build cbpf generator

 man/man8/tc-bpf.8 | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

-- 
2.26.2


^ permalink raw reply

* [PATCH v2 net] net: Use this_cpu_inc() to increment net->core_stats
From: Sebastian Andrzej Siewior @ 2022-04-25 16:39 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: netdev, Eric Dumazet, David S. Miller, Paolo Abeni,
	Thomas Gleixner, Peter Zijlstra

The macro dev_core_stats_##FIELD##_inc() disables preemption and invokes
netdev_core_stats_alloc() to return a per-CPU pointer.
netdev_core_stats_alloc() will allocate memory on its first invocation
which breaks on PREEMPT_RT because it requires non-atomic context for
memory allocation.

This can be avoided by enabling preemption in netdev_core_stats_alloc()
assuming the caller always disables preemption.

It might be better to replace local_inc() with this_cpu_inc() now that
dev_core_stats_##FIELD##_inc() gained a preempt-disable section and does
not rely on already disabled preemption. This results in less
instructions on x86-64:
local_inc:
|          incl %gs:__preempt_count(%rip)  # __preempt_count
|          movq    488(%rdi), %rax # _1->core_stats, _22
|          testq   %rax, %rax      # _22
|          je      .L585   #,
|          add %gs:this_cpu_off(%rip), %rax        # this_cpu_off, tcp_ptr__
|  .L586:
|          testq   %rax, %rax      # _27
|          je      .L587   #,
|          incq (%rax)            # _6->a.counter
|  .L587:
|          decl %gs:__preempt_count(%rip)  # __preempt_count

this_cpu_inc(), this patch:
|         movq    488(%rdi), %rax # _1->core_stats, _5
|         testq   %rax, %rax      # _5
|         je      .L591   #,
| .L585:
|         incq %gs:(%rax) # _18->rx_dropped

Use unsigned long as type for the counter. Use this_cpu_inc() to
increment the counter. Use a plain read of the counter.

Signed-off-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
---
v1…v2:
	- Add missing __percpu annotation as noticed by Jakub + robot
	- Use READ_ONCE() in dev_get_stats() to avoid possible split
	  reads, noticed by Eric.

 include/linux/netdevice.h | 21 +++++++++------------
 net/core/dev.c            | 14 +++++---------
 2 files changed, 14 insertions(+), 21 deletions(-)

diff --git a/include/linux/netdevice.h b/include/linux/netdevice.h
index 59e27a2b7bf04..b1fbe21650bb5 100644
--- a/include/linux/netdevice.h
+++ b/include/linux/netdevice.h
@@ -199,10 +199,10 @@ struct net_device_stats {
  * Try to fit them in a single cache line, for dev_get_stats() sake.
  */
 struct net_device_core_stats {
-	local_t		rx_dropped;
-	local_t		tx_dropped;
-	local_t		rx_nohandler;
-} __aligned(4 * sizeof(local_t));
+	unsigned long	rx_dropped;
+	unsigned long	tx_dropped;
+	unsigned long	rx_nohandler;
+} __aligned(4 * sizeof(unsigned long));
 
 #include <linux/cache.h>
 #include <linux/skbuff.h>
@@ -3843,15 +3843,15 @@ static __always_inline bool __is_skb_forwardable(const struct net_device *dev,
 	return false;
 }
 
-struct net_device_core_stats *netdev_core_stats_alloc(struct net_device *dev);
+struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device *dev);
 
-static inline struct net_device_core_stats *dev_core_stats(struct net_device *dev)
+static inline struct net_device_core_stats __percpu *dev_core_stats(struct net_device *dev)
 {
 	/* This READ_ONCE() pairs with the write in netdev_core_stats_alloc() */
 	struct net_device_core_stats __percpu *p = READ_ONCE(dev->core_stats);
 
 	if (likely(p))
-		return this_cpu_ptr(p);
+		return p;
 
 	return netdev_core_stats_alloc(dev);
 }
@@ -3859,14 +3859,11 @@ static inline struct net_device_core_stats *dev_core_stats(struct net_device *de
 #define DEV_CORE_STATS_INC(FIELD)						\
 static inline void dev_core_stats_##FIELD##_inc(struct net_device *dev)		\
 {										\
-	struct net_device_core_stats *p;					\
+	struct net_device_core_stats __percpu *p;				\
 										\
-	preempt_disable();							\
 	p = dev_core_stats(dev);						\
-										\
 	if (p)									\
-		local_inc(&p->FIELD);						\
-	preempt_enable();							\
+		this_cpu_inc(p->FIELD);						\
 }
 DEV_CORE_STATS_INC(rx_dropped)
 DEV_CORE_STATS_INC(tx_dropped)
diff --git a/net/core/dev.c b/net/core/dev.c
index ae7f42f782e9f..19ef1006f0bf8 100644
--- a/net/core/dev.c
+++ b/net/core/dev.c
@@ -10304,7 +10304,7 @@ void netdev_stats_to_stats64(struct rtnl_link_stats64 *stats64,
 }
 EXPORT_SYMBOL(netdev_stats_to_stats64);
 
-struct net_device_core_stats *netdev_core_stats_alloc(struct net_device *dev)
+struct net_device_core_stats __percpu *netdev_core_stats_alloc(struct net_device *dev)
 {
 	struct net_device_core_stats __percpu *p;
 
@@ -10315,11 +10315,7 @@ struct net_device_core_stats *netdev_core_stats_alloc(struct net_device *dev)
 		free_percpu(p);
 
 	/* This READ_ONCE() pairs with the cmpxchg() above */
-	p = READ_ONCE(dev->core_stats);
-	if (!p)
-		return NULL;
-
-	return this_cpu_ptr(p);
+	return READ_ONCE(dev->core_stats);
 }
 EXPORT_SYMBOL(netdev_core_stats_alloc);
 
@@ -10356,9 +10352,9 @@ struct rtnl_link_stats64 *dev_get_stats(struct net_device *dev,
 
 		for_each_possible_cpu(i) {
 			core_stats = per_cpu_ptr(p, i);
-			storage->rx_dropped += local_read(&core_stats->rx_dropped);
-			storage->tx_dropped += local_read(&core_stats->tx_dropped);
-			storage->rx_nohandler += local_read(&core_stats->rx_nohandler);
+			storage->rx_dropped += READ_ONCE(core_stats->rx_dropped);
+			storage->tx_dropped += READ_ONCE(core_stats->tx_dropped);
+			storage->rx_nohandler += READ_ONCE(core_stats->rx_nohandler);
 		}
 	}
 	return storage;
-- 
2.36.0


^ permalink raw reply related

* Re: [PATCH net-next v9 2/2] net: ethernet: Add driver for Sunplus SP7021
From: Stephen Hemminger @ 2022-04-25 16:32 UTC (permalink / raw)
  To: Wells Lu
  Cc: davem, kuba, robh+dt, netdev, devicetree, linux-kernel, p.zabel,
	pabeni, krzk+dt, roopa, andrew, edumazet, wells.lu
In-Reply-To: <1650882640-7106-3-git-send-email-wellslutw@gmail.com>

On Mon, 25 Apr 2022 18:30:40 +0800
Wells Lu <wellslutw@gmail.com> wrote:

> +
> +int spl2sw_rx_poll(struct napi_struct *napi, int budget)
> +{
> +	struct spl2sw_common *comm = container_of(napi, struct spl2sw_common, rx_napi);
> +	struct spl2sw_mac_desc *desc, *h_desc;
> +	struct net_device_stats *stats;
> +	struct sk_buff *skb, *new_skb;
> +	struct spl2sw_skb_info *sinfo;
> +	int budget_left = budget;
> +	u32 rx_pos, pkg_len;
> +	u32 num, rx_count;
> +	s32 queue;
> +	u32 mask;
> +	int port;
> +	u32 cmd;
> +
> +	/* Process high-priority queue and then low-priority queue. */
> +	for (queue = 0; queue < RX_DESC_QUEUE_NUM; queue++) {
> +		rx_pos = comm->rx_pos[queue];
> +		rx_count = comm->rx_desc_num[queue];
> +
> +		for (num = 0; num < rx_count && budget_left; num++) {
> +			sinfo = comm->rx_skb_info[queue] + rx_pos;
> +			desc = comm->rx_desc[queue] + rx_pos;
> +			cmd = desc->cmd1;
> +
> +			if (cmd & RXD_OWN)
> +				break;
> +
> +			port = FIELD_GET(RXD_PKT_SP, cmd);
> +			if (port < MAX_NETDEV_NUM && comm->ndev[port])
> +				stats = &comm->ndev[port]->stats;
> +			else
> +				goto spl2sw_rx_poll_rec_err;
> +
> +			pkg_len = FIELD_GET(RXD_PKT_LEN, cmd);
> +			if (unlikely((cmd & RXD_ERR_CODE) || pkg_len < ETH_ZLEN + 4)) {
> +				stats->rx_length_errors++;
> +				stats->rx_dropped++;
> +				goto spl2sw_rx_poll_rec_err;
> +			}
> +
> +			dma_unmap_single(&comm->pdev->dev, sinfo->mapping,
> +					 comm->rx_desc_buff_size, DMA_FROM_DEVICE);
> +
> +			skb = sinfo->skb;
> +			skb_put(skb, pkg_len - 4); /* Minus FCS */
> +			skb->ip_summed = CHECKSUM_NONE;
> +			skb->protocol = eth_type_trans(skb, comm->ndev[port]);
> +			netif_receive_skb(skb);
> +
> +			stats->rx_packets++;
> +			stats->rx_bytes += skb->len;
> +
> +			/* Allocate a new skb for receiving. */
> +			new_skb = netdev_alloc_skb(NULL, comm->rx_desc_buff_size);
> +			if (unlikely(!new_skb)) {
> +				desc->cmd2 = (rx_pos == comm->rx_desc_num[queue] - 1) ?
> +					     RXD_EOR : 0;
> +				sinfo->skb = NULL;
> +				sinfo->mapping = 0;
> +				desc->addr1 = 0;
> +				goto spl2sw_rx_poll_alloc_err;
> +			}
> +
> +			sinfo->mapping = dma_map_single(&comm->pdev->dev, new_skb->data,
> +							comm->rx_desc_buff_size,
> +							DMA_FROM_DEVICE);
> +			if (dma_mapping_error(&comm->pdev->dev, sinfo->mapping)) {
> +				dev_kfree_skb_irq(new_skb);
> +				desc->cmd2 = (rx_pos == comm->rx_desc_num[queue] - 1) ?
> +					     RXD_EOR : 0;
> +				sinfo->skb = NULL;
> +				sinfo->mapping = 0;
> +				desc->addr1 = 0;
> +				goto spl2sw_rx_poll_alloc_err;
> +			}
> +
> +			sinfo->skb = new_skb;
> +			desc->addr1 = sinfo->mapping;
> +
> +spl2sw_rx_poll_rec_err:
> +			desc->cmd2 = (rx_pos == comm->rx_desc_num[queue] - 1) ?
> +				     RXD_EOR | comm->rx_desc_buff_size :
> +				     comm->rx_desc_buff_size;
> +
> +			wmb();	/* Set RXD_OWN after other fields are effective. */
> +			desc->cmd1 = RXD_OWN;
> +
> +spl2sw_rx_poll_alloc_err:
> +			/* Move rx_pos to next position */
> +			rx_pos = ((rx_pos + 1) == comm->rx_desc_num[queue]) ? 0 : rx_pos + 1;
> +
> +			budget_left--;
> +
> +			/* If there are packets in high-priority queue,
> +			 * stop processing low-priority queue.
> +			 */
> +			if (queue == 1 && !(h_desc->cmd1 & RXD_OWN))
> +				break;
> +		}
> +
> +		comm->rx_pos[queue] = rx_pos;
> +
> +		/* Save pointer to last rx descriptor of high-priority queue. */
> +		if (queue == 0)
> +			h_desc = comm->rx_desc[queue] + rx_pos;
> +	}
> +
> +	wmb();	/* make sure settings are effective. */
> +	mask = readl(comm->l2sw_reg_base + L2SW_SW_INT_MASK_0);
> +	mask &= ~MAC_INT_RX;
> +	writel(mask, comm->l2sw_reg_base + L2SW_SW_INT_MASK_0);
> +
> +	napi_complete(napi);
> +	return 0;
> +}
> +
> +int spl2sw_tx_poll(struct napi_struct *napi, int budget)
> +{
> +	struct spl2sw_common *comm = container_of(napi, struct spl2sw_common, tx_napi);
> +	struct spl2sw_skb_info *skbinfo;
> +	struct net_device_stats *stats;
> +	int budget_left = budget;
> +	u32 tx_done_pos;
> +	u32 mask;
> +	u32 cmd;
> +	int i;
> +
> +	spin_lock(&comm->tx_lock);
> +
> +	tx_done_pos = comm->tx_done_pos;
> +	while (((tx_done_pos != comm->tx_pos) || (comm->tx_desc_full == 1)) && budget_left) {
> +		cmd = comm->tx_desc[tx_done_pos].cmd1;
> +		if (cmd & TXD_OWN)
> +			break;
> +
> +		skbinfo = &comm->tx_temp_skb_info[tx_done_pos];
> +		if (unlikely(!skbinfo->skb))
> +			goto spl2sw_tx_poll_next;
> +
> +		i = ffs(FIELD_GET(TXD_VLAN, cmd)) - 1;
> +		if (i < MAX_NETDEV_NUM && comm->ndev[i])
> +			stats = &comm->ndev[i]->stats;
> +		else
> +			goto spl2sw_tx_poll_unmap;
> +
> +		if (unlikely(cmd & (TXD_ERR_CODE))) {
> +			stats->tx_errors++;
> +		} else {
> +			stats->tx_packets++;
> +			stats->tx_bytes += skbinfo->len;
> +		}
> +
> +spl2sw_tx_poll_unmap:
> +		dma_unmap_single(&comm->pdev->dev, skbinfo->mapping, skbinfo->len,
> +				 DMA_TO_DEVICE);
> +		skbinfo->mapping = 0;
> +		dev_kfree_skb_irq(skbinfo->skb);
> +		skbinfo->skb = NULL;
> +
> +spl2sw_tx_poll_next:
> +		/* Move tx_done_pos to next position */
> +		tx_done_pos = ((tx_done_pos + 1) == TX_DESC_NUM) ? 0 : tx_done_pos + 1;
> +
> +		if (comm->tx_desc_full == 1)
> +			comm->tx_desc_full = 0;
> +
> +		budget_left--;
> +	}
> +
> +	comm->tx_done_pos = tx_done_pos;
> +	if (!comm->tx_desc_full)
> +		for (i = 0; i < MAX_NETDEV_NUM; i++)
> +			if (comm->ndev[i])
> +				if (netif_queue_stopped(comm->ndev[i]))
> +					netif_wake_queue(comm->ndev[i]);
> +
> +	spin_unlock(&comm->tx_lock);
> +
> +	wmb();			/* make sure settings are effective. */
> +	mask = readl(comm->l2sw_reg_base + L2SW_SW_INT_MASK_0);
> +	mask &= ~MAC_INT_TX;
> +	writel(mask, comm->l2sw_reg_base + L2SW_SW_INT_MASK_0);
> +
> +	napi_complete(napi);
> +	return 0;
> +}

This doesn't look like it is doing NAPI properly.

The driver is supposed to return the amount of packets processed.
If budget is used, it should return budget.

^ permalink raw reply

* Re: [PATCH] ath10k: skip ath10k_halt during suspend for driver state RESTARTING
From: Abhishek Kumar @ 2022-04-25 16:26 UTC (permalink / raw)
  To: Kalle Valo
  Cc: linux-kernel, linux-wireless, briannorris, ath10k, netdev,
	Wen Gong, David S. Miller, Jakub Kicinski, Paolo Abeni
In-Reply-To: <87czh5k7ua.fsf@kernel.org>

Thanks Kalle for having a look and adding this on behalf of me.
Here is the Tested-on tag,
Tested-on: QCA6174 hw3.2 PCI WLAN.RM.4.4.1-00288-QCARMSWPZ-1

Thanks
Abhishek

On Sun, Apr 24, 2022 at 11:14 PM Kalle Valo <kvalo@kernel.org> wrote:
>
> Abhishek Kumar <kuabhs@chromium.org> writes:
>
> > Double free crash is observed when FW recovery(caused by wmi
> > timeout/crash) is followed by immediate suspend event. The FW recovery
> > is triggered by ath10k_core_restart() which calls driver clean up via
> > ath10k_halt(). When the suspend event occurs between the FW recovery,
> > the restart worker thread is put into frozen state until suspend completes.
> > The suspend event triggers ath10k_stop() which again triggers ath10k_halt()
> > The double invocation of ath10k_halt() causes ath10k_htt_rx_free() to be
> > called twice(Note: ath10k_htt_rx_alloc was not called by restart worker
> > thread because of its frozen state), causing the crash.
> >
> > To fix this, during the suspend flow, skip call to ath10k_halt() in
> > ath10k_stop() when the current driver state is ATH10K_STATE_RESTARTING.
> > Also, for driver state ATH10K_STATE_RESTARTING, call
> > ath10k_wait_for_suspend() in ath10k_stop(). This is because call to
> > ath10k_wait_for_suspend() is skipped later in
> > [ath10k_halt() > ath10k_core_stop()] for the driver state
> > ATH10K_STATE_RESTARTING.
> >
> > The frozen restart worker thread will be cancelled during resume when the
> > device comes out of suspend.
> >
> > Below is the crash stack for reference:
> >
> > [  428.469167] ------------[ cut here ]------------
> > [  428.469180] kernel BUG at mm/slub.c:4150!
> > [  428.469193] invalid opcode: 0000 [#1] PREEMPT SMP NOPTI
> > [  428.469219] Workqueue: events_unbound async_run_entry_fn
> > [  428.469230] RIP: 0010:kfree+0x319/0x31b
> > [  428.469241] RSP: 0018:ffffa1fac015fc30 EFLAGS: 00010246
> > [  428.469247] RAX: ffffedb10419d108 RBX: ffff8c05262b0000
> > [  428.469252] RDX: ffff8c04a8c07000 RSI: 0000000000000000
> > [  428.469256] RBP: ffffa1fac015fc78 R08: 0000000000000000
> > [  428.469276] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [  428.469285] Call Trace:
> > [  428.469295]  ? dma_free_attrs+0x5f/0x7d
> > [  428.469320]  ath10k_core_stop+0x5b/0x6f
> > [  428.469336]  ath10k_halt+0x126/0x177
> > [  428.469352]  ath10k_stop+0x41/0x7e
> > [  428.469387]  drv_stop+0x88/0x10e
> > [  428.469410]  __ieee80211_suspend+0x297/0x411
> > [  428.469441]  rdev_suspend+0x6e/0xd0
> > [  428.469462]  wiphy_suspend+0xb1/0x105
> > [  428.469483]  ? name_show+0x2d/0x2d
> > [  428.469490]  dpm_run_callback+0x8c/0x126
> > [  428.469511]  ? name_show+0x2d/0x2d
> > [  428.469517]  __device_suspend+0x2e7/0x41b
> > [  428.469523]  async_suspend+0x1f/0x93
> > [  428.469529]  async_run_entry_fn+0x3d/0xd1
> > [  428.469535]  process_one_work+0x1b1/0x329
> > [  428.469541]  worker_thread+0x213/0x372
> > [  428.469547]  kthread+0x150/0x15f
> > [  428.469552]  ? pr_cont_work+0x58/0x58
> > [  428.469558]  ? kthread_blkcg+0x31/0x31
> >
> > Signed-off-by: Abhishek Kumar <kuabhs@chromium.org>
> > Co-developed-by: Wen Gong <quic_wgong@quicinc.com>
> > Signed-off-by: Wen Gong <quic_wgong@quicinc.com>
>
> Tested-on tag missing, but I can add it if you provide it.
>
> https://wireless.wiki.kernel.org/en/users/drivers/ath10k/submittingpatches#tested-on_tag
>
> --
> https://patchwork.kernel.org/project/linux-wireless/list/
>
> https://wireless.wiki.kernel.org/en/developers/documentation/submittingpatches

^ permalink raw reply

* Re: [PATCH net-next v9 2/2] net: ethernet: Add driver for Sunplus SP7021
From: Stephen Hemminger @ 2022-04-25 16:24 UTC (permalink / raw)
  To: Wells Lu
  Cc: davem, kuba, robh+dt, netdev, devicetree, linux-kernel, p.zabel,
	pabeni, krzk+dt, roopa, andrew, edumazet, wells.lu
In-Reply-To: <1650882640-7106-3-git-send-email-wellslutw@gmail.com>

On Mon, 25 Apr 2022 18:30:40 +0800
Wells Lu <wellslutw@gmail.com> wrote:

> diff --git a/drivers/net/ethernet/sunplus/spl2sw_driver.h b/drivers/net/ethernet/sunplus/spl2sw_driver.h
> new file mode 100644
> index 000000000..5f177b3af
> --- /dev/null
> +++ b/drivers/net/ethernet/sunplus/spl2sw_driver.h
> @@ -0,0 +1,12 @@
> +/* SPDX-License-Identifier: GPL-2.0 */
> +/* Copyright Sunplus Technology Co., Ltd.
> + *       All rights reserved.
> + */
> +
> +#ifndef __SPL2SW_DRIVER_H__
> +#define __SPL2SW_DRIVER_H__
> +
> +#define SPL2SW_RX_NAPI_WEIGHT	16
> +#define SPL2SW_TX_NAPI_WEIGHT	16

Why define your own? there is NAPI_POLL_WEIGHT already
defined in netdevice.h

^ permalink raw reply

* Re: [PATCH perf/core 1/5] libbpf: Add bpf_program__set_insns function
From: Daniel Borkmann @ 2022-04-25 16:22 UTC (permalink / raw)
  To: Jiri Olsa, Arnaldo Carvalho de Melo, Alexei Starovoitov,
	Andrii Nakryiko
  Cc: linux-perf-users, netdev, bpf, Ingo Molnar, Namhyung Kim,
	Alexander Shishkin, Peter Zijlstra, Martin KaFai Lau, Song Liu,
	Yonghong Song, John Fastabend, Ian Rogers
In-Reply-To: <20220422100025.1469207-2-jolsa@kernel.org>

On 4/22/22 12:00 PM, Jiri Olsa wrote:
> Adding bpf_program__set_insns that allows to set new
> instructions for program.
> 
> Also moving bpf_program__attach_kprobe_multi_opts on
> the proper name sorted place in map file.
> 
> Signed-off-by: Jiri Olsa <jolsa@kernel.org>
> ---
>   tools/lib/bpf/libbpf.c   |  8 ++++++++
>   tools/lib/bpf/libbpf.h   | 12 ++++++++++++
>   tools/lib/bpf/libbpf.map |  3 ++-
>   3 files changed, 22 insertions(+), 1 deletion(-)
> 
> diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c
> index 809fe209cdcc..284790d81c1b 100644
> --- a/tools/lib/bpf/libbpf.c
> +++ b/tools/lib/bpf/libbpf.c
> @@ -8457,6 +8457,14 @@ size_t bpf_program__insn_cnt(const struct bpf_program *prog)
>   	return prog->insns_cnt;
>   }
>   
> +void bpf_program__set_insns(struct bpf_program *prog,
> +			    struct bpf_insn *insns, size_t insns_cnt)
> +{
> +	free(prog->insns);
> +	prog->insns = insns;
> +	prog->insns_cnt = insns_cnt;
> +}
> +
>   int bpf_program__set_prep(struct bpf_program *prog, int nr_instances,
>   			  bpf_program_prep_t prep)
>   {
> diff --git a/tools/lib/bpf/libbpf.h b/tools/lib/bpf/libbpf.h
> index 05dde85e19a6..b31ad58d335f 100644
> --- a/tools/lib/bpf/libbpf.h
> +++ b/tools/lib/bpf/libbpf.h
> @@ -323,6 +323,18 @@ struct bpf_insn;
>    * different.
>    */
>   LIBBPF_API const struct bpf_insn *bpf_program__insns(const struct bpf_program *prog);
> +
> +/**
> + * @brief **bpf_program__set_insns()** can set BPF program's underlying
> + * BPF instructions.
> + * @param prog BPF program for which to return instructions
> + * @param insn a pointer to an array of BPF instructions
> + * @param insns_cnt number of `struct bpf_insn`'s that form
> + * specified BPF program
> + */
> +LIBBPF_API void bpf_program__set_insns(struct bpf_program *prog,
> +				       struct bpf_insn *insns, size_t insns_cnt);
> +

Iiuc, patch 2 should be squashed into this one given they logically belong to the
same change?

Fwiw, I think the API description should be elaborated a bit more, in particular that
the passed-in insns need to be from allocated dynamic memory which is later on passed
to free(), and maybe also constraints at which point in time bpf_program__set_insns()
may be called.. (as well as high-level description on potential use cases e.g. around
patch 4).

>   /**
>    * @brief **bpf_program__insn_cnt()** returns number of `struct bpf_insn`'s
>    * that form specified BPF program.
> diff --git a/tools/lib/bpf/libbpf.map b/tools/lib/bpf/libbpf.map
> index dd35ee58bfaa..afa10d24ab41 100644
> --- a/tools/lib/bpf/libbpf.map
> +++ b/tools/lib/bpf/libbpf.map
> @@ -444,7 +444,8 @@ LIBBPF_0.8.0 {
>   	global:
>   		bpf_object__destroy_subskeleton;
>   		bpf_object__open_subskeleton;
> +		bpf_program__attach_kprobe_multi_opts;
> +		bpf_program__set_insns;
>   		libbpf_register_prog_handler;
>   		libbpf_unregister_prog_handler;
> -		bpf_program__attach_kprobe_multi_opts;
>   } LIBBPF_0.7.0;
> 


^ permalink raw reply

* Re: [patch iproute2-next v2] devlink: introduce -h[ex] cmdline option to allow dumping numbers in hex format
From: Stephen Hemminger @ 2022-04-25 16:09 UTC (permalink / raw)
  To: Jiri Pirko; +Cc: netdev, sthemmin, dsahern, snelson
In-Reply-To: <20220425093036.1518107-1-jiri@resnulli.us>

On Mon, 25 Apr 2022 11:30:36 +0200
Jiri Pirko <jiri@resnulli.us> wrote:

>  static int fmsg_value_show(struct dl *dl, int type, struct nlattr *nl_data)
>  {
> +	const char *num_fmt = dl->hex ? "%#x" : "%u";
> +	const char *num64_fmt = dl->hex ? "%#"PRIx64 : "%"PRIu64;

This is going to make it harder if/when format string properties are used.
Not sure what the better way is.

^ permalink raw reply

* Re: [PATCH net-next 00/11] mlxsw: extend line card model by devices and info
From: Jakub Kicinski @ 2022-04-25 16:00 UTC (permalink / raw)
  To: Ido Schimmel; +Cc: netdev, davem, pabeni, jiri, petrm, dsahern, andrew, mlxsw
In-Reply-To: <20220425034431.3161260-1-idosch@nvidia.com>

On Mon, 25 Apr 2022 06:44:20 +0300 Ido Schimmel wrote:
> This patchset is extending the line card model by three items:
> 1) line card devices
> 2) line card info
> 3) line card device info
> 
> First three patches are introducing the necessary changes in devlink
> core.
> 
> Then, all three extensions are implemented in mlxsw alongside with
> selftest.

:/ what is a line card device? You must provide document what you're
doing, this:

 .../networking/devlink/devlink-linecard.rst   |   4 +

is not enough.

How many operations and attributes are you going to copy&paste?

Is linking devlink instances into a hierarchy a better approach?

Would you mind if I revert this? I don't understand why the line card
patches are applied in 6h on Sunday night, especially that RFCv1 had
quite a long discussion. But really any uAPI additions should warrant
longer review time, IMHO.

^ permalink raw reply

* RE: [PATCH net v4] ice: Fix race during aux device (un)plugging
From: Ertman, David M @ 2022-04-25 15:58 UTC (permalink / raw)
  To: ivecera, netdev@vger.kernel.org
  Cc: poros, mschmidt, Leon Romanovsky, Brandeburg, Jesse,
	Nguyen, Anthony L, David S. Miller, Jakub Kicinski, Paolo Abeni,
	Saleem, Shiraz, moderated list:INTEL ETHERNET DRIVERS, open list
In-Reply-To: <20220423102021.4056946-1-ivecera@redhat.com>



> -----Original Message-----
> From: Ivan Vecera <ivecera@redhat.com>
> Sent: Saturday, April 23, 2022 3:20 AM
> To: netdev@vger.kernel.org
> Cc: poros <poros@redhat.com>; mschmidt <mschmidt@redhat.com>; Leon
> Romanovsky <leonro@nvidia.com>; Brandeburg, Jesse
> <jesse.brandeburg@intel.com>; Nguyen, Anthony L
> <anthony.l.nguyen@intel.com>; David S. Miller <davem@davemloft.net>;
> Jakub Kicinski <kuba@kernel.org>; Paolo Abeni <pabeni@redhat.com>;
> Saleem, Shiraz <shiraz.saleem@intel.com>; Ertman, David M
> <david.m.ertman@intel.com>; moderated list:INTEL ETHERNET DRIVERS
> <intel-wired-lan@lists.osuosl.org>; open list <linux-kernel@vger.kernel.org>
> Subject: [PATCH net v4] ice: Fix race during aux device (un)plugging
> 
> Function ice_plug_aux_dev() assigns pf->adev field too early prior
> aux device initialization and on other side ice_unplug_aux_dev()
> starts aux device deinit and at the end assigns NULL to pf->adev.
> This is wrong because pf->adev should always be non-NULL only when
> aux device is fully initialized and ready. This wrong order causes
> a crash when ice_send_event_to_aux() call occurs because that function
> depends on non-NULL value of pf->adev and does not assume that
> aux device is half-initialized or half-destroyed.
> After order correction the race window is tiny but it is still there,
> as Leon mentioned and manipulation with pf->adev needs to be protected
> by mutex.
> 
> Fix (un-)plugging functions so pf->adev field is set after aux device
> init and prior aux device destroy and protect pf->adev assignment by
> new mutex. This mutex is also held during ice_send_event_to_aux()
> call to ensure that aux device is valid during that call.
> Note that device lock used ice_send_event_to_aux() needs to be kept
> to avoid race with aux drv unload.
> 
> Reproducer:
> cycle=1
> while :;do
>         echo "#### Cycle: $cycle"
> 
>         ip link set ens7f0 mtu 9000
>         ip link add bond0 type bond mode 1 miimon 100
>         ip link set bond0 up
>         ifenslave bond0 ens7f0
>         ip link set bond0 mtu 9000
>         ethtool -L ens7f0 combined 1
>         ip link del bond0
>         ip link set ens7f0 mtu 1500
>         sleep 1
> 
>         let cycle++
> done
> 
> In short when the device is added/removed to/from bond the aux device
> is unplugged/plugged. When MTU of the device is changed an event is
> sent to aux device asynchronously. This can race with (un)plugging
> operation and because pf->adev is set too early (plug) or too late
> (unplug) the function ice_send_event_to_aux() can touch uninitialized
> or destroyed fields. In the case of crash below pf->adev->dev.mutex.

--SNIP--

> Fixes: f9f5301e7e2d ("ice: Register auxiliary device to provide RDMA")
> Reviewed-by: Leon Romanovsky <leonro@nvidia.com>
> Signed-off-by: Ivan Vecera <ivecera@redhat.com>
> ---
>  drivers/net/ethernet/intel/ice/ice.h      |  1 +
>  drivers/net/ethernet/intel/ice/ice_idc.c  | 25 +++++++++++++++--------
>  drivers/net/ethernet/intel/ice/ice_main.c |  2 ++
>  3 files changed, 20 insertions(+), 8 deletions(-)
> 
> diff --git a/drivers/net/ethernet/intel/ice/ice.h
> b/drivers/net/ethernet/intel/ice/ice.h
> index 8ed3c9ab7ff7..a895e3a8e988 100644
> --- a/drivers/net/ethernet/intel/ice/ice.h
> +++ b/drivers/net/ethernet/intel/ice/ice.h
> @@ -540,6 +540,7 @@ struct ice_pf {
>  	struct mutex avail_q_mutex;	/* protects access to avail_[rx|tx]qs
> */
>  	struct mutex sw_mutex;		/* lock for protecting VSI alloc
> flow */
>  	struct mutex tc_mutex;		/* lock to protect TC changes
> */
> +	struct mutex adev_mutex;	/* lock to protect aux device access
> */
>  	u32 msg_enable;
>  	struct ice_ptp ptp;
>  	struct tty_driver *ice_gnss_tty_driver;
> diff --git a/drivers/net/ethernet/intel/ice/ice_idc.c
> b/drivers/net/ethernet/intel/ice/ice_idc.c
> index 25a436d342c2..3e3b2ed4cd5d 100644
> --- a/drivers/net/ethernet/intel/ice/ice_idc.c
> +++ b/drivers/net/ethernet/intel/ice/ice_idc.c
> @@ -37,14 +37,17 @@ void ice_send_event_to_aux(struct ice_pf *pf, struct
> iidc_event *event)
>  	if (WARN_ON_ONCE(!in_task()))
>  		return;
> 
> +	mutex_lock(&pf->adev_mutex);
>  	if (!pf->adev)
> -		return;
> +		goto finish;
> 
>  	device_lock(&pf->adev->dev);
>  	iadrv = ice_get_auxiliary_drv(pf);
>  	if (iadrv && iadrv->event_handler)
>  		iadrv->event_handler(pf, event);
>  	device_unlock(&pf->adev->dev);
> +finish:
> +	mutex_unlock(&pf->adev_mutex);
>  }
> 
>  /**
> @@ -290,7 +293,6 @@ int ice_plug_aux_dev(struct ice_pf *pf)
>  		return -ENOMEM;
> 
>  	adev = &iadev->adev;
> -	pf->adev = adev;
>  	iadev->pf = pf;
> 
>  	adev->id = pf->aux_idx;
> @@ -300,18 +302,20 @@ int ice_plug_aux_dev(struct ice_pf *pf)
> 
>  	ret = auxiliary_device_init(adev);
>  	if (ret) {
> -		pf->adev = NULL;
>  		kfree(iadev);
>  		return ret;
>  	}
> 
>  	ret = auxiliary_device_add(adev);
>  	if (ret) {
> -		pf->adev = NULL;
>  		auxiliary_device_uninit(adev);
>  		return ret;
>  	}
> 
> +	mutex_lock(&pf->adev_mutex);
> +	pf->adev = adev;
> +	mutex_unlock(&pf->adev_mutex);
> +
>  	return 0;
>  }
> 
> @@ -320,12 +324,17 @@ int ice_plug_aux_dev(struct ice_pf *pf)
>   */
>  void ice_unplug_aux_dev(struct ice_pf *pf)
>  {
> -	if (!pf->adev)
> -		return;
> +	struct auxiliary_device *adev;
> 
> -	auxiliary_device_delete(pf->adev);
> -	auxiliary_device_uninit(pf->adev);
> +	mutex_lock(&pf->adev_mutex);
> +	adev = pf->adev;
>  	pf->adev = NULL;
> +	mutex_unlock(&pf->adev_mutex);
> +
> +	if (adev) {
> +		auxiliary_device_delete(adev);
> +		auxiliary_device_uninit(adev);
> +	}
>  }
> 
>  /**
> diff --git a/drivers/net/ethernet/intel/ice/ice_main.c
> b/drivers/net/ethernet/intel/ice/ice_main.c
> index 5b1198859da7..2cbbf7abefc4 100644
> --- a/drivers/net/ethernet/intel/ice/ice_main.c
> +++ b/drivers/net/ethernet/intel/ice/ice_main.c
> @@ -3769,6 +3769,7 @@ u16 ice_get_avail_rxq_count(struct ice_pf *pf)
>  static void ice_deinit_pf(struct ice_pf *pf)
>  {
>  	ice_service_task_stop(pf);
> +	mutex_destroy(&pf->adev_mutex);
>  	mutex_destroy(&pf->sw_mutex);
>  	mutex_destroy(&pf->tc_mutex);
>  	mutex_destroy(&pf->avail_q_mutex);
> @@ -3847,6 +3848,7 @@ static int ice_init_pf(struct ice_pf *pf)
> 
>  	mutex_init(&pf->sw_mutex);
>  	mutex_init(&pf->tc_mutex);
> +	mutex_init(&pf->adev_mutex);
> 
>  	INIT_HLIST_HEAD(&pf->aq_wait_list);
>  	spin_lock_init(&pf->aq_wait_lock);
> --
> 2.35.1

Reviewed-by: Dave Ertman <david.m.ertman@intel.com>

^ permalink raw reply

* Re: net: stmmac: dwmac-imx: half duplex crash
From: Andrew Lunn @ 2022-04-25 15:59 UTC (permalink / raw)
  To: Marcel Ziswiler
  Cc: linux-imx@nxp.com, peppe.cavallaro@st.com,
	linux-stm32@st-md-mailman.stormreply.com, davem@davemloft.net,
	linux-kernel@vger.kernel.org, pabeni@redhat.com,
	shawnguo@kernel.org, joabreu@synopsys.com, kernel@pengutronix.de,
	s.hauer@pengutronix.de, kuba@kernel.org,
	alexandre.torgue@foss.st.com, mcoquelin.stm32@gmail.com,
	linux-arm-kernel@lists.infradead.org, netdev@vger.kernel.org,
	festevam@gmail.com
In-Reply-To: <5e51e11bbbf6ecd0ee23b4fd2edec98e6e7fbaa8.camel@toradex.com>

> Good point. I was blinded by NXP downstream which, while listing all incl. 10baseT/Half and 100baseT/Half as
> supported link modes, also does not work. However, upstream indeed shows only full-duplex modes as supported:
> 
> root@verdin-imx8mp-07106916:~# ethtool eth1
> Settings for eth1:
>         Supported ports: [ TP MII ]
>         Supported link modes:   10baseT/Full 
>                                 100baseT/Full 
>                                 1000baseT/Full 

So maybe we actually want ethtool to report -EINVAL when asked to do
something which is not supported! Humm:

https://elixir.bootlin.com/linux/latest/source/drivers/net/phy/phy.c#L783


	/* We make sure that we don't pass unsupported values in to the PHY */
	linkmode_and(advertising, advertising, phydev->supported);

So maybe the unsupported mode got removed, and the PHY was asked to
advertise nothing!

Anyway, this is roughly there the check should go.

> ...
> 
> Once I remove them queues being setup via device tree it shows all modes as supported again:
> 
> root@verdin-imx8mp-07106916:~# ethtool eth1
> Settings for eth1:
>         Supported ports: [ TP MII ]
>         Supported link modes:   10baseT/Half 10baseT/Full 
>                                 100baseT/Half 100baseT/Full 
>                                 1000baseT/Full 
> ...
> 
> However, 10baseT/Half, while no longer just crashing, still does not seem to work right. Looking at wireshark
> traces it does send packets but seems not to ever get neither ARP nor DHCP answers (as well as any other packet
> for that matter).

So the answers are on the wire, just not received? 

> Looks like the same actually applies to 10baseT/Full as well. While 100baseT/Half and
> 100baseT/Full work fine now.
> 
> Any idea what else could still be going wrong with them 10baseT modes?

I would use mii-tool to check the status of the PHY. Make sure it
really has negotiated 10/Half mode. After that, it is very likely to
be a MAC problem, and i don't think i can help you.

> On a side note, besides modifying the device tree for such single-queue setup being half-duplex capable, is
> there any easier way? Much nicer would, of course, be if it justworkedTM (e.g. advertise all modes but once a
> half-duplex mode is chosen revert to such single-queue operation). Then, on the other hand, who still uses
> half-duplex communication in this day and age (;-p).

You seem to need it for some reason!

Anyway, it is just code. You have all the needed information in the
adjust_link callback, so you could implement it.

	    Andrew

^ permalink raw reply

* Re: [PATCH] net: phy: fix error check return value of phy_read()
From: Florian Fainelli @ 2022-04-25 15:55 UTC (permalink / raw)
  To: Andrew Lunn, cgel.zte
  Cc: f.fainelli, bcm-kernel-feedback-list, hkallweit1, linux, davem,
	kuba, pabeni, netdev, linux-kernel, Lv Ruyi, Zeal Robot
In-Reply-To: <Yl6mH0HKCGPxgejI@lunn.ch>



On 4/19/2022 5:07 AM, Andrew Lunn wrote:
> On Tue, Apr 19, 2022 at 01:44:39AM +0000, cgel.zte@gmail.com wrote:
>> From: Lv Ruyi <lv.ruyi@zte.com.cn>
>>
>> phy_read() returns a negative number if there's an error, but the
>> error-checking code in the bcm87xx driver's config_intr function
>> triggers if phy_read() returns non-zero.  Correct that.
>>
>> Reported-by: Zeal Robot <zealci@zte.com.cn>
>> Signed-off-by: Lv Ruyi <lv.ruyi@zte.com.cn>
>> ---
>>   drivers/net/phy/bcm87xx.c | 2 +-
>>   1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/phy/bcm87xx.c b/drivers/net/phy/bcm87xx.c
>> index 313563482690..e62b53718010 100644
>> --- a/drivers/net/phy/bcm87xx.c
>> +++ b/drivers/net/phy/bcm87xx.c
>> @@ -146,7 +146,7 @@ static int bcm87xx_config_intr(struct phy_device *phydev)
>>   
>>   	if (phydev->interrupts == PHY_INTERRUPT_ENABLED) {
>>   		err = phy_read(phydev, BCM87XX_LASI_STATUS);
>> -		if (err)
>> +		if (err < 0)
>>   			return err;
> 
> This should probably have a Fixes: tag, and be for net, not next-next.
> Please read the netdev FAQ about the trees, and submittinng fixes for
> netdev.

Yes, it should be:

Fixes: 15772e4ddf3f ("net: phy: broadcom: remove use of ack_interrupt()")

Also, please subject this change properly with:

net: phy: bcm87xx: Added missing error checking

Thank you
-- 
Florian

^ permalink raw reply

* [PATCH v1] net: stmmac: dwmac-imx: comment spelling fix
From: Marcel Ziswiler @ 2022-04-25 15:48 UTC (permalink / raw)
  To: netdev
  Cc: Marcel Ziswiler, Alexandre Torgue, David S. Miller, Fabio Estevam,
	Fugang Duan, Giuseppe Cavallaro, Jakub Kicinski, Jose Abreu,
	Maxime Coquelin, NXP Linux Team, Paolo Abeni,
	Pengutronix Kernel Team, Sascha Hauer, Shawn Guo,
	linux-arm-kernel, linux-kernel, linux-stm32

From: Marcel Ziswiler <marcel.ziswiler@toradex.com>

Fix spelling in comment.

Fixes: 94abdad6974a ("net: ethernet: dwmac: add ethernet glue logic for NXP imx8 chip")
Signed-off-by: Marcel Ziswiler <marcel.ziswiler@toradex.com>

---

 drivers/net/ethernet/stmicro/stmmac/dwmac-imx.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/dwmac-imx.c b/drivers/net/ethernet/stmicro/stmmac/dwmac-imx.c
index 84651207a1de..bd52fb7cf486 100644
--- a/drivers/net/ethernet/stmicro/stmmac/dwmac-imx.c
+++ b/drivers/net/ethernet/stmicro/stmmac/dwmac-imx.c
@@ -197,9 +197,9 @@ imx_dwmac_parse_dt(struct imx_priv_data *dwmac, struct device *dev)
 	}
 
 	if (of_machine_is_compatible("fsl,imx8mp")) {
-		/* Binding doc describes the propety:
+		/* Binding doc describes the property:
 		   is required by i.MX8MP.
-		   is optinoal for i.MX8DXL.
+		   is optional for i.MX8DXL.
 		 */
 		dwmac->intf_regmap = syscon_regmap_lookup_by_phandle(np, "intf_mode");
 		if (IS_ERR(dwmac->intf_regmap))
-- 
2.35.1


^ permalink raw reply related

* Re: [PATCH bpf-next] libbpf: Remove unnecessary type cast
From: patchwork-bot+netdevbpf @ 2022-04-25 15:50 UTC (permalink / raw)
  To: Yuntao Wang
  Cc: ast, daniel, andrii, kafai, songliubraving, yhs, john.fastabend,
	kpsingh, netdev, bpf, linux-kernel
In-Reply-To: <20220424143420.457082-1-ytcoode@gmail.com>

Hello:

This patch was applied to bpf/bpf-next.git (master)
by Daniel Borkmann <daniel@iogearbox.net>:

On Sun, 24 Apr 2022 22:34:20 +0800 you wrote:
> The link variable is already of type 'struct bpf_link *', casting it to
> 'struct bpf_link *' is redundant, drop it.
> 
> Signed-off-by: Yuntao Wang <ytcoode@gmail.com>
> ---
>  tools/lib/bpf/libbpf.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)

Here is the summary with links:
  - [bpf-next] libbpf: Remove unnecessary type cast
    https://git.kernel.org/bpf/bpf-next/c/003fed595c0f

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

* Re: [PATCH net-next v4 0/2] propagate extack to vxlan_fdb_delete
From: Jakub Kicinski @ 2022-04-25 15:48 UTC (permalink / raw)
  To: Alaa Mohamed
  Cc: netdev, outreachy, roopa, jdenham, sbrivio, jesse.brandeburg,
	anthony.l.nguyen, davem, pabeni, vladimir.oltean, claudiu.manoil,
	alexandre.belloni, shshaikh, manishc, razor, intel-wired-lan,
	linux-kernel, UNGLinuxDriver, GR-Linux-NIC-Dev, bridge
In-Reply-To: <cover.1650896000.git.eng.alaamohamedsoliman.am@gmail.com>

On Mon, 25 Apr 2022 16:25:05 +0200 Alaa Mohamed wrote:
> In order to propagate extack to vxlan_fdb_delete and vxlan_fdb_parse,
> add extack to .ndo_fdb_del and edit all fdb del handelers

Your patches do not apply cleanly to net-next/master. Please rebase and
repost. Please wait 24h between postings to allow additional feedback
to come in.

^ permalink raw reply

* Re: [PATCH] net: appletalk: cleanup brace styling
From: Jakub Kicinski @ 2022-04-25 15:44 UTC (permalink / raw)
  To: Ian Cowan; +Cc: David S. Miller, Paolo Abeni, netdev, linux-kernel
In-Reply-To: <20220425023512.502988-1-ian@linux.cowan.aero>

On Sun, 24 Apr 2022 22:35:12 -0400 Ian Cowan wrote:
> This cleans up some brace styling to meet the style guide. There is one
> exception where the compiler wants unnecessary braces to prevent
> multiple if/else ambiguity.
> 
> Signed-off-by: Ian Cowan <ian@linux.cowan.aero>

Please don't send pure style cleanups to networking, especially 
to dinosaur code. Thanks!

^ permalink raw reply

* [PATCH bpf] xsk: fix possible crash when multiple sockets are created
From: Maciej Fijalkowski @ 2022-04-25 15:37 UTC (permalink / raw)
  To: bpf, ast, daniel, andriin; +Cc: netdev, magnus.karlsson, Maciej Fijalkowski

Fix a crash that happens if an Rx only socket is created first, then a
second socket is created that is Tx only and bound to the same umem as
the first socket and also the same netdev and queue_id together with the
XDP_SHARED_UMEM flag. In this specific case, the tx_descs array page
pool was not created by the first socket as it was an Rx only socket.
When the second socket is bound it needs this tx_descs array of this
shared page pool as it has a Tx component, but unfortunately it was
never allocated, leading to a crash. Note that this array is only used
for zero-copy drivers using the batched Tx APIs, currently only ice and
i40e.

[ 5511.150360] BUG: kernel NULL pointer dereference, address: 0000000000000008
[ 5511.158419] #PF: supervisor write access in kernel mode
[ 5511.164472] #PF: error_code(0x0002) - not-present page
[ 5511.170416] PGD 0 P4D 0
[ 5511.173347] Oops: 0002 [#1] PREEMPT SMP PTI
[ 5511.178186] CPU: 0 PID: 0 Comm: swapper/0 Tainted: G            E     5.18.0-rc1+ #97
[ 5511.187245] Hardware name: Intel Corp. GRANTLEY/GRANTLEY, BIOS GRRFCRB1.86B.0276.D07.1605190235 05/19/2016
[ 5511.198418] RIP: 0010:xsk_tx_peek_release_desc_batch+0x198/0x310
[ 5511.205375] Code: c0 83 c6 01 84 c2 74 6d 8d 46 ff 23 07 44 89 e1 48 83 c0 14 48 c1 e1 04 48 c1 e0 04 48 03 47 10 4c 01 c1 48 8b 50 08 48 8b 00 <48> 89 51 08 48 89 01 41 80 bd d7 00 00 00 00 75 82 48 8b 19 49 8b
[ 5511.227091] RSP: 0018:ffffc90000003dd0 EFLAGS: 00010246
[ 5511.233135] RAX: 0000000000000000 RBX: ffff88810c8da600 RCX: 0000000000000000
[ 5511.241384] RDX: 000000000000003c RSI: 0000000000000001 RDI: ffff888115f555c0
[ 5511.249634] RBP: ffffc90000003e08 R08: 0000000000000000 R09: ffff889092296b48
[ 5511.257886] R10: 0000ffffffffffff R11: ffff889092296800 R12: 0000000000000000
[ 5511.266138] R13: ffff88810c8db500 R14: 0000000000000040 R15: 0000000000000100
[ 5511.274387] FS:  0000000000000000(0000) GS:ffff88903f800000(0000) knlGS:0000000000000000
[ 5511.283746] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 5511.290389] CR2: 0000000000000008 CR3: 00000001046e2001 CR4: 00000000003706f0
[ 5511.298640] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 5511.306892] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[ 5511.315142] Call Trace:
[ 5511.317972]  <IRQ>
[ 5511.320301]  ice_xmit_zc+0x68/0x2f0 [ice]
[ 5511.324977]  ? ktime_get+0x38/0xa0
[ 5511.328913]  ice_napi_poll+0x7a/0x6a0 [ice]
[ 5511.333784]  __napi_poll+0x2c/0x160
[ 5511.337821]  net_rx_action+0xdd/0x200
[ 5511.342058]  __do_softirq+0xe6/0x2dd
[ 5511.346198]  irq_exit_rcu+0xb5/0x100
[ 5511.350339]  common_interrupt+0xa4/0xc0
[ 5511.354777]  </IRQ>
[ 5511.357201]  <TASK>
[ 5511.359625]  asm_common_interrupt+0x1e/0x40
[ 5511.364466] RIP: 0010:cpuidle_enter_state+0xd2/0x360
[ 5511.370211] Code: 49 89 c5 0f 1f 44 00 00 31 ff e8 e9 00 7b ff 45 84 ff 74 12 9c 58 f6 c4 02 0f 85 72 02 00 00 31 ff e8 02 0c 80 ff fb 45 85 f6 <0f> 88 11 01 00 00 49 63 c6 4c 2b 2c 24 48 8d 14 40 48 8d 14 90 49
[ 5511.391921] RSP: 0018:ffffffff82a03e60 EFLAGS: 00000202
[ 5511.397962] RAX: ffff88903f800000 RBX: 0000000000000001 RCX: 000000000000001f
[ 5511.406214] RDX: 0000000000000000 RSI: ffffffff823400b9 RDI: ffffffff8234c046
[ 5511.424646] RBP: ffff88810a384800 R08: 000005032a28c046 R09: 0000000000000008
[ 5511.443233] R10: 000000000000000b R11: 0000000000000006 R12: ffffffff82bcf700
[ 5511.461922] R13: 000005032a28c046 R14: 0000000000000001 R15: 0000000000000000
[ 5511.480300]  cpuidle_enter+0x29/0x40
[ 5511.494329]  do_idle+0x1c7/0x250
[ 5511.507610]  cpu_startup_entry+0x19/0x20
[ 5511.521394]  start_kernel+0x649/0x66e
[ 5511.534626]  secondary_startup_64_no_verify+0xc3/0xcb
[ 5511.549230]  </TASK>

Detect such case during bind() and allocate this memory region via newly
introduced xp_alloc_tx_descs(). Also, use kvcalloc instead of kcalloc as
for other buffer pool allocations, so that it matches the kvfree() from
xp_destroy().

Fixes: d1bc532e99be ("i40e: xsk: Move tmp desc array from driver to pool")
Signed-off-by: Maciej Fijalkowski <maciej.fijalkowski@intel.com>
---
 include/net/xsk_buff_pool.h |  1 +
 net/xdp/xsk.c               | 13 +++++++++++++
 net/xdp/xsk_buff_pool.c     | 16 ++++++++++++----
 3 files changed, 26 insertions(+), 4 deletions(-)

diff --git a/include/net/xsk_buff_pool.h b/include/net/xsk_buff_pool.h
index 5554ee75e7da..647722e847b4 100644
--- a/include/net/xsk_buff_pool.h
+++ b/include/net/xsk_buff_pool.h
@@ -97,6 +97,7 @@ int xp_assign_dev(struct xsk_buff_pool *pool, struct net_device *dev,
 		  u16 queue_id, u16 flags);
 int xp_assign_dev_shared(struct xsk_buff_pool *pool, struct xdp_umem *umem,
 			 struct net_device *dev, u16 queue_id);
+int xp_alloc_tx_descs(struct xsk_buff_pool *pool, struct xdp_sock *xs);
 void xp_destroy(struct xsk_buff_pool *pool);
 void xp_get_pool(struct xsk_buff_pool *pool);
 bool xp_put_pool(struct xsk_buff_pool *pool);
diff --git a/net/xdp/xsk.c b/net/xdp/xsk.c
index 040c73345b7c..57afb96c41e8 100644
--- a/net/xdp/xsk.c
+++ b/net/xdp/xsk.c
@@ -967,6 +967,19 @@ static int xsk_bind(struct socket *sock, struct sockaddr *addr, int addr_len)
 
 			xp_get_pool(umem_xs->pool);
 			xs->pool = umem_xs->pool;
+
+			/* If underlying shared umem was created without Tx
+			 * ring, allocate Tx descs array that Tx batching API
+			 * utilizes
+			 */
+			if (xs->tx && !xs->pool->tx_descs) {
+				err = xp_alloc_tx_descs(xs->pool, xs);
+				if (err) {
+					xp_put_pool(xs->pool);
+					sockfd_put(sock);
+					goto out_unlock;
+				}
+			}
 		}
 
 		xdp_get_umem(umem_xs->umem);
diff --git a/net/xdp/xsk_buff_pool.c b/net/xdp/xsk_buff_pool.c
index af040ffa14ff..87bdd71c7bb6 100644
--- a/net/xdp/xsk_buff_pool.c
+++ b/net/xdp/xsk_buff_pool.c
@@ -42,6 +42,16 @@ void xp_destroy(struct xsk_buff_pool *pool)
 	kvfree(pool);
 }
 
+int xp_alloc_tx_descs(struct xsk_buff_pool *pool, struct xdp_sock *xs)
+{
+	pool->tx_descs = kvcalloc(xs->tx->nentries, sizeof(*pool->tx_descs),
+				  GFP_KERNEL);
+	if (!pool->tx_descs)
+		return -ENOMEM;
+
+	return 0;
+}
+
 struct xsk_buff_pool *xp_create_and_assign_umem(struct xdp_sock *xs,
 						struct xdp_umem *umem)
 {
@@ -59,11 +69,9 @@ struct xsk_buff_pool *xp_create_and_assign_umem(struct xdp_sock *xs,
 	if (!pool->heads)
 		goto out;
 
-	if (xs->tx) {
-		pool->tx_descs = kcalloc(xs->tx->nentries, sizeof(*pool->tx_descs), GFP_KERNEL);
-		if (!pool->tx_descs)
+	if (xs->tx)
+		if (xp_alloc_tx_descs(pool, xs))
 			goto out;
-	}
 
 	pool->chunk_mask = ~((u64)umem->chunk_size - 1);
 	pool->addrs_cnt = umem->size;
-- 
2.27.0


^ permalink raw reply related

* Re: [PATCH] net: linkwatch: ignore events for unregistered netdevs
From: Jakub Kicinski @ 2022-04-25 15:36 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Jann Horn, Lukas Wunner, Paolo Abeni, Oliver Neukum,
	David S. Miller, Oleksij Rempel, netdev, USB list, Andrew Lunn,
	Jacky Chou, Willy Tarreau, Lino Sanfilippo, Philipp Rosenberger,
	Heiner Kallweit, Greg Kroah-Hartman
In-Reply-To: <CANn89iJPs13ndN3PCs6KDAetMUg7N5RzG9_ixvQCmswmcN28mw@mail.gmail.com>

On Mon, 25 Apr 2022 08:31:23 -0700 Eric Dumazet wrote:
> > Jann described a case where someone does
> >
> >     CPU 0      CPU 1     CPU 2
> >
> >   dev_hold()
> >    ------  #unregister -------
> >              dev_hold()
> >                          dev_put()
> >
> > Our check for refcount == 0 goes over the CPUs one by one,
> > so if it sums up CPUs 0 and 1 at the "unregister" point above
> > and CPU2 after the CPU1 hold and CPU2 release it will "miss"
> > one refcount.
> >
> > That's a problem unless doing a dev_hold() on a netdev we only have
> > a reference on is illegal.  
> 
> What is 'illegal' is trying to keep using the device after #unregister.
> 
> We have barriers to prevent that.
> 
> Somehow a layer does not care about the barriers and pretends the
> device is still good to use.
> 
> It is of course perfectly fine to stack multiple dev_hold() from one
> path (if these do not leak, but this is a different issue)

So we'd need something like

WARN_ON(dev->reg_state != NETREG_REGISTERED && !rtnl_held())

in dev_hold()?

^ permalink raw reply

* Re: [PATCH] net: phy: marvell10g: fix return value on error
From: Russell King (Oracle) @ 2022-04-25 15:34 UTC (permalink / raw)
  To: Baruch Siach; +Cc: Marek Behún, netdev, Baruch Siach
In-Reply-To: <f47cb031aeae873bb008ba35001607304a171a20.1650868058.git.baruch@tkos.co.il>

On Mon, Apr 25, 2022 at 09:27:38AM +0300, Baruch Siach wrote:
> From: Baruch Siach <baruch.siach@siklu.com>
> 
> Return back the error value that we get from phy_read_mmd().
> 
> Fixes: c84786fa8f91 ("net: phy: marvell10g: read copper results from CSSR1")
> Signed-off-by: Baruch Siach <baruch.siach@siklu.com>

Reviewed-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk>

Thanks!

-- 
RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
FTTP is here! 40Mbps down 10Mbps up. Decent connectivity at last!

^ permalink raw reply

* Re: [PATCH] net: linkwatch: ignore events for unregistered netdevs
From: Eric Dumazet @ 2022-04-25 15:31 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: Jann Horn, Lukas Wunner, Paolo Abeni, Oliver Neukum,
	David S. Miller, Oleksij Rempel, netdev, USB list, Andrew Lunn,
	Jacky Chou, Willy Tarreau, Lino Sanfilippo, Philipp Rosenberger,
	Heiner Kallweit, Greg Kroah-Hartman
In-Reply-To: <20220425082804.209e3676@kernel.org>

On Mon, Apr 25, 2022 at 8:28 AM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Mon, 25 Apr 2022 08:13:40 -0700 Eric Dumazet wrote:
> > dev_hold() has been an increment of a refcount, and dev_put() a decrement.
> >
> > Not sure why it is fundamentally broken.
>
> Jann described a case where someone does
>
>     CPU 0      CPU 1     CPU 2
>
>   dev_hold()
>    ------  #unregister -------
>              dev_hold()
>                          dev_put()
>
> Our check for refcount == 0 goes over the CPUs one by one,
> so if it sums up CPUs 0 and 1 at the "unregister" point above
> and CPU2 after the CPU1 hold and CPU2 release it will "miss"
> one refcount.
>
> That's a problem unless doing a dev_hold() on a netdev we only have
> a reference on is illegal.

What is 'illegal' is trying to keep using the device after #unregister.

We have barriers to prevent that.

Somehow a layer does not care about the barriers and pretends the
device is still good to use.

It is of course perfectly fine to stack multiple dev_hold() from one
path (if these do not leak, but this is a different issue)

>
> > There are specific steps at device dismantles making sure no more
> > users can dev_hold()
> >
> > It is a contract. Any buggy layer can overwrite any piece of memory,
> > including a refcount_t.
> >
> > Traditionally we could not add a test in dev_hold() to prevent an
> > increment if the device is in dismantle phase.
> > Maybe the situation is better nowadays.


>

^ permalink raw reply

* Re: [PATCH] net: linkwatch: ignore events for unregistered netdevs
From: Jakub Kicinski @ 2022-04-25 15:28 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: Jann Horn, Lukas Wunner, Paolo Abeni, Oliver Neukum,
	David S. Miller, Oleksij Rempel, netdev, USB list, Andrew Lunn,
	Jacky Chou, Willy Tarreau, Lino Sanfilippo, Philipp Rosenberger,
	Heiner Kallweit, Greg Kroah-Hartman
In-Reply-To: <CANn89iLwvqUJHBNifLESJyBQ85qjK42sK85Fs=QV4M7HqUXmxQ@mail.gmail.com>

On Mon, 25 Apr 2022 08:13:40 -0700 Eric Dumazet wrote:
> dev_hold() has been an increment of a refcount, and dev_put() a decrement.
> 
> Not sure why it is fundamentally broken.

Jann described a case where someone does

    CPU 0      CPU 1     CPU 2

  dev_hold()
   ------  #unregister -------
             dev_hold()
                         dev_put()

Our check for refcount == 0 goes over the CPUs one by one,
so if it sums up CPUs 0 and 1 at the "unregister" point above
and CPU2 after the CPU1 hold and CPU2 release it will "miss"
one refcount.

That's a problem unless doing a dev_hold() on a netdev we only have 
a reference on is illegal.

> There are specific steps at device dismantles making sure no more
> users can dev_hold()
> 
> It is a contract. Any buggy layer can overwrite any piece of memory,
> including a refcount_t.
> 
> Traditionally we could not add a test in dev_hold() to prevent an
> increment if the device is in dismantle phase.
> Maybe the situation is better nowadays.


^ permalink raw reply

* Re: [PATCH] net: linkwatch: ignore events for unregistered netdevs
From: Eric Dumazet @ 2022-04-25 15:23 UTC (permalink / raw)
  To: Jann Horn
  Cc: Jakub Kicinski, Lukas Wunner, Paolo Abeni, Oliver Neukum,
	David S. Miller, Oleksij Rempel, netdev, USB list, Andrew Lunn,
	Jacky Chou, Willy Tarreau, Lino Sanfilippo, Philipp Rosenberger,
	Heiner Kallweit, Greg Kroah-Hartman
In-Reply-To: <CAG48ez0nw7coDXYozaUOTThWLkHWZuKVUpMosY2hgVSSfeM4Pw@mail.gmail.com>

On Mon, Apr 25, 2022 at 8:19 AM Jann Horn <jannh@google.com> wrote:
>
> On Mon, Apr 25, 2022 at 5:13 PM Eric Dumazet <edumazet@google.com> wrote:
> > On Mon, Apr 25, 2022 at 8:01 AM Jakub Kicinski <kuba@kernel.org> wrote:
> > >
> > > On Mon, 25 Apr 2022 16:49:34 +0200 Jann Horn wrote:
> > > > > Doesn't mean we should make it legal. We can add a warning to catch
> > > > > abuses.
> > > >
> > > > That was the idea with
> > > > https://lore.kernel.org/netdev/20220128014303.2334568-1-jannh@google.com/,
> > > > but I didn't get any replies when I asked what the precise semantics
> > > > of dev_hold() are supposed to be
> > > > (https://lore.kernel.org/netdev/CAG48ez1-OyZETvrYAfaHicYW1LbrQUVp=C0EukSWqZrYMej73w@mail.gmail.com/),
> > > > so I don't know how to proceed...
> > >
> > > Yeah, I think after you pointed out that the netdev per cpu refcounting
> > > is fundamentally broken everybody decided to hit themselves with the
> > > obliviate spell :S
> >
> > dev_hold() has been an increment of a refcount, and dev_put() a decrement.
> >
> > Not sure why it is fundamentally broken.
>
> Well, it's not quite a refcount. It's a count that can be incremented
> and decremented but can't be read while the device is alive, and then
> at some point it turns into a count that can be read and decremented
> but can't be incremented (see
> https://lore.kernel.org/netdev/CAG48ez1-OyZETvrYAfaHicYW1LbrQUVp=C0EukSWqZrYMej73w@mail.gmail.com/).
> Normal refcounts allow anyone who is holding a reference to add
> another reference.

On a live netdev nothing wants to read the 'current refcount'.
We basically do not care.

>
> > There are specific steps at device dismantles making sure no more
> > users can dev_hold()
>
> So you're saying it's intentional that even if you're already holding
> a dev_hold() reference, you may not be allowed to call dev_hold()
> again?

I think you can/should not.
We might add a test in dev_hold() and catch offenders.

Then add a new api, (dev_hold() is void and can not propagate an
error), and eventually
fix offenders.

^ permalink raw reply


This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox