Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [PATCH net-next v4 1/2] net: pse-pd: add devm_pse_poll_helper()
From: Kory Maincent @ 2026-04-14 14:11 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: github, o.rempel, andrew+netdev, davem, edumazet, pabeni, krzk,
	krzk+dt, conor+dt, robh, netdev, linux-kernel, linux-leds, lkp
In-Reply-To: <20260414160506.4ce4d543@kmaincent-XPS-13-7390>

On Tue, 14 Apr 2026 16:05:06 +0200
Kory Maincent <kory.maincent@bootlin.com> wrote:

> On Mon, 13 Apr 2026 15:50:21 -0700
> Jakub Kicinski <kuba@kernel.org> wrote:
> 
> > This is an AI-generated review of your patch. The human sending this
> > email has considered the AI review valid, or at least plausible.
> > ---
> > net: pse-pd: add devm_pse_poll_helper()
> > 
> > Extract the common event handling loop from pse_isr() into a shared
> > pse_handle_events() function, and add a generic poll-based alternative
> > to the IRQ path for PSE controllers that lack interrupt support.
> >   
> > > Work cancellation on teardown is handled via devm_add_action_or_reset()
> > > to ensure the delayed work is cancelled before poll_notifs is freed
> > > by devres, avoiding a use-after-free when devm_pse_poll_helper() is
> > > called after devm_pse_controller_register() (devres LIFO ordering).    
> > 
> > Is it safe to rely on this devm registration order to prevent a
> > use-after-free?
> > 
> > Standard Linux driver initialization typically sets up interrupts and
> > pollers before device registration to prevent race conditions where an
> > unready device is exposed to userspace.
> > 
> > If a driver follows this standard ordering by calling the helper before
> > registration, devres will unregister the controller first. This will
> > tear down resources in pse_controller_unregister() while the poll work
> > remains active.
> > 
> > If the worker executes during this race window, could it access the
> > already-freed pcdev->pi array or push to the destroyed ntf_fifo? Would
> > it be safer to explicitly cancel asynchronous event sources within
> > pse_controller_unregister() instead?
> >   
> > > --- a/drivers/net/pse-pd/pse_core.c
> > > +++ b/drivers/net/pse-pd/pse_core.c    
> > [ ... ]  
> > > +static void pse_handle_events(struct pse_controller_dev *pcdev,
> > > +			      unsigned long *notifs,
> > > +			      unsigned long notifs_mask)
> > >  {
> > > -	struct pse_controller_dev *pcdev;
> > > -	unsigned long notifs_mask = 0;
> > > -	struct pse_irq_desc *desc;
> > > -	struct pse_irq *h = data;
> > > -	int ret, i;
> > > -
> > > -	desc = &h->desc;
> > > -	pcdev = h->pcdev;
> > > -
> > > -	/* Clear notifs mask */
> > > -	memset(h->notifs, 0, pcdev->nr_lines * sizeof(*h->notifs));
> > > -	mutex_lock(&pcdev->lock);
> > > -	ret = desc->map_event(irq, pcdev, h->notifs, &notifs_mask);
> > > -	if (ret || !notifs_mask) {
> > > -		mutex_unlock(&pcdev->lock);
> > > -		return IRQ_NONE;
> > > -	}
> > > +	int i;
> > >  
> > >  	for_each_set_bit(i, &notifs_mask, pcdev->nr_lines) {    
> > 
> > Can this lead to an out-of-bounds read on the stack if a controller
> > supports more than BITS_PER_LONG ports?
> > 
> > The notifs_mask variable is passed by value to pse_handle_events(), but
> > for_each_set_bit() takes its address. If pcdev->nr_lines is greater than
> > BITS_PER_LONG (e.g., a 48-port switch on a 32-bit architecture), the
> > macro will read past the function argument on the stack into uninitialized
> > memory.  
> 
> It's seems there is a possible out-of-bound issue in my code :/ Oops.
> Carlo, could you take a look and propose a fix? Otherwise, I'll handle it.

But currently it can't be reached as the only driver that support interrupt is
the TPS23881 with 8 ports.

Regards,
-- 
Köry Maincent, Bootlin
Embedded Linux and kernel engineering
https://bootlin.com

^ permalink raw reply

* Re: linux-next: manual merge of the bpf-next tree with the origin tree
From: Alexei Starovoitov @ 2026-04-14 14:09 UTC (permalink / raw)
  To: Mark Brown
  Cc: Daniel Borkmann, Alexei Starovoitov, Andrii Nakryiko, bpf,
	Networking, Joel Fernandes, Kumar Kartikeya Dwivedi,
	Linux Kernel Mailing List, Linux Next Mailing List,
	Paul E. McKenney
In-Reply-To: <ad4whCJuB-viVAae@sirena.org.uk>

On Tue, Apr 14, 2026 at 5:18 AM Mark Brown <broonie@kernel.org> wrote:
>
> Hi all,
>
> Today's linux-next merge of the bpf-next tree got a conflict in:
>
>   include/linux/rcupdate.h
>
> between commit:
>
>   ad6ef775cbeff ("rcu-tasks: Document that RCU Tasks Trace grace periods now imply RCU grace periods")
>
> from the origin tree and commit:
>
>   57b23c0f612dc ("bpf: Retire rcu_trace_implies_rcu_gp()")
>
> from the bpf-next tree.
>
> I fixed it up (see below) and can carry the fix as necessary. This
> is now fixed as far as linux-next is concerned, but any non trivial
> conflicts should be mentioned to your upstream maintainer when your tree
> is submitted for merging.  You may also want to consider cooperating
> with the maintainer of the conflicting tree to minimise any particularly
> complex conflicts.
>
> diff --combined include/linux/rcupdate.h
> index 18a85c30fd4f3,bfa765132de85..0000000000000
> --- a/include/linux/rcupdate.h
> +++ b/include/linux/rcupdate.h
> @@@ -205,15 -205,6 +205,6 @@@ static inline void exit_tasks_rcu_start
>   static inline void exit_tasks_rcu_finish(void) { }
>   #endif /* #else #ifdef CONFIG_TASKS_RCU_GENERIC */
>
> - /**
> -  * rcu_trace_implies_rcu_gp - does an RCU Tasks Trace grace period imply an RCU grace period?
> -  *
> -  * Now that RCU Tasks Trace is implemented in terms of SRCU-fast, a
> -  * call to synchronize_rcu_tasks_trace() is guaranteed to imply at least
> -  * one call to synchronize_rcu().
> -  */
> - static inline bool rcu_trace_implies_rcu_gp(void) { return true; }
> -

Right. I mentioned it in my bpf-next PR.

But how come you're saying it was discovered "today" ?

Paul's commit ad6ef775cbeff was committed to rcu tree on Mar 30,
while Kumar's 57b23c0f612dc was committed to bpf-next on Apr 7.

"today" is April 14.

My only explanation is that rcu tree was not in linux-next until today?!

^ permalink raw reply

* [PATCH net v2] net: airoha: Wait for NPU PPE configuration to complete in airoha_ppe_offload_setup()
From: Lorenzo Bianconi @ 2026-04-14 14:08 UTC (permalink / raw)
  To: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
	Paolo Abeni, Lorenzo Bianconi
  Cc: linux-arm-kernel, linux-mediatek, netdev

In order to properly enable flowtable hw offloading, poll
REG_PPE_FLOW_CFG register in airoha_ppe_offload_setup routine and
wait for NPU PPE configuration triggered by ppe_init callback to complete
before running airoha_ppe_hw_init().

Fixes: 00a7678310fe3 ("net: airoha: Introduce flowtable offload support")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
Changes in v2:
- Check for both REG_PPE_PPE_FLOW_CFG(0) and REG_PPE_PPE_FLOW_CFG(1) to
  complete.
- Link to v1: https://lore.kernel.org/r/20260412-airoha-wait-for-npu-config-offload-setup-v1-1-f4e0aa2a5d85@kernel.org
---
 drivers/net/ethernet/airoha/airoha_ppe.c | 28 ++++++++++++++++++++++++++++
 1 file changed, 28 insertions(+)

diff --git a/drivers/net/ethernet/airoha/airoha_ppe.c b/drivers/net/ethernet/airoha/airoha_ppe.c
index 62cfffb4f0e5..684c8ae9576f 100644
--- a/drivers/net/ethernet/airoha/airoha_ppe.c
+++ b/drivers/net/ethernet/airoha/airoha_ppe.c
@@ -1335,6 +1335,29 @@ static struct airoha_npu *airoha_ppe_npu_get(struct airoha_eth *eth)
 	return npu;
 }
 
+static int airoha_ppe_wait_for_npu_init(struct airoha_eth *eth)
+{
+	int err;
+	u32 val;
+
+	/* PPE_FLOW_CFG default register value is 0. Since we reset FE
+	 * during the device probe we can just check the configured value
+	 * is not 0 here.
+	 */
+	err = read_poll_timeout(airoha_fe_rr, val, val, USEC_PER_MSEC,
+				100 * USEC_PER_MSEC, false, eth,
+				REG_PPE_PPE_FLOW_CFG(0));
+	if (err)
+		return err;
+
+	if (airoha_ppe_is_enabled(eth, 1))
+		err = read_poll_timeout(airoha_fe_rr, val, val, USEC_PER_MSEC,
+					100 * USEC_PER_MSEC, false, eth,
+					REG_PPE_PPE_FLOW_CFG(1));
+
+	return err;
+}
+
 static int airoha_ppe_offload_setup(struct airoha_eth *eth)
 {
 	struct airoha_npu *npu = airoha_ppe_npu_get(eth);
@@ -1348,6 +1371,11 @@ static int airoha_ppe_offload_setup(struct airoha_eth *eth)
 	if (err)
 		goto error_npu_put;
 
+	/* Wait for NPU PPE configuration to complete */
+	err = airoha_ppe_wait_for_npu_init(eth);
+	if (err)
+		goto error_npu_put;
+
 	ppe_num_stats_entries = airoha_ppe_get_total_num_stats_entries(ppe);
 	if (ppe_num_stats_entries > 0) {
 		err = npu->ops.ppe_init_stats(npu, ppe->foe_stats_dma,

---
base-commit: b9d8b856689d2b968495d79fe653d87fcb8ad98c
change-id: 20260412-airoha-wait-for-npu-config-offload-setup-19d04522412d

Best regards,
-- 
Lorenzo Bianconi <lorenzo@kernel.org>


^ permalink raw reply related

* RE: [EXTERNAL] [PATCH net] netvsc: transfer lower device max tso size during VF transition
From: Haiyang Zhang @ 2026-04-14 14:08 UTC (permalink / raw)
  To: Li Tian, netdev@vger.kernel.org, linux-hyperv@vger.kernel.org
  Cc: linux-kernel@vger.kernel.org, Wei Liu, Dexuan Cui, Long Li,
	Andrew Lunn, Eric Dumazet, Vitaly Kuznetsov, Paolo Abeni,
	Jakub Kicinski, Jason Wang
In-Reply-To: <20260325045006.18607-1-litian@redhat.com>

> -----Original Message-----
> From: Li Tian <litian@redhat.com>
> Sent: Wednesday, March 25, 2026 12:50 AM
> To: netdev@vger.kernel.org; linux-hyperv@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org; Haiyang Zhang <haiyangz@microsoft.com>;
> Wei Liu <wei.liu@kernel.org>; Dexuan Cui <DECUI@microsoft.com>; Long Li
> <longli@microsoft.com>; Andrew Lunn <andrew+netdev@lunn.ch>; Eric Dumazet
> <edumazet@google.com>; Vitaly Kuznetsov <vkuznets@redhat.com>; Paolo Abeni
> <pabeni@redhat.com>; Jakub Kicinski <kuba@kernel.org>; Jason Wang
> <jasowang@redhat.com>; Li Tian <litian@redhat.com>
> Subject: [EXTERNAL] [PATCH net] netvsc: transfer lower device max tso size
> during VF transition
> 
> When netvsc is accelerated by the lower device, we can advertise the
> lower device max tso size in order to get better performance.
> While a long-term migration to user-space bonding is planned, current
> users on RHEL 10 / Azure are experiencing significant performance
> regressions in 802.3ad environments. This patch provides a localized,
> safe fix within netvsc without introducing new core networking helpers.
> 
> Signed-off-by: Li Tian <litian@redhat.com>
> ---
>  drivers/net/hyperv/netvsc_drv.c | 8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/hyperv/netvsc_drv.c
> b/drivers/net/hyperv/netvsc_drv.c
> index ee5ab5ceb2be..971607c7406f 100644
> --- a/drivers/net/hyperv/netvsc_drv.c
> +++ b/drivers/net/hyperv/netvsc_drv.c
> @@ -2428,10 +2428,14 @@ static int netvsc_vf_changed(struct net_device
> *vf_netdev, unsigned long event)
>  		 * This value is only increased for netvsc NIC when datapath
> is
>  		 * switched over to the VF
>  		 */
> -		if (vf_is_up)
> +		if (vf_is_up) {
>  			netif_set_tso_max_size(ndev, vf_netdev->tso_max_size);
> -		else
> +			WRITE_ONCE(ndev->gso_max_size, READ_ONCE(vf_netdev-
> >gso_max_size));
> +			WRITE_ONCE(ndev->gso_ipv4_max_size,
> +				   READ_ONCE(vf_netdev->gso_ipv4_max_size));
> +		} else {
>  			netif_set_tso_max_size(ndev, netvsc_dev-
> >netvsc_gso_max_size);
> +		}
>  	}
> 
>  	return NOTIFY_OK;
Thanks.

Reviewed-by: Haiyang Zhang <haiyangz@microsoft.com>


^ permalink raw reply

* Re: [PATCH net-next v4 1/2] net: pse-pd: add devm_pse_poll_helper()
From: Kory Maincent @ 2026-04-14 14:05 UTC (permalink / raw)
  To: Jakub Kicinski
  Cc: github, o.rempel, andrew+netdev, davem, edumazet, pabeni, krzk,
	krzk+dt, conor+dt, robh, netdev, linux-kernel, linux-leds, lkp
In-Reply-To: <20260413225021.3634733-1-kuba@kernel.org>

On Mon, 13 Apr 2026 15:50:21 -0700
Jakub Kicinski <kuba@kernel.org> wrote:

> This is an AI-generated review of your patch. The human sending this
> email has considered the AI review valid, or at least plausible.
> ---
> net: pse-pd: add devm_pse_poll_helper()
> 
> Extract the common event handling loop from pse_isr() into a shared
> pse_handle_events() function, and add a generic poll-based alternative
> to the IRQ path for PSE controllers that lack interrupt support.
> 
> > Work cancellation on teardown is handled via devm_add_action_or_reset()
> > to ensure the delayed work is cancelled before poll_notifs is freed
> > by devres, avoiding a use-after-free when devm_pse_poll_helper() is
> > called after devm_pse_controller_register() (devres LIFO ordering).  
> 
> Is it safe to rely on this devm registration order to prevent a
> use-after-free?
> 
> Standard Linux driver initialization typically sets up interrupts and
> pollers before device registration to prevent race conditions where an
> unready device is exposed to userspace.
> 
> If a driver follows this standard ordering by calling the helper before
> registration, devres will unregister the controller first. This will
> tear down resources in pse_controller_unregister() while the poll work
> remains active.
> 
> If the worker executes during this race window, could it access the
> already-freed pcdev->pi array or push to the destroyed ntf_fifo? Would
> it be safer to explicitly cancel asynchronous event sources within
> pse_controller_unregister() instead?
> 
> > --- a/drivers/net/pse-pd/pse_core.c
> > +++ b/drivers/net/pse-pd/pse_core.c  
> [ ... ]
> > +static void pse_handle_events(struct pse_controller_dev *pcdev,
> > +			      unsigned long *notifs,
> > +			      unsigned long notifs_mask)
> >  {
> > -	struct pse_controller_dev *pcdev;
> > -	unsigned long notifs_mask = 0;
> > -	struct pse_irq_desc *desc;
> > -	struct pse_irq *h = data;
> > -	int ret, i;
> > -
> > -	desc = &h->desc;
> > -	pcdev = h->pcdev;
> > -
> > -	/* Clear notifs mask */
> > -	memset(h->notifs, 0, pcdev->nr_lines * sizeof(*h->notifs));
> > -	mutex_lock(&pcdev->lock);
> > -	ret = desc->map_event(irq, pcdev, h->notifs, &notifs_mask);
> > -	if (ret || !notifs_mask) {
> > -		mutex_unlock(&pcdev->lock);
> > -		return IRQ_NONE;
> > -	}
> > +	int i;
> >  
> >  	for_each_set_bit(i, &notifs_mask, pcdev->nr_lines) {  
> 
> Can this lead to an out-of-bounds read on the stack if a controller
> supports more than BITS_PER_LONG ports?
> 
> The notifs_mask variable is passed by value to pse_handle_events(), but
> for_each_set_bit() takes its address. If pcdev->nr_lines is greater than
> BITS_PER_LONG (e.g., a 48-port switch on a 32-bit architecture), the
> macro will read past the function argument on the stack into uninitialized
> memory.

It's seems there is a possible out-of-bound issue in my code :/ Oops.
Carlo, could you take a look and propose a fix? Otherwise, I'll handle it.

Regards
-- 
Köry Maincent, Bootlin
Embedded Linux and kernel engineering
https://bootlin.com

^ permalink raw reply

* Re: [PATCH v2] Bluetooth: Add Broadcom channel priority commands
From: Luiz Augusto von Dentz @ 2026-04-14 14:00 UTC (permalink / raw)
  To: fnkl.kernel
  Cc: Sven Peter, Janne Grunau, Neal Gompa, Marcel Holtmann,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, linux-kernel, asahi, linux-arm-kernel,
	linux-bluetooth, netdev
In-Reply-To: <20260407-brcm-prio-v2-1-3f745edf49af@gmail.com>

Hi Sasha,

On Tue, Apr 7, 2026 at 1:46 PM Sasha Finkelstein via B4 Relay
<devnull+fnkl.kernel.gmail.com@kernel.org> wrote:
>
> From: Sasha Finkelstein <fnkl.kernel@gmail.com>
>
> Certain Broadcom bluetooth chips (bcm4377/bcm4378/bcm438) need ACL
> streams carrying audio to be set as "high priority" using a vendor
> specific command to prevent 10-ish second-long dropouts whenever
> something does a device scan. This patch sends the command when the
> socket priority is set to TC_PRIO_INTERACTIVE, as BlueZ does for audio.
>
> Signed-off-by: Sasha Finkelstein <fnkl.kernel@gmail.com>
> ---
> Changes in v2:
> - new ioctl got nack-ed, so let's use sk_priority as the trigger
> - Link to v1: https://lore.kernel.org/r/20260407-brcm-prio-v1-1-f38b17376640@gmail.com
> ---
>  MAINTAINERS                       |  2 ++
>  drivers/bluetooth/hci_bcm4377.c   |  2 ++
>  include/net/bluetooth/bluetooth.h |  4 ++++
>  include/net/bluetooth/hci_core.h  | 11 +++++++++++
>  net/bluetooth/Kconfig             |  7 +++++++
>  net/bluetooth/Makefile            |  1 +
>  net/bluetooth/brcm.c              | 29 +++++++++++++++++++++++++++++
>  net/bluetooth/brcm.h              | 17 +++++++++++++++++
>  net/bluetooth/hci_conn.c          | 28 ++++++++++++++++++++++++++++
>  net/bluetooth/l2cap_sock.c        | 13 +++++++++++++
>  10 files changed, 114 insertions(+)
>
> diff --git a/MAINTAINERS b/MAINTAINERS
> index c3fe46d7c4bc..81be021367ec 100644
> --- a/MAINTAINERS
> +++ b/MAINTAINERS
> @@ -2562,6 +2562,8 @@ F:        include/dt-bindings/pinctrl/apple.h
>  F:     include/linux/mfd/macsmc.h
>  F:     include/linux/soc/apple/*
>  F:     include/uapi/drm/asahi_drm.h
> +F:     net/bluetooth/brcm.c
> +F:     net/bluetooth/brcm.h
>
>  ARM/ARTPEC MACHINE SUPPORT
>  M:     Jesper Nilsson <jesper.nilsson@axis.com>
> diff --git a/drivers/bluetooth/hci_bcm4377.c b/drivers/bluetooth/hci_bcm4377.c
> index 925d0a635945..5f79920c0306 100644
> --- a/drivers/bluetooth/hci_bcm4377.c
> +++ b/drivers/bluetooth/hci_bcm4377.c
> @@ -2397,6 +2397,8 @@ static int bcm4377_probe(struct pci_dev *pdev, const struct pci_device_id *id)
>         if (bcm4377->hw->broken_le_ext_adv_report_phy)
>                 hci_set_quirk(hdev, HCI_QUIRK_FIXUP_LE_EXT_ADV_REPORT_PHY);
>
> +       hci_set_brcm_capable(hdev);
> +
>         pci_set_drvdata(pdev, bcm4377);
>         hci_set_drvdata(hdev, bcm4377);
>         SET_HCIDEV_DEV(hdev, &pdev->dev);
> diff --git a/include/net/bluetooth/bluetooth.h b/include/net/bluetooth/bluetooth.h
> index 69eed69f7f26..07a250673950 100644
> --- a/include/net/bluetooth/bluetooth.h
> +++ b/include/net/bluetooth/bluetooth.h
> @@ -457,6 +457,7 @@ struct l2cap_ctrl {
>  };
>
>  struct hci_dev;
> +struct hci_conn;
>
>  typedef void (*hci_req_complete_t)(struct hci_dev *hdev, u8 status, u16 opcode);
>  typedef void (*hci_req_complete_skb_t)(struct hci_dev *hdev, u8 status,
> @@ -469,6 +470,9 @@ void hci_req_cmd_complete(struct hci_dev *hdev, u16 opcode, u8 status,
>  int hci_ethtool_ts_info(unsigned int index, int sk_proto,
>                         struct kernel_ethtool_ts_info *ts_info);
>
> +int hci_conn_setsockopt(struct hci_conn *conn, struct sock *sk, int level,
> +                       int optname, sockptr_t optval, unsigned int optlen);
> +
>  #define HCI_REQ_START  BIT(0)
>  #define HCI_REQ_SKB    BIT(1)
>
> diff --git a/include/net/bluetooth/hci_core.h b/include/net/bluetooth/hci_core.h
> index a7bffb908c1e..947e7c2b08dd 100644
> --- a/include/net/bluetooth/hci_core.h
> +++ b/include/net/bluetooth/hci_core.h
> @@ -642,6 +642,10 @@ struct hci_dev {
>         bool                    aosp_quality_report;
>  #endif
>
> +#if IS_ENABLED(CONFIG_BT_BRCMEXT)
> +       bool                    brcm_capable;
> +#endif
> +
>         int (*open)(struct hci_dev *hdev);
>         int (*close)(struct hci_dev *hdev);
>         int (*flush)(struct hci_dev *hdev);
> @@ -1791,6 +1795,13 @@ static inline void hci_set_aosp_capable(struct hci_dev *hdev)
>  #endif
>  }
>
> +static inline void hci_set_brcm_capable(struct hci_dev *hdev)
> +{
> +#if IS_ENABLED(CONFIG_BT_BRCMEXT)
> +       hdev->brcm_capable = true;
> +#endif
> +}
> +
>  static inline void hci_devcd_setup(struct hci_dev *hdev)
>  {
>  #ifdef CONFIG_DEV_COREDUMP
> diff --git a/net/bluetooth/Kconfig b/net/bluetooth/Kconfig
> index 6b2b65a66700..0f2a5fbcafc5 100644
> --- a/net/bluetooth/Kconfig
> +++ b/net/bluetooth/Kconfig
> @@ -110,6 +110,13 @@ config BT_AOSPEXT
>           This options enables support for the Android Open Source
>           Project defined HCI vendor extensions.
>
> +config BT_BRCMEXT
> +       bool "Enable Broadcom extensions"
> +       depends on BT
> +       help
> +         This option enables support for the Broadcom defined HCI
> +         vendor extensions.
> +
>  config BT_DEBUGFS
>         bool "Export Bluetooth internals in debugfs"
>         depends on BT && DEBUG_FS
> diff --git a/net/bluetooth/Makefile b/net/bluetooth/Makefile
> index a7eede7616d8..b4c9013a46ce 100644
> --- a/net/bluetooth/Makefile
> +++ b/net/bluetooth/Makefile
> @@ -24,5 +24,6 @@ bluetooth-$(CONFIG_BT_LE) += iso.o
>  bluetooth-$(CONFIG_BT_LEDS) += leds.o
>  bluetooth-$(CONFIG_BT_MSFTEXT) += msft.o
>  bluetooth-$(CONFIG_BT_AOSPEXT) += aosp.o
> +bluetooth-$(CONFIG_BT_BRCMEXT) += brcm.o
>  bluetooth-$(CONFIG_BT_DEBUGFS) += hci_debugfs.o
>  bluetooth-$(CONFIG_BT_SELFTEST) += selftest.o
> diff --git a/net/bluetooth/brcm.c b/net/bluetooth/brcm.c
> new file mode 100644
> index 000000000000..9aa0a265ab3d
> --- /dev/null
> +++ b/net/bluetooth/brcm.c
> @@ -0,0 +1,29 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Copyright (C) 2026 The Asahi Linux Contributors
> + */
> +
> +#include <net/bluetooth/bluetooth.h>
> +#include <net/bluetooth/hci_core.h>
> +
> +#include "brcm.h"
> +
> +int brcm_set_high_priority(struct hci_dev *hdev, u16 handle, bool enable)
> +{
> +       struct sk_buff *skb;
> +       u8 cmd[3];
> +
> +       if (!hdev->brcm_capable)
> +               return 0;
> +
> +       cmd[0] = handle;
> +       cmd[1] = handle >> 8;

Adding a packed struct and then using something like cpu_to_le16 is
probably preferable over above.

> +       cmd[2] = !!enable;
> +
> +       skb = hci_cmd_sync(hdev, 0xfc57, sizeof(cmd), cmd, HCI_CMD_TIMEOUT);
> +       if (IS_ERR(skb))
> +               return PTR_ERR(skb);
> +
> +       kfree_skb(skb);
> +       return 0;
> +}
> diff --git a/net/bluetooth/brcm.h b/net/bluetooth/brcm.h
> new file mode 100644
> index 000000000000..fdaee63bd1d2
> --- /dev/null
> +++ b/net/bluetooth/brcm.h
> @@ -0,0 +1,17 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (C) 2026 The Asahi Linux Contributors
> + */
> +
> +#if IS_ENABLED(CONFIG_BT_BRCMEXT)
> +
> +int brcm_set_high_priority(struct hci_dev *hdev, u16 handle, bool enable);
> +
> +#else
> +
> +static inline int brcm_set_high_priority(struct hci_dev *hdev, u16 handle, bool enable)
> +{
> +       return 0;
> +}
> +
> +#endif
> diff --git a/net/bluetooth/hci_conn.c b/net/bluetooth/hci_conn.c
> index 11d3ad8d2551..096163840f62 100644
> --- a/net/bluetooth/hci_conn.c
> +++ b/net/bluetooth/hci_conn.c
> @@ -35,6 +35,7 @@
>  #include <net/bluetooth/iso.h>
>  #include <net/bluetooth/mgmt.h>
>
> +#include "brcm.h"
>  #include "smp.h"
>  #include "eir.h"
>
> @@ -3070,6 +3071,33 @@ int hci_conn_set_phy(struct hci_conn *conn, u32 phys)
>         }
>  }
>
> +int hci_conn_setsockopt(struct hci_conn *conn, struct sock *sk, int level,
> +                       int optname, sockptr_t optval, unsigned int optlen)
> +{
> +       int val;
> +       bool old_high, new_high, changed;
> +
> +       if (level != SOL_SOCKET)
> +               return 0;
> +
> +       if (optname != SO_PRIORITY)
> +               return 0;
> +
> +       if (optlen < sizeof(int))
> +               return -EINVAL;
> +
> +       if (copy_from_sockptr(&val, optval, sizeof(val)))
> +               return -EFAULT;
> +
> +       old_high = sk->sk_priority >= TC_PRIO_INTERACTIVE;
> +       new_high = val >= TC_PRIO_INTERACTIVE;
> +       changed = old_high != new_high;
> +       if (!changed)
> +               return 0;
> +
> +       return brcm_set_high_priority(conn->hdev, conn->handle, new_high);

The skb carries the priority (skb->priority), not sure why you need to
capture the sk_priority instead, doing so ignores the load balance
that hci_core performs to avoid starving connections.

> +}
> +
>  static int abort_conn_sync(struct hci_dev *hdev, void *data)
>  {
>         struct hci_conn *conn = data;
> diff --git a/net/bluetooth/l2cap_sock.c b/net/bluetooth/l2cap_sock.c
> index 71e8c1b45bce..d5eef87accc4 100644
> --- a/net/bluetooth/l2cap_sock.c
> +++ b/net/bluetooth/l2cap_sock.c
> @@ -891,6 +891,16 @@ static int l2cap_sock_setsockopt(struct socket *sock, int level, int optname,
>
>         BT_DBG("sk %p", sk);
>
> +       if (level == SOL_SOCKET) {
> +               conn = chan->conn;
> +               if (conn)
> +                       err = hci_conn_setsockopt(conn->hcon, sock->sk, level,
> +                                                 optname, optval, optlen);
> +               if (err)
> +                       return err;
> +               return sock_setsockopt(sock, level, optname, optval, optlen);
> +       }
> +
>         if (level == SOL_L2CAP)
>                 return l2cap_sock_setsockopt_old(sock, optname, optval, optlen);
>
> @@ -1931,6 +1941,9 @@ static struct sock *l2cap_sock_alloc(struct net *net, struct socket *sock,
>
>         INIT_LIST_HEAD(&l2cap_pi(sk)->rx_busy);
>
> +       if (sock)
> +               set_bit(SOCK_CUSTOM_SOCKOPT, &sock->flags);

This is more complicated than it needs to be. I'd just add a new
callback, `hdev->set_priority(handle, skb->priority)`, so the driver
is called whenever it needs to elevate a connection's priority, that
said there could be cases where a connection needs its priority set
momentarily to transmit A2DP, followed by OBEX packets that are best
effort. Therefore, `hci_conn` will probably need to track the priority
so it can detect when it needs changing on a per skb basis.

>         chan = l2cap_chan_create();
>         if (!chan) {
>                 sk_free(sk);
>
> ---
> base-commit: bfe62a454542cfad3379f6ef5680b125f41e20f4
> change-id: 20260407-brcm-prio-b630e6cc3834
>
> Best regards,
> --
> Sasha Finkelstein <fnkl.kernel@gmail.com>
>
>


-- 
Luiz Augusto von Dentz

^ permalink raw reply

* Re: [PATCH bpf-next 1/2] bpf: tcp: Reject TCP_NODELAY from BPF hdr opt callbacks
From: KaFai Wan @ 2026-04-14 13:56 UTC (permalink / raw)
  To: edumazet, ncardwell, kuniyu, davem, dsahern, kuba, pabeni, horms,
	ast, daniel, andrii, martin.lau, eddyz87, memxor, song,
	yonghong.song, jolsa, shuah, sdf, netdev, linux-kernel, bpf,
	linux-kselftest
  Cc: Quan Sun, Yinhao Hu, Kaiyan Mei
In-Reply-To: <20260414112310.1285783-2-kafai.wan@linux.dev>

On Tue, 2026-04-14 at 19:23 +0800, KaFai Wan wrote:

AI is right and I'm late for the issue. Please ignore this. Sorry for the noise.

> A BPF_SOCK_OPS program can enable
> BPF_SOCK_OPS_WRITE_HDR_OPT_CB_FLAG and then call
> bpf_setsockopt(TCP_NODELAY) from BPF_SOCK_OPS_HDR_OPT_LEN_CB.
> 
> That reaches __tcp_sock_set_nodelay(), which may call
> tcp_push_pending_frames(). The transmit path then computes TCP
> options again, re-enters bpf_skops_hdr_opt_len(), and invokes the
> same BPF callback recursively. This can loop until the kernel
> stack overflows.
> 
> TCP_NODELAY is not safe from the header option callback context.
> Reject it with -EOPNOTSUPP when TCP header option callbacks are
> enabled on the socket, so the callback cannot recurse back into
> tcp_push_pending_frames() through do_tcp_setsockopt().
> 
> Reported-by: Quan Sun <2022090917019@std.uestc.edu.cn>
> Reported-by: Yinhao Hu <dddddd@hust.edu.cn>
> Reported-by: Kaiyan Mei <M202472210@hust.edu.cn>
> Closes: https://lore.kernel.org/bpf/d1d523c9-6901-4454-a183-94462b8f3e4e@std.uestc.edu.cn/
> Fixes: 7e41df5dbba2 ("bpf: Add a few optnames to bpf_setsockopt")
> Signed-off-by: KaFai Wan <kafai.wan@linux.dev>
> ---
>  net/ipv4/tcp.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/net/ipv4/tcp.c b/net/ipv4/tcp.c
> index 202a4e57a218..7ac4c98be19d 100644
> --- a/net/ipv4/tcp.c
> +++ b/net/ipv4/tcp.c
> @@ -4004,7 +4004,10 @@ int do_tcp_setsockopt(struct sock *sk, int level, int optname,
>  
>  	switch (optname) {
>  	case TCP_NODELAY:
> -		__tcp_sock_set_nodelay(sk, val);
> +		if (val && BPF_SOCK_OPS_TEST_FLAG(tp, BPF_SOCK_OPS_WRITE_HDR_OPT_CB_FLAG))
> +			err = -EOPNOTSUPP;
> +		else
> +			__tcp_sock_set_nodelay(sk, val);
>  		break;
>  
>  	case TCP_THIN_LINEAR_TIMEOUTS:

-- 
Thanks,
KaFai

^ permalink raw reply

* Re: [PATCH net] net: ethernet: ravb: Do not check URAM suspension when WoL is active
From: Simon Horman @ 2026-04-14 13:56 UTC (permalink / raw)
  To: Niklas Söderlund
  Cc: Paul Barker, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Paolo Abeni, Yoshihiro Shimoda,
	Geert Uytterhoeven, netdev, linux-renesas-soc
In-Reply-To: <20260412173213.3179426-1-niklas.soderlund+renesas@ragnatech.se>

On Sun, Apr 12, 2026 at 07:32:13PM +0200, Niklas Söderlund wrote:
> When updating the driver to match latest datasheet to suspend access to
> URAM when suspending DMA transfers a corner-case was missed, URAM access
> will not be suspended if WoL is enabled. This lead to the error message
> (correctly) being triggered as URAM access is not suspended even tho
> it's requested as part of stopping DMA.
> 
> Avoid checking if URAM access is suspended and printing the error
> message if WoL is enabled when we suspend the system, as we know it will
> not be.
> 
> Reported-by: Geert Uytterhoeven <geert@linux-m68k.org>
> Closes: https://lore.kernel.org/all/CAMuHMdWnjV%3DHGE1o08zLhUfTgOSene5fYx1J5GG10mB%2BToq8qg@mail.gmail.com/
> Fixes: 353d8e7989b6 ("net: ethernet: ravb: Suspend and resume the transmission flow")
> Signed-off-by: Niklas Söderlund <niklas.soderlund+renesas@ragnatech.se>

Hi Niklas,

This is a bit awkward.

1. This patch doesn't apply cleanly to net (yet). Because the cited
   commit, which is a dependency, has not propagated there.

2. OTHO, net-next is closed for the merge window.

Regardless of the 2nd point, I'm suspecting that the best option is to
repost this targeting net-next.

...

-- 
pw-bot: changes-requested

^ permalink raw reply

* Re: [PATCH v3 1/3] net: dsa: microchip: implement KSZ87xx Module 3 low-loss cable errata
From: Fidelio LAWSON @ 2026-04-14 13:48 UTC (permalink / raw)
  To: Andrew Lunn, Marek Vasut
  Cc: Woojung Huh, UNGLinuxDriver, Vladimir Oltean, David S. Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, Marek Vasut,
	Maxime Chevallier, Simon Horman, Heiner Kallweit, Russell King,
	netdev, linux-kernel, Fidelio Lawson
In-Reply-To: <d9b161dd-f698-4d7e-8ccb-9ec12411bf87@lunn.ch>

On 4/14/26 14:40, Andrew Lunn wrote:
> On Tue, Apr 14, 2026 at 01:05:49PM +0200, Marek Vasut wrote:
>> On 4/14/26 11:12 AM, Fidelio Lawson wrote:
>>> Implement the "Module 3: Equalizer fix for short cables" erratum from
>>> Microchip document DS80000687C for KSZ87xx switches.
>>>
>>> The issue affects short or low-loss cable links (e.g. CAT5e/CAT6),
>>> where the PHY receiver equalizer may amplify high-amplitude signals
>>> excessively, resulting in internal distortion and link establishment
>>> failures.
>>>
>>> KSZ87xx devices require a workaround for the Module 3 low-loss cable
>>> condition, controlled through the switch TABLE_LINK_MD_V indirect
>>> registers.
>>>
>>> The affected registers are part of the switch address space and are not
>>> directly accessible from the PHY driver. To keep the PHY-facing API
>>> clean and avoid leaking switch-specific details, model this errata
>>> control as vendor-specific Clause 22 PHY registers.
>>>
>>> A vendor-specific Clause 22 PHY register is introduced as a mode
>>> selector in PHY_REG_LOW_LOSS_CTRL, and ksz8_r_phy() / ksz8_w_phy()
>>> translate accesses to these bits into the appropriate indirect
>>> TABLE_LINK_MD_V accesses.
>>>
>>> The control register defines the following modes:
>>> 0: disabled (default behavior)
>>> 1: EQ training workaround
>>> 2: LPF 90 MHz
>>> 3: LPF 62 MHz
>>> 4: LPF 55 MHz
>>> 5: LPF 44 MHz
>> I may not fully understand this, but aren't the EQ and LPF settings
>> orthogonal ?
> 
> What is the real life experience using this feature? Is it needed for
> 1cm cables, but most > 1m cables are O.K with the defaults? Do we need
> all these configuration options? How is a user supposed to discover
> the different options? Can we simplify it down to a Boolean?
> 
> Ethernet is just supposed to work with any valid length of cable,
> KISS. So maybe we should try to keep this feature KISS. Just tell the
> driver it is a short cable, pick different defaults which should work
> with any short cable?
> 
> A boolean should also help with making this tunable reusable with
> other devices. It is unlikely any other devices have these same
> configuration options, unless it is from the same vendor.
> 
>       Andrew

The issue has been observed with very short or low‑loss
cables, typically in industrial or embedded setups where the cable is
below 3m or in a board-to-board setup.

 From our practical experience, this issue occurs in our setup where a 
very short CAT‑6e cable (~20cm) is used.
We were seeing random link dropouts with the default settings, and since 
enabling the workaround 2, the link has remained stable and we have not 
observed any further issues.

We don’t need all these configuration options.

According to the Microchip erratum, the user should try workaround 1 (EQ 
training), and if that does not resolve the random link dropouts,
fall back to workaround 2 by reducing the LPF bandwidth to 62MHz.

Since this procedure for determining which workaround is effective is 
inherently experimental and requires observation in real deployments, 
this is why I originally chose to expose the selection of the workaround 
to the user, at least allowing them to choose between workaround 1 and 
workaround 2.

regards

Fidelio




^ permalink raw reply

* Re: [PATCH net] slip: reject VJ frames when no receive slots are allocated
From: Simon Horman @ 2026-04-14 13:41 UTC (permalink / raw)
  To: bestswngs
  Cc: 'Simon Horman', andrew+netdev, davem, edumazet, kuba,
	pabeni, kees, netdev, xmei5
In-Reply-To: <20260412154252.2060940-2-bestswngs@gmail.com>

From: 'Simon Horman' <horms@kernel.org>

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://sashiko.dev
---
slip: reject VJ frames when no receive slots are allocated

This commit prevents a null pointer dereference by adding checks for
a NULL comp->rstate in slhc_uncompress() and slhc_remember(). It ensures
frames are rejected when no receive slots are allocated.

> diff --git a/drivers/net/slip/slhc.c b/drivers/net/slip/slhc.c
> index e3c785da3eef3..e67052bcab57b 100644
> --- a/drivers/net/slip/slhc.c
> +++ b/drivers/net/slip/slhc.c
> @@ -502,6 +502,10 @@ slhc_uncompress(struct slcompress *comp, unsigned char *icp, int isize)
>  
>  	/* We've got a compressed packet; read the change byte */
>  	comp->sls_i_compressed++;
> +	if (!comp->rstate) {
> +		comp->sls_i_error++;
> +		return 0;
> +	}

While reviewing this code, a few pre-existing issues were noticed in and
around slhc_uncompress().

Can slhc_uncompress() cause a heap buffer overflow?

It expands compressed packets by prepending reconstructed IP and TCP
headers, which can total up to 120 bytes if options are maximized. The
caller in slip.c assumes a maximum expansion of 80 bytes:

    if (count + 80 > sl->buffsize)

If a compressed packet is sized at sl->buffsize - 80, slhc_uncompress()
shifts the payload and copies the 120-byte headers, writing up to 37 bytes
past the dynamically allocated sl->rbuff chunk. Does this corrupt memory?

Also, does the decode() function read variable-length deltas without
bounds checking against the packet length isize?

A malformed packet with multiple change flags set could cause decode()
to read up to 15 bytes past the end of the packet data. Because the packet
buffer sl->rbuff is allocated via kmalloc without zero-initialization,
could this read uninitialized kernel heap memory and inject it into the
reconstructed TCP/IP headers, leaking it back to the network?

Finally, do slhc_compress() and slhc_uncompress() read and write the
16-bit TCP checksum using direct pointer casts?

drivers/net/slip/slhc.c:slhc_uncompress() {
    ...
    thp->check = *(__sum16 *)cp;
    ...
}

Since the compressed header varies dynamically in size, the cp pointer
frequently lands on odd byte boundaries. On architectures with strict
alignment requirements, could this direct dereference trigger an alignment
fault and kernel panic? Should the code use get_unaligned() and
put_unaligned() instead?

[ ... ]

> @@ -651,8 +655,9 @@ slhc_remember(struct slcompress *comp, unsigned char *icp, int isize)
>  
>  	/* The packet is shorter than a legal IP header.
>  	 * Also make sure isize is positive.
> +	 * Reject if no receive slots are configured (rstate is NULL).
>  	 */
> -	if (isize < (int)sizeof(struct iphdr)) {
> +	if (!comp->rstate || isize < (int)sizeof(struct iphdr)) {
>  runt:
>  		comp->sls_i_runt++;
>  		return slhc_toss(comp);

When dropping a packet due to a lack of configured receive slots
(rslots == 0), should this increment comp->sls_i_error instead of
comp->sls_i_runt?

Dropping a packet because rstate is missing is an internal configuration
state, not a runt packet.

^ permalink raw reply

* Re: [PATCH RFC bpf-next 8/8] selftests/bpf: add tests to validate KASAN on JIT programs
From: Alexis Lothoré @ 2026-04-14 13:43 UTC (permalink / raw)
  To: Andrey Konovalov, Alexis Lothoré (eBPF Foundation)
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
	Song Liu, Yonghong Song, Jiri Olsa, John Fastabend,
	David S. Miller, David Ahern, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Shuah Khan,
	Maxime Coquelin, Alexandre Torgue, Andrey Ryabinin,
	Alexander Potapenko, Dmitry Vyukov, Vincenzo Frascino,
	Andrew Morton, ebpf, Bastien Curutchet, Thomas Petazzoni,
	Xu Kuohai, bpf, linux-kernel, netdev, linux-kselftest,
	linux-stm32, linux-arm-kernel, kasan-dev, linux-mm
In-Reply-To: <CA+fCnZekgcEgsZnRrOB=+HoG=neRg=oLTt2jStyrPJ6mYf2ctQ@mail.gmail.com>

On Tue Apr 14, 2026 at 12:20 AM CEST, Andrey Konovalov wrote:
> On Mon, Apr 13, 2026 at 8:29 PM Alexis Lothoré (eBPF Foundation)
> <alexis.lothore@bootlin.com> wrote:
>>
>> Add a basic KASAN test runner that loads and test-run programs that can
>> trigger memory management bugs. The test captures kernel logs and ensure
>> that the expected KASAN splat is emitted by searching for the
>> corresponding first lines in the report.
>>
>> This version implements two faulty programs triggering either a
>> user-after-free, or an out-of-bounds memory usage. The bugs are
>> triggered thanks to some dedicated kfuncs in bpf_testmod.c, but two
>> different techniques are used, as some cases can be quite hard to
>> trigger in a pure "black box" approach:
>> - for reads, we can make the used kfuncs return some faulty pointers
>>   that ebpf programs will manipulate, they will generate legitimate
>>   kasan reports as a consequence
>> - applying the same trick for faulty writes is harder, as ebpf programs
>>   can't write kernel data freely. So ebpf programs can call another
>>   specific testing kfunc that will alter the shadow memory matching the
>>   passed memory (eg: a map). When the program will try to write to the
>>   corresponding memory, it will trigger a report as well.
>>
>> Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com>
>> ---
>> The way of bringing kasan_poison into bpf_testmod is definitely not
>> ideal.  But I would like to validate the testing approach (triggering
>> real faulty accesses, which is hard on some cases, VS manually poisoning
>> BPF-manipulated memory) before eventually making clean bridges between
>> KASAN APIs and bpf_testmod.c, if the latter approach is the valid one.
>
> Would it make sense to put these tests into KASAN KUnit tests in
> mm/kasan/kasan_test_c.c? I assume there is a kernel API to JIT BPF
> programs from the kernel itself?

Possibly indeed, but I think one important use case will be the
possibility to run those tests in BPF CI ([1]), and this depends on
those tests being integrated in the test_progs framework
(tools/testing/selftests/bpf)

[1] https://github.com/kernel-patches/bpf/pulls

> There, you can just call kasan_poison(), some tests already do this.
> And you can also extend the KASAN KUnit test framework to find out
> whether the bad access is a read or write, if you want to check this.

Alexis
-- 
Alexis Lothoré, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com


^ permalink raw reply

* Re: [PATCH iwl-next v2 7/8] ixgbe: use GFP_KERNEL in ixgbe_fcoe_ddp_setup()
From: Simon Horman @ 2026-04-14 13:29 UTC (permalink / raw)
  To: aleksandr.loktionov
  Cc: Simon Horman, anthony.l.nguyen, marcin.szycik, intel-wired-lan,
	jacob.e.keller, netdev
In-Reply-To: <20260408131216.2662245-8-aleksandr.loktionov@intel.com>

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.

For local reproduction steps see https://netdev-ai.bots.linux.dev/ai-local.html
---
ixgbe: use GFP_KERNEL in ixgbe_fcoe_ddp_setup()

This commit restructures ixgbe_fcoe_ddp_setup() to use GFP_KERNEL
instead of GFP_ATOMIC by splitting get_cpu()/put_cpu() usage into two
narrow critical sections, moving DMA mapping and pool allocation between
them where preemption is enabled.

> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_fcoe.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_fcoe.c
> index 011fda9c6193..e338ff0e6522 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_fcoe.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_fcoe.c

[ ... ]

> @@ -179,29 +180,43 @@ static int ixgbe_fcoe_ddp_setup(struct net_device *netdev, u16 xid,
>  		return 0;
>  	}
>
> +	/* Pin to current CPU only to read the per-CPU pool pointer; drop
> +	 * the pin before any allocations that may sleep under direct reclaim.
> +	 */
>  	ddp_pool = per_cpu_ptr(fcoe->ddp_pool, get_cpu());
>  	if (!ddp_pool->pool) {
>  		e_warn(drv, "xid=0x%x no ddp pool for fcoe\n", xid);
> -		goto out_noddp;
> +		put_cpu();
> +		return 0;
>  	}
> +	pool = ddp_pool->pool;
> +	put_cpu();
>
>  	/* setup dma from scsi command sgl */
>  	dmacount = dma_map_sg(&adapter->pdev->dev, sgl, sgc, DMA_FROM_DEVICE);
>  	if (dmacount == 0) {
>  		e_err(drv, "xid 0x%x DMA map error\n", xid);
> -		goto out_noddp;
> +		return 0;
>  	}
>
> -	/* alloc the udl from per cpu ddp pool */
> -	ddp->udl = dma_pool_alloc(ddp_pool->pool, GFP_ATOMIC, &ddp->udp);
> +	/* Allocate from per-CPU pool; GFP_KERNEL is safe: preemption is
> +	 * re-enabled after the put_cpu() above.  Per-CPU DMA pools are only
> +	 * destroyed under RTNL during interface teardown, so the saved pool
> +	 * pointer remains valid.
> +	 */
> +	ddp->udl = dma_pool_alloc(pool, GFP_KERNEL, &ddp->udp);

Is this safe given the calling context?

Looking at the call path from the FCoE stack:

fc_exch_seq_send()
  fc_exch_alloc()
    fc_exch_em_alloc()
      spin_lock_bh(&pool->lock)
      # allocate exchange
      # set ep->ex_lock = held
      return with ep->ex_lock still held
  fc_fcp_ddp_setup()
    lport->tt.ddp_setup()
      .ndo_fcoe_ddp_setup() == ixgbe_fcoe_ddp_get()
        ixgbe_fcoe_ddp_setup()  # <-- we are here
  # ... continues with ep->ex_lock held
  spin_unlock_bh(&ep->ex_lock)

fc_exch_em_alloc() acquires ep->ex_lock (a spinlock) and returns with
it held. The lock remains held through fc_fcp_ddp_setup() and into
ixgbe_fcoe_ddp_setup(). Additionally, spin_lock_bh() disables bottom
halves.

Can GFP_KERNEL be used while holding ep->ex_lock with BH disabled, or
does this require GFP_ATOMIC?

^ permalink raw reply

* Re: [PATCH v2] wireguard: device: use exit_rtnl callback instead of manual rtnl_lock in pre_exit
From: Jason A. Donenfeld @ 2026-04-14 13:28 UTC (permalink / raw)
  To: Shardul Bankar
  Cc: kuniyu, andrew+netdev, davem, edumazet, kuba, pabeni, wireguard,
	netdev, linux-kernel, janak, kalpan.jani, shardulsb08,
	syzbot+f2fbf7478a35a94c8b7c
In-Reply-To: <20260413151232.1004611-1-shardul.b@mpiricsoftware.com>

Hi Shardul,

On Mon, Apr 13, 2026 at 5:13 PM Shardul Bankar
<shardul.b@mpiricsoftware.com> wrote:
>
> wg_netns_pre_exit() manually acquires rtnl_lock() inside the
> pernet .pre_exit callback.  This causes a hung task when another
> thread holds rtnl_mutex - the cleanup_net workqueue (or the
> setup_net failure rollback path) blocks indefinitely in
> wg_netns_pre_exit() waiting to acquire the lock.
>
> Convert to .exit_rtnl, introduced in commit 7a60d91c690b ("net:
> Add ->exit_rtnl() hook to struct pernet_operations."), where the
> framework already holds RTNL and batches all callbacks under a
> single rtnl_lock()/rtnl_unlock() pair, eliminating the contention
> window.
>
> The rcu_assign_pointer(wg->creating_net, NULL) is safe to move
> from .pre_exit to .exit_rtnl (which runs after synchronize_rcu())
> because all RCU readers of creating_net either use maybe_get_net()
> - which returns NULL for a dying namespace with zero refcount - or
> access net->user_ns which remains valid throughout the entire
> ops_undo_list sequence.
>
> Reported-by: syzbot+f2fbf7478a35a94c8b7c@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?id=cb64c22a492202ca929e18262fdb8cb89e635c70
> Signed-off-by: Shardul Bankar <shardul.b@mpiricsoftware.com>

Thanks. Applied to the wireguard tree, and also added the missing
__net_exit and __read_mostly annotations in the process.

Jason

^ permalink raw reply

* Re: [PATCH RFC bpf-next 3/8] bpf: add BPF_JIT_KASAN for KASAN instrumentation of JITed programs
From: Alexis Lothoré @ 2026-04-14 13:24 UTC (permalink / raw)
  To: Andrey Konovalov, Alexis Lothoré (eBPF Foundation)
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
	Song Liu, Yonghong Song, Jiri Olsa, John Fastabend,
	David S. Miller, David Ahern, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Shuah Khan,
	Maxime Coquelin, Alexandre Torgue, Andrey Ryabinin,
	Alexander Potapenko, Dmitry Vyukov, Vincenzo Frascino,
	Andrew Morton, ebpf, Bastien Curutchet, Thomas Petazzoni,
	Xu Kuohai, bpf, linux-kernel, netdev, linux-kselftest,
	linux-stm32, linux-arm-kernel, kasan-dev, linux-mm
In-Reply-To: <CA+fCnZf-o8tiv_tX9YB5eBUGx17OpztKZsEB6Awjw3WAqBAiUw@mail.gmail.com>

On Tue Apr 14, 2026 at 12:20 AM CEST, Andrey Konovalov wrote:
> On Mon, Apr 13, 2026 at 8:29 PM Alexis Lothoré (eBPF Foundation)
> <alexis.lothore@bootlin.com> wrote:
>>
>> Add a new Kconfig option CONFIG_BPF_JIT_KASAN that automatically enables
>> KASAN (Kernel Address Sanitizer) memory access checks for JIT-compiled
>> BPF programs, when both KASAN and JIT compiler are enabled. When
>> enabled, the JIT compiler will emit shadow memory checks before memory
>> loads and stores to detect use-after-free, out-of-bounds, and other
>> memory safety bugs at runtime. The option is gated behind
>> HAVE_EBPF_JIT_KASAN, as it needs proper arch-specific implementation.
>>
>> Signed-off-by: Alexis Lothoré (eBPF Foundation) <alexis.lothore@bootlin.com>
>> ---
>>  kernel/bpf/Kconfig | 9 +++++++++
>>  1 file changed, 9 insertions(+)
>>
>> diff --git a/kernel/bpf/Kconfig b/kernel/bpf/Kconfig
>> index eb3de35734f0..28392adb3d7e 100644
>> --- a/kernel/bpf/Kconfig
>> +++ b/kernel/bpf/Kconfig
>> @@ -17,6 +17,10 @@ config HAVE_CBPF_JIT
>>  config HAVE_EBPF_JIT
>>         bool
>>
>> +# KASAN support for JIT compiler
>> +config HAVE_EBPF_JIT_KASAN
>> +       bool
>> +
>>  # Used by archs to tell that they want the BPF JIT compiler enabled by
>>  # default for kernels that were compiled with BPF JIT support.
>>  config ARCH_WANT_DEFAULT_BPF_JIT
>> @@ -101,4 +105,9 @@ config BPF_LSM
>>
>>           If you are unsure how to answer this question, answer N.
>>
>> +config BPF_JIT_KASAN
>> +       bool
>> +       depends on HAVE_EBPF_JIT_KASAN
>> +       default y if BPF_JIT && KASAN_GENERIC
>
> Should this be "depends on KASAN && KASAN_GENERIC"?

Meaning, making it an explicit user-selectable option ?

If so, the current design choice is voluntary and based on the feedback
received on the original RFC, where I have been suggested to
automatically enable the KASAN instrumentation in BPF programs if KASAN
support is enabled in the kernel ([1]). But if a user-selectable toggle
is eventually a better solution, I'm fine with changing it.

[1] https://lore.kernel.org/bpf/CAADnVQLX7RSnOqQuU32Cgq-e0MVqyeNrtCQSBbk0W2xGkE-ZNw@mail.gmail.com/

>
>
>> +
>>  endmenu # "BPF subsystem"
>>
>> --
>> 2.53.0
>>




-- 
Alexis Lothoré, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com


^ permalink raw reply

* Re: [PATCH iwl-next v2 5/8] ixgbe: remove ixgbe_ping_all_vfs() from link state change handlers
From: Simon Horman @ 2026-04-14 13:23 UTC (permalink / raw)
  To: aleksandr.loktionov
  Cc: 'Simon Horman', intel-wired-lan, anthony.l.nguyen, netdev
In-Reply-To: <20260408131216.2662245-6-aleksandr.loktionov@intel.com>

From: 'Simon Horman' <horms@kernel.org>

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://sashiko.dev
---
ixgbe: remove ixgbe_ping_all_vfs() from link state change handlers

This commit removes ixgbe_ping_all_vfs() from the link watchdog handlers
to prevent a race condition during VF initialization where a gratuitous
ping collides with the VF mailbox state machine.

> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> index 8110bcf006a30..3071302901499 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe_main.c
> @@ -8164,9 +8164,6 @@ static void ixgbe_watchdog_link_is_up(struct ixgbe_adapter *adapter)
>  
>  	/* update the default user priority for VFs */
>  	ixgbe_update_default_up(adapter);
> -
> -	/* ping all the active vfs to let them know link has changed */
> -	ixgbe_ping_all_vfs(adapter);
>  }

How will legacy VFs (such as 82599, X540, and X550) receive link state
changes without these pings?

The commit message states that link-state changes are already communicated
to VFs through the normal mailbox protocol. However, older VFs appear to
rely on receiving an IXGBE_VT_MSGTYPE_CTS message from the PF to clear
mac->get_link_status and complete the link up sequence in
ixgbevf_check_mac_link_vf().

Without the ping, will these legacy VFs fail to read the mailbox and stay
stuck reporting the link as down?

Is the mailbox corruption race still triggerable from other paths?

If ixgbe_ping_vf() is called locklessly from process-context NDO callbacks
like ixgbe_ndo_set_vf_link_state() or ixgbe_ndo_set_vf_trust(), could it be
preempted by a hardirq triggering ixgbe_msg_task() (which correctly acquires
vfs_lock before writing to the mailbox)?

Since ixgbe_write_mbx_pf() only arbitrates between the PF and VF via the
IXGBE_PFMAILBOX_PFU hardware lock, it provides no mutual exclusion between
concurrent host threads. Could concurrent writes corrupt the IXGBE_PFMBMEM
buffer and cause the exact same VF initialization failure this patch intended
to fix?

Would wrapping the pings in spin_lock_irqsave(&adapter->vfs_lock, flags)
prevent the mailbox corruption instead of removing the pings entirely?

[ ... ]

> @@ -8197,9 +8194,6 @@ static void ixgbe_watchdog_link_is_down(struct ixgbe_adapter *adapter)
>  
>  	e_info(drv, "NIC Link is Down\n");
>  	netif_carrier_off(netdev);
> -
> -	/* ping all the active vfs to let them know link has changed */
> -	ixgbe_ping_all_vfs(adapter);
>  }

Without this ping, will older VFs miss the mailbox interrupt needed to set
mac->get_link_status = 1, causing them to remain stuck reporting the link as
up indefinitely and silently drop traffic when the cable is unplugged?

This isn't a bug, but removing the calls from the link handlers appears to
leave ixgbe_ping_all_vfs() as dead code without any remaining callers in
the kernel. Should the function definition in ixgbe_sriov.c and its
declaration in ixgbe_sriov.h be removed as well?

^ permalink raw reply

* Re: [PATCH net] net: airoha: Add missing PPE configurations in airoha_ppe_hw_init()
From: patchwork-bot+netdevbpf @ 2026-04-14 13:20 UTC (permalink / raw)
  To: Lorenzo Bianconi
  Cc: andrew+netdev, davem, edumazet, kuba, pabeni, linux-arm-kernel,
	linux-mediatek, netdev
In-Reply-To: <20260412-airoha_ppe_hw_init-missing-bits-v1-1-06ac670819e3@kernel.org>

Hello:

This patch was applied to netdev/net.git (main)
by Paolo Abeni <pabeni@redhat.com>:

On Sun, 12 Apr 2026 10:43:26 +0200 you wrote:
> Add the following PPE configuration in airoha_ppe_hw_init routine:
> - 6RD hw offloading is currently not supported by Netfilter flowtable.
>   Disable explicitly PPE 6RD offloading in order to prevent PPE to learn
>   6RD flows and eventually interrupt the traffic.
> - Add missing PPE bind rate configuration for L3 and L2 traffic.
>   PPE bind rate configuration specifies the pps threshold to move a PPE
>   entry state from UNBIND to BIND. Without this configuration this value
>   is random.
> - Set ageing thresholds to the values used in the vendor SDK in order to
>   improve connection stability under load and avoid packet loss caused by
>   fast aging.
> 
> [...]

Here is the summary with links:
  - [net] net: airoha: Add missing PPE configurations in airoha_ppe_hw_init()
    https://git.kernel.org/netdev/net/c/b9d8b856689d

You are awesome, thank you!
-- 
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html



^ permalink raw reply

* Re: [PATCH] net: wwan: t7xx: validate port_count against message length in t7xx_port_enum_msg_handler
From: Willy Tarreau @ 2026-04-14 13:17 UTC (permalink / raw)
  To: Paolo Abeni
  Cc: Pavitra Jha, chandrashekar.devegowda, linux-wwan, netdev, stable
In-Reply-To: <3b67dedb-3472-4322-9a30-32bf8e3cef99@redhat.com>

On Tue, Apr 14, 2026 at 11:41:54AM +0200, Paolo Abeni wrote:
> On 4/11/26 10:39 AM, Pavitra Jha wrote:
> > t7xx_port_enum_msg_handler() uses the modem-supplied port_count field as
> > a loop bound over port_msg->data[] without checking that the message buffer
> > contains sufficient data. A modem sending port_count=65535 in a 12-byte
> > buffer triggers a slab-out-of-bounds read of up to 262140 bytes.
> > 
> > Add a struct_size() check after extracting port_count and before the loop.
> > Pass msg_len from both call sites: skb->len at the DPMAIF path after
> > skb_pull(), and the captured rt_feature->data_len at the handshake path.
> > 
> > Fixes: 1e3e8eb9b6e3 ("net: wwan: t7xx: Add control DMA interface")
> 
> Wrong fixes tag:
> 
> fatal: ambiguous argument '1e3e8eb9b6e3': unknown revision or path not
> in the working tree.

Interesting, there isn't a single digit correct here! The matching one
I'm finding based on the subject is:

  39d439047f1d ("net: wwan: t7xx: Add control DMA interface")

Willy

> > diff --git a/drivers/net/wwan/t7xx/t7xx_port_ctrl_msg.c b/drivers/net/wwan/t7xx/t7xx_port_ctrl_msg.c
> > index ae632ef96..d984a688d 100644
> > --- a/drivers/net/wwan/t7xx/t7xx_port_ctrl_msg.c
> > +++ b/drivers/net/wwan/t7xx/t7xx_port_ctrl_msg.c
> > @@ -124,7 +124,7 @@ static int fsm_ee_message_handler(struct t7xx_port *port, struct t7xx_fsm_ctl *c
> >   * * 0		- Success.
> >   * * -EFAULT	- Message check failure.
> >   */
> > -int t7xx_port_enum_msg_handler(struct t7xx_modem *md, void *msg)
> > +int t7xx_port_enum_msg_handler(struct t7xx_modem *md, void *msg, size_t msg_len)
> 
> Undocumented new argument
> 
> /P

^ permalink raw reply

* Re: [PATCH net] net: airoha: Wait for NPU PPE configuration to complete in airoha_ppe_offload_setup()
From: Lorenzo Bianconi @ 2026-04-14 13:15 UTC (permalink / raw)
  To: Simon Horman
  Cc: andrew+netdev, davem, edumazet, kuba, pabeni, linux-arm-kernel,
	linux-mediatek, netdev
In-Reply-To: <20260414124641.754059-1-horms@kernel.org>

[-- Attachment #1: Type: text/plain, Size: 2536 bytes --]

> From: 'Simon Horman' <horms@kernel.org>
> 
> This is an AI-generated review of your patch. The human sending this
> email has considered the AI review valid, or at least plausible.
> Full review at: https://sashiko.dev
> ---
>     net: airoha: Wait for NPU PPE configuration to complete in airoha_ppe_offload_setup()
> 
> This commit adds a poll loop in airoha_ppe_offload_setup() to wait for the NPU
> PPE configuration to complete before running airoha_ppe_hw_init().
> 
> > diff --git a/drivers/net/ethernet/airoha/airoha_ppe.c b/drivers/net/ethernet/airoha/airoha_ppe.c
> > index c2c32b6833df9..52199f6b39f9b 100644
> > --- a/drivers/net/ethernet/airoha/airoha_ppe.c
> > +++ b/drivers/net/ethernet/airoha/airoha_ppe.c
> [ ... ]
> > @@ -1340,6 +1341,13 @@ static int airoha_ppe_offload_setup(struct airoha_eth *eth)
> >  	if (err)
> >  		goto error_npu_put;
> >  
> > +	/* Wait for NPU PPE configuration to complete */
> > +	err = read_poll_timeout(airoha_fe_rr, val, val, USEC_PER_MSEC,
> > +				100 * USEC_PER_MSEC, false, eth,
> > +				REG_PPE_PPE_FLOW_CFG(0));
> 
> If the SoC has multiple PPE instances, is it sufficient to only wait for
> REG_PPE_PPE_FLOW_CFG(0)?
> 
> Later in airoha_ppe_hw_init(), the code iterates up to eth->soc->num_ppe,
> initializing all available instances:
> 
> drivers/net/ethernet/airoha/airoha_ppe.c:airoha_ppe_hw_init() {
>     ...
> 	for (i = 0; i < eth->soc->num_ppe; i++) {
> 		...
> 		airoha_fe_wr(eth, REG_PPE_TB_BASE(i),
> 			     ppe->foe_dma + sram_tb_size);
>     ...
> }

ack, let me look into it, I will post a v2.

> 
> If we only poll PPE 0, could the driver race with the NPU firmware's
> initialization of subsequent PPEs?
> 
> Additionally, does this poll condition properly ensure the hardware is ready?
> Since the break condition is simply `val`, it exits immediately if the register
> contains any non-zero value.
> 
> If the register retains a non-zero value from a previous session (like across
> driver reloads), or if an unclocked domain returns 0xFFFFFFFF, this poll might
> succeed immediately and bypass the wait.
> 
> Could we clear the register before polling, or wait for a specific bit instead?

I guess check val != 0 is enough since we reset the FE block and defualt
register value is 0 for both REG_PPE_PPE_FLOW_CFG(0) and
REG_PPE_PPE_FLOW_CFG(1).

Regards,
Lorenzo

> 
> > +	if (err)
> > +		goto error_npu_put;
> > +
> >  	ppe_num_stats_entries = airoha_ppe_get_total_num_stats_entries(ppe);

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply

* RE: [Intel-wired-lan] [PATCH iwl-next] ice: call netif_keep_dst() once when entering switchdev mode
From: Holda, Patryk @ 2026-04-14 13:13 UTC (permalink / raw)
  To: Simon Horman, Loktionov, Aleksandr
  Cc: intel-wired-lan@lists.osuosl.org, Nguyen, Anthony L,
	netdev@vger.kernel.org, Szycik, Marcin
In-Reply-To: <20260403124133.GA94926@horms.kernel.org>

> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of
> Simon Horman
> Sent: Friday, April 3, 2026 2:42 PM
> To: Loktionov, Aleksandr <aleksandr.loktionov@intel.com>
> Cc: intel-wired-lan@lists.osuosl.org; Nguyen, Anthony L
> <anthony.l.nguyen@intel.com>; netdev@vger.kernel.org; Szycik, Marcin
> <marcin.szycik@intel.com>
> Subject: Re: [Intel-wired-lan] [PATCH iwl-next] ice: call netif_keep_dst() once
> when entering switchdev mode
> 
> On Fri, Mar 27, 2026 at 08:22:36AM +0100, Aleksandr Loktionov wrote:
> > From: Marcin Szycik <marcin.szycik@intel.com>
> >
> > netif_keep_dst() only needs to be called once for the uplink VSI, not
> > once for each port representor.  Move it from ice_eswitch_setup_repr()
> > to ice_eswitch_enable_switchdev().
> >
> > Fixes: defd52455aee ("ice: do Tx through PF netdev in slow-path")
> 
> This problem seems to predate the cited commit.
> 
> > Signed-off-by: Marcin Szycik <marcin.szycik@intel.com>
> > Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>

Tested-by: Patryk Holda <patryk.holda@intel.com> 



^ permalink raw reply

* Re: [PATCH RFC bpf-next 1/8] kasan: expose generic kasan helpers
From: Alexis Lothoré @ 2026-04-14 13:12 UTC (permalink / raw)
  To: Andrey Konovalov, Alexis Lothoré (eBPF Foundation)
  Cc: Alexei Starovoitov, Daniel Borkmann, Andrii Nakryiko,
	Martin KaFai Lau, Eduard Zingerman, Kumar Kartikeya Dwivedi,
	Song Liu, Yonghong Song, Jiri Olsa, John Fastabend,
	David S. Miller, David Ahern, Thomas Gleixner, Ingo Molnar,
	Borislav Petkov, Dave Hansen, x86, H. Peter Anvin, Shuah Khan,
	Maxime Coquelin, Alexandre Torgue, Andrey Ryabinin,
	Alexander Potapenko, Dmitry Vyukov, Vincenzo Frascino,
	Andrew Morton, ebpf, Bastien Curutchet, Thomas Petazzoni,
	Xu Kuohai, bpf, linux-kernel, netdev, linux-kselftest,
	linux-stm32, linux-arm-kernel, kasan-dev, linux-mm
In-Reply-To: <CA+fCnZfubV6LgRjO3NQvhrG2Q5o0ftkFFupLWVYS50XDnmCaog@mail.gmail.com>

Hi Andrey, thanks for the prompt review !

On Tue Apr 14, 2026 at 12:19 AM CEST, Andrey Konovalov wrote:
> On Mon, Apr 13, 2026 at 8:29 PM Alexis Lothoré (eBPF Foundation)
> <alexis.lothore@bootlin.com> wrote:
>>

[...]

>> +#ifdef CONFIG_KASAN_GENERIC
>> +void __asan_load1(void *p);
>> +void __asan_store1(void *p);
>> +void __asan_load2(void *p);
>> +void __asan_store2(void *p);
>> +void __asan_load4(void *p);
>> +void __asan_store4(void *p);
>> +void __asan_load8(void *p);
>> +void __asan_store8(void *p);
>> +void __asan_load16(void *p);
>> +void __asan_store16(void *p);
>> +#endif /* CONFIG_KASAN_GENERIC */
>
> This looks ugly, let's not do this unless it's really required.
>
> You can just use kasan_check_read/write() instead - these are public
> wrappers around the same shadow memory checking functions. And they
> also work with the SW_TAGS mode, in case the BPF would want to use
> that mode at some point. (For HW_TAGS, we only have kasan_check_byte()
> that checks a single byte, but it can be extended in the future if
> required to be used by BPF.)

ACK, I'll try to use those kasan_check_read and kasan_check_write rather
than __asan_{load,store}X.

Alexis

-- 
Alexis Lothoré, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com


^ permalink raw reply

* RE: [Intel-wired-lan] [PATCH iwl-next v1 1/3] i40e: prepare for XDP metadata ops support
From: Holda, Patryk @ 2026-04-14 13:12 UTC (permalink / raw)
  To: Loktionov, Aleksandr, Kohei Enju,
	intel-wired-lan@lists.osuosl.org, netdev@vger.kernel.org
  Cc: Nguyen, Anthony L, Kitszel, Przemyslaw, Andrew Lunn,
	David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	kohei.enju@gmail.com
In-Reply-To: <IA3PR11MB89861AD556C1C4D863DD4F3EE54CA@IA3PR11MB8986.namprd11.prod.outlook.com>

> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of
> Loktionov, Aleksandr
> Sent: Friday, March 20, 2026 7:57 AM
> To: Kohei Enju <kohei@enjuk.jp>; intel-wired-lan@lists.osuosl.org;
> netdev@vger.kernel.org
> Cc: Nguyen, Anthony L <anthony.l.nguyen@intel.com>; Kitszel, Przemyslaw
> <przemyslaw.kitszel@intel.com>; Andrew Lunn <andrew+netdev@lunn.ch>;
> David S. Miller <davem@davemloft.net>; Eric Dumazet
> <edumazet@google.com>; Jakub Kicinski <kuba@kernel.org>; Paolo Abeni
> <pabeni@redhat.com>; kohei.enju@gmail.com
> Subject: Re: [Intel-wired-lan] [PATCH iwl-next v1 1/3] i40e: prepare for XDP
> metadata ops support
> 
> 
> 
> > -----Original Message-----
> > From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf
> > Of Kohei Enju
> > Sent: Thursday, March 19, 2026 6:17 PM
> > To: intel-wired-lan@lists.osuosl.org; netdev@vger.kernel.org
> > Cc: Nguyen, Anthony L <anthony.l.nguyen@intel.com>; Kitszel,
> > Przemyslaw <przemyslaw.kitszel@intel.com>; Andrew Lunn
> > <andrew+netdev@lunn.ch>; David S. Miller <davem@davemloft.net>; Eric
> > Dumazet <edumazet@google.com>; Jakub Kicinski <kuba@kernel.org>;
> Paolo
> > Abeni <pabeni@redhat.com>; kohei.enju@gmail.com; Kohei Enju
> > <kohei@enjuk.jp>
> > Subject: [Intel-wired-lan] [PATCH iwl-next v1 1/3] i40e: prepare for
> > XDP metadata ops support
> >
> > Prepare 'struct i40e_xdp_buff' that contains an xdp_buff and a pointer
> > to i40e_rx_desc in order to pass the RX descriptor to the XDP kfuncs.
> > Also in ZC path, use XSK_CHECK_PRIV_TYPE() to ensure i40e_xdp_buff
> > doesn't exceed the offset of cb in xdp_buff_xsk.
> >
> > No functional changes.
> >
> > Signed-off-by: Kohei Enju <kohei@enjuk.jp>
> > ---
> >  drivers/net/ethernet/intel/i40e/i40e_main.c |  2 +-
> > drivers/net/ethernet/intel/i40e/i40e_txrx.c |  5 ++++-
> > drivers/net/ethernet/intel/i40e/i40e_txrx.h |  7 ++++++-
> > drivers/net/ethernet/intel/i40e/i40e_xsk.c  | 12 ++++++++++++
> >  4 files changed, 23 insertions(+), 3 deletions(-)
> >
> > diff --git a/drivers/net/ethernet/intel/i40e/i40e_main.c
> > b/drivers/net/ethernet/intel/i40e/i40e_main.c
> > index 31a42ee18aa0..7966d9cb8009 100644
> > --- a/drivers/net/ethernet/intel/i40e/i40e_main.c
> > +++ b/drivers/net/ethernet/intel/i40e/i40e_main.c
> > @@ -3619,7 +3619,7 @@ static int i40e_configure_rx_ring(struct
> > i40e_ring *ring)
> >  	}
> >
> >  skip:
> > -	xdp_init_buff(&ring->xdp, xdp_frame_sz, &ring->xdp_rxq);
> > +	xdp_init_buff(&ring->xdp_ctx.xdp, xdp_frame_sz, &ring-
> > >xdp_rxq);
> >
> >  	rx_ctx.dbuff = DIV_ROUND_UP(ring->rx_buf_len,
> >  				    BIT_ULL(I40E_RXQ_CTX_DBUFF_SHIFT));
> > diff --git a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> > b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> > index 4ffdb007c41a..cfaf724ee7ff 100644
> > --- a/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> > +++ b/drivers/net/ethernet/intel/i40e/i40e_txrx.c
> > @@ -2438,10 +2438,11 @@ static int i40e_clean_rx_irq(struct i40e_ring
> > *rx_ring, int budget,
> >  			     unsigned int *rx_cleaned)
> >  {
> >  	unsigned int total_rx_bytes = 0, total_rx_packets = 0;
> 
> ...
> 
> >  		xdp_res = i40e_run_xdp_zc(rx_ring, first, xdp_prog);
> >  		i40e_handle_xdp_result_zc(rx_ring, first, rx_desc,
> &rx_packets,
> >  					  &rx_bytes, xdp_res, &failure);
> > --
> > 2.51.0
> 
> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>


Tested-by: Patryk Holda <patryk.holda@intel.com> 



^ permalink raw reply

* Re: [PATCH net] net: airoha: Fix max TX packet length configuration
From: Paolo Abeni @ 2026-04-14 13:04 UTC (permalink / raw)
  To: Lorenzo Bianconi, Andrew Lunn, David S. Miller, Eric Dumazet,
	Jakub Kicinski, Simon Horman
  Cc: linux-arm-kernel, linux-mediatek, netdev
In-Reply-To: <20260412-airoha-fix-max-mtu-v1-1-333030d0a564@kernel.org>

On 4/12/26 10:09 AM, Lorenzo Bianconi wrote:
> According to the Airoha documentation, REG_GDM_LEN_CFG() register does not
> include FCS length. Fix MTU configuration removing ETH_FCS_LEN from
> maximum TX packet length configuration.
> 
> Fixes: 54d989d58d2ac ("net: airoha: Move min/max packet len configuration in airoha_dev_open()")
> Fixes: 03b1b69f0662c ("net: airoha: Introduce airoha_dev_change_mtu callback")
> Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
> ---
>  drivers/net/ethernet/airoha/airoha_eth.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 
> diff --git a/drivers/net/ethernet/airoha/airoha_eth.c b/drivers/net/ethernet/airoha/airoha_eth.c
> index c14cdce588a7..a81ffda72b39 100644
> --- a/drivers/net/ethernet/airoha/airoha_eth.c
> +++ b/drivers/net/ethernet/airoha/airoha_eth.c
> @@ -1629,9 +1629,9 @@ static void airoha_update_hw_stats(struct airoha_gdm_port *port)
>  
>  static int airoha_dev_open(struct net_device *dev)
>  {
> -	int err, len = ETH_HLEN + dev->mtu + ETH_FCS_LEN;
>  	struct airoha_gdm_port *port = netdev_priv(dev);
>  	struct airoha_qdma *qdma = port->qdma;
> +	int err, len = ETH_HLEN + dev->mtu;

Sashiko noted that the above may cause regressions, dropping max MTU
vlan frames.

/P


^ permalink raw reply

* Re: [PATCH net] net: airoha: Fix max TX packet length configuration
From: Lorenzo Bianconi @ 2026-04-14 13:04 UTC (permalink / raw)
  To: Simon Horman
  Cc: andrew+netdev, davem, edumazet, kuba, pabeni, linux-arm-kernel,
	linux-mediatek, netdev
In-Reply-To: <20260414124011.751922-1-horms@kernel.org>

[-- Attachment #1: Type: text/plain, Size: 1888 bytes --]

> From: 'Simon Horman' <horms@kernel.org>
> 
> This is an AI-generated review of your patch. The human sending this
> email has considered the AI review valid, or at least plausible.
> Full review at: https://sashiko.dev
> ---
> net: airoha: Fix max TX packet length configuration
> 
> This commit fixes the MTU configuration for the Airoha driver by removing
> ETH_FCS_LEN from the maximum TX packet length, matching the hardware
> documentation which states the register does not include the FCS length.
> 
> > diff --git a/drivers/net/ethernet/airoha/airoha_eth.c b/drivers/net/ethernet/airoha/airoha_eth.c
> > index c14cdce588a7c..a81ffda72b39f 100644
> > --- a/drivers/net/ethernet/airoha/airoha_eth.c
> > +++ b/drivers/net/ethernet/airoha/airoha_eth.c
> > @@ -1629,9 +1629,9 @@ static void airoha_update_hw_stats(struct airoha_gdm_port *port)
> >  
> >  static int airoha_dev_open(struct net_device *dev)
> >  {
> > -	int err, len = ETH_HLEN + dev->mtu + ETH_FCS_LEN;
> >  	struct airoha_gdm_port *port = netdev_priv(dev);
> >  	struct airoha_qdma *qdma = port->qdma;
> > +	int err, len = ETH_HLEN + dev->mtu;
> 
> Does this regression cause MTU-sized VLAN-tagged frames to be dropped by the
> hardware?
> 
> By limiting the frame size to ETH_HLEN + dev->mtu, the maximum allowed length
> becomes 1514 bytes for a standard 1500 MTU. An Ethernet frame containing a
> standard 802.1Q VLAN tag and a maximum-sized 1500-byte IP payload requires an
> 18-byte L2 header, making the total frame size 1518 bytes (excluding FCS).
> 
> Previously, the inclusion of ETH_FCS_LEN inadvertently provided the 4-byte
> headroom needed to accommodate a single VLAN tag. 
> 
> Could the configuration explicitly account for VLAN headers using VLAN_ETH_HLEN
> instead?

ack, thx for pointing this out. I will fix it in v2.

Regards,
Lorenzo

> 
> [ ... ]

[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 228 bytes --]

^ permalink raw reply

* Re: [net,PATCH v3 2/2] net: ks8851: Avoid excess softirq scheduling
From: Sebastian Andrzej Siewior @ 2026-04-14 13:02 UTC (permalink / raw)
  To: Marek Vasut
  Cc: netdev, stable, David S. Miller, Andrew Lunn, Eric Dumazet,
	Jakub Kicinski, Nicolai Buchwitz, Paolo Abeni, Ronald Wahl,
	Yicong Hui, linux-kernel
In-Reply-To: <20260414103327.113500-2-marex@nabladev.com>

On 2026-04-14 12:32:53 [+0200], Marek Vasut wrote:
> The code injects a packet into netif_rx() repeatedly, which will add
> it to its internal NAPI and schedule a softirq, and process it. It is
> more efficient to queue multiple packets and process them all at the
> local_bh_enable() time.
> 
> Fixes: e0863634bf9f ("net: ks8851: Queue RX packets in IRQ handler instead of disabling BHs")
> Cc: stable@vger.kernel.org
> Signed-off-by: Marek Vasut <marex@nabladev.com>

Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>

Sebastian

^ permalink raw reply

* Re: [PATCH iwl-next v2 1/8] ixgbe: lower IXGBE_ITR_ADAPTIVE_MAX_USECS to prevent RX starvation
From: Simon Horman @ 2026-04-14 12:58 UTC (permalink / raw)
  To: aleksandr.loktionov
  Cc: 'Simon Horman', intel-wired-lan, anthony.l.nguyen, netdev
In-Reply-To: <20260408131216.2662245-2-aleksandr.loktionov@intel.com>

From: 'Simon Horman' <horms@kernel.org>

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://sashiko.dev
---
ixgbe: lower IXGBE_ITR_ADAPTIVE_MAX_USECS to prevent RX starvation

The patch lowers IXGBE_ITR_ADAPTIVE_MAX_USECS from 126 to 84 to prevent
descriptor ring starvation under sustained full-line-rate bulk RX traffic.
It states that this provides enough headroom to drain the ring before it wraps.

> diff --git a/drivers/net/ethernet/intel/ixgbe/ixgbe.h b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
> index 59a1cee40b436..eedee3b2e72c8 100644
> --- a/drivers/net/ethernet/intel/ixgbe/ixgbe.h
> +++ b/drivers/net/ethernet/intel/ixgbe/ixgbe.h
> @@ -475,7 +475,7 @@ static inline unsigned int ixgbe_rx_pg_order(struct ixgbe_ring *ring)
>  
>  #define IXGBE_ITR_ADAPTIVE_MIN_INC	2
>  #define IXGBE_ITR_ADAPTIVE_MIN_USECS	10
> -#define IXGBE_ITR_ADAPTIVE_MAX_USECS	126
> +#define IXGBE_ITR_ADAPTIVE_MAX_USECS	84

Will the hardware still be configured with a 126 us delay during bulk traffic?

Looking at ixgbe_update_itr(), when traffic is categorized as bulk
(packets >= 256), itr is initialized to IXGBE_ITR_ADAPTIVE_BULK (0).
For a 10Gbps link, avg_wire_size plateaus at 32256:

drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:ixgbe_update_itr() {
    ...
	} else {
		/* plateau at a limit of 8K ints/sec */
		avg_wire_size = 32256;
	}
    ...
}

The itr is then calculated and updated directly:

drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:ixgbe_update_itr() {
    ...
	switch (q_vector->adapter->link_speed) {
	case IXGBE_LINK_SPEED_10GB_FULL:
	case IXGBE_LINK_SPEED_100_FULL:
	default:
		itr += DIV_ROUND_UP(avg_wire_size,
				    IXGBE_ITR_ADAPTIVE_MIN_INC * 256) *
		       IXGBE_ITR_ADAPTIVE_MIN_INC;
		break;
    ...
}

With IXGBE_ITR_ADAPTIVE_MIN_INC being 2, the calculation is
DIV_ROUND_UP(32256, 512) * 2, which equals exactly 126.

There is no clamping against IXGBE_ITR_ADAPTIVE_MAX_USECS applied after this
bulk calculation, so ring_container->itr becomes 126.

Later, in ixgbe_set_itr(), the final ITR is determined by taking the minimum
between the rx and tx values:

drivers/net/ethernet/intel/ixgbe/ixgbe_main.c:ixgbe_set_itr() {
    ...
	new_itr = min(q_vector->rx.itr, q_vector->tx.itr);
    ...
}

If the TX ring is idle or operating in latency mode, it will have the
IXGBE_ITR_ADAPTIVE_LATENCY flag (0x80) set.  Even though its base value is
clamped to 84, the latency flag makes it numerically 212.

Does this mean min(126, 212) will return the RX bulk value of 126, leaving
the interrupt delay at 126 us and descriptor ring starvation unresolved?

>  #define IXGBE_ITR_ADAPTIVE_LATENCY	0x80
>  #define IXGBE_ITR_ADAPTIVE_BULK		0x00

^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox