Netdev List

Netdev List
 help / color / mirror / Atom feed

* Re: [patch 10/38] arcnet: Remove function timing code
From: David Woodhouse @ 2026-04-13 15:29 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: Michael Grzeschik, netdev, Arnd Bergmann, x86, Lu Baolu, iommu,
	linux-wireless, Herbert Xu, linux-crypto, Vlastimil Babka,
	linux-mm, Bernie Thompson, linux-fbdev, Theodore Tso, linux-ext4,
	Andrew Morton, Uladzislau Rezki, Marco Elver, Dmitry Vyukov,
	kasan-dev, Andrey Ryabinin, Thomas Sailer, linux-hams,
	Jason A. Donenfeld, Richard Henderson, linux-alpha, Russell King,
	linux-arm-kernel, Catalin Marinas, Huacai Chen, loongarch,
	Geert Uytterhoeven, linux-m68k, Dinh Nguyen, Jonas Bonn,
	linux-openrisc, Helge Deller, linux-parisc, Michael Ellerman,
	linuxppc-dev, Paul Walmsley, linux-riscv, Heiko Carstens,
	linux-s390, David S. Miller, sparclinux
In-Reply-To: <20260410120318.253872322@kernel.org>

[-- Attachment #1: Type: text/plain, Size: 778 bytes --]

On Fri, 2026-04-10 at 14:19 +0200, Thomas Gleixner wrote:
> ARCNET is a museums piece and the function timing can be done with
> ftrace. Remove the cruft.
> 
> Signed-off-by: Thomas Gleixner <tglx@kernel.org>
> Cc: Michael Grzeschik <m.grzeschik@pengutronix.de>
> Cc: netdev@vger.kernel.org
> ---
>  drivers/net/arcnet/arc-rimi.c  |    4 ++--
>  drivers/net/arcnet/arcdevice.h |   20 +-------------------
>  drivers/net/arcnet/com20020.c  |    6 ++----
>  drivers/net/arcnet/com90io.c   |    6 ++----
>  drivers/net/arcnet/com90xx.c   |    4 ++--
>  5 files changed, 9 insertions(+), 31 deletions(-)

Acked-by: David Woodhouse <dwmw2@infradead.org>

By coincidence, I took the last of my ARCNET cards to the tip just this
morning...

[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5069 bytes --]

^ permalink raw reply

* Re: [PATCH 2/5] selftests: net: add multithread client support to iou-zcrx
From: Juanlu Herrero @ 2026-04-13 15:19 UTC (permalink / raw)
  To: David Wei; +Cc: netdev
In-Reply-To: <8fa08d73-28a3-4521-bcfb-ec81869c24f3@davidwei.uk>

On Thu, Apr 09, 2026 at 08:51:11AM -0600, David Wei wrote:
> On 2026-04-08 09:38, Juanlu Herrero wrote:
> > Add pthreads to the iou-zcrx client so that multiple connections can be
> > established simultaneously. Each client thread connects to the server
> > and sends its payload independently.
> > 
> > Introduce struct thread_ctx and the -t option to control the number of
> > threads (default 1), preserving backwards compatibility with existing
> > tests.
> > 
> > Signed-off-by: Juanlu Herrero <juanlu@fastmail.com>
> > ---
> >   .../testing/selftests/drivers/net/hw/Makefile |  2 +-
> >   .../selftests/drivers/net/hw/iou-zcrx.c       | 46 +++++++++++++++++--
> >   2 files changed, 44 insertions(+), 4 deletions(-)
> > 
> > diff --git a/tools/testing/selftests/drivers/net/hw/Makefile b/tools/testing/selftests/drivers/net/hw/Makefile
> > index deeca3f8d080..227adfec706c 100644
> > --- a/tools/testing/selftests/drivers/net/hw/Makefile
> > +++ b/tools/testing/selftests/drivers/net/hw/Makefile
> > @@ -80,5 +80,5 @@ include ../../../net/ynl.mk
> >   include ../../../net/bpf.mk
> >   ifeq ($(HAS_IOURING_ZCRX),y)
> > -$(OUTPUT)/iou-zcrx: LDLIBS += -luring
> > +$(OUTPUT)/iou-zcrx: LDLIBS += -luring -lpthread
> >   endif
> > diff --git a/tools/testing/selftests/drivers/net/hw/iou-zcrx.c b/tools/testing/selftests/drivers/net/hw/iou-zcrx.c
> > index 334985083f61..de2eea78a5b6 100644
> > --- a/tools/testing/selftests/drivers/net/hw/iou-zcrx.c
> > +++ b/tools/testing/selftests/drivers/net/hw/iou-zcrx.c
> > @@ -4,6 +4,7 @@
> >   #include <error.h>
> >   #include <fcntl.h>
> >   #include <limits.h>
> > +#include <pthread.h>
> >   #include <stdbool.h>
> >   #include <stdint.h>
> >   #include <stdio.h>
> > @@ -85,8 +86,14 @@ static int cfg_send_size = SEND_SIZE;
> >   static struct sockaddr_in6 cfg_addr;
> >   static unsigned int cfg_rx_buf_len;
> >   static bool cfg_dry_run;
> > +static int cfg_num_threads = 1;
> >   static char *payload;
> > +
> > +struct thread_ctx {
> > +	int			thread_id;
> 
> This is set here and in patch 4 but I don't see it being used.

Makes sense, will remove from v2.

^ permalink raw reply

* [PATCH v2] wireguard: device: use exit_rtnl callback instead of manual rtnl_lock in pre_exit
From: Shardul Bankar @ 2026-04-13 15:12 UTC (permalink / raw)
  To: Jason, kuniyu
  Cc: andrew+netdev, davem, edumazet, kuba, pabeni, wireguard, netdev,
	linux-kernel, janak, kalpan.jani, shardulsb08, Shardul Bankar,
	syzbot+f2fbf7478a35a94c8b7c

wg_netns_pre_exit() manually acquires rtnl_lock() inside the
pernet .pre_exit callback.  This causes a hung task when another
thread holds rtnl_mutex - the cleanup_net workqueue (or the
setup_net failure rollback path) blocks indefinitely in
wg_netns_pre_exit() waiting to acquire the lock.

Convert to .exit_rtnl, introduced in commit 7a60d91c690b ("net:
Add ->exit_rtnl() hook to struct pernet_operations."), where the
framework already holds RTNL and batches all callbacks under a
single rtnl_lock()/rtnl_unlock() pair, eliminating the contention
window.

The rcu_assign_pointer(wg->creating_net, NULL) is safe to move
from .pre_exit to .exit_rtnl (which runs after synchronize_rcu())
because all RCU readers of creating_net either use maybe_get_net()
- which returns NULL for a dying namespace with zero refcount - or
access net->user_ns which remains valid throughout the entire
ops_undo_list sequence.

Reported-by: syzbot+f2fbf7478a35a94c8b7c@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?id=cb64c22a492202ca929e18262fdb8cb89e635c70
Signed-off-by: Shardul Bankar <shardul.b@mpiricsoftware.com>
---
v2: Fix incorrect Reported-by email address

 drivers/net/wireguard/device.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/net/wireguard/device.c b/drivers/net/wireguard/device.c
index 46a71ec36af8..eb854c5294a3 100644
--- a/drivers/net/wireguard/device.c
+++ b/drivers/net/wireguard/device.c
@@ -411,12 +411,11 @@ static struct rtnl_link_ops link_ops __read_mostly = {
 	.newlink		= wg_newlink,
 };
 
-static void wg_netns_pre_exit(struct net *net)
+static void wg_netns_exit_rtnl(struct net *net, struct list_head *dev_kill_list)
 {
 	struct wg_device *wg;
 	struct wg_peer *peer;
 
-	rtnl_lock();
 	list_for_each_entry(wg, &device_list, device_list) {
 		if (rcu_access_pointer(wg->creating_net) == net) {
 			pr_debug("%s: Creating namespace exiting\n", wg->dev->name);
@@ -429,11 +428,10 @@ static void wg_netns_pre_exit(struct net *net)
 			mutex_unlock(&wg->device_update_lock);
 		}
 	}
-	rtnl_unlock();
 }
 
 static struct pernet_operations pernet_ops = {
-	.pre_exit = wg_netns_pre_exit
+	.exit_rtnl = wg_netns_exit_rtnl
 };
 
 int __init wg_device_init(void)
-- 
2.34.1


^ permalink raw reply related

* Re: [PATCH] wireguard: device: use exit_rtnl callback instead of manual rtnl_lock in pre_exit
From: syzbot @ 2026-04-13 15:01 UTC (permalink / raw)
  To: shardul.b
  Cc: andrew, davem, edumazet, janak, jason, kalpan.jani, kuba, kuniyu,
	linux-kernel, netdev, pabeni, shardul.b, shardulsb08, wireguard
In-Reply-To: <20260413150024.1003490-1-shardul.b@mpiricsoftware.com>

> wg_netns_pre_exit() manually acquires rtnl_lock() inside the
> pernet .pre_exit callback.  This causes a hung task when another
> thread holds rtnl_mutex - the cleanup_net workqueue (or the
> setup_net failure rollback path) blocks indefinitely in
> wg_netns_pre_exit() waiting to acquire the lock.
>
> Convert to .exit_rtnl, introduced in commit 7a60d91c690b ("net:
> Add ->exit_rtnl() hook to struct pernet_operations."), where the
> framework already holds RTNL and batches all callbacks under a
> single rtnl_lock()/rtnl_unlock() pair, eliminating the contention
> window.
>
> The rcu_assign_pointer(wg->creating_net, NULL) is safe to move
> from .pre_exit to .exit_rtnl (which runs after synchronize_rcu())
> because all RCU readers of creating_net either use maybe_get_net()
> - which returns NULL for a dying namespace with zero refcount - or
> access net->user_ns which remains valid throughout the entire
> ops_undo_list sequence.
>
> Reported-by: syzbot+pFBD3bslSSshiJCd3rxy@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?id=cb64c22a492202ca929e18262fdb8cb89e635c70
> Signed-off-by: Shardul Bankar <shardul.b@mpiricsoftware.com>
> ---
>  drivers/net/wireguard/device.c | 6 ++----
>  1 file changed, 2 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/net/wireguard/device.c b/drivers/net/wireguard/device.c
> index 46a71ec36af8..eb854c5294a3 100644
> --- a/drivers/net/wireguard/device.c
> +++ b/drivers/net/wireguard/device.c
> @@ -411,12 +411,11 @@ static struct rtnl_link_ops link_ops __read_mostly = {
>  	.newlink		= wg_newlink,
>  };
>  
> -static void wg_netns_pre_exit(struct net *net)
> +static void wg_netns_exit_rtnl(struct net *net, struct list_head *dev_kill_list)
>  {
>  	struct wg_device *wg;
>  	struct wg_peer *peer;
>  
> -	rtnl_lock();
>  	list_for_each_entry(wg, &device_list, device_list) {
>  		if (rcu_access_pointer(wg->creating_net) == net) {
>  			pr_debug("%s: Creating namespace exiting\n", wg->dev->name);
> @@ -429,11 +428,10 @@ static void wg_netns_pre_exit(struct net *net)
>  			mutex_unlock(&wg->device_update_lock);
>  		}
>  	}
> -	rtnl_unlock();
>  }
>  
>  static struct pernet_operations pernet_ops = {
> -	.pre_exit = wg_netns_pre_exit
> +	.exit_rtnl = wg_netns_exit_rtnl
>  };
>  
>  int __init wg_device_init(void)
> -- 
> 2.34.1
>

I see the command but can't find the corresponding bug.
The email is sent to  syzbot+HASH@syzkaller.appspotmail.com address
but the HASH does not correspond to any known bug.
Please double check the address.


^ permalink raw reply

* [PATCH] wireguard: device: use exit_rtnl callback instead of manual rtnl_lock in pre_exit
From: Shardul Bankar @ 2026-04-13 15:00 UTC (permalink / raw)
  To: Jason, kuniyu
  Cc: andrew+netdev, davem, edumazet, kuba, pabeni, wireguard, netdev,
	linux-kernel, janak, kalpan.jani, shardulsb08, Shardul Bankar,
	syzbot+pFBD3bslSSshiJCd3rxy

wg_netns_pre_exit() manually acquires rtnl_lock() inside the
pernet .pre_exit callback.  This causes a hung task when another
thread holds rtnl_mutex - the cleanup_net workqueue (or the
setup_net failure rollback path) blocks indefinitely in
wg_netns_pre_exit() waiting to acquire the lock.

Convert to .exit_rtnl, introduced in commit 7a60d91c690b ("net:
Add ->exit_rtnl() hook to struct pernet_operations."), where the
framework already holds RTNL and batches all callbacks under a
single rtnl_lock()/rtnl_unlock() pair, eliminating the contention
window.

The rcu_assign_pointer(wg->creating_net, NULL) is safe to move
from .pre_exit to .exit_rtnl (which runs after synchronize_rcu())
because all RCU readers of creating_net either use maybe_get_net()
- which returns NULL for a dying namespace with zero refcount - or
access net->user_ns which remains valid throughout the entire
ops_undo_list sequence.

Reported-by: syzbot+pFBD3bslSSshiJCd3rxy@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?id=cb64c22a492202ca929e18262fdb8cb89e635c70
Signed-off-by: Shardul Bankar <shardul.b@mpiricsoftware.com>
---
 drivers/net/wireguard/device.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/drivers/net/wireguard/device.c b/drivers/net/wireguard/device.c
index 46a71ec36af8..eb854c5294a3 100644
--- a/drivers/net/wireguard/device.c
+++ b/drivers/net/wireguard/device.c
@@ -411,12 +411,11 @@ static struct rtnl_link_ops link_ops __read_mostly = {
 	.newlink		= wg_newlink,
 };
 
-static void wg_netns_pre_exit(struct net *net)
+static void wg_netns_exit_rtnl(struct net *net, struct list_head *dev_kill_list)
 {
 	struct wg_device *wg;
 	struct wg_peer *peer;
 
-	rtnl_lock();
 	list_for_each_entry(wg, &device_list, device_list) {
 		if (rcu_access_pointer(wg->creating_net) == net) {
 			pr_debug("%s: Creating namespace exiting\n", wg->dev->name);
@@ -429,11 +428,10 @@ static void wg_netns_pre_exit(struct net *net)
 			mutex_unlock(&wg->device_update_lock);
 		}
 	}
-	rtnl_unlock();
 }
 
 static struct pernet_operations pernet_ops = {
-	.pre_exit = wg_netns_pre_exit
+	.exit_rtnl = wg_netns_exit_rtnl
 };
 
 int __init wg_device_init(void)
-- 
2.34.1


^ permalink raw reply related

* Re: [PATCH iwl-next 1/1] igc: add support for forcing link speed without autonegotiation
From: Simon Horman @ 2026-04-13 14:52 UTC (permalink / raw)
  To: khai.wen.tan
  Cc: 'Simon Horman', anthony.l.nguyen, przemyslaw.kitszel,
	andrew+netdev, davem, edumazet, kuba, pabeni, intel-wired-lan,
	netdev, linux-kernel, faizal.abdul.rahim, hector.blanco.alcaine,
	hong.aun.looi, khai.wen.tan, faizal.abdul.rahim
In-Reply-To: <20260409072747.217836-1-khai.wen.tan@linux.intel.com>

From: 'Simon Horman' <horms@kernel.org>

This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://sashiko.dev
---
igc: add support for forcing link speed without autonegotiation

This commit allows forcing 10/100 Mb/s link speed and duplex settings via
ethtool when autonegotiation is disabled. It configures the MAC CTRL and
PHY registers to bypass autonegotiation and apply the requested settings.

> diff --git a/drivers/net/ethernet/intel/igc/igc_ethtool.c b/drivers/net/ethernet/intel/igc/igc_ethtool.c
> index 0122009bedd03..e14771532dad8 100644
> --- a/drivers/net/ethernet/intel/igc/igc_ethtool.c
> +++ b/drivers/net/ethernet/intel/igc/igc_ethtool.c
[ ... ]
> +static int igc_handle_autoneg_disabled(struct net_device *netdev, u32 speed,
> +				       u8 duplex)
> +{
> +	struct igc_adapter *adapter = netdev_priv(netdev);
> +	struct igc_mac_info *mac = &adapter->hw.mac;
> +	enum igc_forced_speed_duplex forced_speed_duplex;
>  
> +	switch (speed) {
> +	case SPEED_10:
> +		forced_speed_duplex = (duplex == DUPLEX_FULL) ? IGC_FORCED_10F : IGC_FORCED_10H;
> +		break;
> +	case SPEED_100:
> +		forced_speed_duplex = (duplex == DUPLEX_FULL) ? IGC_FORCED_100F : IGC_FORCED_100H;
> +		break;

Does forcing a half-duplex configuration here also require setting flow
control to none (e.g., hw->fc.requested_mode = igc_fc_none)?

> diff --git a/drivers/net/ethernet/intel/igc/igc_mac.c b/drivers/net/ethernet/intel/igc/igc_mac.c
> index 7ac6637f8db73..16b0f2db06896 100644
> --- a/drivers/net/ethernet/intel/igc/igc_mac.c
> +++ b/drivers/net/ethernet/intel/igc/igc_mac.c
> @@ -452,7 +452,7 @@ s32 igc_config_fc_after_link_up(struct igc_hw *hw)
>  	 * so we had to force link.  In this case, we need to force the
>  	 * configuration of the MAC to match the "fc" parameter.
>  	 */
> -	if (mac->autoneg_failed)
> +	if (mac->autoneg_failed || !mac->autoneg_enabled)
>  		ret_val = igc_force_mac_fc(hw);

If mac->autoneg_enabled is false, igc_force_mac_fc() will configure the MAC
using the existing hw->fc.current_mode. If current_mode defaults to full flow
control, this will enable PAUSE frames.

Since autonegotiation is disabled, the MII status check further down in this
same function will fail and return early:

drivers/net/ethernet/intel/igc/igc_mac.c:igc_config_fc_after_link_up() {
    ...
    if (!(mii_status_reg & MII_SR_AUTONEG_COMPLETE)) {
        hw_dbg("Copper PHY and Auto Neg has not completed.\n");
        goto out;
    }
    ...
}

This early return skips the half-duplex safety check at the end of
igc_config_fc_after_link_up() which would normally clear the flow control
settings:

drivers/net/ethernet/intel/igc/igc_mac.c:igc_config_fc_after_link_up() {
    ...
    if (duplex == HALF_DUPLEX)
        hw->fc.current_mode = igc_fc_none;

    ret_val = igc_force_mac_fc(hw);
    ...
}

Does this leave the MAC incorrectly configured to send and receive PAUSE
frames on a half-duplex link, violating the IEEE 802.3 specification?

^ permalink raw reply

* Re: [GIT PULL] bluetooth-next 2026-04-13
From: Paolo Abeni @ 2026-04-13 14:56 UTC (permalink / raw)
  To: Luiz Augusto von Dentz; +Cc: davem, kuba, linux-bluetooth, netdev
In-Reply-To: <CABBYNZ+7zr1jQ7a-2p88zNqMdvn6MAB5NAZ7b4OW=P56DcMT5g@mail.gmail.com>

On 4/13/26 4:17 PM, Luiz Augusto von Dentz wrote:
> On Mon, Apr 13, 2026 at 10:11 AM Luiz Augusto von Dentz
> <luiz.dentz@gmail.com> wrote:
>> On Mon, Apr 13, 2026 at 9:41 AM Paolo Abeni <pabeni@redhat.com> wrote:
>>>
>>> On 4/13/26 3:22 PM, Luiz Augusto von Dentz wrote:
>>>> The following changes since commit 42f9b4c6ef19e71d2c7d9bfd3c5037d4fe434ad7:
>>>>
>>>>   tools: ynl: tests: fix leading space on Makefile target (2026-04-09 20:41:40 -0700)
>>>>
>>>> are available in the Git repository at:
>>>>
>>>>   git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next.git tags/for-net-next-2026-04-13
>>>>
>>>> for you to fetch changes up to c347ca17d62a32c25564fee0ca3a2a7bc2d5fd6f:
>>>>
>>>>   Bluetooth: hci_qca: Fix missing wakeup during SSR memdump handling (2026-04-13 09:19:42 -0400)
>>>>
>>>> ----------------------------------------------------------------
>>>> bluetooth-next pull request for net-next:
>>>
>>> Net-next is closed for the merge window. I guess Jakub could still
>>> consider merging this, but unless you want it very, very badly, I hope
>>> it can just be postponed, as the PW queue is already long.
>>
>> This update includes quite a few new hardware supports. This is a
>> resend because the last one was dropped due to an invalid 'Fixes' tag.
>>
>>  Btw, I don't know why the entire PR needs to be dropped if only a few
>> items have invalid tags? Can't we just dropped those?
> 
> Maybe Im doing something wrong in my side, the issue with the Fixes
> that is that sometimes they become invalid once I rebase on top of
> net-next, which, afaik, is necessary to detect already applied
> patches. Or is rebasing is not really necessary and should only be
> done once when rc1 is tagged?

AFAICT, rebasing is needed if you have local patches not present in the
tree that you are pulling from.

If you e.g. send a PR to net-next, including all the patches present in
your devel tree ATM, you could avoid the later rebase not applying any
patch in your tree until you merge net-next back.

Would that doable for you?

Thanks,

Paolo


^ permalink raw reply

* Re: [PATCH net v1 2/2] selftests: fib_nexthops: test stale has_v4 on nexthop replace
From: David Ahern @ 2026-04-13 14:47 UTC (permalink / raw)
  To: Jiayuan Chen, netdev
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Shuah Khan, linux-kernel, linux-kselftest
In-Reply-To: <20260413114522.147784-2-jiayuan.chen@linux.dev>

On 4/13/26 5:45 AM, Jiayuan Chen wrote:
> Add test cases that exercise the scenario where an IPv6 nexthop is
> replaced with an IPv4 nexthop while being part of a group. The group's
> has_v4 flag must be updated so that subsequent IPv6 route additions are
> properly rejected.
> 
> Two cases are covered:
>   1. Gateway nexthop replaced across families with an existing IPv6
>      route on the group (rejected by fib6_check_nh_list).
>   2. Blackhole nexthop replaced across families with no existing IPv6
>      route on the group (fib6_check_nh_list returns early) — this is
>      the path that triggers a NULL ptr deref without the kernel fix.
> 
> Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
> ---
>  tools/testing/selftests/net/fib_nexthops.sh | 22 +++++++++++++++++++++
>  1 file changed, 22 insertions(+)
> 

Reviewed-by: David Ahern <dsahern@kernel.org>


^ permalink raw reply

* Re: [patch 17/38] ext4: Replace get_cycles() usage with ktime_get()
From: Arnd Bergmann @ 2026-04-13 14:46 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: Theodore Ts'o, linux-ext4, x86, Baolu Lu, iommu,
	Michael Grzeschik, Netdev, linux-wireless, Herbert Xu,
	linux-crypto, Vlastimil Babka (SUSE), linux-mm, David Woodhouse,
	Bernie Thompson, linux-fbdev, Andrew Morton,
	Uladzislau Rezki (Sony), Marco Elver, Dmitry Vyukov, kasan-dev,
	Andrey Ryabinin, Thomas Sailer, linux-hams, Jason A . Donenfeld,
	Richard Henderson, linux-alpha, Russell King, linux-arm-kernel,
	Catalin Marinas, Huacai Chen, loongarch, Geert Uytterhoeven,
	linux-m68k, Dinh Nguyen, Jonas Bonn,
	linux-openrisc@vger.kernel.org, Helge Deller, linux-parisc,
	Michael Ellerman, linuxppc-dev, Paul Walmsley, linux-riscv,
	Heiko Carstens, linux-s390, David S . Miller, sparclinux
In-Reply-To: <20260410120318.727211419@kernel.org>

On Fri, Apr 10, 2026, at 14:19, Thomas Gleixner wrote:
> get_cycles() is not guaranteed to be functional on all systems/platforms
> and the values returned are unitless and not easy to map to something
> useful.
>
> Use ktime_get() instead, which provides nanosecond timestamps and is
> functional everywhere.
>
> This is part of a larger effort to limit get_cycles() usage to low level
> architecture code.
>
> Signed-off-by: Thomas Gleixner <tglx@kernel.org>
> Cc: "Theodore Ts'o" <tytso@mit.edu>
> Cc: linux-ext4@vger.kernel.org

I think this is technically an ABI chance, since the time
difference gets exported through procfs, but the new version
is clearly the right thing to do since it replaces a hardware
specific value with a portable one.

       Arnd

^ permalink raw reply

* Re: [PATCH net v1 1/2] nexthop: fix IPv6 route referencing IPv4 nexthop
From: David Ahern @ 2026-04-13 14:46 UTC (permalink / raw)
  To: Jiayuan Chen, netdev
  Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
	Simon Horman, Shuah Khan, linux-kernel, linux-kselftest
In-Reply-To: <20260413114522.147784-1-jiayuan.chen@linux.dev>

On 4/13/26 5:45 AM, Jiayuan Chen wrote:
> syzbot reported a panic [1] [2].
> 
> When an IPv6 nexthop is replaced with an IPv4 nexthop, the has_v4 flag
> of all groups containing this nexthop is not updated. This is because
> nh_group_v4_update is only called when replacing AF_INET to AF_INET6,
> but the reverse direction (AF_INET6 to AF_INET) is missed.
> 
> This allows a stale has_v4=false to bypass fib6_check_nexthop, causing
> IPv6 routes to be attached to groups that effectively contain only AF_INET
> members. Subsequent route lookups then call nexthop_fib6_nh() which
> returns NULL for the AF_INET member, leading to a NULL pointer
> dereference.
> 
> Fix by calling nh_group_v4_update whenever the family changes, not just
> AF_INET to AF_INET6.
> 
> Reproducer:
> 	# AF_INET6 blackhole
> 	ip -6 nexthop add id 1 blackhole
> 	# group with has_v4=false
> 	ip nexthop add id 100 group 1
> 	# replace with AF_INET (no -6), has_v4 stays false
> 	ip nexthop replace id 1 blackhole
> 	# pass stale has_v4 check
> 	ip -6 route add 2001:db8::/64 nhid 100
> 	# panic
> 	ping -6 2001:db8::1
> 
> [1] https://syzkaller.appspot.com/bug?id=e17283eb2f8dcf3dd9b47fe6f67a95f71faadad0
> [2] https://syzkaller.appspot.com/bug?id=8699b6ae54c9f35837d925686208402949e12ef3
> Fixes: 7bf4796dd099 ("nexthops: add support for replace")
> Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
> ---
>  net/ipv4/nexthop.c | 4 ++--
>  1 file changed, 2 insertions(+), 2 deletions(-)
> 


Reviewed-by: David Ahern <dsahern@kernel.org>


^ permalink raw reply

* Re: [PATCH v4] net/mlx5: Fix OOB access and stack information leak in PTP event handling
From: Leon Romanovsky @ 2026-04-13 14:46 UTC (permalink / raw)
  To: Prathamesh Deshpande
  Cc: Carolina Jubran, Saeed Mahameed, Richard Cochran, Tariq Toukan,
	Mark Bloch, netdev, linux-rdma, linux-kernel
In-Reply-To: <20260412000418.8415-1-prathameshdeshpande7@gmail.com>

On Sun, Apr 12, 2026 at 01:04:10AM +0100, Prathamesh Deshpande wrote:
> In mlx5_pps_event(), several critical issues were identified:
> 
> 1. The 'pin' index from the hardware event was used without bounds
>    checking to index 'pin_config' and 'pps_info->start'. Check against
>    MAX_PIN_NUM to prevent out-of-bounds access.

You were told more than once that this is impossible.

<...>

> +	if (WARN_ON_ONCE(pin >= MAX_PIN_NUM))
> +		return NOTIFY_OK;

Let's not add useless checks in fast path.

Thanks

^ permalink raw reply

* Re: [patch 38/38] treewide: Remove asm/timex.h includes from generic code
From: Arnd Bergmann @ 2026-04-13 14:45 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: x86, Baolu Lu, iommu, Michael Grzeschik, Netdev, linux-wireless,
	Herbert Xu, linux-crypto, Vlastimil Babka (SUSE), linux-mm,
	David Woodhouse, Bernie Thompson, linux-fbdev, Theodore Ts'o,
	linux-ext4, Andrew Morton, Uladzislau Rezki (Sony), Marco Elver,
	Dmitry Vyukov, kasan-dev, Andrey Ryabinin, Thomas Sailer,
	linux-hams, Jason A . Donenfeld, Richard Henderson, linux-alpha,
	Russell King, linux-arm-kernel, Catalin Marinas, Huacai Chen,
	loongarch, Geert Uytterhoeven, linux-m68k, Dinh Nguyen,
	Jonas Bonn, linux-openrisc@vger.kernel.org, Helge Deller,
	linux-parisc, Michael Ellerman, linuxppc-dev, Paul Walmsley,
	linux-riscv, Heiko Carstens, linux-s390, David S . Miller,
	sparclinux
In-Reply-To: <20260410120320.163559629@kernel.org>

On Fri, Apr 10, 2026, at 14:21, Thomas Gleixner wrote:
> asm/timex.h does not provide any functionality for non-architecture code
> anymore.
>
> Remove the asm-generic fallback and all references in include and source
> files along with the random_get_entropy() #ifdeffery in timex.h.
>
> Signed-off-by: Thomas Gleixner <tglx@kernel.org>
> ---
>  include/asm-generic/Kbuild  |    1 -
>  include/asm-generic/timex.h |   15 ---------------
>  include/linux/random.h      |    3 +++
>  include/linux/timex.h       |   26 --------------------------

Acked-by: Arnd Bergmann <arnd@arndb.de>

^ permalink raw reply

* Re: [patch 32/38] powerpc/spufs: Use mftb() directly
From: Arnd Bergmann @ 2026-04-13 14:43 UTC (permalink / raw)
  To: Thomas Gleixner, LKML
  Cc: Michael Ellerman, linuxppc-dev, x86, Baolu Lu, iommu,
	Michael Grzeschik, Netdev, linux-wireless, Herbert Xu,
	linux-crypto, Vlastimil Babka (SUSE), linux-mm, David Woodhouse,
	Bernie Thompson, linux-fbdev, Theodore Ts'o, linux-ext4,
	Andrew Morton, Uladzislau Rezki (Sony), Marco Elver,
	Dmitry Vyukov, kasan-dev, Andrey Ryabinin, Thomas Sailer,
	linux-hams, Jason A . Donenfeld, Richard Henderson, linux-alpha,
	Russell King, linux-arm-kernel, Catalin Marinas, Huacai Chen,
	loongarch, Geert Uytterhoeven, linux-m68k, Dinh Nguyen,
	Jonas Bonn, linux-openrisc@vger.kernel.org, Helge Deller,
	linux-parisc, Paul Walmsley, linux-riscv, Heiko Carstens,
	linux-s390, David S . Miller, sparclinux
In-Reply-To: <20260410120319.723429844@kernel.org>

On Fri, Apr 10, 2026, at 14:21, Thomas Gleixner wrote:
> There is no reason to indirect via get_cycles(), which is about to be
> removed.
>
> Use mftb() directly.
>
> Signed-off-by: Thomas Gleixner <tglx@kernel.org>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: linuxppc-dev@lists.ozlabs.org

Acked-by: Arnd Bergmann <arnd@arndb.de>

^ permalink raw reply

* Re: [PATCH 5.10.y] Bluetooth: SCO: Fix use-after-free in sco_recv_frame() due to missing sock_hold
From: Greg KH @ 2026-04-13 14:43 UTC (permalink / raw)
  To: Jianqiang kang
  Cc: stable, imv4bel, patches, linux-kernel, marcel, johan.hedberg,
	luiz.dentz, davem, kuba, linux-bluetooth, netdev, luiz.von.dentz
In-Reply-To: <20260409074429.740279-1-jianqkang@sina.cn>

On Thu, Apr 09, 2026 at 03:44:29PM +0800, Jianqiang kang wrote:
> From: Hyunwoo Kim <imv4bel@gmail.com>
> 
> [ Upstream commit 598dbba9919c5e36c54fe1709b557d64120cb94b ]
> 
> sco_recv_frame() reads conn->sk under sco_conn_lock() but immediately
> releases the lock without holding a reference to the socket. A concurrent
> close() can free the socket between the lock release and the subsequent
> sk->sk_state access, resulting in a use-after-free.
> 
> Other functions in the same file (sco_sock_timeout(), sco_conn_del())
> correctly use sco_sock_hold() to safely hold a reference under the lock.
> 
> Fix by using sco_sock_hold() to take a reference before releasing the
> lock, and adding sock_put() on all exit paths.
> 
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> Signed-off-by: Hyunwoo Kim <imv4bel@gmail.com>
> Signed-off-by: Luiz Augusto von Dentz <luiz.von.dentz@intel.com>
> Signed-off-by: Jianqiang kang <jianqkang@sina.cn>
> ---
>  net/bluetooth/sco.c | 10 +++++++---
>  1 file changed, 7 insertions(+), 3 deletions(-)

This breaks the build, how was it tested?

confused,

greg k-h

^ permalink raw reply

* Re: [PATCH v2] net/sched: sch_cake: fix NAT destination port not being updated in cake_update_flowkeys
From: Toke Høiland-Jørgensen @ 2026-04-13 14:40 UTC (permalink / raw)
  To: Dudu Lu, netdev; +Cc: jhs, jiri, Dudu Lu
In-Reply-To: <20260413110041.44704-1-phx0fer@gmail.com>

Dudu Lu <phx0fer@gmail.com> writes:

> cake_update_flowkeys() is supposed to update the flow dissector keys
> with the NAT-translated addresses and ports from conntrack, so that
> CAKE's per-flow fairness correctly identifies post-NAT flows as
> belonging to the same connection.
>
> For the source port, this works correctly:
>     keys->ports.src = port;
>
> But for the destination port, the assignment is reversed:
>     port = keys->ports.dst;
>
> This means the NAT destination port is never updated in the flow keys.
> As a result, when multiple connections are NATed to the same destination,
> CAKE treats them as separate flows because the original (pre-NAT)
> destination ports differ. This breaks CAKE's NAT-aware flow isolation
> when using the "nat" mode.
>
> The bug was introduced in commit b0c19ed6088a ("sch_cake: Take advantage
> of skb->hash where appropriate") which refactored the original direct
> assignment into a compare-and-conditionally-update pattern, but wrote
> the destination port update backwards.
>
> Fix by reversing the assignment direction to match the source port
> pattern.
>
> Fixes: b0c19ed6088a ("sch_cake: Take advantage of skb->hash where appropriate")
> Signed-off-by: Dudu Lu <phx0fer@gmail.com>

Thank you for the fix!

Acked-by: Toke Høiland-Jørgensen <toke@toke.dk>

^ permalink raw reply

* [PATCH ipsec v2] xfrm: cleanup error path in xfrm_add_policy()
From: Deepanshu Kartikey @ 2026-04-13 14:35 UTC (permalink / raw)
  To: steffen.klassert, herbert, davem, edumazet, kuba, pabeni, horms,
	sd
  Cc: netdev, linux-kernel, Deepanshu Kartikey,
	syzbot+901d48e0b95aed4a2548

Replace the open-coded manual cleanup in the error path of
xfrm_add_policy() with xfrm_policy_destroy(), which already
handles all the necessary cleanup internally. This is consistent
with how xfrm_policy_construct() handles its own error paths.

The walk.dead flag must be set before calling xfrm_policy_destroy()
as required by BUG_ON(!policy->walk.dead).

Tested-by: syzbot+901d48e0b95aed4a2548@syzkaller.appspotmail.com
Signed-off-by: Deepanshu Kartikey <kartikey406@gmail.com>
---
v2:
  - Reworded commit message to reflect cleanup rather than bugfix
    as suggested by Sabrina Dubroca
  - Removed incorrect Fixes: and Closes: tags
  - Corrected subject prefix to "PATCH ipsec"
---
 net/xfrm/xfrm_user.c | 5 ++---
 1 file changed, 2 insertions(+), 3 deletions(-)

diff --git a/net/xfrm/xfrm_user.c b/net/xfrm/xfrm_user.c
index d56450f61669..ae144d1e4a65 100644
--- a/net/xfrm/xfrm_user.c
+++ b/net/xfrm/xfrm_user.c
@@ -2267,9 +2267,8 @@ static int xfrm_add_policy(struct sk_buff *skb, struct nlmsghdr *nlh,
 
 	if (err) {
 		xfrm_dev_policy_delete(xp);
-		xfrm_dev_policy_free(xp);
-		security_xfrm_policy_free(xp->security);
-		kfree(xp);
+		xp->walk.dead = 1;
+		xfrm_policy_destroy(xp);
 		return err;
 	}
 
-- 
2.43.0


^ permalink raw reply related

* Re: [PATCH] xfrm: fix memory leak in xfrm_add_policy()
From: Sabrina Dubroca @ 2026-04-13 14:34 UTC (permalink / raw)
  To: Deepanshu Kartikey
  Cc: steffen.klassert, herbert, davem, edumazet, kuba, pabeni, horms,
	leon, netdev, linux-kernel, syzbot+901d48e0b95aed4a2548
In-Reply-To: <CADhLXY5a7QCdp7NXC07fma905O8YJ4jPcbbdK0=kFgD3d2AF-A@mail.gmail.com>

2026-04-13, 19:58:53 +0530, Deepanshu Kartikey wrote:
> On Mon, Apr 13, 2026 at 7:03 PM Sabrina Dubroca <sd@queasysnail.net> wrote:
> >
> 
> > What is missing in the current code? "we have a better way to do this"
> > is not a bugfix, it's a clean up. The kmemleak report says that we're
> > leaking the xfrm_policy struct on this codepath, which doesn't make
> > sense, that's covered by the existing kfree(xp).
> >
> > Also, please use "PATCH ipsec" for fixes to net/xfrm and the rest of
> > the IPsec implementation.
> >
> > --
> > Sabrina
> 
> Hi Sabrina,
> 
> Thanks for the review!
> 
> You are right, the existing kfree(xp) already covers the struct
> itself, so my commit message was incorrect in claiming a memory
> leak fix. I will resend this as a cleanup patch to replace the
> open-coded manual cleanup with xfrm_policy_destroy(), which is
> more consistent with xfrm_policy_construct() error handling.

Ok. Then you should wait 2 weeks until the merge window is over:
https://lore.kernel.org/netdev/20260412142250.131bf997@kernel.org/

and use "[PATCH ipsec-next]" as prefix for the cleanup patch (+ drop
the syzbot references).

> I am also separately investigating the root cause of the actual
> kmemleak report and will send a proper fix once identified.

Ok, thanks.

-- 
Sabrina

^ permalink raw reply

* Re: [GIT PULL] bluetooth-next 2026-04-13
From: Paolo Abeni @ 2026-04-13 14:34 UTC (permalink / raw)
  To: Luiz Augusto von Dentz; +Cc: davem, kuba, linux-bluetooth, netdev
In-Reply-To: <CABBYNZK5mMbuzvjAquy49vSk0pLRnk_s8oqEut1se4DToURn9A@mail.gmail.com>

On 4/13/26 4:11 PM, Luiz Augusto von Dentz wrote:
> On Mon, Apr 13, 2026 at 9:41 AM Paolo Abeni <pabeni@redhat.com> wrote:
>> On 4/13/26 3:22 PM, Luiz Augusto von Dentz wrote:
>>> The following changes since commit 42f9b4c6ef19e71d2c7d9bfd3c5037d4fe434ad7:
>>>
>>>   tools: ynl: tests: fix leading space on Makefile target (2026-04-09 20:41:40 -0700)
>>>
>>> are available in the Git repository at:
>>>
>>>   git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next.git tags/for-net-next-2026-04-13
>>>
>>> for you to fetch changes up to c347ca17d62a32c25564fee0ca3a2a7bc2d5fd6f:
>>>
>>>   Bluetooth: hci_qca: Fix missing wakeup during SSR memdump handling (2026-04-13 09:19:42 -0400)
>>>
>>> ----------------------------------------------------------------
>>> bluetooth-next pull request for net-next:
>>
>> Net-next is closed for the merge window. I guess Jakub could still
>> consider merging this, but unless you want it very, very badly, I hope
>> it can just be postponed, as the PW queue is already long.
> 
> This update includes quite a few new hardware supports. This is a
> resend because the last one was dropped due to an invalid 'Fixes' tag.
> 
>  Btw, I don't know why the entire PR needs to be dropped if only a few
> items have invalid tags? Can't we just dropped those?

That's the problem with PR: AFAIK we can just pull whatever the
submitter published. To do any edit, including dropping patches, we must
not do a git pull, and instead applying the patches individually
(meaning they will get the net maintainer SoB, too, which in turn will
make little sense).

/P


^ permalink raw reply

* Re: [PATCH net 1/2] netfilter: skip recording stale or retransmitted INIT
From: Florian Westphal @ 2026-04-13 14:23 UTC (permalink / raw)
  To: Xin Long
  Cc: network dev, linux-sctp, davem, kuba, Eric Dumazet, Paolo Abeni,
	Simon Horman, Marcelo Ricardo Leitner, Yi Chen
In-Reply-To: <CADvbK_f1Cvqx0_-J1jGaT865eWiW2ZHsJT8EkN6kr21j88Y9kQ@mail.gmail.com>

Xin Long <lucien.xin@gmail.com> wrote:
> On Sat, Apr 11, 2026 at 4:16 PM Florian Westphal <fw@strlen.de> wrote:
> > Xin Long <lucien.xin@gmail.com> wrote:
> >
> > > diff --git a/net/netfilter/nf_conntrack_proto_sctp.c b/net/netfilter/nf_conntrack_proto_sctp.c
> > > index 645d2c43ebf7..7e10fa65cbdd 100644
> > > --- a/net/netfilter/nf_conntrack_proto_sctp.c
> > > +++ b/net/netfilter/nf_conntrack_proto_sctp.c
> > > @@ -466,9 +466,13 @@ int nf_conntrack_sctp_packet(struct nf_conn *ct,
> > >                       if (!ih)
> > >                               goto out_unlock;
> > >
> > > -                     if (ct->proto.sctp.init[dir] && ct->proto.sctp.init[!dir])
> > > -                             ct->proto.sctp.init[!dir] = 0;
> > > -                     ct->proto.sctp.init[dir] = 1;
> > > +                     /* Do not record INIT matching peer vtag (stale or retransmitted INIT). */
> > > +                     if (old_state == SCTP_CONNTRACK_NONE ||
> > > +                         ct->proto.sctp.vtag[!dir] != ih->init_tag) {
> >
> > Should    ct->proto.sctp.vtag[!dir] == ih->init_tag case also
> > set ignore = true?
> 
> It should for a stale INIT, but not for a retransmitted one. At this point,
> though, we don't reliably distinguish between the two.
> 
> Also, as this patch only aims to prevent updating ct->proto.sctp.init[]
> (introduced in 8e56b063c865) in this scenario, it’s safer to avoid
> changing other behavior.

Alright. I'm fine with this going in via net directly:

Acked-by: Florian Westphal <fw@strlen.de>


^ permalink raw reply

* Re: [PATCH] xfrm: fix memory leak in xfrm_add_policy()
From: Deepanshu Kartikey @ 2026-04-13 14:28 UTC (permalink / raw)
  To: Sabrina Dubroca
  Cc: steffen.klassert, herbert, davem, edumazet, kuba, pabeni, horms,
	leon, netdev, linux-kernel, syzbot+901d48e0b95aed4a2548
In-Reply-To: <adzwiKOPptwbNp7Y@krikkit>

On Mon, Apr 13, 2026 at 7:03 PM Sabrina Dubroca <sd@queasysnail.net> wrote:
>

> What is missing in the current code? "we have a better way to do this"
> is not a bugfix, it's a clean up. The kmemleak report says that we're
> leaking the xfrm_policy struct on this codepath, which doesn't make
> sense, that's covered by the existing kfree(xp).
>
> Also, please use "PATCH ipsec" for fixes to net/xfrm and the rest of
> the IPsec implementation.
>
> --
> Sabrina

Hi Sabrina,

Thanks for the review!

You are right, the existing kfree(xp) already covers the struct
itself, so my commit message was incorrect in claiming a memory
leak fix. I will resend this as a cleanup patch to replace the
open-coded manual cleanup with xfrm_policy_destroy(), which is
more consistent with xfrm_policy_construct() error handling.

I am also separately investigating the root cause of the actual
kmemleak report and will send a proper fix once identified.

Sorry for the noise and thanks again for the guidance.

Deepanshu

^ permalink raw reply

* Re: [syzbot] [mptcp?] possible deadlock in mptcp_pm_nl_set_flags
From: Matthieu Baerts @ 2026-04-13 14:26 UTC (permalink / raw)
  To: syzbot
  Cc: davem, edumazet, geliang, horms, kuba, linux-kernel, martineau,
	mptcp, netdev, pabeni, syzkaller-bugs
In-Reply-To: <69d90c16.050a0220.adfb3.000a.GAE@google.com>

Hello,

On 10/04/2026 16:41, syzbot wrote:
> Hello,
> 
> syzbot found the following issue on:
> 
> HEAD commit:    591cd656a1bf Linux 7.0-rc7
> git tree:       upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=1779ce06580000
> kernel config:  https://syzkaller.appspot.com/x/.config?x=5a3e5e8c17cc174e
> dashboard link: https://syzkaller.appspot.com/bug?extid=dfa28bb6444d8f169cbb
> compiler:       gcc (Debian 14.2.0-19) 14.2.0, GNU ld (GNU Binutils for Debian) 2.44
> 
> Unfortunately, I don't have any reproducer for this issue yet.
> 
> Downloadable assets:
> disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/d900f083ada3/non_bootable_disk-591cd656.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/3e99d0e29df0/vmlinux-591cd656.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/5877fee0a056/bzImage-591cd656.xz
> 
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: syzbot+dfa28bb6444d8f169cbb@syzkaller.appspotmail.com
> 
> netlink: 8 bytes leftover after parsing attributes in process `syz.8.6915'.
> netlink: 8 bytes leftover after parsing attributes in process `syz.8.6915'.
> ======================================================
> WARNING: possible circular locking dependency detected
> syzkaller #0 Tainted: G             L     
> ------------------------------------------------------
> syz.8.6915/29783 is trying to acquire lock:
> ffffffff8e9ab8a0 (fs_reclaim){+.+.}-{0:0}, at: might_alloc include/linux/sched/mm.h:317 [inline]
> ffffffff8e9ab8a0 (fs_reclaim){+.+.}-{0:0}, at: slab_pre_alloc_hook mm/slub.c:4489 [inline]
> ffffffff8e9ab8a0 (fs_reclaim){+.+.}-{0:0}, at: slab_alloc_node mm/slub.c:4843 [inline]
> ffffffff8e9ab8a0 (fs_reclaim){+.+.}-{0:0}, at: kmem_cache_alloc_lru_noprof+0x51/0x6e0 mm/slub.c:4885
> 
> but task is already holding lock:
> ffff888034761ae0 (sk_lock-AF_INET){+.+.}-{0:0}, at: lock_sock include/net/sock.h:1709 [inline]
> ffff888034761ae0 (sk_lock-AF_INET){+.+.}-{0:0}, at: mptcp_pm_nl_set_flags_all net/mptcp/pm_kernel.c:1482 [inline]
> ffff888034761ae0 (sk_lock-AF_INET){+.+.}-{0:0}, at: mptcp_pm_nl_set_flags+0x605/0xd30 net/mptcp/pm_kernel.c:1551
> 
> which lock already depends on the new lock.
> 
> 
> the existing dependency chain (in reverse order) is:
> 
> -> #6 (sk_lock-AF_INET){+.+.}-{0:0}:
>        lock_sock_nested+0x41/0xf0 net/core/sock.c:3780
>        lock_sock include/net/sock.h:1709 [inline]
>        inet_shutdown+0x67/0x410 net/ipv4/af_inet.c:919
>        nbd_mark_nsock_dead+0xae/0x5c0 drivers/block/nbd.c:318
>        recv_work+0x5fb/0x8c0 drivers/block/nbd.c:1021
>        process_one_work+0xa23/0x19a0 kernel/workqueue.c:3276
>        process_scheduled_works kernel/workqueue.c:3359 [inline]
>        worker_thread+0x5ef/0xe50 kernel/workqueue.c:3440
>        kthread+0x370/0x450 kernel/kthread.c:436
>        ret_from_fork+0x754/0xd80 arch/x86/kernel/process.c:158
>        ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
> 
> -> #5 (&nsock->tx_lock){+.+.}-{4:4}:
>        __mutex_lock_common kernel/locking/mutex.c:614 [inline]
>        __mutex_lock+0x1a2/0x1b90 kernel/locking/mutex.c:776
>        nbd_handle_cmd drivers/block/nbd.c:1143 [inline]
>        nbd_queue_rq+0x428/0x1080 drivers/block/nbd.c:1207
>        blk_mq_dispatch_rq_list+0x422/0x1e70 block/blk-mq.c:2148
>        __blk_mq_do_dispatch_sched block/blk-mq-sched.c:168 [inline]
>        blk_mq_do_dispatch_sched block/blk-mq-sched.c:182 [inline]
>        __blk_mq_sched_dispatch_requests+0xcea/0x1620 block/blk-mq-sched.c:307
>        blk_mq_sched_dispatch_requests+0xd7/0x1c0 block/blk-mq-sched.c:329
>        blk_mq_run_hw_queue+0x23c/0x670 block/blk-mq.c:2386
>        blk_mq_dispatch_list+0x51d/0x1360 block/blk-mq.c:2949
>        blk_mq_flush_plug_list block/blk-mq.c:2997 [inline]
>        blk_mq_flush_plug_list+0x130/0x600 block/blk-mq.c:2969
>        __blk_flush_plug+0x2c4/0x4b0 block/blk-core.c:1230
>        blk_finish_plug block/blk-core.c:1257 [inline]
>        __submit_bio+0x584/0x6c0 block/blk-core.c:649
>        __submit_bio_noacct_mq block/blk-core.c:722 [inline]
>        submit_bio_noacct_nocheck+0x562/0xc10 block/blk-core.c:753
>        submit_bio_noacct+0xd17/0x2010 block/blk-core.c:884
>        blk_crypto_submit_bio include/linux/blk-crypto.h:203 [inline]
>        submit_bh_wbc+0x59c/0x770 fs/buffer.c:2821
>        submit_bh fs/buffer.c:2826 [inline]
>        block_read_full_folio+0x4c8/0x8e0 fs/buffer.c:2458
>        filemap_read_folio+0xfc/0x3b0 mm/filemap.c:2501
>        do_read_cache_folio+0x2d7/0x6b0 mm/filemap.c:4101
>        read_mapping_folio include/linux/pagemap.h:1017 [inline]
>        read_part_sector+0xd1/0x370 block/partitions/core.c:723
>        adfspart_check_ICS+0x93/0x910 block/partitions/acorn.c:360
>        check_partition block/partitions/core.c:142 [inline]
>        blk_add_partitions block/partitions/core.c:590 [inline]
>        bdev_disk_changed+0x7f8/0xc80 block/partitions/core.c:694
>        blkdev_get_whole+0x187/0x290 block/bdev.c:764
>        bdev_open+0x2c7/0xe40 block/bdev.c:973
>        blkdev_open+0x34e/0x4f0 block/fops.c:697
>        do_dentry_open+0x6d8/0x1660 fs/open.c:949
>        vfs_open+0x82/0x3f0 fs/open.c:1081
>        do_open fs/namei.c:4677 [inline]
>        path_openat+0x208c/0x31a0 fs/namei.c:4836
>        do_file_open+0x20e/0x430 fs/namei.c:4865
>        do_sys_openat2+0x10d/0x1e0 fs/open.c:1366
>        do_sys_open fs/open.c:1372 [inline]
>        __do_sys_openat fs/open.c:1388 [inline]
>        __se_sys_openat fs/open.c:1383 [inline]
>        __x64_sys_openat+0x12d/0x210 fs/open.c:1383
>        do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
>        do_syscall_64+0x106/0xf80 arch/x86/entry/syscall_64.c:94
>        entry_SYSCALL_64_after_hwframe+0x77/0x7f
> 
> -> #4 (&cmd->lock){+.+.}-{4:4}:
>        __mutex_lock_common kernel/locking/mutex.c:614 [inline]
>        __mutex_lock+0x1a2/0x1b90 kernel/locking/mutex.c:776
>        nbd_queue_rq+0xba/0x1080 drivers/block/nbd.c:1199
>        blk_mq_dispatch_rq_list+0x422/0x1e70 block/blk-mq.c:2148
>        __blk_mq_do_dispatch_sched block/blk-mq-sched.c:168 [inline]
>        blk_mq_do_dispatch_sched block/blk-mq-sched.c:182 [inline]
>        __blk_mq_sched_dispatch_requests+0xcea/0x1620 block/blk-mq-sched.c:307
>        blk_mq_sched_dispatch_requests+0xd7/0x1c0 block/blk-mq-sched.c:329
>        blk_mq_run_hw_queue+0x23c/0x670 block/blk-mq.c:2386
>        blk_mq_dispatch_list+0x51d/0x1360 block/blk-mq.c:2949
>        blk_mq_flush_plug_list block/blk-mq.c:2997 [inline]
>        blk_mq_flush_plug_list+0x130/0x600 block/blk-mq.c:2969
>        __blk_flush_plug+0x2c4/0x4b0 block/blk-core.c:1230
>        blk_finish_plug block/blk-core.c:1257 [inline]
>        __submit_bio+0x584/0x6c0 block/blk-core.c:649
>        __submit_bio_noacct_mq block/blk-core.c:722 [inline]
>        submit_bio_noacct_nocheck+0x562/0xc10 block/blk-core.c:753
>        submit_bio_noacct+0xd17/0x2010 block/blk-core.c:884
>        blk_crypto_submit_bio include/linux/blk-crypto.h:203 [inline]
>        submit_bh_wbc+0x59c/0x770 fs/buffer.c:2821
>        submit_bh fs/buffer.c:2826 [inline]
>        block_read_full_folio+0x4c8/0x8e0 fs/buffer.c:2458
>        filemap_read_folio+0xfc/0x3b0 mm/filemap.c:2501
>        do_read_cache_folio+0x2d7/0x6b0 mm/filemap.c:4101
>        read_mapping_folio include/linux/pagemap.h:1017 [inline]
>        read_part_sector+0xd1/0x370 block/partitions/core.c:723
>        adfspart_check_ICS+0x93/0x910 block/partitions/acorn.c:360
>        check_partition block/partitions/core.c:142 [inline]
>        blk_add_partitions block/partitions/core.c:590 [inline]
>        bdev_disk_changed+0x7f8/0xc80 block/partitions/core.c:694
>        blkdev_get_whole+0x187/0x290 block/bdev.c:764
>        bdev_open+0x2c7/0xe40 block/bdev.c:973
>        blkdev_open+0x34e/0x4f0 block/fops.c:697
>        do_dentry_open+0x6d8/0x1660 fs/open.c:949
>        vfs_open+0x82/0x3f0 fs/open.c:1081
>        do_open fs/namei.c:4677 [inline]
>        path_openat+0x208c/0x31a0 fs/namei.c:4836
>        do_file_open+0x20e/0x430 fs/namei.c:4865
>        do_sys_openat2+0x10d/0x1e0 fs/open.c:1366
>        do_sys_open fs/open.c:1372 [inline]
>        __do_sys_openat fs/open.c:1388 [inline]
>        __se_sys_openat fs/open.c:1383 [inline]
>        __x64_sys_openat+0x12d/0x210 fs/open.c:1383
>        do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
>        do_syscall_64+0x106/0xf80 arch/x86/entry/syscall_64.c:94
>        entry_SYSCALL_64_after_hwframe+0x77/0x7f
> 
> -> #3 (set->srcu){.+.+}-{0:0}:
>        srcu_lock_sync include/linux/srcu.h:199 [inline]
>        __synchronize_srcu+0xa2/0x300 kernel/rcu/srcutree.c:1481
>        blk_mq_wait_quiesce_done block/blk-mq.c:284 [inline]
>        blk_mq_wait_quiesce_done block/blk-mq.c:281 [inline]
>        blk_mq_quiesce_queue block/blk-mq.c:304 [inline]
>        blk_mq_quiesce_queue+0x149/0x1c0 block/blk-mq.c:299
>        elevator_switch+0x17b/0x7e0 block/elevator.c:576
>        elevator_change+0x352/0x530 block/elevator.c:681
>        elevator_set_default+0x29e/0x360 block/elevator.c:754
>        blk_register_queue+0x412/0x590 block/blk-sysfs.c:946
>        __add_disk+0x73f/0xe40 block/genhd.c:528
>        add_disk_fwnode+0x118/0x5c0 block/genhd.c:597
>        add_disk include/linux/blkdev.h:785 [inline]
>        nbd_dev_add+0x77a/0xb10 drivers/block/nbd.c:1984
>        nbd_init+0x291/0x2b0 drivers/block/nbd.c:2692
>        do_one_initcall+0x11d/0x760 init/main.c:1382
>        do_initcall_level init/main.c:1444 [inline]
>        do_initcalls init/main.c:1460 [inline]
>        do_basic_setup init/main.c:1479 [inline]
>        kernel_init_freeable+0x6e5/0x7a0 init/main.c:1692
>        kernel_init+0x1f/0x1e0 init/main.c:1582
>        ret_from_fork+0x754/0xd80 arch/x86/kernel/process.c:158
>        ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
> 
> -> #2 (&q->elevator_lock){+.+.}-{4:4}:
>        __mutex_lock_common kernel/locking/mutex.c:614 [inline]
>        __mutex_lock+0x1a2/0x1b90 kernel/locking/mutex.c:776
>        elevator_change+0x1bc/0x530 block/elevator.c:679
>        elevator_set_none+0x92/0xf0 block/elevator.c:769
>        blk_mq_elv_switch_none block/blk-mq.c:5110 [inline]
>        __blk_mq_update_nr_hw_queues block/blk-mq.c:5155 [inline]
>        blk_mq_update_nr_hw_queues+0x4c1/0x15f0 block/blk-mq.c:5220
>        nbd_start_device+0x1a6/0xbd0 drivers/block/nbd.c:1489
>        nbd_genl_connect+0xff2/0x1a40 drivers/block/nbd.c:2239
>        genl_family_rcv_msg_doit+0x214/0x300 net/netlink/genetlink.c:1114
>        genl_family_rcv_msg net/netlink/genetlink.c:1194 [inline]
>        genl_rcv_msg+0x560/0x800 net/netlink/genetlink.c:1209
>        netlink_rcv_skb+0x159/0x420 net/netlink/af_netlink.c:2550
>        genl_rcv+0x28/0x40 net/netlink/genetlink.c:1218
>        netlink_unicast_kernel net/netlink/af_netlink.c:1318 [inline]
>        netlink_unicast+0x5aa/0x870 net/netlink/af_netlink.c:1344
>        netlink_sendmsg+0x8b0/0xda0 net/netlink/af_netlink.c:1894
>        sock_sendmsg_nosec net/socket.c:727 [inline]
>        __sock_sendmsg net/socket.c:742 [inline]
>        ____sys_sendmsg+0x9e1/0xb70 net/socket.c:2592
>        ___sys_sendmsg+0x190/0x1e0 net/socket.c:2646
>        __sys_sendmsg+0x170/0x220 net/socket.c:2678
>        do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
>        do_syscall_64+0x106/0xf80 arch/x86/entry/syscall_64.c:94
>        entry_SYSCALL_64_after_hwframe+0x77/0x7f
> 
> -> #1 (&q->q_usage_counter(io)#51){++++}-{0:0}:
>        blk_alloc_queue+0x610/0x790 block/blk-core.c:461
>        blk_mq_alloc_queue+0x174/0x290 block/blk-mq.c:4429
>        __blk_mq_alloc_disk+0x29/0x120 block/blk-mq.c:4476
>        nbd_dev_add+0x492/0xb10 drivers/block/nbd.c:1954
>        nbd_init+0x291/0x2b0 drivers/block/nbd.c:2692
>        do_one_initcall+0x11d/0x760 init/main.c:1382
>        do_initcall_level init/main.c:1444 [inline]
>        do_initcalls init/main.c:1460 [inline]
>        do_basic_setup init/main.c:1479 [inline]
>        kernel_init_freeable+0x6e5/0x7a0 init/main.c:1692
>        kernel_init+0x1f/0x1e0 init/main.c:1582
>        ret_from_fork+0x754/0xd80 arch/x86/kernel/process.c:158
>        ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:245
> 
> -> #0 (fs_reclaim){+.+.}-{0:0}:
>        check_prev_add kernel/locking/lockdep.c:3165 [inline]
>        check_prevs_add kernel/locking/lockdep.c:3284 [inline]
>        validate_chain kernel/locking/lockdep.c:3908 [inline]
>        __lock_acquire+0x14b8/0x2630 kernel/locking/lockdep.c:5237
>        lock_acquire kernel/locking/lockdep.c:5868 [inline]
>        lock_acquire+0x1cf/0x380 kernel/locking/lockdep.c:5825
>        __fs_reclaim_acquire mm/page_alloc.c:4348 [inline]
>        fs_reclaim_acquire+0xc4/0x100 mm/page_alloc.c:4362
>        might_alloc include/linux/sched/mm.h:317 [inline]
>        slab_pre_alloc_hook mm/slub.c:4489 [inline]
>        slab_alloc_node mm/slub.c:4843 [inline]
>        kmem_cache_alloc_lru_noprof+0x51/0x6e0 mm/slub.c:4885
>        sock_alloc_inode+0x25/0x1c0 net/socket.c:322
>        alloc_inode+0x68/0x250 fs/inode.c:347
>        new_inode_pseudo include/linux/fs.h:3003 [inline]
>        sock_alloc+0x44/0x280 net/socket.c:637
>        __sock_create+0xc2/0x860 net/socket.c:1569
>        mptcp_subflow_create_socket+0xec/0xa30 net/mptcp/subflow.c:1790
>        __mptcp_subflow_connect+0x3c6/0x1480 net/mptcp/subflow.c:1631
>        mptcp_pm_create_subflow_or_signal_addr+0xc3e/0x18d0 net/mptcp/pm_kernel.c:416
>        mptcp_pm_nl_fullmesh net/mptcp/pm_kernel.c:1460 [inline]
>        mptcp_pm_nl_set_flags_all net/mptcp/pm_kernel.c:1487 [inline]
>        mptcp_pm_nl_set_flags+0x814/0xd30 net/mptcp/pm_kernel.c:1551
>        mptcp_pm_set_flags net/mptcp/pm_netlink.c:277 [inline]
>        mptcp_pm_nl_set_flags_doit+0x1b0/0x290 net/mptcp/pm_netlink.c:282
>        genl_family_rcv_msg_doit+0x214/0x300 net/netlink/genetlink.c:1114
>        genl_family_rcv_msg net/netlink/genetlink.c:1194 [inline]
>        genl_rcv_msg+0x560/0x800 net/netlink/genetlink.c:1209
>        netlink_rcv_skb+0x159/0x420 net/netlink/af_netlink.c:2550
>        genl_rcv+0x28/0x40 net/netlink/genetlink.c:1218
>        netlink_unicast_kernel net/netlink/af_netlink.c:1318 [inline]
>        netlink_unicast+0x5aa/0x870 net/netlink/af_netlink.c:1344
>        netlink_sendmsg+0x8b0/0xda0 net/netlink/af_netlink.c:1894
>        sock_sendmsg_nosec net/socket.c:727 [inline]
>        __sock_sendmsg net/socket.c:742 [inline]
>        ____sys_sendmsg+0x9e1/0xb70 net/socket.c:2592
>        ___sys_sendmsg+0x190/0x1e0 net/socket.c:2646
>        __sys_sendmsg+0x170/0x220 net/socket.c:2678
>        do_syscall_x64 arch/x86/entry/syscall_64.c:63 [inline]
>        do_syscall_64+0x106/0xf80 arch/x86/entry/syscall_64.c:94
>        entry_SYSCALL_64_after_hwframe+0x77/0x7f
> 
> other info that might help us debug this:
> 
> Chain exists of:
>   fs_reclaim --> &nsock->tx_lock --> sk_lock-AF_INET
> 
>  Possible unsafe locking scenario:
> 
>        CPU0                    CPU1
>        ----                    ----
>   lock(sk_lock-AF_INET);
>                                lock(&nsock->tx_lock);
>                                lock(sk_lock-AF_INET);
>   lock(fs_reclaim);
> 
>  *** DEADLOCK ***

Same here, if I'm not mistaken, it looks like this issue is also due to
nbd introducing a lockdep dependency between reclaim and af_socket, and
this is similar to a previous report:

#syz dup: [syzbot] [mptcp?] possible deadlock in mptcp_subflow_create_socket (2)

If that's not correct, please induplicate it.

I'm not sure how MPTCP is being involved here, I guess it should not and
a simple fix is to forbid it. But maybe this issue is not specific to
MPTCP?

Cheers,
Matt
-- 
Sponsored by the NGI0 Core fund.


^ permalink raw reply

* Re: [PATCH v3 net-next 13/15] net/sched: sch_cake: annotate data-races in cake_dump_stats()
From: Toke Høiland-Jørgensen @ 2026-04-13 14:23 UTC (permalink / raw)
  To: Eric Dumazet
  Cc: David S . Miller, Jakub Kicinski, Paolo Abeni, Simon Horman,
	Jamal Hadi Salim, Jiri Pirko, netdev, eric.dumazet
In-Reply-To: <CANn89iJbFUm5NF1Q6WnB17Ufdg8PyFX4msJLVVs4PKRz6XW6yw@mail.gmail.com>

Eric Dumazet <edumazet@google.com> writes:

> On Mon, Apr 13, 2026 at 5:07 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>>
>> Eric Dumazet <edumazet@google.com> writes:
>>
>> > cake_dump_stats() and cake_dump_class_stats() run without qdisc
>> > spinlock being held.
>> >
>> > Add READ_ONCE()/WRITE_ONCE() annotations.
>> >
>> > Fixes: 046f6fd5daef ("sched: Add Common Applications Kept Enhanced (cake) qdisc")
>> > Signed-off-by: Eric Dumazet <edumazet@google.com>
>> > Cc: "Toke Høiland-Jørgensen" <toke@toke.dk>
>> > ---
>> >  net/sched/sch_cake.c | 404 ++++++++++++++++++++++++-------------------
>> >  1 file changed, 225 insertions(+), 179 deletions(-)
>>
>> One of these diffstats is not like the others - thanks for tackling this :)
>>
>> A few nits below:
>>
>> > diff --git a/net/sched/sch_cake.c b/net/sched/sch_cake.c
>> > index 32e672820c00a88c6d8fe77a6308405e016525ea..f523f0aa4d830e9d3ec4d43bb123e1dc4f8f289d 100644
>> > --- a/net/sched/sch_cake.c
>> > +++ b/net/sched/sch_cake.c
>> > @@ -399,14 +399,14 @@ static void cake_configure_rates(struct Qdisc *sch, u64 rate, bool rate_adjust);
>> >   * Here, invsqrt is a fixed point number (< 1.0), 32bit mantissa, aka Q0.32
>> >   */
>> >
>> > -static void cobalt_newton_step(struct cobalt_vars *vars)
>> > +static void cobalt_newton_step(struct cobalt_vars *vars, u32 count)
>> >  {
>> >       u32 invsqrt, invsqrt2;
>> >       u64 val;
>> >
>> >       invsqrt = vars->rec_inv_sqrt;
>> >       invsqrt2 = ((u64)invsqrt * invsqrt) >> 32;
>> > -     val = (3LL << 32) - ((u64)vars->count * invsqrt2);
>> > +     val = (3LL << 32) - ((u64)count * invsqrt2);
>> >
>> >       val >>= 2; /* avoid overflow in following multiply */
>> >       val = (val * invsqrt) >> (32 - 2 + 1);
>> > @@ -414,12 +414,12 @@ static void cobalt_newton_step(struct cobalt_vars *vars)
>> >       vars->rec_inv_sqrt = val;
>> >  }
>> >
>> > -static void cobalt_invsqrt(struct cobalt_vars *vars)
>> > +static void cobalt_invsqrt(struct cobalt_vars *vars, u32 count)
>> >  {
>> > -     if (vars->count < REC_INV_SQRT_CACHE)
>> > -             vars->rec_inv_sqrt = inv_sqrt_cache[vars->count];
>> > +     if (count < REC_INV_SQRT_CACHE)
>> > +             vars->rec_inv_sqrt = inv_sqrt_cache[count];
>> >       else
>> > -             cobalt_newton_step(vars);
>> > +             cobalt_newton_step(vars, count);
>> >  }
>> >
>> >  static void cobalt_vars_init(struct cobalt_vars *vars)
>> > @@ -449,16 +449,19 @@ static bool cobalt_queue_full(struct cobalt_vars *vars,
>> >       bool up = false;
>> >
>> >       if (ktime_to_ns(ktime_sub(now, vars->blue_timer)) > p->target) {
>> > -             up = !vars->p_drop;
>> > -             vars->p_drop += p->p_inc;
>> > -             if (vars->p_drop < p->p_inc)
>> > -                     vars->p_drop = ~0;
>> > -             vars->blue_timer = now;
>> > -     }
>> > -     vars->dropping = true;
>> > -     vars->drop_next = now;
>> > +             u32 p_drop = vars->p_drop;
>> > +
>> > +             up = !p_drop;
>> > +             p_drop += p->p_inc;
>> > +             if (p_drop < p->p_inc)
>> > +                     p_drop = ~0;
>> > +             WRITE_ONCE(vars->p_drop, p_drop);
>> > +             WRITE_ONCE(vars->blue_timer, now);
>> > +     }
>> > +     WRITE_ONCE(vars->dropping, true);
>> > +     WRITE_ONCE(vars->drop_next, now);
>> >       if (!vars->count)
>> > -             vars->count = 1;
>> > +             WRITE_ONCE(vars->count, 1);
>> >
>> >       return up;
>> >  }
>> > @@ -474,21 +477,25 @@ static bool cobalt_queue_empty(struct cobalt_vars *vars,
>> >
>> >       if (vars->p_drop &&
>> >           ktime_to_ns(ktime_sub(now, vars->blue_timer)) > p->target) {
>> > -             if (vars->p_drop < p->p_dec)
>> > -                     vars->p_drop = 0;
>> > +             u32 p_drop = vars->p_drop;
>> > +
>> > +             if (p_drop < p->p_dec)
>> > +                     p_drop = 0;
>> >               else
>> > -                     vars->p_drop -= p->p_dec;
>> > -             vars->blue_timer = now;
>> > -             down = !vars->p_drop;
>> > +                     p_drop -= p->p_dec;
>> > +             WRITE_ONCE(vars->p_drop, p_drop);
>> > +             WRITE_ONCE(vars->blue_timer, now);
>> > +             down = !p_drop;
>> >       }
>> > -     vars->dropping = false;
>> > +     WRITE_ONCE(vars->dropping, false);
>> >
>> >       if (vars->count && ktime_to_ns(ktime_sub(now, vars->drop_next)) >= 0) {
>> > -             vars->count--;
>> > -             cobalt_invsqrt(vars);
>> > -             vars->drop_next = cobalt_control(vars->drop_next,
>> > -                                              p->interval,
>> > -                                              vars->rec_inv_sqrt);
>> > +             WRITE_ONCE(vars->count, vars->count - 1);
>> > +             cobalt_invsqrt(vars, vars->count);
>> > +             WRITE_ONCE(vars->drop_next,
>> > +                        cobalt_control(vars->drop_next,
>> > +                                       p->interval,
>> > +                                       vars->rec_inv_sqrt));
>> >       }
>> >
>> >       return down;
>> > @@ -507,6 +514,7 @@ static enum qdisc_drop_reason cobalt_should_drop(struct cobalt_vars *vars,
>> >       bool next_due, over_target;
>> >       ktime_t schedule;
>> >       u64 sojourn;
>> > +     u32 count;
>> >
>> >  /* The 'schedule' variable records, in its sign, whether 'now' is before or
>> >   * after 'drop_next'.  This allows 'drop_next' to be updated before the next
>> > @@ -528,45 +536,50 @@ static enum qdisc_drop_reason cobalt_should_drop(struct cobalt_vars *vars,
>> >       over_target = sojourn > p->target &&
>> >                     sojourn > p->mtu_time * bulk_flows * 2 &&
>> >                     sojourn > p->mtu_time * 4;
>> > -     next_due = vars->count && ktime_to_ns(schedule) >= 0;
>> > +     count = vars->count;
>> > +     next_due = count && ktime_to_ns(schedule) >= 0;
>> >
>> >       vars->ecn_marked = false;
>> >
>> >       if (over_target) {
>> >               if (!vars->dropping) {
>> > -                     vars->dropping = true;
>> > -                     vars->drop_next = cobalt_control(now,
>> > -                                                      p->interval,
>> > -                                                      vars->rec_inv_sqrt);
>> > +                     WRITE_ONCE(vars->dropping, true);
>> > +                     WRITE_ONCE(vars->drop_next,
>> > +                                cobalt_control(now,
>> > +                                               p->interval,
>> > +                                               vars->rec_inv_sqrt));
>> >               }
>> > -             if (!vars->count)
>> > -                     vars->count = 1;
>> > +             if (!count)
>> > +                     count = 1;
>> >       } else if (vars->dropping) {
>> > -             vars->dropping = false;
>> > +             WRITE_ONCE(vars->dropping, false);
>> >       }
>> >
>> >       if (next_due && vars->dropping) {
>> >               /* Use ECN mark if possible, otherwise drop */
>> > -             if (!(vars->ecn_marked = INET_ECN_set_ce(skb)))
>> > +             vars->ecn_marked = INET_ECN_set_ce(skb);
>> > +             if (!vars->ecn_marked)
>> >                       reason = QDISC_DROP_CONGESTED;
>> >
>> > -             vars->count++;
>> > -             if (!vars->count)
>> > -                     vars->count--;
>> > -             cobalt_invsqrt(vars);
>> > -             vars->drop_next = cobalt_control(vars->drop_next,
>> > -                                              p->interval,
>> > -                                              vars->rec_inv_sqrt);
>> > +             count++;
>> > +             if (!count)
>> > +                     count--;
>> > +             cobalt_invsqrt(vars, count);
>> > +             WRITE_ONCE(vars->drop_next,
>> > +                        cobalt_control(vars->drop_next,
>> > +                                       p->interval,
>> > +                                       vars->rec_inv_sqrt));
>> >               schedule = ktime_sub(now, vars->drop_next);
>> >       } else {
>> >               while (next_due) {
>> > -                     vars->count--;
>> > -                     cobalt_invsqrt(vars);
>> > -                     vars->drop_next = cobalt_control(vars->drop_next,
>> > -                                                      p->interval,
>> > -                                                      vars->rec_inv_sqrt);
>> > +                     count--;
>> > +                     cobalt_invsqrt(vars, count);
>> > +                     WRITE_ONCE(vars->drop_next,
>> > +                                cobalt_control(vars->drop_next,
>> > +                                               p->interval,
>> > +                                               vars->rec_inv_sqrt));
>> >                       schedule = ktime_sub(now, vars->drop_next);
>> > -                     next_due = vars->count && ktime_to_ns(schedule) >= 0;
>> > +                     next_due = count && ktime_to_ns(schedule) >= 0;
>> >               }
>> >       }
>> >
>> > @@ -575,11 +588,12 @@ static enum qdisc_drop_reason cobalt_should_drop(struct cobalt_vars *vars,
>> >           get_random_u32() < vars->p_drop)
>> >               reason = QDISC_DROP_FLOOD_PROTECTION;
>> >
>> > +     WRITE_ONCE(vars->count, count);
>> >       /* Overload the drop_next field as an activity timeout */
>> > -     if (!vars->count)
>> > -             vars->drop_next = ktime_add_ns(now, p->interval);
>> > +     if (count)
>>
>> This seems to reverse the conditional?
>
> Ah right, thanks !
>
>>
>> > +             WRITE_ONCE(vars->drop_next, ktime_add_ns(now, p->interval));
>> >       else if (ktime_to_ns(schedule) > 0 && reason == QDISC_DROP_UNSPEC)
>> > -             vars->drop_next = now;
>> > +             WRITE_ONCE(vars->drop_next, now);
>> >
>> >       return reason;
>> >  }
>> > @@ -813,7 +827,7 @@ static u32 cake_hash(struct cake_tin_data *q, const struct sk_buff *skb,
>> >                    i++, k = (k + 1) % CAKE_SET_WAYS) {
>> >                       if (q->tags[outer_hash + k] == flow_hash) {
>> >                               if (i)
>> > -                                     q->way_hits++;
>> > +                                     WRITE_ONCE(q->way_hits, q->way_hits + 1);
>> >
>> >                               if (!q->flows[outer_hash + k].set) {
>> >                                       /* need to increment host refcnts */
>> > @@ -831,7 +845,7 @@ static u32 cake_hash(struct cake_tin_data *q, const struct sk_buff *skb,
>> >               for (i = 0; i < CAKE_SET_WAYS;
>> >                        i++, k = (k + 1) % CAKE_SET_WAYS) {
>> >                       if (!q->flows[outer_hash + k].set) {
>> > -                             q->way_misses++;
>> > +                             WRITE_ONCE(q->way_misses, q->way_misses + 1);
>> >                               allocate_src = cake_dsrc(flow_mode);
>> >                               allocate_dst = cake_ddst(flow_mode);
>> >                               goto found;
>> > @@ -841,7 +855,7 @@ static u32 cake_hash(struct cake_tin_data *q, const struct sk_buff *skb,
>> >               /* With no empty queues, default to the original
>> >                * queue, accept the collision, update the host tags.
>> >                */
>> > -             q->way_collisions++;
>> > +             WRITE_ONCE(q->way_collisions, q->way_collisions + 1);
>> >               allocate_src = cake_dsrc(flow_mode);
>> >               allocate_dst = cake_ddst(flow_mode);
>> >
>> > @@ -875,7 +889,8 @@ static u32 cake_hash(struct cake_tin_data *q, const struct sk_buff *skb,
>> >                       q->flows[reduced_hash].srchost = srchost_idx;
>> >
>> >                       if (q->flows[reduced_hash].set == CAKE_SET_BULK)
>> > -                             cake_inc_srchost_bulk_flow_count(q, &q->flows[reduced_hash], flow_mode);
>> > +                             cake_inc_srchost_bulk_flow_count(q, &q->flows[reduced_hash],
>> > +                                                              flow_mode);
>> >               }
>> >
>> >               if (allocate_dst) {
>> > @@ -899,7 +914,8 @@ static u32 cake_hash(struct cake_tin_data *q, const struct sk_buff *skb,
>> >                       q->flows[reduced_hash].dsthost = dsthost_idx;
>> >
>> >                       if (q->flows[reduced_hash].set == CAKE_SET_BULK)
>> > -                             cake_inc_dsthost_bulk_flow_count(q, &q->flows[reduced_hash], flow_mode);
>> > +                             cake_inc_dsthost_bulk_flow_count(q, &q->flows[reduced_hash],
>> > +                                                              flow_mode);
>> >               }
>> >       }
>> >
>> > @@ -1379,9 +1395,9 @@ static u32 cake_calc_overhead(struct cake_sched_data *qd, u32 len, u32 off)
>> >               len -= off;
>> >
>> >       if (qd->max_netlen < len)
>> > -             qd->max_netlen = len;
>> > +             WRITE_ONCE(qd->max_netlen, len);
>> >       if (qd->min_netlen > len)
>> > -             qd->min_netlen = len;
>> > +             WRITE_ONCE(qd->min_netlen, len);
>> >
>> >       len += q->rate_overhead;
>> >
>> > @@ -1401,9 +1417,9 @@ static u32 cake_calc_overhead(struct cake_sched_data *qd, u32 len, u32 off)
>> >       }
>> >
>> >       if (qd->max_adjlen < len)
>> > -             qd->max_adjlen = len;
>> > +             WRITE_ONCE(qd->max_adjlen, len);
>> >       if (qd->min_adjlen > len)
>> > -             qd->min_adjlen = len;
>> > +             WRITE_ONCE(qd->min_adjlen, len);
>> >
>> >       return len;
>> >  }
>> > @@ -1416,7 +1432,7 @@ static u32 cake_overhead(struct cake_sched_data *q, const struct sk_buff *skb)
>> >       u16 segs = qdisc_pkt_segs(skb);
>> >       u32 len = qdisc_pkt_len(skb);
>> >
>> > -     q->avg_netoff = cake_ewma(q->avg_netoff, off << 16, 8);
>> > +     WRITE_ONCE(q->avg_netoff, cake_ewma(q->avg_netoff, off << 16, 8));
>> >
>> >       if (segs == 1)
>> >               return cake_calc_overhead(q, len, off);
>> > @@ -1590,16 +1606,17 @@ static unsigned int cake_drop(struct Qdisc *sch, struct sk_buff **to_free)
>> >       }
>> >
>> >       if (cobalt_queue_full(&flow->cvars, &b->cparams, now))
>> > -             b->unresponsive_flow_count++;
>> > +             WRITE_ONCE(b->unresponsive_flow_count,
>> > +                        b->unresponsive_flow_count + 1);
>> >
>> >       len = qdisc_pkt_len(skb);
>> >       q->buffer_used      -= skb->truesize;
>> > -     b->backlogs[idx]    -= len;
>> > -     b->tin_backlog      -= len;
>> > +     WRITE_ONCE(b->backlogs[idx], b->backlogs[idx] - len);
>> > +     WRITE_ONCE(b->tin_backlog, b->tin_backlog - len);
>> >       qstats_backlog_sub(sch, len);
>> >
>> > -     flow->dropped++;
>> > -     b->tin_dropped++;
>> > +     WRITE_ONCE(flow->dropped, flow->dropped + 1);
>> > +     WRITE_ONCE(b->tin_dropped, b->tin_dropped + 1);
>> >
>> >       if (q->config->rate_flags & CAKE_FLAG_INGRESS)
>> >               cake_advance_shaper(q, b, skb, now, true);
>> > @@ -1795,7 +1812,7 @@ static s32 cake_enqueue(struct sk_buff *skb, struct Qdisc *sch,
>> >       }
>> >
>> >       if (unlikely(len > b->max_skblen))
>> > -             b->max_skblen = len;
>> > +             WRITE_ONCE(b->max_skblen, len);
>> >
>> >       if (qdisc_pkt_segs(skb) > 1 && q->config->rate_flags & CAKE_FLAG_SPLIT_GSO) {
>> >               struct sk_buff *segs, *nskb;
>> > @@ -1819,13 +1836,13 @@ static s32 cake_enqueue(struct sk_buff *skb, struct Qdisc *sch,
>> >                       numsegs++;
>> >                       slen += segs->len;
>> >                       q->buffer_used += segs->truesize;
>> > -                     b->packets++;
>>
>> Right above this hunk we do sch->q.qlen++; - does that need changing as
>> well?
>
> This was changed to qdisc_qlen_inc() in a prior commit in this series.
> ( net/sched: add qdisc_qlen_inc() and qdisc_qlen_dec() )

Ah, right, missed that; sorry!

>>
>> >               }
>> >
>> >               /* stats */
>> > -             b->bytes            += slen;
>> > -             b->backlogs[idx]    += slen;
>> > -             b->tin_backlog      += slen;
>> > +             WRITE_ONCE(b->bytes, b->bytes + slen);
>> > +             WRITE_ONCE(b->packets, b->packets + numsegs);
>> > +             WRITE_ONCE(b->backlogs[idx], b->backlogs[idx] + slen);
>> > +             WRITE_ONCE(b->tin_backlog, b->tin_backlog + slen);
>> >               qstats_backlog_add(sch, slen);
>> >               q->avg_window_bytes += slen;
>> >
>> > @@ -1843,10 +1860,10 @@ static s32 cake_enqueue(struct sk_buff *skb, struct Qdisc *sch,
>> >                       ack = cake_ack_filter(q, flow);
>> >
>> >               if (ack) {
>> > -                     b->ack_drops++;
>> > +                     WRITE_ONCE(b->ack_drops, b->ack_drops + 1);
>> >                       qdisc_qstats_drop(sch);
>> >                       ack_pkt_len = qdisc_pkt_len(ack);
>> > -                     b->bytes += ack_pkt_len;
>> > +                     WRITE_ONCE(b->bytes, b->bytes + ack_pkt_len);
>> >                       q->buffer_used += skb->truesize - ack->truesize;
>> >                       if (q->config->rate_flags & CAKE_FLAG_INGRESS)
>> >                               cake_advance_shaper(q, b, ack, now, true);
>> > @@ -1859,10 +1876,10 @@ static s32 cake_enqueue(struct sk_buff *skb, struct Qdisc *sch,
>> >               }
>> >
>> >               /* stats */
>> > -             b->packets++;
>> > -             b->bytes            += len - ack_pkt_len;
>> > -             b->backlogs[idx]    += len - ack_pkt_len;
>> > -             b->tin_backlog      += len - ack_pkt_len;
>> > +             WRITE_ONCE(b->packets, b->packets + 1);
>> > +             WRITE_ONCE(b->bytes, b->bytes + len - ack_pkt_len);
>> > +             WRITE_ONCE(b->backlogs[idx], b->backlogs[idx] + len - ack_pkt_len);
>> > +             WRITE_ONCE(b->tin_backlog, b->tin_backlog + len - ack_pkt_len);
>> >               qstats_backlog_add(sch, len - ack_pkt_len);
>> >               q->avg_window_bytes += len - ack_pkt_len;
>> >       }
>> > @@ -1894,9 +1911,9 @@ static s32 cake_enqueue(struct sk_buff *skb, struct Qdisc *sch,
>> >                       u64 b = q->avg_window_bytes * (u64)NSEC_PER_SEC;
>> >
>> >                       b = div64_u64(b, window_interval);
>> > -                     q->avg_peak_bandwidth =
>> > -                             cake_ewma(q->avg_peak_bandwidth, b,
>> > -                                       b > q->avg_peak_bandwidth ? 2 : 8);
>> > +                     WRITE_ONCE(q->avg_peak_bandwidth,
>> > +                                cake_ewma(q->avg_peak_bandwidth, b,
>> > +                                          b > q->avg_peak_bandwidth ? 2 : 8));
>> >                       q->avg_window_bytes = 0;
>> >                       q->avg_window_begin = now;
>> >
>> > @@ -1917,27 +1934,30 @@ static s32 cake_enqueue(struct sk_buff *skb, struct Qdisc *sch,
>> >               if (!flow->set) {
>> >                       list_add_tail(&flow->flowchain, &b->new_flows);
>> >               } else {
>> > -                     b->decaying_flow_count--;
>> > +                     WRITE_ONCE(b->decaying_flow_count,
>> > +                                b->decaying_flow_count - 1);
>> >                       list_move_tail(&flow->flowchain, &b->new_flows);
>> >               }
>> >               flow->set = CAKE_SET_SPARSE;
>> > -             b->sparse_flow_count++;
>> > +             WRITE_ONCE(b->sparse_flow_count,
>> > +                        b->sparse_flow_count + 1);
>> >
>> > -             flow->deficit = cake_get_flow_quantum(b, flow, q->config->flow_mode);
>> > +             WRITE_ONCE(flow->deficit,
>> > +                        cake_get_flow_quantum(b, flow, q->config->flow_mode));
>> >       } else if (flow->set == CAKE_SET_SPARSE_WAIT) {
>> >               /* this flow was empty, accounted as a sparse flow, but actually
>> >                * in the bulk rotation.
>> >                */
>> >               flow->set = CAKE_SET_BULK;
>> > -             b->sparse_flow_count--;
>> > -             b->bulk_flow_count++;
>> > +             WRITE_ONCE(b->sparse_flow_count, b->sparse_flow_count - 1);
>> > +             WRITE_ONCE(b->bulk_flow_count, b->bulk_flow_count + 1);
>> >
>> >               cake_inc_srchost_bulk_flow_count(b, flow, q->config->flow_mode);
>> >               cake_inc_dsthost_bulk_flow_count(b, flow, q->config->flow_mode);
>> >       }
>> >
>> >       if (q->buffer_used > q->buffer_max_used)
>> > -             q->buffer_max_used = q->buffer_used;
>> > +             WRITE_ONCE(q->buffer_max_used, q->buffer_used);
>> >
>> >       if (q->buffer_used <= q->buffer_limit)
>> >               return NET_XMIT_SUCCESS;
>> > @@ -1976,8 +1996,8 @@ static struct sk_buff *cake_dequeue_one(struct Qdisc *sch)
>> >       if (flow->head) {
>> >               skb = dequeue_head(flow);
>> >               len = qdisc_pkt_len(skb);
>> > -             b->backlogs[q->cur_flow] -= len;
>> > -             b->tin_backlog           -= len;
>> > +             WRITE_ONCE(b->backlogs[q->cur_flow], b->backlogs[q->cur_flow] - len);
>> > +             WRITE_ONCE(b->tin_backlog, b->tin_backlog - len);
>> >               qstats_backlog_sub(sch, len);
>> >               q->buffer_used           -= skb->truesize;
>> >               qdisc_qlen_dec(sch);
>> > @@ -2042,7 +2062,7 @@ static struct sk_buff *cake_dequeue(struct Qdisc *sch)
>> >
>> >               cake_configure_rates(sch, new_rate, true);
>> >               q->last_checked_active = now;
>> > -             q->active_queues = num_active_qs;
>> > +             WRITE_ONCE(q->active_queues, num_active_qs);
>> >       }
>> >
>> >  begin:
>> > @@ -2149,8 +2169,10 @@ static struct sk_buff *cake_dequeue(struct Qdisc *sch)
>> >                */
>> >               if (flow->set == CAKE_SET_SPARSE) {
>> >                       if (flow->head) {
>> > -                             b->sparse_flow_count--;
>> > -                             b->bulk_flow_count++;
>> > +                             WRITE_ONCE(b->sparse_flow_count,
>> > +                                        b->sparse_flow_count - 1);
>> > +                             WRITE_ONCE(b->bulk_flow_count,
>> > +                                        b->bulk_flow_count + 1);
>> >
>> >                               cake_inc_srchost_bulk_flow_count(b, flow, q->config->flow_mode);
>> >                               cake_inc_dsthost_bulk_flow_count(b, flow, q->config->flow_mode);
>> > @@ -2165,7 +2187,8 @@ static struct sk_buff *cake_dequeue(struct Qdisc *sch)
>> >                       }
>> >               }
>> >
>> > -             flow->deficit += cake_get_flow_quantum(b, flow, q->config->flow_mode);
>> > +             WRITE_ONCE(flow->deficit,
>> > +                        flow->deficit + cake_get_flow_quantum(b, flow, q->config->flow_mode));
>> >               list_move_tail(&flow->flowchain, &b->old_flows);
>> >
>> >               goto retry;
>> > @@ -2177,7 +2200,8 @@ static struct sk_buff *cake_dequeue(struct Qdisc *sch)
>> >               if (!skb) {
>> >                       /* this queue was actually empty */
>> >                       if (cobalt_queue_empty(&flow->cvars, &b->cparams, now))
>> > -                             b->unresponsive_flow_count--;
>> > +                             WRITE_ONCE(b->unresponsive_flow_count,
>> > +                                        b->unresponsive_flow_count - 1);
>> >
>> >                       if (flow->cvars.p_drop || flow->cvars.count ||
>> >                           ktime_before(now, flow->cvars.drop_next)) {
>> > @@ -2187,16 +2211,22 @@ static struct sk_buff *cake_dequeue(struct Qdisc *sch)
>> >                               list_move_tail(&flow->flowchain,
>> >                                              &b->decaying_flows);
>> >                               if (flow->set == CAKE_SET_BULK) {
>> > -                                     b->bulk_flow_count--;
>> > +                                     WRITE_ONCE(b->bulk_flow_count,
>> > +                                                b->bulk_flow_count - 1);
>> >
>> > -                                     cake_dec_srchost_bulk_flow_count(b, flow, q->config->flow_mode);
>> > -                                     cake_dec_dsthost_bulk_flow_count(b, flow, q->config->flow_mode);
>> > +                                     cake_dec_srchost_bulk_flow_count(b, flow,
>> > +                                                                      q->config->flow_mode);
>> > +                                     cake_dec_dsthost_bulk_flow_count(b, flow,
>> > +                                                                      q->config->flow_mode);
>>
>> These seem like unnecessary whitespace changes?
>
> Line length was 105 ... a bit over the recommended limit.

Right, do realise that, but I personally find the one-liners more
readable even if they are a bit long. Won't insist, though; it's just
whitespace, after all :)

-Toke


^ permalink raw reply

* Re: [GIT PULL] bluetooth-next 2026-04-13
From: Luiz Augusto von Dentz @ 2026-04-13 14:17 UTC (permalink / raw)
  To: Paolo Abeni; +Cc: davem, kuba, linux-bluetooth, netdev
In-Reply-To: <CABBYNZK5mMbuzvjAquy49vSk0pLRnk_s8oqEut1se4DToURn9A@mail.gmail.com>

Hi,

On Mon, Apr 13, 2026 at 10:11 AM Luiz Augusto von Dentz
<luiz.dentz@gmail.com> wrote:
>
> Hi Paolo,
>
> On Mon, Apr 13, 2026 at 9:41 AM Paolo Abeni <pabeni@redhat.com> wrote:
> >
> > On 4/13/26 3:22 PM, Luiz Augusto von Dentz wrote:
> > > The following changes since commit 42f9b4c6ef19e71d2c7d9bfd3c5037d4fe434ad7:
> > >
> > >   tools: ynl: tests: fix leading space on Makefile target (2026-04-09 20:41:40 -0700)
> > >
> > > are available in the Git repository at:
> > >
> > >   git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next.git tags/for-net-next-2026-04-13
> > >
> > > for you to fetch changes up to c347ca17d62a32c25564fee0ca3a2a7bc2d5fd6f:
> > >
> > >   Bluetooth: hci_qca: Fix missing wakeup during SSR memdump handling (2026-04-13 09:19:42 -0400)
> > >
> > > ----------------------------------------------------------------
> > > bluetooth-next pull request for net-next:
> >
> > Net-next is closed for the merge window. I guess Jakub could still
> > consider merging this, but unless you want it very, very badly, I hope
> > it can just be postponed, as the PW queue is already long.
>
> This update includes quite a few new hardware supports. This is a
> resend because the last one was dropped due to an invalid 'Fixes' tag.
>
>  Btw, I don't know why the entire PR needs to be dropped if only a few
> items have invalid tags? Can't we just dropped those?

Maybe Im doing something wrong in my side, the issue with the Fixes
that is that sometimes they become invalid once I rebase on top of
net-next, which, afaik, is necessary to detect already applied
patches. Or is rebasing is not really necessary and should only be
done once when rc1 is tagged?

> > Thanks,
> >
> > /P
> >
>
>
> --
> Luiz Augusto von Dentz



-- 
Luiz Augusto von Dentz

^ permalink raw reply

* Re: [GIT PULL] bluetooth-next 2026-04-13
From: Luiz Augusto von Dentz @ 2026-04-13 14:11 UTC (permalink / raw)
  To: Paolo Abeni; +Cc: davem, kuba, linux-bluetooth, netdev
In-Reply-To: <580f7972-56d5-4968-9dcc-32adc31e0673@redhat.com>

Hi Paolo,

On Mon, Apr 13, 2026 at 9:41 AM Paolo Abeni <pabeni@redhat.com> wrote:
>
> On 4/13/26 3:22 PM, Luiz Augusto von Dentz wrote:
> > The following changes since commit 42f9b4c6ef19e71d2c7d9bfd3c5037d4fe434ad7:
> >
> >   tools: ynl: tests: fix leading space on Makefile target (2026-04-09 20:41:40 -0700)
> >
> > are available in the Git repository at:
> >
> >   git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next.git tags/for-net-next-2026-04-13
> >
> > for you to fetch changes up to c347ca17d62a32c25564fee0ca3a2a7bc2d5fd6f:
> >
> >   Bluetooth: hci_qca: Fix missing wakeup during SSR memdump handling (2026-04-13 09:19:42 -0400)
> >
> > ----------------------------------------------------------------
> > bluetooth-next pull request for net-next:
>
> Net-next is closed for the merge window. I guess Jakub could still
> consider merging this, but unless you want it very, very badly, I hope
> it can just be postponed, as the PW queue is already long.

This update includes quite a few new hardware supports. This is a
resend because the last one was dropped due to an invalid 'Fixes' tag.

 Btw, I don't know why the entire PR needs to be dropped if only a few
items have invalid tags? Can't we just dropped those?

> Thanks,
>
> /P
>


-- 
Luiz Augusto von Dentz

^ permalink raw reply

* Re: [PATCH net] ice: Fix missing 1's complement negation in GCS raw checksum
From: Simon Horman @ 2026-04-13 14:11 UTC (permalink / raw)
  To: Matt Fleming
  Cc: Tony Nguyen, Przemek Kitszel, Andrew Lunn, David S . Miller,
	Eric Dumazet, Jakub Kicinski, Paolo Abeni, intel-wired-lan,
	netdev, linux-kernel, kernel-team, Matt Fleming
In-Reply-To: <20260408190214.1287708-1-matt@readmodwrite.com>

On Wed, Apr 08, 2026 at 08:02:14PM +0100, Matt Fleming wrote:
> From: Matt Fleming <mfleming@cloudflare.com>
> 
> Commit 905d1a220e8d ("ice: Add E830 checksum offload support") added
> Generic Checksum (GCS) support for E830 NICs but omitted the 1's
> complement negation (~) when converting the hardware raw_csum to
> skb->csum for CHECKSUM_COMPLETE.
> 
> Without the negation, every CHECKSUM_COMPLETE packet fails the
> fast-path validation in nf_ip_checksum() and falls through to software
> checksumming via __skb_checksum_complete(), which triggers the
> rate-limited "hw csum failure" warning. Packets are still accepted
> (the software recheck passes) but hardware checksum offload is
> effectively disabled and the warning floods dmesg on systems running
> nf_conntrack on VLAN sub-interfaces.
> 
> Multiple other drivers (idpf, ehea, iwlwifi, cassini, sunhme, enetc)
> also apply ~ for CHECKSUM_COMPLETE. The ice driver was the only in-tree
> user of csum_unfold() for CHECKSUM_COMPLETE that omitted it.
> 
> Fixes: 905d1a220e8d ("ice: Add E830 checksum offload support")
> Signed-off-by: Matt Fleming <mfleming@cloudflare.com>

Reviewed-by: Simon Horman <horms@kernel.org>


^ permalink raw reply

page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox