* Re: [PATCH v4] net/mlx5: Fix OOB access and stack information leak in PTP event handling
From: Leon Romanovsky @ 2026-04-13 14:46 UTC (permalink / raw)
To: Prathamesh Deshpande
Cc: Carolina Jubran, Saeed Mahameed, Richard Cochran, Tariq Toukan,
Mark Bloch, netdev, linux-rdma, linux-kernel
In-Reply-To: <20260412000418.8415-1-prathameshdeshpande7@gmail.com>
On Sun, Apr 12, 2026 at 01:04:10AM +0100, Prathamesh Deshpande wrote:
> In mlx5_pps_event(), several critical issues were identified:
>
> 1. The 'pin' index from the hardware event was used without bounds
> checking to index 'pin_config' and 'pps_info->start'. Check against
> MAX_PIN_NUM to prevent out-of-bounds access.
You were told more than once that this is impossible.
<...>
> + if (WARN_ON_ONCE(pin >= MAX_PIN_NUM))
> + return NOTIFY_OK;
Let's not add useless checks in fast path.
Thanks
^ permalink raw reply
* Re: [PATCH net v1 1/2] nexthop: fix IPv6 route referencing IPv4 nexthop
From: David Ahern @ 2026-04-13 14:46 UTC (permalink / raw)
To: Jiayuan Chen, netdev
Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Simon Horman, Shuah Khan, linux-kernel, linux-kselftest
In-Reply-To: <20260413114522.147784-1-jiayuan.chen@linux.dev>
On 4/13/26 5:45 AM, Jiayuan Chen wrote:
> syzbot reported a panic [1] [2].
>
> When an IPv6 nexthop is replaced with an IPv4 nexthop, the has_v4 flag
> of all groups containing this nexthop is not updated. This is because
> nh_group_v4_update is only called when replacing AF_INET to AF_INET6,
> but the reverse direction (AF_INET6 to AF_INET) is missed.
>
> This allows a stale has_v4=false to bypass fib6_check_nexthop, causing
> IPv6 routes to be attached to groups that effectively contain only AF_INET
> members. Subsequent route lookups then call nexthop_fib6_nh() which
> returns NULL for the AF_INET member, leading to a NULL pointer
> dereference.
>
> Fix by calling nh_group_v4_update whenever the family changes, not just
> AF_INET to AF_INET6.
>
> Reproducer:
> # AF_INET6 blackhole
> ip -6 nexthop add id 1 blackhole
> # group with has_v4=false
> ip nexthop add id 100 group 1
> # replace with AF_INET (no -6), has_v4 stays false
> ip nexthop replace id 1 blackhole
> # pass stale has_v4 check
> ip -6 route add 2001:db8::/64 nhid 100
> # panic
> ping -6 2001:db8::1
>
> [1] https://syzkaller.appspot.com/bug?id=e17283eb2f8dcf3dd9b47fe6f67a95f71faadad0
> [2] https://syzkaller.appspot.com/bug?id=8699b6ae54c9f35837d925686208402949e12ef3
> Fixes: 7bf4796dd099 ("nexthops: add support for replace")
> Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
> ---
> net/ipv4/nexthop.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
Reviewed-by: David Ahern <dsahern@kernel.org>
^ permalink raw reply
* Re: [patch 17/38] ext4: Replace get_cycles() usage with ktime_get()
From: Arnd Bergmann @ 2026-04-13 14:46 UTC (permalink / raw)
To: Thomas Gleixner, LKML
Cc: Theodore Ts'o, linux-ext4, x86, Baolu Lu, iommu,
Michael Grzeschik, Netdev, linux-wireless, Herbert Xu,
linux-crypto, Vlastimil Babka (SUSE), linux-mm, David Woodhouse,
Bernie Thompson, linux-fbdev, Andrew Morton,
Uladzislau Rezki (Sony), Marco Elver, Dmitry Vyukov, kasan-dev,
Andrey Ryabinin, Thomas Sailer, linux-hams, Jason A . Donenfeld,
Richard Henderson, linux-alpha, Russell King, linux-arm-kernel,
Catalin Marinas, Huacai Chen, loongarch, Geert Uytterhoeven,
linux-m68k, Dinh Nguyen, Jonas Bonn,
linux-openrisc@vger.kernel.org, Helge Deller, linux-parisc,
Michael Ellerman, linuxppc-dev, Paul Walmsley, linux-riscv,
Heiko Carstens, linux-s390, David S . Miller, sparclinux
In-Reply-To: <20260410120318.727211419@kernel.org>
On Fri, Apr 10, 2026, at 14:19, Thomas Gleixner wrote:
> get_cycles() is not guaranteed to be functional on all systems/platforms
> and the values returned are unitless and not easy to map to something
> useful.
>
> Use ktime_get() instead, which provides nanosecond timestamps and is
> functional everywhere.
>
> This is part of a larger effort to limit get_cycles() usage to low level
> architecture code.
>
> Signed-off-by: Thomas Gleixner <tglx@kernel.org>
> Cc: "Theodore Ts'o" <tytso@mit.edu>
> Cc: linux-ext4@vger.kernel.org
I think this is technically an ABI chance, since the time
difference gets exported through procfs, but the new version
is clearly the right thing to do since it replaces a hardware
specific value with a portable one.
Arnd
^ permalink raw reply
* Re: [PATCH net v1 2/2] selftests: fib_nexthops: test stale has_v4 on nexthop replace
From: David Ahern @ 2026-04-13 14:47 UTC (permalink / raw)
To: Jiayuan Chen, netdev
Cc: David S. Miller, Eric Dumazet, Jakub Kicinski, Paolo Abeni,
Simon Horman, Shuah Khan, linux-kernel, linux-kselftest
In-Reply-To: <20260413114522.147784-2-jiayuan.chen@linux.dev>
On 4/13/26 5:45 AM, Jiayuan Chen wrote:
> Add test cases that exercise the scenario where an IPv6 nexthop is
> replaced with an IPv4 nexthop while being part of a group. The group's
> has_v4 flag must be updated so that subsequent IPv6 route additions are
> properly rejected.
>
> Two cases are covered:
> 1. Gateway nexthop replaced across families with an existing IPv6
> route on the group (rejected by fib6_check_nh_list).
> 2. Blackhole nexthop replaced across families with no existing IPv6
> route on the group (fib6_check_nh_list returns early) — this is
> the path that triggers a NULL ptr deref without the kernel fix.
>
> Signed-off-by: Jiayuan Chen <jiayuan.chen@linux.dev>
> ---
> tools/testing/selftests/net/fib_nexthops.sh | 22 +++++++++++++++++++++
> 1 file changed, 22 insertions(+)
>
Reviewed-by: David Ahern <dsahern@kernel.org>
^ permalink raw reply
* Re: [GIT PULL] bluetooth-next 2026-04-13
From: Paolo Abeni @ 2026-04-13 14:56 UTC (permalink / raw)
To: Luiz Augusto von Dentz; +Cc: davem, kuba, linux-bluetooth, netdev
In-Reply-To: <CABBYNZ+7zr1jQ7a-2p88zNqMdvn6MAB5NAZ7b4OW=P56DcMT5g@mail.gmail.com>
On 4/13/26 4:17 PM, Luiz Augusto von Dentz wrote:
> On Mon, Apr 13, 2026 at 10:11 AM Luiz Augusto von Dentz
> <luiz.dentz@gmail.com> wrote:
>> On Mon, Apr 13, 2026 at 9:41 AM Paolo Abeni <pabeni@redhat.com> wrote:
>>>
>>> On 4/13/26 3:22 PM, Luiz Augusto von Dentz wrote:
>>>> The following changes since commit 42f9b4c6ef19e71d2c7d9bfd3c5037d4fe434ad7:
>>>>
>>>> tools: ynl: tests: fix leading space on Makefile target (2026-04-09 20:41:40 -0700)
>>>>
>>>> are available in the Git repository at:
>>>>
>>>> git://git.kernel.org/pub/scm/linux/kernel/git/bluetooth/bluetooth-next.git tags/for-net-next-2026-04-13
>>>>
>>>> for you to fetch changes up to c347ca17d62a32c25564fee0ca3a2a7bc2d5fd6f:
>>>>
>>>> Bluetooth: hci_qca: Fix missing wakeup during SSR memdump handling (2026-04-13 09:19:42 -0400)
>>>>
>>>> ----------------------------------------------------------------
>>>> bluetooth-next pull request for net-next:
>>>
>>> Net-next is closed for the merge window. I guess Jakub could still
>>> consider merging this, but unless you want it very, very badly, I hope
>>> it can just be postponed, as the PW queue is already long.
>>
>> This update includes quite a few new hardware supports. This is a
>> resend because the last one was dropped due to an invalid 'Fixes' tag.
>>
>> Btw, I don't know why the entire PR needs to be dropped if only a few
>> items have invalid tags? Can't we just dropped those?
>
> Maybe Im doing something wrong in my side, the issue with the Fixes
> that is that sometimes they become invalid once I rebase on top of
> net-next, which, afaik, is necessary to detect already applied
> patches. Or is rebasing is not really necessary and should only be
> done once when rc1 is tagged?
AFAICT, rebasing is needed if you have local patches not present in the
tree that you are pulling from.
If you e.g. send a PR to net-next, including all the patches present in
your devel tree ATM, you could avoid the later rebase not applying any
patch in your tree until you merge net-next back.
Would that doable for you?
Thanks,
Paolo
^ permalink raw reply
* Re: [PATCH iwl-next 1/1] igc: add support for forcing link speed without autonegotiation
From: Simon Horman @ 2026-04-13 14:52 UTC (permalink / raw)
To: khai.wen.tan
Cc: 'Simon Horman', anthony.l.nguyen, przemyslaw.kitszel,
andrew+netdev, davem, edumazet, kuba, pabeni, intel-wired-lan,
netdev, linux-kernel, faizal.abdul.rahim, hector.blanco.alcaine,
hong.aun.looi, khai.wen.tan, faizal.abdul.rahim
In-Reply-To: <20260409072747.217836-1-khai.wen.tan@linux.intel.com>
From: 'Simon Horman' <horms@kernel.org>
This is an AI-generated review of your patch. The human sending this
email has considered the AI review valid, or at least plausible.
Full review at: https://sashiko.dev
---
igc: add support for forcing link speed without autonegotiation
This commit allows forcing 10/100 Mb/s link speed and duplex settings via
ethtool when autonegotiation is disabled. It configures the MAC CTRL and
PHY registers to bypass autonegotiation and apply the requested settings.
> diff --git a/drivers/net/ethernet/intel/igc/igc_ethtool.c b/drivers/net/ethernet/intel/igc/igc_ethtool.c
> index 0122009bedd03..e14771532dad8 100644
> --- a/drivers/net/ethernet/intel/igc/igc_ethtool.c
> +++ b/drivers/net/ethernet/intel/igc/igc_ethtool.c
[ ... ]
> +static int igc_handle_autoneg_disabled(struct net_device *netdev, u32 speed,
> + u8 duplex)
> +{
> + struct igc_adapter *adapter = netdev_priv(netdev);
> + struct igc_mac_info *mac = &adapter->hw.mac;
> + enum igc_forced_speed_duplex forced_speed_duplex;
>
> + switch (speed) {
> + case SPEED_10:
> + forced_speed_duplex = (duplex == DUPLEX_FULL) ? IGC_FORCED_10F : IGC_FORCED_10H;
> + break;
> + case SPEED_100:
> + forced_speed_duplex = (duplex == DUPLEX_FULL) ? IGC_FORCED_100F : IGC_FORCED_100H;
> + break;
Does forcing a half-duplex configuration here also require setting flow
control to none (e.g., hw->fc.requested_mode = igc_fc_none)?
> diff --git a/drivers/net/ethernet/intel/igc/igc_mac.c b/drivers/net/ethernet/intel/igc/igc_mac.c
> index 7ac6637f8db73..16b0f2db06896 100644
> --- a/drivers/net/ethernet/intel/igc/igc_mac.c
> +++ b/drivers/net/ethernet/intel/igc/igc_mac.c
> @@ -452,7 +452,7 @@ s32 igc_config_fc_after_link_up(struct igc_hw *hw)
> * so we had to force link. In this case, we need to force the
> * configuration of the MAC to match the "fc" parameter.
> */
> - if (mac->autoneg_failed)
> + if (mac->autoneg_failed || !mac->autoneg_enabled)
> ret_val = igc_force_mac_fc(hw);
If mac->autoneg_enabled is false, igc_force_mac_fc() will configure the MAC
using the existing hw->fc.current_mode. If current_mode defaults to full flow
control, this will enable PAUSE frames.
Since autonegotiation is disabled, the MII status check further down in this
same function will fail and return early:
drivers/net/ethernet/intel/igc/igc_mac.c:igc_config_fc_after_link_up() {
...
if (!(mii_status_reg & MII_SR_AUTONEG_COMPLETE)) {
hw_dbg("Copper PHY and Auto Neg has not completed.\n");
goto out;
}
...
}
This early return skips the half-duplex safety check at the end of
igc_config_fc_after_link_up() which would normally clear the flow control
settings:
drivers/net/ethernet/intel/igc/igc_mac.c:igc_config_fc_after_link_up() {
...
if (duplex == HALF_DUPLEX)
hw->fc.current_mode = igc_fc_none;
ret_val = igc_force_mac_fc(hw);
...
}
Does this leave the MAC incorrectly configured to send and receive PAUSE
frames on a half-duplex link, violating the IEEE 802.3 specification?
^ permalink raw reply
* [PATCH] wireguard: device: use exit_rtnl callback instead of manual rtnl_lock in pre_exit
From: Shardul Bankar @ 2026-04-13 15:00 UTC (permalink / raw)
To: Jason, kuniyu
Cc: andrew+netdev, davem, edumazet, kuba, pabeni, wireguard, netdev,
linux-kernel, janak, kalpan.jani, shardulsb08, Shardul Bankar,
syzbot+pFBD3bslSSshiJCd3rxy
wg_netns_pre_exit() manually acquires rtnl_lock() inside the
pernet .pre_exit callback. This causes a hung task when another
thread holds rtnl_mutex - the cleanup_net workqueue (or the
setup_net failure rollback path) blocks indefinitely in
wg_netns_pre_exit() waiting to acquire the lock.
Convert to .exit_rtnl, introduced in commit 7a60d91c690b ("net:
Add ->exit_rtnl() hook to struct pernet_operations."), where the
framework already holds RTNL and batches all callbacks under a
single rtnl_lock()/rtnl_unlock() pair, eliminating the contention
window.
The rcu_assign_pointer(wg->creating_net, NULL) is safe to move
from .pre_exit to .exit_rtnl (which runs after synchronize_rcu())
because all RCU readers of creating_net either use maybe_get_net()
- which returns NULL for a dying namespace with zero refcount - or
access net->user_ns which remains valid throughout the entire
ops_undo_list sequence.
Reported-by: syzbot+pFBD3bslSSshiJCd3rxy@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?id=cb64c22a492202ca929e18262fdb8cb89e635c70
Signed-off-by: Shardul Bankar <shardul.b@mpiricsoftware.com>
---
drivers/net/wireguard/device.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/drivers/net/wireguard/device.c b/drivers/net/wireguard/device.c
index 46a71ec36af8..eb854c5294a3 100644
--- a/drivers/net/wireguard/device.c
+++ b/drivers/net/wireguard/device.c
@@ -411,12 +411,11 @@ static struct rtnl_link_ops link_ops __read_mostly = {
.newlink = wg_newlink,
};
-static void wg_netns_pre_exit(struct net *net)
+static void wg_netns_exit_rtnl(struct net *net, struct list_head *dev_kill_list)
{
struct wg_device *wg;
struct wg_peer *peer;
- rtnl_lock();
list_for_each_entry(wg, &device_list, device_list) {
if (rcu_access_pointer(wg->creating_net) == net) {
pr_debug("%s: Creating namespace exiting\n", wg->dev->name);
@@ -429,11 +428,10 @@ static void wg_netns_pre_exit(struct net *net)
mutex_unlock(&wg->device_update_lock);
}
}
- rtnl_unlock();
}
static struct pernet_operations pernet_ops = {
- .pre_exit = wg_netns_pre_exit
+ .exit_rtnl = wg_netns_exit_rtnl
};
int __init wg_device_init(void)
--
2.34.1
^ permalink raw reply related
* Re: [PATCH] wireguard: device: use exit_rtnl callback instead of manual rtnl_lock in pre_exit
From: syzbot @ 2026-04-13 15:01 UTC (permalink / raw)
To: shardul.b
Cc: andrew, davem, edumazet, janak, jason, kalpan.jani, kuba, kuniyu,
linux-kernel, netdev, pabeni, shardul.b, shardulsb08, wireguard
In-Reply-To: <20260413150024.1003490-1-shardul.b@mpiricsoftware.com>
> wg_netns_pre_exit() manually acquires rtnl_lock() inside the
> pernet .pre_exit callback. This causes a hung task when another
> thread holds rtnl_mutex - the cleanup_net workqueue (or the
> setup_net failure rollback path) blocks indefinitely in
> wg_netns_pre_exit() waiting to acquire the lock.
>
> Convert to .exit_rtnl, introduced in commit 7a60d91c690b ("net:
> Add ->exit_rtnl() hook to struct pernet_operations."), where the
> framework already holds RTNL and batches all callbacks under a
> single rtnl_lock()/rtnl_unlock() pair, eliminating the contention
> window.
>
> The rcu_assign_pointer(wg->creating_net, NULL) is safe to move
> from .pre_exit to .exit_rtnl (which runs after synchronize_rcu())
> because all RCU readers of creating_net either use maybe_get_net()
> - which returns NULL for a dying namespace with zero refcount - or
> access net->user_ns which remains valid throughout the entire
> ops_undo_list sequence.
>
> Reported-by: syzbot+pFBD3bslSSshiJCd3rxy@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?id=cb64c22a492202ca929e18262fdb8cb89e635c70
> Signed-off-by: Shardul Bankar <shardul.b@mpiricsoftware.com>
> ---
> drivers/net/wireguard/device.c | 6 ++----
> 1 file changed, 2 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/net/wireguard/device.c b/drivers/net/wireguard/device.c
> index 46a71ec36af8..eb854c5294a3 100644
> --- a/drivers/net/wireguard/device.c
> +++ b/drivers/net/wireguard/device.c
> @@ -411,12 +411,11 @@ static struct rtnl_link_ops link_ops __read_mostly = {
> .newlink = wg_newlink,
> };
>
> -static void wg_netns_pre_exit(struct net *net)
> +static void wg_netns_exit_rtnl(struct net *net, struct list_head *dev_kill_list)
> {
> struct wg_device *wg;
> struct wg_peer *peer;
>
> - rtnl_lock();
> list_for_each_entry(wg, &device_list, device_list) {
> if (rcu_access_pointer(wg->creating_net) == net) {
> pr_debug("%s: Creating namespace exiting\n", wg->dev->name);
> @@ -429,11 +428,10 @@ static void wg_netns_pre_exit(struct net *net)
> mutex_unlock(&wg->device_update_lock);
> }
> }
> - rtnl_unlock();
> }
>
> static struct pernet_operations pernet_ops = {
> - .pre_exit = wg_netns_pre_exit
> + .exit_rtnl = wg_netns_exit_rtnl
> };
>
> int __init wg_device_init(void)
> --
> 2.34.1
>
I see the command but can't find the corresponding bug.
The email is sent to syzbot+HASH@syzkaller.appspotmail.com address
but the HASH does not correspond to any known bug.
Please double check the address.
^ permalink raw reply
* [PATCH v2] wireguard: device: use exit_rtnl callback instead of manual rtnl_lock in pre_exit
From: Shardul Bankar @ 2026-04-13 15:12 UTC (permalink / raw)
To: Jason, kuniyu
Cc: andrew+netdev, davem, edumazet, kuba, pabeni, wireguard, netdev,
linux-kernel, janak, kalpan.jani, shardulsb08, Shardul Bankar,
syzbot+f2fbf7478a35a94c8b7c
wg_netns_pre_exit() manually acquires rtnl_lock() inside the
pernet .pre_exit callback. This causes a hung task when another
thread holds rtnl_mutex - the cleanup_net workqueue (or the
setup_net failure rollback path) blocks indefinitely in
wg_netns_pre_exit() waiting to acquire the lock.
Convert to .exit_rtnl, introduced in commit 7a60d91c690b ("net:
Add ->exit_rtnl() hook to struct pernet_operations."), where the
framework already holds RTNL and batches all callbacks under a
single rtnl_lock()/rtnl_unlock() pair, eliminating the contention
window.
The rcu_assign_pointer(wg->creating_net, NULL) is safe to move
from .pre_exit to .exit_rtnl (which runs after synchronize_rcu())
because all RCU readers of creating_net either use maybe_get_net()
- which returns NULL for a dying namespace with zero refcount - or
access net->user_ns which remains valid throughout the entire
ops_undo_list sequence.
Reported-by: syzbot+f2fbf7478a35a94c8b7c@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?id=cb64c22a492202ca929e18262fdb8cb89e635c70
Signed-off-by: Shardul Bankar <shardul.b@mpiricsoftware.com>
---
v2: Fix incorrect Reported-by email address
drivers/net/wireguard/device.c | 6 ++----
1 file changed, 2 insertions(+), 4 deletions(-)
diff --git a/drivers/net/wireguard/device.c b/drivers/net/wireguard/device.c
index 46a71ec36af8..eb854c5294a3 100644
--- a/drivers/net/wireguard/device.c
+++ b/drivers/net/wireguard/device.c
@@ -411,12 +411,11 @@ static struct rtnl_link_ops link_ops __read_mostly = {
.newlink = wg_newlink,
};
-static void wg_netns_pre_exit(struct net *net)
+static void wg_netns_exit_rtnl(struct net *net, struct list_head *dev_kill_list)
{
struct wg_device *wg;
struct wg_peer *peer;
- rtnl_lock();
list_for_each_entry(wg, &device_list, device_list) {
if (rcu_access_pointer(wg->creating_net) == net) {
pr_debug("%s: Creating namespace exiting\n", wg->dev->name);
@@ -429,11 +428,10 @@ static void wg_netns_pre_exit(struct net *net)
mutex_unlock(&wg->device_update_lock);
}
}
- rtnl_unlock();
}
static struct pernet_operations pernet_ops = {
- .pre_exit = wg_netns_pre_exit
+ .exit_rtnl = wg_netns_exit_rtnl
};
int __init wg_device_init(void)
--
2.34.1
^ permalink raw reply related
* Re: [PATCH 2/5] selftests: net: add multithread client support to iou-zcrx
From: Juanlu Herrero @ 2026-04-13 15:19 UTC (permalink / raw)
To: David Wei; +Cc: netdev
In-Reply-To: <8fa08d73-28a3-4521-bcfb-ec81869c24f3@davidwei.uk>
On Thu, Apr 09, 2026 at 08:51:11AM -0600, David Wei wrote:
> On 2026-04-08 09:38, Juanlu Herrero wrote:
> > Add pthreads to the iou-zcrx client so that multiple connections can be
> > established simultaneously. Each client thread connects to the server
> > and sends its payload independently.
> >
> > Introduce struct thread_ctx and the -t option to control the number of
> > threads (default 1), preserving backwards compatibility with existing
> > tests.
> >
> > Signed-off-by: Juanlu Herrero <juanlu@fastmail.com>
> > ---
> > .../testing/selftests/drivers/net/hw/Makefile | 2 +-
> > .../selftests/drivers/net/hw/iou-zcrx.c | 46 +++++++++++++++++--
> > 2 files changed, 44 insertions(+), 4 deletions(-)
> >
> > diff --git a/tools/testing/selftests/drivers/net/hw/Makefile b/tools/testing/selftests/drivers/net/hw/Makefile
> > index deeca3f8d080..227adfec706c 100644
> > --- a/tools/testing/selftests/drivers/net/hw/Makefile
> > +++ b/tools/testing/selftests/drivers/net/hw/Makefile
> > @@ -80,5 +80,5 @@ include ../../../net/ynl.mk
> > include ../../../net/bpf.mk
> > ifeq ($(HAS_IOURING_ZCRX),y)
> > -$(OUTPUT)/iou-zcrx: LDLIBS += -luring
> > +$(OUTPUT)/iou-zcrx: LDLIBS += -luring -lpthread
> > endif
> > diff --git a/tools/testing/selftests/drivers/net/hw/iou-zcrx.c b/tools/testing/selftests/drivers/net/hw/iou-zcrx.c
> > index 334985083f61..de2eea78a5b6 100644
> > --- a/tools/testing/selftests/drivers/net/hw/iou-zcrx.c
> > +++ b/tools/testing/selftests/drivers/net/hw/iou-zcrx.c
> > @@ -4,6 +4,7 @@
> > #include <error.h>
> > #include <fcntl.h>
> > #include <limits.h>
> > +#include <pthread.h>
> > #include <stdbool.h>
> > #include <stdint.h>
> > #include <stdio.h>
> > @@ -85,8 +86,14 @@ static int cfg_send_size = SEND_SIZE;
> > static struct sockaddr_in6 cfg_addr;
> > static unsigned int cfg_rx_buf_len;
> > static bool cfg_dry_run;
> > +static int cfg_num_threads = 1;
> > static char *payload;
> > +
> > +struct thread_ctx {
> > + int thread_id;
>
> This is set here and in patch 4 but I don't see it being used.
Makes sense, will remove from v2.
^ permalink raw reply
* Re: [patch 10/38] arcnet: Remove function timing code
From: David Woodhouse @ 2026-04-13 15:29 UTC (permalink / raw)
To: Thomas Gleixner, LKML
Cc: Michael Grzeschik, netdev, Arnd Bergmann, x86, Lu Baolu, iommu,
linux-wireless, Herbert Xu, linux-crypto, Vlastimil Babka,
linux-mm, Bernie Thompson, linux-fbdev, Theodore Tso, linux-ext4,
Andrew Morton, Uladzislau Rezki, Marco Elver, Dmitry Vyukov,
kasan-dev, Andrey Ryabinin, Thomas Sailer, linux-hams,
Jason A. Donenfeld, Richard Henderson, linux-alpha, Russell King,
linux-arm-kernel, Catalin Marinas, Huacai Chen, loongarch,
Geert Uytterhoeven, linux-m68k, Dinh Nguyen, Jonas Bonn,
linux-openrisc, Helge Deller, linux-parisc, Michael Ellerman,
linuxppc-dev, Paul Walmsley, linux-riscv, Heiko Carstens,
linux-s390, David S. Miller, sparclinux
In-Reply-To: <20260410120318.253872322@kernel.org>
[-- Attachment #1: Type: text/plain, Size: 778 bytes --]
On Fri, 2026-04-10 at 14:19 +0200, Thomas Gleixner wrote:
> ARCNET is a museums piece and the function timing can be done with
> ftrace. Remove the cruft.
>
> Signed-off-by: Thomas Gleixner <tglx@kernel.org>
> Cc: Michael Grzeschik <m.grzeschik@pengutronix.de>
> Cc: netdev@vger.kernel.org
> ---
> drivers/net/arcnet/arc-rimi.c | 4 ++--
> drivers/net/arcnet/arcdevice.h | 20 +-------------------
> drivers/net/arcnet/com20020.c | 6 ++----
> drivers/net/arcnet/com90io.c | 6 ++----
> drivers/net/arcnet/com90xx.c | 4 ++--
> 5 files changed, 9 insertions(+), 31 deletions(-)
Acked-by: David Woodhouse <dwmw2@infradead.org>
By coincidence, I took the last of my ARCNET cards to the tip just this
morning...
[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5069 bytes --]
^ permalink raw reply
* Re: [net,PATCH v2] net: ks8851: Reinstate disabling of BHs around IRQ handler
From: Marek Vasut @ 2026-04-13 15:31 UTC (permalink / raw)
To: Sebastian Andrzej Siewior, Jakub Kicinski
Cc: netdev, stable, David S. Miller, Andrew Lunn, Eric Dumazet,
Nicolai Buchwitz, Paolo Abeni, Ronald Wahl, Yicong Hui,
linux-kernel, Thomas Gleixner
In-Reply-To: <20260413125744.TVKkZcEK@linutronix.de>
On 4/13/26 2:57 PM, Sebastian Andrzej Siewior wrote:
> On 2026-04-12 10:51:25 [-0700], Jakub Kicinski wrote:
>>> Does the backtrace make the problem clearer, with the annotation above ?
>>
>> Sebastian, do you have any recommendation here? tl;dr is that the driver does
> …
>
> What about this:
>
> --- a/drivers/net/ethernet/micrel/ks8851_par.c
> +++ b/drivers/net/ethernet/micrel/ks8851_par.c
> @@ -63,7 +63,7 @@ static void ks8851_lock_par(struct ks8851_net *ks, unsigned long *flags)
> {
> struct ks8851_net_par *ksp = to_ks8851_par(ks);
>
> - spin_lock_irqsave(&ksp->lock, *flags);
> + spin_lock_bh(&ksp->lock);
> }
>
> /**
> @@ -77,7 +77,7 @@ static void ks8851_unlock_par(struct ks8851_net *ks, unsigned long *flags)
> {
> struct ks8851_net_par *ksp = to_ks8851_par(ks);
>
> - spin_unlock_irqrestore(&ksp->lock, *flags);
> + spin_unlock_bh(&ksp->lock);
> }
>
> /**
>
>
> I don't see why it needs to disable interrupts.
Because when the lock is held, the PAR code shouldn't be interrupted by
an interrupt, otherwise it would completely mess up the state of the
KS8851 MAC. The spinlock does not protect only the IRQ handler, it
protects also ks8851_start_xmit_par() and ks8851_write_mac_addr() and
ks8851_read_mac_addr() and ks8851_net_open() and ks8851_net_stop() and
other sites which call ks8851_lock()/ks8851_unlock() which cannot be
executed concurrently, but where BHs can be enabled.
> ? This seems to be used by
> the _par driver and the _common part. The comments refer to DMA but I
> see only FIFO access.
The KS8851 does its own internal DMA into the SRAM, from which the data
are copied by the driver into system DRAM.
> And while at it, I would recommend to
>
> diff --git a/drivers/net/ethernet/micrel/ks8851_common.c b/drivers/net/ethernet/micrel/ks8851_common.c
> index 8048770958d60..f1c662887646c 100644
> --- a/drivers/net/ethernet/micrel/ks8851_common.c
> +++ b/drivers/net/ethernet/micrel/ks8851_common.c
> @@ -378,9 +378,12 @@ static irqreturn_t ks8851_irq(int irq, void *_ks)
> if (status & IRQ_LCI)
> mii_check_link(&ks->mii);
>
> - if (status & IRQ_RXI)
> + if (status & IRQ_RXI) {
> + local_bh_disable();
> while ((skb = __skb_dequeue(&rxq)))
> netif_rx(skb);
> + local_bh_enable();
> + }
>
> return IRQ_HANDLED;
> }
>
> Because otherwise it will kick-off backlog NAPI after every packet if
> multiple packets are available.
I think this patch will do the same, but the above should be done for
the SPI part ?
^ permalink raw reply
* Re: [patch 15/38] ptp: ptp_vmclock: Replace get_cycles() usage
From: David Woodhouse @ 2026-04-13 15:33 UTC (permalink / raw)
To: Thomas Gleixner, LKML
Cc: Arnd Bergmann, x86, Lu Baolu, iommu, Michael Grzeschik, netdev,
linux-wireless, Herbert Xu, linux-crypto, Vlastimil Babka,
linux-mm, Bernie Thompson, linux-fbdev, Theodore Tso, linux-ext4,
Andrew Morton, Uladzislau Rezki, Marco Elver, Dmitry Vyukov,
kasan-dev, Andrey Ryabinin, Thomas Sailer, linux-hams,
Jason A. Donenfeld, Richard Henderson, linux-alpha, Russell King,
linux-arm-kernel, Catalin Marinas, Huacai Chen, loongarch,
Geert Uytterhoeven, linux-m68k, Dinh Nguyen, Jonas Bonn,
linux-openrisc, Helge Deller, linux-parisc, Michael Ellerman,
linuxppc-dev, Paul Walmsley, linux-riscv, Heiko Carstens,
linux-s390, David S. Miller, sparclinux
In-Reply-To: <20260410120318.592237447@kernel.org>
[-- Attachment #1: Type: text/plain, Size: 994 bytes --]
On Fri, 2026-04-10 at 14:19 +0200, Thomas Gleixner wrote:
> get_cycles() is not really well defined and similar to other usaage of the
> underlying hardware CPU counters the PTP vmclock should use an explicit
> interface as well.
>
> Implement ptp_vmclock_read_cpu_counter() in arm64 and x86 and simplify the
> Kconfig selection while at it.
>
> No functional change.
>
> Signed-off-by: Thomas Gleixner <tglx@kernel.org>
> Cc: David Woodhouse <dwmw2@infradead.org>
Acked-by: David Woodhouse <dwmw@amazon.co.uk>
Although I might follow up with a change to make this...
> +static inline u64 ptp_vmclock_read_cpu_counter(void)
> +{
> + return cpu_feature_enabled(X86_FEATURE_TSC) ? rdtsc() : 0;
> +}
> +
... depend on TSC_RELIABLE¹, since if the guest doesn't believe that it
is, then the guest shouldn't be trying to use it as the basis for
precise timing.
¹ (Or... one of the other zoo of TSC flags for the gradually reducing
brokenness over the years...)
[-- Attachment #2: smime.p7s --]
[-- Type: application/pkcs7-signature, Size: 5069 bytes --]
^ permalink raw reply
* Re: [net-next PATCH v5 1/4] octeontx2-af: npa: cn20k: Add NPA Halo support
From: Alexander Lobakin @ 2026-04-13 15:32 UTC (permalink / raw)
To: Subbaraya Sundeep
Cc: andrew+netdev, davem, edumazet, kuba, pabeni, sgoutham, gakula,
bbhushan2, netdev, linux-kernel, Linu Cherian
In-Reply-To: <20260410101150.GA1783722@kernel-ep2>
From: Subbaraya Sundeep <sbhatta@marvell.com>
Date: Fri, 10 Apr 2026 15:41:50 +0530
> On 2026-04-10 at 15:06:56, Alexander Lobakin (aleksander.lobakin@intel.com) wrote:
>> From: Subbaraya Sundeep <sbhatta@marvell.com>
>> Date: Fri, 10 Apr 2026 15:05:36 +0530
>>
>>> On 2026-04-09 at 20:39:02, Alexander Lobakin (aleksander.lobakin@intel.com) wrote:
>>>> From: Subbaraya Sundeep <sbhatta@marvell.com>
>>>> Date: Thu, 9 Apr 2026 15:23:21 +0530
>>>>
>>>>> From: Linu Cherian <lcherian@marvell.com>
>>>>>
>>>>> CN20K silicon implements unified aura and pool context
>>>>> type called Halo for better resource usage. Add support to
>>>>> handle Halo context type operations.
>>>>>
>>>>> Signed-off-by: Linu Cherian <lcherian@marvell.com>
>>>>> Signed-off-by: Subbaraya Sundeep <sbhatta@marvell.com>
>>>>
>>>> [...]
>>>>
>>>>> diff --git a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/struct.h b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/struct.h
>>>>> index 763f6cabd7c2..2364bafd329d 100644
>>>>> --- a/drivers/net/ethernet/marvell/octeontx2/af/cn20k/struct.h
>>>>> +++ b/drivers/net/ethernet/marvell/octeontx2/af/cn20k/struct.h
>>>>> @@ -377,4 +377,85 @@ struct npa_cn20k_pool_s {
>>>>>
>>>>> static_assert(sizeof(struct npa_cn20k_pool_s) == NIX_MAX_CTX_SIZE);
>>>>>
>>>>> +struct npa_cn20k_halo_s {
>>>>> + u64 stack_base : 64;
>>>>
>>>> It's redundant to add : 64 to a 64-bit field.
>>> Agreed. But this is for readability, it helps when checking HRM. For
>>> instance HRM says [703:640] and we define as u64 reserved_640_703 : 64;
>>> so that we do not have to count bits in mind.
>>>> Moreover, on 32-bit systems, the compilers sometimes complain on
>>>> bitfields > 32 bits.
>>> This driver depends on 64BIT.
>>>>
>>>>> + u64 ena : 1;
>>>>> + u64 nat_align : 1;
>>>>> + u64 reserved_66_67 : 2;
>>>>> + u64 stack_caching : 1;
>>>>> + u64 reserved_69_71 : 3;
>>>>> + u64 aura_drop_ena : 1;
>>>>> + u64 reserved_73_79 : 7;
>>>>> + u64 aura_drop : 8;
>>>>> + u64 buf_offset : 12;
>>>>> + u64 reserved_100_103 : 4;
>>>>> + u64 buf_size : 12;
>>>>> + u64 reserved_116_119 : 4;
>>>>> + u64 ref_cnt_prof : 3;
>>>>> + u64 reserved_123_127 : 5;
>>>>> + u64 stack_max_pages : 32;
>>>>> + u64 stack_pages : 32;
>>>>> + u64 bp_0 : 7;
>>>>> + u64 bp_1 : 7;
>>>>> + u64 bp_2 : 7;
>>>>> + u64 bp_3 : 7;
>>>>> + u64 bp_4 : 7;
>>>>> + u64 bp_5 : 7;
>>>>> + u64 bp_6 : 7;
>>>>> + u64 bp_7 : 7;
>>>>> + u64 bp_ena_0 : 1;
>>>>> + u64 bp_ena_1 : 1;
>>>>> + u64 bp_ena_2 : 1;
>>>>> + u64 bp_ena_3 : 1;
>>>>> + u64 bp_ena_4 : 1;
>>>>> + u64 bp_ena_5 : 1;
>>>>> + u64 bp_ena_6 : 1;
>>>>> + u64 bp_ena_7 : 1;
>>>>> + u64 stack_offset : 4;
>>>>> + u64 reserved_260_263 : 4;
>>>>> + u64 shift : 6;
>>>>> + u64 reserved_270_271 : 2;
>>>>> + u64 avg_level : 8;
>>>>> + u64 avg_con : 9;
>>>>> + u64 fc_ena : 1;
>>>>> + u64 fc_stype : 2;
>>>>> + u64 fc_hyst_bits : 4;
>>>>> + u64 fc_up_crossing : 1;
>>>>> + u64 reserved_297_299 : 3;
>>>>> + u64 update_time : 16;
>>>>> + u64 reserved_316_319 : 4;
>>>>> + u64 fc_addr : 64;
>>>>> + u64 ptr_start : 64;
>>>>> + u64 ptr_end : 64;
>>>>> + u64 bpid_0 : 12;
>>>>> + u64 reserved_524_535 : 12;
>>>>> + u64 err_int : 8;
>>>>> + u64 err_int_ena : 8;
>>>>> + u64 thresh_int : 1;
>>>>> + u64 thresh_int_ena : 1;
>>>>> + u64 thresh_up : 1;
>>>>> + u64 reserved_555 : 1;
>>>>> + u64 thresh_qint_idx : 7;
>>>>> + u64 reserved_563 : 1;
>>>>> + u64 err_qint_idx : 7;
>>>>> + u64 reserved_571_575 : 5;
>>>>> + u64 thresh : 36;
>>>>> + u64 reserved_612_615 : 4;
>>>>> + u64 fc_msh_dst : 11;
>>>>> + u64 reserved_627_630 : 4;
>>>>> + u64 op_dpc_ena : 1;
>>>>> + u64 op_dpc_set : 5;
>>>>> + u64 reserved_637_637 : 1;
>>>>> + u64 stream_ctx : 1;
>>>>> + u64 unified_ctx : 1;
>>>>> + u64 reserved_640_703 : 64;
>>>>> + u64 reserved_704_767 : 64;
>>>>> + u64 reserved_768_831 : 64;
>>>>> + u64 reserved_832_895 : 64;
>>>>> + u64 reserved_896_959 : 64;
>>>>> + u64 reserved_960_1023 : 64;
>>>>> +};
>>>>> +
>>>>> +static_assert(sizeof(struct npa_cn20k_halo_s) == NIX_MAX_CTX_SIZE);
>>>>
>>>> Now the main question:
>>>>
>>>> Is mailbox's Endianness fixed (LE/BE)? Or is it always the same as the
>>>> host's ones (I doubt so)?
>>>> If not, these need to be __le{8,16,32,64} (or __be if it's Big Endian)
>>>> and you need to handle the conversions manually.
>>>>
>>> Yes endianness is LE and fixed. This is NOT a host side driver for an
>>> endpoint card. This is driver for on chip PCI device of CN20K soc.
>>> Hope I answered your question wrt host.
>>
>> But the mailbox is shared between the SoC and the host or HW or not? Is
> In hardware it is just shared DDR region between two on chip devices and both
> devices access shared region using their BARs.
>> it possible that one client of the mailbox will have LE and the second
>> will have BE?
> No not possible.
Okay, so seems like it's safe to use Endianness-agnostic types without
messing with `__le`/`__be`, thanks for explaining.
Thanks,
Olek
^ permalink raw reply
* Re: [PATCH 2/5] selftests: net: add multithread client support to iou-zcrx
From: Juanlu Herrero @ 2026-04-13 15:44 UTC (permalink / raw)
To: David Wei; +Cc: netdev
In-Reply-To: <8fa08d73-28a3-4521-bcfb-ec81869c24f3@davidwei.uk>
On Thu, Apr 09, 2026 at 08:51:11AM -0600, David Wei wrote:
> On 2026-04-08 09:38, Juanlu Herrero wrote:
> > Add pthreads to the iou-zcrx client so that multiple connections can be
> > established simultaneously. Each client thread connects to the server
> > and sends its payload independently.
> >
> > Introduce struct thread_ctx and the -t option to control the number of
> > threads (default 1), preserving backwards compatibility with existing
> > tests.
> >
> > Signed-off-by: Juanlu Herrero <juanlu@fastmail.com>
> > ---
> > .../testing/selftests/drivers/net/hw/Makefile | 2 +-
> > .../selftests/drivers/net/hw/iou-zcrx.c | 46 +++++++++++++++++--
> > 2 files changed, 44 insertions(+), 4 deletions(-)
> >
> > diff --git a/tools/testing/selftests/drivers/net/hw/Makefile b/tools/testing/selftests/drivers/net/hw/Makefile
> > index deeca3f8d080..227adfec706c 100644
> > --- a/tools/testing/selftests/drivers/net/hw/Makefile
> > +++ b/tools/testing/selftests/drivers/net/hw/Makefile
> > @@ -80,5 +80,5 @@ include ../../../net/ynl.mk
> > include ../../../net/bpf.mk
> > ifeq ($(HAS_IOURING_ZCRX),y)
> > -$(OUTPUT)/iou-zcrx: LDLIBS += -luring
> > +$(OUTPUT)/iou-zcrx: LDLIBS += -luring -lpthread
> > endif
> > diff --git a/tools/testing/selftests/drivers/net/hw/iou-zcrx.c b/tools/testing/selftests/drivers/net/hw/iou-zcrx.c
> > index 334985083f61..de2eea78a5b6 100644
> > --- a/tools/testing/selftests/drivers/net/hw/iou-zcrx.c
> > +++ b/tools/testing/selftests/drivers/net/hw/iou-zcrx.c
> > @@ -4,6 +4,7 @@
> > #include <error.h>
> > #include <fcntl.h>
> > #include <limits.h>
> > +#include <pthread.h>
> > #include <stdbool.h>
> > #include <stdint.h>
> > #include <stdio.h>
> > @@ -85,8 +86,14 @@ static int cfg_send_size = SEND_SIZE;
> > static struct sockaddr_in6 cfg_addr;
> > static unsigned int cfg_rx_buf_len;
> > static bool cfg_dry_run;
> > +static int cfg_num_threads = 1;
> > static char *payload;
> > +
> > +struct thread_ctx {
> > + int thread_id;
>
> This is set here and in patch 4 but I don't see it being used.
Makes sense, will remove in v2.
^ permalink raw reply
* Re: [net,PATCH v2] net: ks8851: Reinstate disabling of BHs around IRQ handler
From: Jakub Kicinski @ 2026-04-13 15:44 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: Marek Vasut, netdev, stable, David S. Miller, Andrew Lunn,
Eric Dumazet, Nicolai Buchwitz, Paolo Abeni, Ronald Wahl,
Yicong Hui, linux-kernel, Thomas Gleixner
In-Reply-To: <20260413125744.TVKkZcEK@linutronix.de>
On Mon, 13 Apr 2026 14:57:44 +0200 Sebastian Andrzej Siewior wrote:
> On 2026-04-12 10:51:25 [-0700], Jakub Kicinski wrote:
> > > Does the backtrace make the problem clearer, with the annotation above ?
> >
> > Sebastian, do you have any recommendation here? tl;dr is that the driver does
> …
>
> What about this:
Thanks for taking a look (according to you auto-reply immediately after
a vacation ;))
TBH changing the driver feels like a workaround / invitation for a
whack-a-mole game. I'd prefer to fix the skb allocation.
Is there any way we can check if any locks which were _irq() on non-RT
are held?
^ permalink raw reply
* Re: [PATCH iwl-next v2] igb: use ktime_get_real helpers in igb_ptp_reset()
From: Simon Horman @ 2026-04-13 15:50 UTC (permalink / raw)
To: Aleksandr Loktionov
Cc: intel-wired-lan, anthony.l.nguyen, netdev, Jacob Keller,
Paul Menzel
In-Reply-To: <20260409075523.3728506-1-aleksandr.loktionov@intel.com>
On Thu, Apr 09, 2026 at 09:55:23AM +0200, Aleksandr Loktionov wrote:
> Replace ktime_to_ns(ktime_get_real()) with the direct equivalent
> ktime_get_real_ns() and ktime_to_timespec64(ktime_get_real()) with
> ktime_get_real_ts64() in igb_ptp_reset(). Using the combined helpers
> makes the intent clearer.
>
> Suggested-by: Jacob Keller <jacob.e.keller@intel.com>
> Suggested-by: Simon Horman <horms@kernel.org>
> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
> Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Reviewed-by: Simon Horman <horms@kernel.org>
^ permalink raw reply
* RE: [Intel-wired-lan] [PATCH iwl-next] ice: fix FDIR CTRL VSI resource leak in ice_reset_all_vfs()
From: Romanowski, Rafal @ 2026-04-13 15:51 UTC (permalink / raw)
To: Simon Horman, Loktionov, Aleksandr
Cc: intel-wired-lan@lists.osuosl.org, Nguyen, Anthony L,
netdev@vger.kernel.org, Dawid Osuchowski
In-Reply-To: <20260403115255.GA60103@horms.kernel.org>
> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of Simon
> Horman
> Sent: Friday, April 3, 2026 1:57 PM
> To: Loktionov, Aleksandr <aleksandr.loktionov@intel.com>
> Cc: intel-wired-lan@lists.osuosl.org; Nguyen, Anthony L
> <anthony.l.nguyen@intel.com>; netdev@vger.kernel.org; Dawid Osuchowski
> <dawid.osuchowski@linux.intel.com>
> Subject: Re: [Intel-wired-lan] [PATCH iwl-next] ice: fix FDIR CTRL VSI resource
> leak in ice_reset_all_vfs()
>
> On Fri, Mar 27, 2026 at 08:22:32AM +0100, Aleksandr Loktionov wrote:
> > From: Dawid Osuchowski <dawid.osuchowski@linux.intel.com>
> >
> > Resetting all VFs causes resource leak on VFs with FDIR filters
> > enabled as CTRL VSIs are only invalidated and not freed. Fix by using
> > ice_vf_ctrl_vsi_release() instead of ice_vf_ctrl_invalidate_vsi()
> > which aligns behavior with the ice_reset_vf() function.
> >
> > Reproduction:
> > echo 1 > /sys/class/net/$pf/device/sriov_numvfs
> > ethtool -N $vf flow-type ether proto 0x9000 action 0
> > echo 1 > /sys/class/net/$pf/device/reset
> >
> > Fixes: da62c5ff9dcd ("ice: Add support for per VF ctrl VSI enabling")
> > Signed-off-by: Dawid Osuchowski <dawid.osuchowski@linux.intel.com>
> > Signed-off-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
>
> Reviewed-by: Simon Horman <horms@kernel.org>
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
^ permalink raw reply
* RE: [Intel-wired-lan] [PATCH iwl-net] ice: fix NULL pointer dereference in ice_reset_all_vfs()
From: Romanowski, Rafal @ 2026-04-13 15:53 UTC (permalink / raw)
To: Oros, Petr, netdev@vger.kernel.org
Cc: Kitszel, Przemyslaw, Brett Creeley, Eric Dumazet,
linux-kernel@vger.kernel.org, Andrew Lunn, Nguyen, Anthony L,
intel-wired-lan@lists.osuosl.org, Jakub Kicinski, Paolo Abeni,
David S. Miller
In-Reply-To: <20260401110937.83497-1-poros@redhat.com>
> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of Petr
> Oros
> Sent: Wednesday, April 1, 2026 1:10 PM
> To: netdev@vger.kernel.org
> Cc: Kitszel, Przemyslaw <przemyslaw.kitszel@intel.com>; Brett Creeley
> <brett.creeley@intel.com>; Eric Dumazet <edumazet@google.com>; linux-
> kernel@vger.kernel.org; Andrew Lunn <andrew+netdev@lunn.ch>; Nguyen,
> Anthony L <anthony.l.nguyen@intel.com>; intel-wired-lan@lists.osuosl.org;
> Jakub Kicinski <kuba@kernel.org>; Paolo Abeni <pabeni@redhat.com>; David S.
> Miller <davem@davemloft.net>
> Subject: [Intel-wired-lan] [PATCH iwl-net] ice: fix NULL pointer dereference in
> ice_reset_all_vfs()
>
> ice_reset_all_vfs() ignores the return value of ice_vf_rebuild_vsi().
> When the VSI rebuild fails (e.g. during NVM firmware update via nvmupdate64e),
> ice_vsi_rebuild() tears down the VSI on its error path, leaving txq_map and
> rxq_map as NULL. The subsequent unconditional call to ice_vf_post_vsi_rebuild()
> leads to a NULL pointer dereference in
> ice_ena_vf_q_mappings() when it accesses vsi->txq_map[0].
>
> The single-VF reset path in ice_reset_vf() already handles this correctly by
> checking the return value of ice_vf_reconfig_vsi() and skipping
> ice_vf_post_vsi_rebuild() on failure.
>
> Apply the same pattern to ice_reset_all_vfs(): check the return value of
> ice_vf_rebuild_vsi() and skip ice_vf_post_vsi_rebuild() and
> ice_eswitch_attach_vf() on failure. The VF is left safely disabled
> (ICE_VF_STATE_INIT not set, VFGEN_RSTAT not set to VFACTIVE) and can be
> recovered via a VFLR triggered by a PCI reset of the VF (sysfs reset or driver
> rebind).
>
> Note that this patch does not prevent the VF VSI rebuild from failing during NVM
> update — the underlying cause is firmware being in a transitional state while the
> EMP reset is processed, which can cause Admin Queue commands (ice_add_vsi,
> ice_cfg_vsi_lan) to fail. This patch only prevents the subsequent NULL pointer
> dereference that crashes the kernel when the rebuild does fail.
>
> crash> bt
> PID: 50795 TASK: ff34c9ee708dc680 CPU: 1 COMMAND:
> "kworker/u512:5"
> #0 [ff72159bcfe5bb50] machine_kexec at ffffffffaa8850ee
> #1 [ff72159bcfe5bba8] __crash_kexec at ffffffffaaa15fba
> #2 [ff72159bcfe5bc68] crash_kexec at ffffffffaaa16540
> #3 [ff72159bcfe5bc70] oops_end at ffffffffaa837eda
> #4 [ff72159bcfe5bc90] page_fault_oops at ffffffffaa893997
> #5 [ff72159bcfe5bce8] exc_page_fault at ffffffffab528595
> #6 [ff72159bcfe5bd10] asm_exc_page_fault at ffffffffab600bb2
> [exception RIP: ice_ena_vf_q_mappings+0x79]
> RIP: ffffffffc0a85b29 RSP: ff72159bcfe5bdc8 RFLAGS: 00010206
> RAX: 00000000000f0000 RBX: ff34c9efc9c00000 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: 0000000000000010 RDI: ff34c9efc9c00000
> RBP: ff34c9efc27d4828 R8: 0000000000000093 R9: 0000000000000040
> R10: ff34c9efc27d4828 R11: 0000000000000040 R12: 0000000000100000
> R13: 0000000000000010 R14: R15:
> ORIG_RAX: ffffffffffffffff CS: 0010 SS: 0018
> #7 [ff72159bcfe5bdf8] ice_sriov_post_vsi_rebuild at ffffffffc0a85e2e [ice]
> #8 [ff72159bcfe5be08] ice_reset_all_vfs at ffffffffc0a920b4 [ice]
> #9 [ff72159bcfe5be48] ice_service_task at ffffffffc0a31519 [ice]
> #10 [ff72159bcfe5be88] process_one_work at ffffffffaa93dca4
> #11 [ff72159bcfe5bec8] worker_thread at ffffffffaa93e9de
> #12 [ff72159bcfe5bf18] kthread at ffffffffaa946663
> #13 [ff72159bcfe5bf50] ret_from_fork at ffffffffaa8086b9
>
> The panic occurs attempting to dereference the NULL pointer in RDX at
> ice_sriov.c:294, which loads vsi->txq_map (offset 0x4b8 in ice_vsi).
>
> The faulting VSI is an allocated slab object but not fully initialized after a failed
> ice_vsi_rebuild():
>
> crash> struct ice_vsi 0xff34c9efc27d4828
> netdev = 0x0,
> rx_rings = 0x0,
> tx_rings = 0x0,
> q_vectors = 0x0,
> txq_map = 0x0,
> rxq_map = 0x0,
> alloc_txq = 0x10,
> num_txq = 0x10,
> alloc_rxq = 0x10,
> num_rxq = 0x10,
>
> The nvmupdate64e process was performing NVM firmware update:
>
> crash> bt 0xff34c9edd1a30000
> PID: 49858 TASK: ff34c9edd1a30000 CPU: 1 COMMAND: "nvmupdate64e"
> #0 [ff72159bcd617618] __schedule at ffffffffab5333f8
> #4 [ff72159bcd617750] ice_sq_send_cmd at ffffffffc0a35347 [ice]
> #5 [ff72159bcd6177a8] ice_sq_send_cmd_retry at ffffffffc0a35b47 [ice]
> #6 [ff72159bcd617810] ice_aq_send_cmd at ffffffffc0a38018 [ice]
> #7 [ff72159bcd617848] ice_aq_read_nvm at ffffffffc0a40254 [ice]
> #8 [ff72159bcd6178b8] ice_read_flat_nvm at ffffffffc0a4034c [ice]
> #9 [ff72159bcd617918] ice_devlink_nvm_snapshot at ffffffffc0a6ffa5 [ice]
>
> dmesg:
> ice 0000:13:00.0: firmware recommends not updating fw.mgmt, as it
> may result in a downgrade. continuing anyways
> ice 0000:13:00.1: ice_init_nvm failed -5
> ice 0000:13:00.1: Rebuild failed, unload and reload driver
>
> Fixes: 12bb018c538c ("ice: Refactor VF reset")
> Signed-off-by: Petr Oros <poros@redhat.com>
> ---
> drivers/net/ethernet/intel/ice/ice_vf_lib.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/intel/ice/ice_vf_lib.c
> b/drivers/net/ethernet/intel/ice/ice_vf_lib.c
> index c8bc952f05cdb5..51259a4fdda4b9 100644
> --- a/drivers/net/ethernet/intel/ice/ice_vf_lib.c
> +++ b/drivers/net/ethernet/intel/ice/ice_vf_lib.c
> @@ -804,7 +804,12 @@ void ice_reset_all_vfs(struct ice_pf *pf)
> ice_vf_ctrl_invalidate_vsi(vf);
>
> ice_vf_pre_vsi_rebuild(vf);
> - ice_vf_rebuild_vsi(vf);
> + if (ice_vf_rebuild_vsi(vf)) {
> + dev_err(dev, "VF %u VSI rebuild failed, leaving VF
> disabled\n",
> + vf->vf_id);
> + mutex_unlock(&vf->cfg_lock);
> + continue;
> + }
> ice_vf_post_vsi_rebuild(vf);
>
> ice_eswitch_attach_vf(pf, vf);
> --
> 2.52.0
Tested-by: Rafal Romanowski <rafal.romanowski@intel.com>
^ permalink raw reply
* Re: [PATCH 5/5] selftests: net: add rss_multiqueue test variant to iou-zcrx
From: Juanlu Herrero @ 2026-04-13 16:01 UTC (permalink / raw)
To: David Wei; +Cc: netdev
In-Reply-To: <d3777e86-2859-491d-8071-77871df13c08@davidwei.uk>
On Fri, Apr 10, 2026 at 03:26:54PM -0600, David Wei wrote:
> On 2026-04-08 09:38, Juanlu Herrero wrote:
> > Add multi-port support to the iou-zcrx test binary and a new
> > rss_multiqueue Python test variant that exercises multi-queue zero-copy
> > receive with per-port flow rule steering.
> >
> > In multi-port mode, the server creates N listening sockets on
> > consecutive ports (cfg_port, cfg_port+1, ...) and uses epoll to accept
> > one connection per socket. Each client thread connects to its
> > corresponding port. Per-port ntuple flow rules steer traffic to
> > different NIC hardware queues, each with its own zcrx instance.
> >
> > For single-thread mode (the default), behavior is unchanged: one socket
> > on cfg_port, one thread, one queue.
> >
> > Signed-off-by: Juanlu Herrero <juanlu@fastmail.com>
> > ---
> > .../selftests/drivers/net/hw/iou-zcrx.c | 81 ++++++++++++++-----
> > .../selftests/drivers/net/hw/iou-zcrx.py | 45 ++++++++++-
> > 2 files changed, 104 insertions(+), 22 deletions(-)
> >
> > diff --git a/tools/testing/selftests/drivers/net/hw/iou-zcrx.c b/tools/testing/selftests/drivers/net/hw/iou-zcrx.c
> > index 646682167bb0..1f33d7127185 100644
> > --- a/tools/testing/selftests/drivers/net/hw/iou-zcrx.c
> > +++ b/tools/testing/selftests/drivers/net/hw/iou-zcrx.c
>
> Please make all changes in iou-zcrx.c in a single patch. Then patch 5
> only changes the Python selftest.
>
> [...]
> > @@ -397,12 +410,36 @@ static void run_server(void)
> > if (cfg_dry_run)
> > goto join;
> > + epfd = epoll_create1(0);
> > + if (epfd < 0)
> > + error(1, 0, "epoll_create1()");
> > +
> > for (i = 0; i < cfg_num_threads; i++) {
> > - ctxs[i].connfd = accept(fd, NULL, NULL);
> > - if (ctxs[i].connfd < 0)
> > - error(1, 0, "accept()");
> > + ev.events = EPOLLIN;
> > + ev.data.u32 = i;
> > + if (epoll_ctl(epfd, EPOLL_CTL_ADD, fds[i], &ev) < 0)
> > + error(1, 0, "epoll_ctl()");
> > }
> > + accepted = 0;
> > + while (accepted < cfg_num_threads) {
>
> You're using epoll here but it is still accepting a fixed nr of
> connections. The server should be able to accept an arbitrary nr of
> connections, dispatching them to the server worker threads.
>
> Also with multiple queues, connections must be dispatched according to
> their NAPI IDs to the correct server workers.
>
> > + nfds = epoll_wait(epfd, events, 64, 5000);
> > + if (nfds < 0)
> > + error(1, 0, "epoll_wait()");
> > + if (nfds == 0)
> > + error(1, 0, "epoll_wait() timeout");
> > +
> > + for (i = 0; i < nfds; i++) {
> > + int idx = events[i].data.u32;
> > +
> > + ctxs[idx].connfd = accept(fds[idx], NULL, NULL);
> > + if (ctxs[idx].connfd < 0)
> > + error(1, 0, "accept()");
> > + accepted++;
> > + }
> > + }
> > +
> > + close(epfd);
> > pthread_barrier_wait(&barrier);
> > join:
Makes sense, I will re-work the patches and address the epoll & NAPI
id feedback. Thanks!
^ permalink raw reply
* Re: [net,PATCH v2] net: ks8851: Reinstate disabling of BHs around IRQ handler
From: Sebastian Andrzej Siewior @ 2026-04-13 16:03 UTC (permalink / raw)
To: Marek Vasut
Cc: Jakub Kicinski, netdev, stable, David S. Miller, Andrew Lunn,
Eric Dumazet, Nicolai Buchwitz, Paolo Abeni, Ronald Wahl,
Yicong Hui, linux-kernel, Thomas Gleixner
In-Reply-To: <16fdeec9-9208-4c9b-b228-d6c6e045e116@nabladev.com>
On 2026-04-13 17:31:34 [+0200], Marek Vasut wrote:
> > I don't see why it needs to disable interrupts.
>
> Because when the lock is held, the PAR code shouldn't be interrupted by an
> interrupt, otherwise it would completely mess up the state of the KS8851
> MAC. The spinlock does not protect only the IRQ handler, it protects also
> ks8851_start_xmit_par() and ks8851_write_mac_addr() and
> ks8851_read_mac_addr() and ks8851_net_open() and ks8851_net_stop() and other
> sites which call ks8851_lock()/ks8851_unlock() which cannot be executed
> concurrently, but where BHs can be enabled.
I need check this once brain is at full power again. But which
interrupt? Your interrupt is threaded. So that should be okay.
> > ? This seems to be used by
> > the _par driver and the _common part. The comments refer to DMA but I
> > see only FIFO access.
>
> The KS8851 does its own internal DMA into the SRAM, from which the data are
> copied by the driver into system DRAM.
So this no interrupt involved as "dma completed" and you do your manual
"memcpy".
> > And while at it, I would recommend to
> >
> > diff --git a/drivers/net/ethernet/micrel/ks8851_common.c b/drivers/net/ethernet/micrel/ks8851_common.c
> > index 8048770958d60..f1c662887646c 100644
> > --- a/drivers/net/ethernet/micrel/ks8851_common.c
> > +++ b/drivers/net/ethernet/micrel/ks8851_common.c
> > @@ -378,9 +378,12 @@ static irqreturn_t ks8851_irq(int irq, void *_ks)
> > if (status & IRQ_LCI)
> > mii_check_link(&ks->mii);
> > - if (status & IRQ_RXI)
> > + if (status & IRQ_RXI) {
> > + local_bh_disable();
> > while ((skb = __skb_dequeue(&rxq)))
> > netif_rx(skb);
> > + local_bh_enable();
> > + }
> > return IRQ_HANDLED;
> > }
> >
> > Because otherwise it will kick-off backlog NAPI after every packet if
> > multiple packets are available.
> I think this patch will do the same, but the above should be done for the
> SPI part ?
Yes, both. This the SPI/ Mutex part does not matter. You inject one
packet into netif_rx() then if will add it to its internal NAPI and
schedule a softirq, process it. It would be more efficient to queue
multiple packets and process them all at the local_bh_enable() time.
Sebastian
^ permalink raw reply
* Re: [PATCH bpf-next v2 1/1] bpf: Refactor dynptr mutability tracking
From: Mykyta Yatsenko @ 2026-04-13 16:05 UTC (permalink / raw)
To: Amery Hung, bpf
Cc: netdev, alexei.starovoitov, andrii, daniel, memxor, eddyz87,
yatsenko, martin.lau, kernel-team
In-Reply-To: <20260402065013.884228-2-ameryhung@gmail.com>
On 4/2/26 7:50 AM, Amery Hung wrote:
> Redefine dynptr mutability and fix inconsistency in the verifier and
> kfunc signatures. Dynptr mutability is at two levels. The first is
> the bpf_dynptr structure and the second is the memory the dynptr points
> to. The verifer currently tracks the mutability of the bpf_dynptr struct
> through helper and kfunc prototypes, where "const struct bpf_dynptr *"
> means the structure itself is immutable. The second level is tracked
> in upper bit of bpf_dynptr->size in runtime and is not changed in this
> patch.
>
> There are two type of inconsistency in the verfier regarding the
> mutability of the bpf_dynptr struct. First, there are many existing
> kfuncs whose prototypes are wrong. For example, bpf_dynptr_adjust()
> mutates a dynptr's start and offset but marks the argument as a const
> pointer. At the same time many other kfuncs that does not mutate the
> dynptr but mark themselves as mutable. Second, the verifier currently
> does not honor the const qualifier in kfunc prototypes as it determines
> whether tagging the arg_type with MEM_RDONLY or not based on the register
> state.
>
> Since all the verifier care is to prevent CONST_PTR_TO_DYNPTR from
> being destroyed in callback and global subprogram, redefine the
> mutability at the bpf_dynptr level to just bpf_dynptr_kern->data. Then,
> explicitly prohibit passing CONST_PTR_TO_DYNPTR to an argument tagged
> with MEM_UNINIT or OBJ_RELEASE. The mutability of a dynptr's view is not
> really interesting so drop MEM_RDONLY annotation for dynptr from the
> helpers and kfuncs. Plus, if the mutability of the entire bpf_dynptr
> were to be done correctly, it would kill the bpf_dynptr_adjust() usage
> in callback and global subporgram.
>
> Implementation wise
>
> - First, make sure all kfunc arg are correctly tagged: Tag the dynptr
> argument of bpf_dynptr_file_discard() with OBJ_RELEASE.
> - Then, in process_dynptr_func(), make sure CONST_PTR_TO_DYNPTR cannot
> be passed to argument tagged with MEM_UNINIT or OBJ_RELEASE. For
> MEM_UNINIT, it is already checked by is_dynptr_reg_valid_uninit().
> For OBJ_RELEASE, check against OBJ_RELEASE instead of MEM_RDONLY and
> drop a now identical check in umark_stack_slots_dynptr().
> - Remove the mutual exclusive check between MEM_UNINIT and MEM_RDONLY,
> but don't add a MEM_UNINIT and OBJ_RELEASE version as it is obviously
> wrong.
>
> Note that while this patch stops following the C semantic for the
> mutability of bpf_dynptr, the prototype of kfuncs are still fixed to
> maintain the correct C semantics in the helper implementation. Adding or
> removing the const qualifier does not break backward compatibility.
>
> In test_kfunc_dynptr_param.c, initialize dynptr to 0 to avoid
> -Wuninitialized-const-pointer warning.
>
> Signed-off-by: Amery Hung <ameryhung@gmail.com>
> ---
> fs/verity/measure.c | 2 +-
> include/linux/bpf.h | 8 +--
> kernel/bpf/btf.c | 2 +-
> kernel/bpf/helpers.c | 18 ++---
> kernel/bpf/verifier.c | 68 +++++--------------
> kernel/trace/bpf_trace.c | 18 ++---
> tools/testing/selftests/bpf/bpf_kfuncs.h | 8 +--
> .../selftests/bpf/progs/dynptr_success.c | 6 +-
> .../bpf/progs/test_kfunc_dynptr_param.c | 9 +--
> 9 files changed, 51 insertions(+), 88 deletions(-)
>
> diff --git a/fs/verity/measure.c b/fs/verity/measure.c
> index 6a35623ebdf0..3840436e4510 100644
> --- a/fs/verity/measure.c
> +++ b/fs/verity/measure.c
> @@ -118,7 +118,7 @@ __bpf_kfunc_start_defs();
> *
> * Return: 0 on success, a negative value on error.
> */
> -__bpf_kfunc int bpf_get_fsverity_digest(struct file *file, struct bpf_dynptr *digest_p)
> +__bpf_kfunc int bpf_get_fsverity_digest(struct file *file, const struct bpf_dynptr *digest_p)
> {
> struct bpf_dynptr_kern *digest_ptr = (struct bpf_dynptr_kern *)digest_p;
maybe we can make this digest_ptr const as well, otherwise it's a little
bit strange to introduce const, but cast to non-const kernel struct
immediately. I think we can apply this in other kfuncs.
> const struct inode *inode = file_inode(file);
> diff --git a/include/linux/bpf.h b/include/linux/bpf.h
> index 05b34a6355b0..329b78940b79 100644
> --- a/include/linux/bpf.h
> +++ b/include/linux/bpf.h
> @@ -3621,8 +3621,8 @@ static inline int bpf_fd_reuseport_array_update_elem(struct bpf_map *map,
> struct bpf_key *bpf_lookup_user_key(s32 serial, u64 flags);
> struct bpf_key *bpf_lookup_system_key(u64 id);
> void bpf_key_put(struct bpf_key *bkey);
> -int bpf_verify_pkcs7_signature(struct bpf_dynptr *data_p,
> - struct bpf_dynptr *sig_p,
> +int bpf_verify_pkcs7_signature(const struct bpf_dynptr *data_p,
> + const struct bpf_dynptr *sig_p,
> struct bpf_key *trusted_keyring);
>
> #else
...
> err = mark_stack_slots_dynptr(env, reg, arg_type, insn_idx, clone_ref_obj_id);
> - } else /* MEM_RDONLY and None case from above */ {
> + } else /* OBJ_RELEASE and None case from above */ {
> /* For the reg->type == PTR_TO_STACK case, bpf_dynptr is never const */
> - if (reg->type == CONST_PTR_TO_DYNPTR && !(arg_type & MEM_RDONLY)) {
> - verbose(env, "cannot pass pointer to const bpf_dynptr, the helper mutates it\n");
> + if (reg->type == CONST_PTR_TO_DYNPTR && (arg_type & OBJ_RELEASE)) {
> + verbose(env, "CONST_PTR_TO_DYNPTR cannot be released");
\n is missing in the verbose.
> return -EINVAL;
> }
>
> @@ -8958,7 +8929,7 @@ static int process_dynptr_func(struct bpf_verifier_env *env, int regno, int insn
> return -EINVAL;
> }
>
> - /* Fold modifiers (in this case, MEM_RDONLY) when checking expected type */
> + /* Fold modifiers (in this case, OBJ_RELEASE) when checking expected type */
> if (!is_dynptr_type_expected(env, reg, arg_type & ~MEM_RDONLY)) {
Do we need to update the `is_dynptr_type_expected(env, reg, arg_type &
~MEM_RDONLY)` as MEM_RDONLY is no longer applied to the dynptr?
> verbose(env,
> "Expected a dynptr of type %s as arg #%d\n",
> @@ -10803,7 +10774,7 @@ static int btf_check_func_arg_match(struct bpf_verifier_env *env, int subprog,
> bpf_log(log, "R%d is not a pointer to arena or scalar.\n", regno);
> return -EINVAL;
> }
> - } else if (arg->arg_type == (ARG_PTR_TO_DYNPTR | MEM_RDONLY)) {
> + } else if (arg->arg_type == ARG_PTR_TO_DYNPTR) {
> ret = check_func_arg_reg_off(env, reg, regno, ARG_PTR_TO_DYNPTR);
> if (ret)
> return ret;
> @@ -13718,9 +13689,6 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_
> enum bpf_arg_type dynptr_arg_type = ARG_PTR_TO_DYNPTR;
> int clone_ref_obj_id = 0;
>
> - if (reg->type == CONST_PTR_TO_DYNPTR)
> - dynptr_arg_type |= MEM_RDONLY;
> -
> if (is_kfunc_arg_uninit(btf, &args[i]))
> dynptr_arg_type |= MEM_UNINIT;
>
> @@ -13733,7 +13701,7 @@ static int check_kfunc_args(struct bpf_verifier_env *env, struct bpf_kfunc_call_
> } else if (meta->func_id == special_kfunc_list[KF_bpf_dynptr_from_file]) {
> dynptr_arg_type |= DYNPTR_TYPE_FILE;
> } else if (meta->func_id == special_kfunc_list[KF_bpf_dynptr_file_discard]) {
> - dynptr_arg_type |= DYNPTR_TYPE_FILE;
> + dynptr_arg_type |= DYNPTR_TYPE_FILE | OBJ_RELEASE;
> meta->release_regno = regno;
> } else if (meta->func_id == special_kfunc_list[KF_bpf_dynptr_clone] &&
> (dynptr_arg_type & MEM_UNINIT)) {
> @@ -24785,7 +24753,7 @@ static int do_check_common(struct bpf_verifier_env *env, int subprog)
> } else if (arg->arg_type == ARG_ANYTHING) {
> reg->type = SCALAR_VALUE;
> mark_reg_unknown(env, regs, i);
> - } else if (arg->arg_type == (ARG_PTR_TO_DYNPTR | MEM_RDONLY)) {
> + } else if (arg->arg_type == ARG_PTR_TO_DYNPTR) {
> /* assume unspecial LOCAL dynptr type */
> __mark_dynptr_reg(reg, BPF_DYNPTR_TYPE_LOCAL, true, ++env->id_gen);
> } else if (base_type(arg->arg_type) == ARG_PTR_TO_MEM) {
> diff --git a/kernel/trace/bpf_trace.c b/kernel/trace/bpf_trace.c
> index 0b040a417442..5f35ecdd5341 100644
> --- a/kernel/trace/bpf_trace.c
> +++ b/kernel/trace/bpf_trace.c
> @@ -3393,7 +3393,7 @@ typedef int (*copy_fn_t)(void *dst, const void *src, u32 size, struct task_struc
> * direct calls into all the specific callback implementations
> * (copy_user_data_sleepable, copy_user_data_nofault, and so on)
> */
> -static __always_inline int __bpf_dynptr_copy_str(struct bpf_dynptr *dptr, u64 doff, u64 size,
> +static __always_inline int __bpf_dynptr_copy_str(const struct bpf_dynptr *dptr, u64 doff, u64 size,
> const void *unsafe_src,
> copy_fn_t str_copy_fn,
> struct task_struct *tsk)
> @@ -3535,49 +3535,49 @@ __bpf_kfunc int bpf_send_signal_task(struct task_struct *task, int sig, enum pid
> return bpf_send_signal_common(sig, type, task, value);
> }
>
> -__bpf_kfunc int bpf_probe_read_user_dynptr(struct bpf_dynptr *dptr, u64 off,
> +__bpf_kfunc int bpf_probe_read_user_dynptr(const struct bpf_dynptr *dptr, u64 off,
> u64 size, const void __user *unsafe_ptr__ign)
> {
> return __bpf_dynptr_copy(dptr, off, size, (const void __force *)unsafe_ptr__ign,
> copy_user_data_nofault, NULL);
> }
>
> -__bpf_kfunc int bpf_probe_read_kernel_dynptr(struct bpf_dynptr *dptr, u64 off,
> +__bpf_kfunc int bpf_probe_read_kernel_dynptr(const struct bpf_dynptr *dptr, u64 off,
> u64 size, const void *unsafe_ptr__ign)
> {
> return __bpf_dynptr_copy(dptr, off, size, unsafe_ptr__ign,
> copy_kernel_data_nofault, NULL);
> }
>
> -__bpf_kfunc int bpf_probe_read_user_str_dynptr(struct bpf_dynptr *dptr, u64 off,
> +__bpf_kfunc int bpf_probe_read_user_str_dynptr(const struct bpf_dynptr *dptr, u64 off,
> u64 size, const void __user *unsafe_ptr__ign)
> {
> return __bpf_dynptr_copy_str(dptr, off, size, (const void __force *)unsafe_ptr__ign,
> copy_user_str_nofault, NULL);
> }
>
> -__bpf_kfunc int bpf_probe_read_kernel_str_dynptr(struct bpf_dynptr *dptr, u64 off,
> +__bpf_kfunc int bpf_probe_read_kernel_str_dynptr(const struct bpf_dynptr *dptr, u64 off,
> u64 size, const void *unsafe_ptr__ign)
> {
> return __bpf_dynptr_copy_str(dptr, off, size, unsafe_ptr__ign,
> copy_kernel_str_nofault, NULL);
> }
>
> -__bpf_kfunc int bpf_copy_from_user_dynptr(struct bpf_dynptr *dptr, u64 off,
> +__bpf_kfunc int bpf_copy_from_user_dynptr(const struct bpf_dynptr *dptr, u64 off,
> u64 size, const void __user *unsafe_ptr__ign)
> {
> return __bpf_dynptr_copy(dptr, off, size, (const void __force *)unsafe_ptr__ign,
> copy_user_data_sleepable, NULL);
> }
>
> -__bpf_kfunc int bpf_copy_from_user_str_dynptr(struct bpf_dynptr *dptr, u64 off,
> +__bpf_kfunc int bpf_copy_from_user_str_dynptr(const struct bpf_dynptr *dptr, u64 off,
> u64 size, const void __user *unsafe_ptr__ign)
> {
> return __bpf_dynptr_copy_str(dptr, off, size, (const void __force *)unsafe_ptr__ign,
> copy_user_str_sleepable, NULL);
> }
>
> -__bpf_kfunc int bpf_copy_from_user_task_dynptr(struct bpf_dynptr *dptr, u64 off,
> +__bpf_kfunc int bpf_copy_from_user_task_dynptr(const struct bpf_dynptr *dptr, u64 off,
> u64 size, const void __user *unsafe_ptr__ign,
> struct task_struct *tsk)
> {
> @@ -3585,7 +3585,7 @@ __bpf_kfunc int bpf_copy_from_user_task_dynptr(struct bpf_dynptr *dptr, u64 off,
> copy_user_data_sleepable, tsk);
> }
>
> -__bpf_kfunc int bpf_copy_from_user_task_str_dynptr(struct bpf_dynptr *dptr, u64 off,
> +__bpf_kfunc int bpf_copy_from_user_task_str_dynptr(const struct bpf_dynptr *dptr, u64 off,
> u64 size, const void __user *unsafe_ptr__ign,
> struct task_struct *tsk)
> {
> diff --git a/tools/testing/selftests/bpf/bpf_kfuncs.h b/tools/testing/selftests/bpf/bpf_kfuncs.h
> index 7dad01439391..ae71e9b69051 100644
> --- a/tools/testing/selftests/bpf/bpf_kfuncs.h
> +++ b/tools/testing/selftests/bpf/bpf_kfuncs.h
> @@ -40,7 +40,7 @@ extern void *bpf_dynptr_slice(const struct bpf_dynptr *ptr, __u64 offset,
> extern void *bpf_dynptr_slice_rdwr(const struct bpf_dynptr *ptr, __u64 offset, void *buffer,
> __u64 buffer__szk) __ksym __weak;
>
> -extern int bpf_dynptr_adjust(const struct bpf_dynptr *ptr, __u64 start, __u64 end) __ksym __weak;
> +extern int bpf_dynptr_adjust(struct bpf_dynptr *ptr, __u64 start, __u64 end) __ksym __weak;
> extern bool bpf_dynptr_is_null(const struct bpf_dynptr *ptr) __ksym __weak;
> extern bool bpf_dynptr_is_rdonly(const struct bpf_dynptr *ptr) __ksym __weak;
> extern __u64 bpf_dynptr_size(const struct bpf_dynptr *ptr) __ksym __weak;
> @@ -70,13 +70,13 @@ extern void *bpf_rdonly_cast(const void *obj, __u32 btf_id) __ksym __weak;
>
> extern int bpf_get_file_xattr(struct file *file, const char *name,
> struct bpf_dynptr *value_ptr) __ksym;
> -extern int bpf_get_fsverity_digest(struct file *file, struct bpf_dynptr *digest_ptr) __ksym;
> +extern int bpf_get_fsverity_digest(struct file *file, const struct bpf_dynptr *digest_ptr) __ksym;
>
> extern struct bpf_key *bpf_lookup_user_key(__s32 serial, __u64 flags) __ksym;
> extern struct bpf_key *bpf_lookup_system_key(__u64 id) __ksym;
> extern void bpf_key_put(struct bpf_key *key) __ksym;
> -extern int bpf_verify_pkcs7_signature(struct bpf_dynptr *data_ptr,
> - struct bpf_dynptr *sig_ptr,
> +extern int bpf_verify_pkcs7_signature(const struct bpf_dynptr *data_ptr,
> + const struct bpf_dynptr *sig_ptr,
> struct bpf_key *trusted_keyring) __ksym;
>
> struct dentry;
> diff --git a/tools/testing/selftests/bpf/progs/dynptr_success.c b/tools/testing/selftests/bpf/progs/dynptr_success.c
> index e0d672d93adf..e0745b6e467e 100644
> --- a/tools/testing/selftests/bpf/progs/dynptr_success.c
> +++ b/tools/testing/selftests/bpf/progs/dynptr_success.c
> @@ -914,7 +914,7 @@ void *user_ptr;
> char expected_str[384];
> __u32 test_len[7] = {0/* placeholder */, 0, 1, 2, 255, 256, 257};
>
> -typedef int (*bpf_read_dynptr_fn_t)(struct bpf_dynptr *dptr, u64 off,
> +typedef int (*bpf_read_dynptr_fn_t)(const struct bpf_dynptr *dptr, u64 off,
> u64 size, const void *unsafe_ptr);
>
> /* Returns the offset just before the end of the maximum sized xdp fragment.
> @@ -1106,7 +1106,7 @@ int test_copy_from_user_str_dynptr(void *ctx)
> return 0;
> }
>
> -static int bpf_copy_data_from_user_task(struct bpf_dynptr *dptr, u64 off,
> +static int bpf_copy_data_from_user_task(const struct bpf_dynptr *dptr, u64 off,
> u64 size, const void *unsafe_ptr)
> {
> struct task_struct *task = bpf_get_current_task_btf();
> @@ -1114,7 +1114,7 @@ static int bpf_copy_data_from_user_task(struct bpf_dynptr *dptr, u64 off,
> return bpf_copy_from_user_task_dynptr(dptr, off, size, unsafe_ptr, task);
> }
>
> -static int bpf_copy_data_from_user_task_str(struct bpf_dynptr *dptr, u64 off,
> +static int bpf_copy_data_from_user_task_str(const struct bpf_dynptr *dptr, u64 off,
> u64 size, const void *unsafe_ptr)
> {
> struct task_struct *task = bpf_get_current_task_btf();
> diff --git a/tools/testing/selftests/bpf/progs/test_kfunc_dynptr_param.c b/tools/testing/selftests/bpf/progs/test_kfunc_dynptr_param.c
> index d249113ed657..1c6cfd0888ba 100644
> --- a/tools/testing/selftests/bpf/progs/test_kfunc_dynptr_param.c
> +++ b/tools/testing/selftests/bpf/progs/test_kfunc_dynptr_param.c
> @@ -11,12 +11,7 @@
> #include <bpf/bpf_helpers.h>
> #include <bpf/bpf_tracing.h>
> #include "bpf_misc.h"
> -
> -extern struct bpf_key *bpf_lookup_system_key(__u64 id) __ksym;
> -extern void bpf_key_put(struct bpf_key *key) __ksym;
> -extern int bpf_verify_pkcs7_signature(struct bpf_dynptr *data_ptr,
> - struct bpf_dynptr *sig_ptr,
> - struct bpf_key *trusted_keyring) __ksym;
> +#include "bpf_kfuncs.h"
>
> struct {
> __uint(type, BPF_MAP_TYPE_RINGBUF);
> @@ -38,7 +33,7 @@ SEC("?lsm.s/bpf")
> __failure __msg("cannot pass in dynptr at an offset=-8")
> int BPF_PROG(not_valid_dynptr, int cmd, union bpf_attr *attr, unsigned int size, bool kernel)
> {
> - unsigned long val;
> + unsigned long val = 0;
>
> return bpf_verify_pkcs7_signature((struct bpf_dynptr *)&val,
> (struct bpf_dynptr *)&val, NULL);
^ permalink raw reply
* Re: [net,PATCH v2] net: ks8851: Reinstate disabling of BHs around IRQ handler
From: Sebastian Andrzej Siewior @ 2026-04-13 16:10 UTC (permalink / raw)
To: Jakub Kicinski
Cc: Marek Vasut, netdev, stable, David S. Miller, Andrew Lunn,
Eric Dumazet, Nicolai Buchwitz, Paolo Abeni, Ronald Wahl,
Yicong Hui, linux-kernel, Thomas Gleixner
In-Reply-To: <20260413084445.59fe28d6@kernel.org>
On 2026-04-13 08:44:45 [-0700], Jakub Kicinski wrote:
> On Mon, 13 Apr 2026 14:57:44 +0200 Sebastian Andrzej Siewior wrote:
> > On 2026-04-12 10:51:25 [-0700], Jakub Kicinski wrote:
> > > > Does the backtrace make the problem clearer, with the annotation above ?
> > >
> > > Sebastian, do you have any recommendation here? tl;dr is that the driver does
> > …
> >
> > What about this:
>
> Thanks for taking a look (according to you auto-reply immediately after
> a vacation ;))
;)
> TBH changing the driver feels like a workaround / invitation for a
> whack-a-mole game. I'd prefer to fix the skb allocation.
The problem is that _irq() implicitly disables bh processing but this
does not happen. Forcing this is possible but expensive.
However, I did remove lock from bh_disable() on RT.
Marek: from which kernel version was this backtrace?
> Is there any way we can check if any locks which were _irq() on non-RT
> are held?
lockdep has a list of locks which are acquired but it does not see if it
is _irq() or not. It only records it was acquired.
Sebastian
^ permalink raw reply
* Re: [PATCH net] net: usb: cdc_ncm: reject negative chained NDP offsets
From: Bjørn Mork @ 2026-04-13 16:20 UTC (permalink / raw)
To: Greg Kroah-Hartman
Cc: Oliver Neukum, linux-usb, netdev, linux-kernel, Oliver Neukum,
Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, stable
In-Reply-To: <2026041355-designate-spiritual-e785@gregkh>
Greg Kroah-Hartman <gregkh@linuxfoundation.org> writes:
> On Mon, Apr 13, 2026 at 02:11:50PM +0200, Oliver Neukum wrote:
>> On 13.04.26 12:43, Greg Kroah-Hartman wrote:
>> > On Mon, Apr 13, 2026 at 10:36:19AM +0200, Oliver Neukum wrote:
>> > >
>> > >
>> > > On 11.04.26 12:53, Greg Kroah-Hartman wrote:
>> > > > cdc_ncm_rx_fixup() reads dwNextNdpIndex from each NDP32 to chain to the
>> > > > next one. The 32-bit value from the device is stored into the signed
>> > > > int ndpoffset so that means values with the high bit set become
>> > >
>> > > Well, then isn't the problem rather that you should not store an
>> > > unsigned value in a signed variable?
>> >
>> > No. well, yes. but no.
>> >
>> > cdc_ncm_rx_verify_nth16() returns an int, and is negative if something
>> > went wrong, so we need it that way, and then we need to check it, like
>> > we properly do at the top of the loop, it's just that at the bottom of
>> > the loop we also need to do the same exact thing.
>>
>> Doesn't that suggest that cdc_ncm_rx_verify_nth16() is the problem?
>> To be precise, the way it indicates errors?
>> As this is an offset into a buffer and the header must be at the start
>> of the buffer, isn't 0 the natural indication of an error?
>
> Maybe? I really don't know, sorry, parsing the cdc_ncm buffer is not
> something I looked too deeply into :)
Oliver is correct AFAICS. These functions could use 0 to indicate
errors. This would make the code simpler and cleaner.
The negative error return is just a sloppy choice I made at a time we
only supported the 16bit versions. Didn't anticipate 32bit support
since it is optional and pointless. But as usual, hardware vendors do
surprising things.
Note that cdc_mbim.c must be updated if cdc_ncm_rx_verify_nth16() is
changed.
Bjørn
^ permalink raw reply
* [PATCH v1 net 1/1] net/sched: sch_dualpi2: fix limit/memlimit enforcement when dequeueing L-queue
From: chia-yu.chang @ 2026-04-13 16:37 UTC (permalink / raw)
To: linux-hardening, kees, gustavoars, jhs, jiri, davem, edumazet,
kuba, pabeni, linux-kernel, netdev, horms, ij, ncardwell,
koen.de_schepper, g.white, ingemar.s.johansson, mirja.kuehlewind,
cheshire, rs.ietf, Jason_Livingood, vidhi_goel
Cc: Chia-Yu Chang
From: Chia-Yu Chang <chia-yu.chang@nokia-bell-labs.com>
Fix dualpi2_change() to correctly enforce updated limit and memlimit values
after a configuration change of the dualpi2 qdisc.
Before this patch, dualpi2_change() always attempted to dequeue packets via
the root qdisc (C-queue) when reducing backlog or memory usage, and
unconditionally assumed that a valid skb will be returned. When traffic
classification results in packets being queued in the L-queue while the
C-queue is empty, this leads to a NULL skb dereference during limit or
memlimit enforcement.
This is fixed by first dequeuing from the C-queue path if it is non-empty.
Once the C-queue is empty, packets are dequeued directly from the L-queue.
Return values from qdisc_dequeue_internal() are checked for both queues. When
dequeuing from the L-queue, the parent qdisc qlen and backlog counters are
updated explicitly to keep overall qdisc statistics consistent.
Fixes: 320d031ad6e4 ("sched: Struct definition and parsing of dualpi2 qdisc")
Signed-off-by: Chia-Yu Chang <chia-yu.chang@nokia-bell-labs.com>
---
net/sched/sch_dualpi2.c | 24 +++++++++++++++++++-----
1 file changed, 19 insertions(+), 5 deletions(-)
diff --git a/net/sched/sch_dualpi2.c b/net/sched/sch_dualpi2.c
index 6d7e6389758d..56d4422970b6 100644
--- a/net/sched/sch_dualpi2.c
+++ b/net/sched/sch_dualpi2.c
@@ -872,11 +872,25 @@ static int dualpi2_change(struct Qdisc *sch, struct nlattr *opt,
old_backlog = sch->qstats.backlog;
while (qdisc_qlen(sch) > sch->limit ||
q->memory_used > q->memory_limit) {
- struct sk_buff *skb = qdisc_dequeue_internal(sch, true);
-
- q->memory_used -= skb->truesize;
- qdisc_qstats_backlog_dec(sch, skb);
- rtnl_qdisc_drop(skb, sch);
+ int c_len = qdisc_qlen(sch) - qdisc_qlen(q->l_queue);
+ struct sk_buff *skb = NULL;
+
+ if (c_len) {
+ skb = qdisc_dequeue_internal(sch, true);
+ if (!skb)
+ break;
+ q->memory_used -= skb->truesize;
+ rtnl_qdisc_drop(skb, sch);
+ } else if (qdisc_qlen(q->l_queue)) {
+ skb = qdisc_dequeue_internal(q->l_queue, true);
+ if (!skb)
+ break;
+ q->memory_used -= skb->truesize;
+ rtnl_qdisc_drop(skb, q->l_queue);
+ /* Keep the overall qdisc stats consistent */
+ --sch->q.qlen;
+ qdisc_qstats_backlog_dec(sch, skb);
+ }
}
qdisc_tree_reduce_backlog(sch, old_qlen - qdisc_qlen(sch),
old_backlog - sch->qstats.backlog);
--
2.34.1
^ permalink raw reply related
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox