* [PATCH 6.1.y] net: dsa: clean up FDB, MDB, VLAN entries on unbind
From: Alva Lan @ 2026-04-21 7:36 UTC (permalink / raw)
To: gregkh, sashal, stable; +Cc: netdev, Vladimir Oltean, Jakub Kicinski, Alva Lan
From: Vladimir Oltean <vladimir.oltean@nxp.com>
[ Upstream commit 7afb5fb42d4950f33af2732b8147c552659f79b7 ]
As explained in many places such as commit b117e1e8a86d ("net: dsa:
delete dsa_legacy_fdb_add and dsa_legacy_fdb_del"), DSA is written given
the assumption that higher layers have balanced additions/deletions.
As such, it only makes sense to be extremely vocal when those
assumptions are violated and the driver unbinds with entries still
present.
But Ido Schimmel points out a very simple situation where that is wrong:
https://lore.kernel.org/netdev/ZDazSM5UsPPjQuKr@shredder/
(also briefly discussed by me in the aforementioned commit).
Basically, while the bridge bypass operations are not something that DSA
explicitly documents, and for the majority of DSA drivers this API
simply causes them to go to promiscuous mode, that isn't the case for
all drivers. Some have the necessary requirements for bridge bypass
operations to do something useful - see dsa_switch_supports_uc_filtering().
Although in tools/testing/selftests/net/forwarding/local_termination.sh,
we made an effort to popularize better mechanisms to manage address
filters on DSA interfaces from user space - namely macvlan for unicast,
and setsockopt(IP_ADD_MEMBERSHIP) - through mtools - for multicast, the
fact is that 'bridge fdb add ... self static local' also exists as
kernel UAPI, and might be useful to someone, even if only for a quick
hack.
It seems counter-productive to block that path by implementing shim
.ndo_fdb_add and .ndo_fdb_del operations which just return -EOPNOTSUPP
in order to prevent the ndo_dflt_fdb_add() and ndo_dflt_fdb_del() from
running, although we could do that.
Accepting that cleanup is necessary seems to be the only option.
Especially since we appear to be coming back at this from a different
angle as well. Russell King is noticing that the WARN_ON() triggers even
for VLANs:
https://lore.kernel.org/netdev/Z_li8Bj8bD4-BYKQ@shell.armlinux.org.uk/
What happens in the bug report above is that dsa_port_do_vlan_del() fails,
then the VLAN entry lingers on, and then we warn on unbind and leak it.
This is not a straight revert of the blamed commit, but we now add an
informational print to the kernel log (to still have a way to see
that bugs exist), and some extra comments gathered from past years'
experience, to justify the logic.
Fixes: 0832cd9f1f02 ("net: dsa: warn if port lists aren't empty in dsa_port_teardown")
Signed-off-by: Vladimir Oltean <vladimir.oltean@nxp.com>
Link: https://patch.msgid.link/20250414212930.2956310-1-vladimir.oltean@nxp.com
Signed-off-by: Jakub Kicinski <kuba@kernel.org>
[ Apply the patch to net/dsa/dsa2.c in v6.1 since commit
47d2ce03dcfb ("net: dsa: rename dsa2.c back into dsa.c and create its header")
renamed this file to net/dsa/dsa.c starting from v6.2. ]
Signed-off-by: Alva Lan <alvalan9@foxmail.com>
---
net/dsa/dsa2.c | 38 +++++++++++++++++++++++++++++++++++---
1 file changed, 35 insertions(+), 3 deletions(-)
diff --git a/net/dsa/dsa2.c b/net/dsa/dsa2.c
index 415e856ba0ac..9ecb5e34e484 100644
--- a/net/dsa/dsa2.c
+++ b/net/dsa/dsa2.c
@@ -1738,12 +1738,44 @@ static int dsa_switch_parse(struct dsa_switch *ds, struct dsa_chip_data *cd)
static void dsa_switch_release_ports(struct dsa_switch *ds)
{
+ struct dsa_mac_addr *a, *tmp;
struct dsa_port *dp, *next;
+ struct dsa_vlan *v, *n;
dsa_switch_for_each_port_safe(dp, next, ds) {
- WARN_ON(!list_empty(&dp->fdbs));
- WARN_ON(!list_empty(&dp->mdbs));
- WARN_ON(!list_empty(&dp->vlans));
+ /* These are either entries that upper layers lost track of
+ * (probably due to bugs), or installed through interfaces
+ * where one does not necessarily have to remove them, like
+ * ndo_dflt_fdb_add().
+ */
+ list_for_each_entry_safe(a, tmp, &dp->fdbs, list) {
+ dev_info(ds->dev,
+ "Cleaning up unicast address %pM vid %u from port %d\n",
+ a->addr, a->vid, dp->index);
+ list_del(&a->list);
+ kfree(a);
+ }
+
+ list_for_each_entry_safe(a, tmp, &dp->mdbs, list) {
+ dev_info(ds->dev,
+ "Cleaning up multicast address %pM vid %u from port %d\n",
+ a->addr, a->vid, dp->index);
+ list_del(&a->list);
+ kfree(a);
+ }
+
+ /* These are entries that upper layers have lost track of,
+ * probably due to bugs, but also due to dsa_port_do_vlan_del()
+ * having failed and the VLAN entry still lingering on.
+ */
+ list_for_each_entry_safe(v, n, &dp->vlans, list) {
+ dev_info(ds->dev,
+ "Cleaning up vid %u from port %d\n",
+ v->vid, dp->index);
+ list_del(&v->list);
+ kfree(v);
+ }
+
list_del(&dp->list);
kfree(dp);
}
--
2.43.0
^ permalink raw reply related
* [PATCH] ieee802154: ca8210: fix cas_ctl leak on spi_async failure
From: Shitalkumar Gandhi @ 2026-04-21 7:32 UTC (permalink / raw)
To: alex.aring, stefan, miquel.raynal
Cc: andrew+netdev, davem, edumazet, kuba, pabeni, linux-wpan, netdev,
linux-kernel, stable, Shitalkumar Gandhi
ca8210_spi_transfer() allocates cas_ctl with kzalloc_obj(GFP_ATOMIC)
and relies entirely on the SPI completion callback
ca8210_spi_transfer_complete() to free it.
The spi_async() API only invokes the completion callback on successful
submission. On failure it returns a negative error code without ever
queuing the callback, which leaves cas_ctl and its embedded spi_message
and spi_transfer orphaned. Every kfree(cas_ctl) in the driver is
inside the completion callback, so there is no other reclamation path.
ca8210_spi_transfer() is called from ca8210_spi_exchange(), the
interrupt handler ca8210_interrupt_handler(), and from the retry path
inside the completion callback itself. The exchange and interrupt
handler paths loop on -EBUSY, so under sustained SPI bus contention
every retry iteration leaks a fresh cas_ctl (~600 bytes per
occurrence).
Fix it by freeing cas_ctl on the spi_async() error path. While here,
correct the misleading error string: the function calls spi_async(),
not spi_sync().
Fixes: ded845a781a5 ("ieee802154: Add CA8210 IEEE 802.15.4 device driver")
Cc: stable@vger.kernel.org
Signed-off-by: Shitalkumar Gandhi <shitalkumar.gandhi@cambiumnetworks.com>
---
drivers/net/ieee802154/ca8210.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/net/ieee802154/ca8210.c b/drivers/net/ieee802154/ca8210.c
index ed4178155a5d..bf837adfebb2 100644
--- a/drivers/net/ieee802154/ca8210.c
+++ b/drivers/net/ieee802154/ca8210.c
@@ -919,9 +919,10 @@ static int ca8210_spi_transfer(
if (status < 0) {
dev_crit(
&spi->dev,
- "status %d from spi_sync in write\n",
+ "status %d from spi_async in write\n",
status
);
+ kfree(cas_ctl);
}
return status;
--
2.25.1
^ permalink raw reply related
* Re: [PATCH net v2] ipv6: Apply max_dst_opts_cnt to ip6_tnl_parse_tlv_enc_lim
From: Daniel Borkmann @ 2026-04-21 7:33 UTC (permalink / raw)
To: Justin Iurman, Ido Schimmel
Cc: kuba, edumazet, dsahern, tom, willemdebruijn.kernel, pabeni,
netdev
In-Reply-To: <524def33-63e1-47c0-be38-dee68d859332@gmail.com>
On 4/20/26 8:55 PM, Justin Iurman wrote:
> On 4/19/26 16:31, Ido Schimmel wrote:
>> On Sun, Apr 19, 2026 at 12:37:35AM +0200, Justin Iurman wrote:
>>> Nope. But if it happens, users would be confused as max_dst_opts_cnt would
>>> not have the same meaning in two different code paths. OTOH, I agree that
>>> such situation would look suspicious. I guess it's fine to keep your patch
>>> as is and to not over-complicate things unnecessarily.
>>
>> I agree that it's weird to reuse max_dst_opts_cnt here:
>>
>> 1. The meaning is different from the Rx path.
>>
>> 2. We only enforce max_dst_opts_cnt, but not max_dst_opts_len.
>>
>> 3. The default is derived from the initial netns, unlike in the Rx path.
>>
>> Given the above and that:
>>
>> 1. We believe that 8 options until the tunnel encapsulation limit option
>> is liberal enough.
>>
>> 2. We don't want to over-complicate things.
>>
>> Can we go with an hard coded 8 and see if anyone complains? In the
>> unlikely case that someone complains we can at least gain some insight
>> into how this option is actually used with tunnels.
>
> In general, I'm not a big fan of hard-coded values, but I also think that in this context it would make sense to do so. This is not a strong +1, let's say it's more a "not against it".
Makes sense, I'll update it in a v3.
^ permalink raw reply
* Re: [PATCH v4 net] net: ax25: fix integer overflow in ax25_rx_fragment()
From: Paolo Abeni @ 2026-04-21 7:29 UTC (permalink / raw)
To: Mashiro Chen, netdev; +Cc: linux-hams, kuba, horms, davem, edumazet
In-Reply-To: <20260413204921.70463-1-mashiro.chen@mailbox.org>
On 4/13/26 10:49 PM, Mashiro Chen wrote:
> ax25_rx_fragment() accumulates fragment lengths into ax25_cb->fraglen,
> which is an unsigned short. When the total exceeds 65535, fraglen wraps
> around to a small value. The subsequent alloc_skb(fraglen) allocates a
> too-small buffer, and skb_put() in the copy loop triggers skb_over_panic().
>
> Add pskb_may_pull(skb, 1) at function entry to ensure the segmentation
> header byte is in the linear data area before dereferencing skb->data.
> This also rejects zero-length skbs, which the original code did not
> check for.
>
> Two issues in the overflow error path are also fixed:
> First, the current skb, after skb_pull(skb, 1), is neither enqueued
> nor freed before returning 1, leaking it. Add kfree_skb(skb) before
> the return.
> Second, ax25->fraglen is not reset after skb_queue_purge(). Add
> ax25->fraglen = 0 to restore a consistent state.
>
> Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
> Signed-off-by: Mashiro Chen <mashiro.chen@mailbox.org>
we are moving ax25 out of tree:
https://lore.kernel.org/netdev/20260421021824.1293976-1-kuba@kernel.org/
please hold off until Thursday (after that our net PR will land into
mainline), and eventually resend if the code still exists in Linus's
tree at that point.
Thanks,
Paolo
^ permalink raw reply
* Re: [EXTERNAL] Re: [PATCH net v2] hv_sock: Report EOF instead of -EIO for FIN
From: Stefano Garzarella @ 2026-04-21 7:28 UTC (permalink / raw)
To: Dexuan Cui
Cc: patchwork-bot+netdevbpf@kernel.org, kuba@kernel.org,
KY Srinivasan, Haiyang Zhang, wei.liu@kernel.org, Long Li,
davem@davemloft.net, edumazet@google.com, pabeni@redhat.com,
horms@kernel.org, niuxuewei.nxw@antgroup.com,
linux-hyperv@vger.kernel.org, virtualization@lists.linux.dev,
netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
stable@vger.kernel.org, Ben Hillis, levymitchell0@gmail.com
In-Reply-To: <SA1PR21MB69214CABCA0DCD597040F849BF2C2@SA1PR21MB6921.namprd21.prod.outlook.com>
On Tue, 21 Apr 2026 at 05:13, Dexuan Cui <DECUI@microsoft.com> wrote:
>
> > From: patchwork-bot+netdevbpf@kernel.org <patchwork-
> > bot+netdevbpf@kernel.org>
> > Sent: Monday, April 20, 2026 3:00 PM
> > > [...]
> >
> > Here is the summary with links:
> > - [net,v2] hv_sock: Report EOF instead of -EIO for FIN
> > https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git/commit/?id=f63152958994
>
> Hi Jakub, Stefano,
> I'm sorry -- I just posted v3
> https://lore.kernel.org/linux-hyperv/20260421025950.1099495-1-decui@microsoft.com/T/#u
> and then I realized that the v2 had been merged into the main branch :-(
>
> Should I post a new delta patch(with a Fixes tag against the v2) based on the main branch?
Ehm, I'm not sure about the process but if it's merged in net tree,
maybe we need a follow up patch.
Anyway, let's wait for Jakub's or other net maintainers' suggestions.
Thanks,
Stefano
^ permalink raw reply
* Re: [PATCH net v2 2/2] selftests/bpf: check epoll readiness after reuseport migration
From: Kuniyuki Iwashima @ 2026-04-21 7:15 UTC (permalink / raw)
To: jt26wzz
Cc: davem, dsahern, edumazet, horms, kuba, kuniyu, linux-kernel,
linux-kselftest, ncardwell, netdev, pabeni, shuah, tamird
In-Reply-To: <20260418181333.1713389-3-jt26wzz@gmail.com>
From: Zhenzhong Wu <jt26wzz@gmail.com>
Date: Sun, 19 Apr 2026 02:13:33 +0800
> After migrate_dance() moves established children to the target
> listener, add it to an epoll set and verify that epoll_wait(..., 0)
> reports it ready before accept().
>
> This adds epoll coverage for the TCP_ESTABLISHED reuseport migration
> case in migrate_reuseport.
>
> Keep the check limited to TCP_ESTABLISHED cases. TCP_SYN_RECV and
> TCP_NEW_SYN_RECV still depend on asynchronous handshake completion,
> so a zero-timeout epoll_wait() would race there.
>
> Signed-off-by: Zhenzhong Wu <jt26wzz@gmail.com>
> ---
> .../bpf/prog_tests/migrate_reuseport.c | 32 ++++++++++++++++++-
> 1 file changed, 31 insertions(+), 1 deletion(-)
>
> diff --git a/tools/testing/selftests/bpf/prog_tests/migrate_reuseport.c b/tools/testing/selftests/bpf/prog_tests/migrate_reuseport.c
> index 653b0a20f..580a53424 100644
> --- a/tools/testing/selftests/bpf/prog_tests/migrate_reuseport.c
> +++ b/tools/testing/selftests/bpf/prog_tests/migrate_reuseport.c
> @@ -18,13 +18,16 @@
> * 9. call shutdown() for the second server
> * and migrate the requests in the accept queue
> * to the last server socket.
> - * 10. call accept() for the last server socket.
> + * 10. for TCP_ESTABLISHED cases, call epoll_wait(..., 0)
> + * for the last server socket.
> + * 11. call accept() for the last server socket.
> *
> * Author: Kuniyuki Iwashima <kuniyu@amazon.co.jp>
> */
>
> #include <bpf/bpf.h>
> #include <bpf/libbpf.h>
> +#include <sys/epoll.h>
>
> #include "test_progs.h"
> #include "test_migrate_reuseport.skel.h"
> @@ -522,6 +525,33 @@ static void run_test(struct migrate_reuseport_test_case *test_case,
> goto close_clients;
> }
>
> + /* Only TCP_ESTABLISHED has already-migrated accept-queue entries
> + * here. Later states still depend on follow-up handshake work.
> + */
> + if (test_case->state == BPF_TCP_ESTABLISHED) {
> + struct epoll_event ev = {
> + .events = EPOLLIN,
> + };
> + int epfd;
> + int nfds;
> +
> + epfd = epoll_create1(EPOLL_CLOEXEC);
> + if (!ASSERT_NEQ(epfd, -1, "epoll_create1"))
> + goto close_clients;
> +
> + ev.data.fd = test_case->servers[MIGRATED_TO];
> + if (!ASSERT_OK(epoll_ctl(epfd, EPOLL_CTL_ADD,
> + test_case->servers[MIGRATED_TO], &ev),
> + "epoll_ctl"))
> + goto close_epfd;
> +
> + nfds = epoll_wait(epfd, &ev, 1, 0);
> + ASSERT_EQ(nfds, 1, "epoll_wait");
Thanks for the update, but the test passes without patch 1.
I think it would be best to test just after shutdown()
where migration happens.
Also, TCP_SYN_RECV should be covered in the same way.
---8<---
diff --git a/tools/testing/selftests/bpf/prog_tests/migrate_reuseport.c b/tools/testing/selftests/bpf/prog_tests/migrate_reuseport.c
index 580a534249a7..66fea936649e 100644
--- a/tools/testing/selftests/bpf/prog_tests/migrate_reuseport.c
+++ b/tools/testing/selftests/bpf/prog_tests/migrate_reuseport.c
@@ -353,8 +353,29 @@ static int update_maps(struct migrate_reuseport_test_case *test_case,
static int migrate_dance(struct migrate_reuseport_test_case *test_case)
{
+ struct epoll_event ev = {
+ .events = EPOLLIN,
+ };
+ int epoll, nfds;
int i, err;
+ if (test_case->state != BPF_TCP_NEW_SYN_RECV) {
+ epoll = epoll_create1(0);
+ if (!ASSERT_NEQ(epoll, -1, "epoll_create1"))
+ return -1;
+
+ ev.data.fd = test_case->servers[MIGRATED_TO];
+ if (!ASSERT_OK(epoll_ctl(epoll, EPOLL_CTL_ADD,
+ test_case->servers[MIGRATED_TO], &ev),
+ "epoll_ctl")) {
+ goto close_epoll;
+ }
+
+ nfds = epoll_wait(epoll, &ev, 1, 0);
+ if (!ASSERT_EQ(nfds, 0, "epoll_wait 1"))
+ goto close_epoll;
+ }
+
/* Migrate TCP_ESTABLISHED and TCP_SYN_RECV requests
* to the last listener based on eBPF.
*/
@@ -368,6 +389,15 @@ static int migrate_dance(struct migrate_reuseport_test_case *test_case)
if (test_case->state == BPF_TCP_NEW_SYN_RECV)
return 0;
+ nfds = epoll_wait(epoll, &ev, 1, 0);
+ if (!ASSERT_EQ(nfds, 1, "epoll_wait 2")) {
+close_epoll:
+ close(epoll);
+ return -1;
+ }
+
+ close(epoll);
+
/* Note that we use the second listener instead of the
* first one here.
*
@@ -525,33 +555,6 @@ static void run_test(struct migrate_reuseport_test_case *test_case,
goto close_clients;
}
- /* Only TCP_ESTABLISHED has already-migrated accept-queue entries
- * here. Later states still depend on follow-up handshake work.
- */
- if (test_case->state == BPF_TCP_ESTABLISHED) {
- struct epoll_event ev = {
- .events = EPOLLIN,
- };
- int epfd;
- int nfds;
-
- epfd = epoll_create1(EPOLL_CLOEXEC);
- if (!ASSERT_NEQ(epfd, -1, "epoll_create1"))
- goto close_clients;
-
- ev.data.fd = test_case->servers[MIGRATED_TO];
- if (!ASSERT_OK(epoll_ctl(epfd, EPOLL_CTL_ADD,
- test_case->servers[MIGRATED_TO], &ev),
- "epoll_ctl"))
- goto close_epfd;
-
- nfds = epoll_wait(epfd, &ev, 1, 0);
- ASSERT_EQ(nfds, 1, "epoll_wait");
-
-close_epfd:
- close(epfd);
- }
-
count_requests(test_case, skel);
close_clients:
---8<---
^ permalink raw reply related
* [PATCH 6.6.y] i40e: Fix preempt count leak in napi poll tracepoint
From: charles_xu @ 2026-04-21 7:18 UTC (permalink / raw)
To: tglx, anthony.l.nguyen, przemyslaw.kitszel, intel-wired-lan,
netdev, joe, aleksandr.loktionov, stable
From: Thomas Gleixner <tglx@kernel.org>
[ Upstream commit 4b3d54a85bd37ebf2d9836f0d0de775c0ff21af9 ]
Using get_cpu() in the tracepoint assignment causes an obvious preempt
count leak because nothing invokes put_cpu() to undo it:
softirq: huh, entered softirq 3 NET_RX with preempt_count 00000100, exited with 00000101?
This clearly has seen a lot of testing in the last 3+ years...
Use smp_processor_id() instead.
Fixes: 6d4d584a7ea8 ("i40e: Add i40e_napi_poll tracepoint")
Signed-off-by: Thomas Gleixner <tglx@kernel.org>
Cc: Tony Nguyen <anthony.l.nguyen@intel.com>
Cc: Przemek Kitszel <przemyslaw.kitszel@intel.com>
Cc: intel-wired-lan@lists.osuosl.org
Cc: netdev@vger.kernel.org
Reviewed-by: Joe Damato <joe@dama.to>
Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
Signed-off-by: Tony Nguyen <anthony.l.nguyen@intel.com>
Signed-off-by: Charles Xu <charles_xu@189.cn>
---
drivers/net/ethernet/intel/i40e/i40e_trace.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/net/ethernet/intel/i40e/i40e_trace.h b/drivers/net/ethernet/intel/i40e/i40e_trace.h
index 33b4e30f5e00..9b735a9e2114 100644
--- a/drivers/net/ethernet/intel/i40e/i40e_trace.h
+++ b/drivers/net/ethernet/intel/i40e/i40e_trace.h
@@ -88,7 +88,7 @@ TRACE_EVENT(i40e_napi_poll,
__entry->rx_clean_complete = rx_clean_complete;
__entry->tx_clean_complete = tx_clean_complete;
__entry->irq_num = q->irq_num;
- __entry->curr_cpu = get_cpu();
+ __entry->curr_cpu = smp_processor_id();
__assign_str(qname, q->name);
__assign_str(dev_name, napi->dev ? napi->dev->name : NO_DEV);
__assign_bitmask(irq_affinity, cpumask_bits(&q->affinity_mask),
--
2.35.3
^ permalink raw reply related
* Re: [patch 32/38] powerpc/spufs: Use mftb() directly
From: Mukesh Kumar Chaurasiya @ 2026-04-21 6:48 UTC (permalink / raw)
To: Thomas Gleixner
Cc: LKML, Michael Ellerman, linuxppc-dev, Arnd Bergmann, x86,
Lu Baolu, iommu, Michael Grzeschik, netdev, linux-wireless,
Herbert Xu, linux-crypto, Vlastimil Babka, linux-mm,
David Woodhouse, Bernie Thompson, linux-fbdev, Theodore Tso,
linux-ext4, Andrew Morton, Uladzislau Rezki, Marco Elver,
Dmitry Vyukov, kasan-dev, Andrey Ryabinin, Thomas Sailer,
linux-hams, Jason A. Donenfeld, Richard Henderson, linux-alpha,
Russell King, linux-arm-kernel, Catalin Marinas, Huacai Chen,
loongarch, Geert Uytterhoeven, linux-m68k, Dinh Nguyen,
Jonas Bonn, linux-openrisc, Helge Deller, linux-parisc,
Paul Walmsley, linux-riscv, Heiko Carstens, linux-s390,
David S. Miller, sparclinux
In-Reply-To: <20260410120319.723429844@kernel.org>
On Fri, Apr 10, 2026 at 02:21:04PM +0200, Thomas Gleixner wrote:
> There is no reason to indirect via get_cycles(), which is about to be
> removed.
>
> Use mftb() directly.
>
> Signed-off-by: Thomas Gleixner <tglx@kernel.org>
> Cc: Michael Ellerman <mpe@ellerman.id.au>
> Cc: linuxppc-dev@lists.ozlabs.org
> ---
> arch/powerpc/platforms/cell/spufs/switch.c | 5 +++--
> 1 file changed, 3 insertions(+), 2 deletions(-)
>
> --- a/arch/powerpc/platforms/cell/spufs/switch.c
> +++ b/arch/powerpc/platforms/cell/spufs/switch.c
> @@ -34,6 +34,7 @@
> #include <asm/spu_priv1.h>
> #include <asm/spu_csa.h>
> #include <asm/mmu_context.h>
> +#include <asm/time.h>
>
> #include "spufs.h"
>
> @@ -279,7 +280,7 @@ static inline void save_timebase(struct
> * Read PPE Timebase High and Timebase low registers
> * and save in CSA. TBD.
> */
> - csa->suspend_time = get_cycles();
> + csa->suspend_time = mftb();
> }
>
> static inline void remove_other_spu_access(struct spu_state *csa,
> @@ -1261,7 +1262,7 @@ static inline void setup_decr(struct spu
> * in LSCSA.
> */
> if (csa->priv2.mfc_control_RW & MFC_CNTL_DECREMENTER_RUNNING) {
> - cycles_t resume_time = get_cycles();
> + cycles_t resume_time = mftb();
> cycles_t delta_time = resume_time - csa->suspend_time;
>
> csa->lscsa->decr_status.slot[0] = SPU_DECR_STATUS_RUNNING;
>
Reviewed-by: Mukesh Kumar Chaurasiya (IBM) <mkchauras@gmail.com>
^ permalink raw reply
* [PATCH net] net: airoha: stop net_device TX queue before updating CPU index
From: Lorenzo Bianconi @ 2026-04-21 6:43 UTC (permalink / raw)
To: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni
Cc: Simon Horman, linux-arm-kernel, linux-mediatek, netdev,
Lorenzo Bianconi
Currently, airoha_eth driver updates the CPU index register prior of
verifying whether the number of free descriptors has fallen below the
threshold.
Move net_device TX queue length check before updating the TX CPU index
in order to update TX CPU index even if there are more packets to be
transmitted but the net_device TX queue is going to be stopped
accounting the inflight packets.
Fixes: 1d304174106c ("net: airoha: Implement BQL support")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
drivers/net/ethernet/airoha/airoha_eth.c | 9 ++++-----
1 file changed, 4 insertions(+), 5 deletions(-)
diff --git a/drivers/net/ethernet/airoha/airoha_eth.c b/drivers/net/ethernet/airoha/airoha_eth.c
index 19f67c7dd8e1..5d327237e274 100644
--- a/drivers/net/ethernet/airoha/airoha_eth.c
+++ b/drivers/net/ethernet/airoha/airoha_eth.c
@@ -2058,17 +2058,16 @@ static netdev_tx_t airoha_dev_xmit(struct sk_buff *skb,
skb_tx_timestamp(skb);
netdev_tx_sent_queue(txq, skb->len);
+ if (q->ndesc - q->queued < q->free_thr) {
+ netif_tx_stop_queue(txq);
+ q->txq_stopped = true;
+ }
if (netif_xmit_stopped(txq) || !netdev_xmit_more())
airoha_qdma_rmw(qdma, REG_TX_CPU_IDX(qid),
TX_RING_CPU_IDX_MASK,
FIELD_PREP(TX_RING_CPU_IDX_MASK, index));
- if (q->ndesc - q->queued < q->free_thr) {
- netif_tx_stop_queue(txq);
- q->txq_stopped = true;
- }
-
spin_unlock_bh(&q->lock);
return NETDEV_TX_OK;
---
base-commit: a663bac71a2f0b3ac6c373168ca57b2a6e6381aa
change-id: 20260421-airoha-xmit-stop-condition-344dc0292a19
Best regards,
--
Lorenzo Bianconi <lorenzo@kernel.org>
^ permalink raw reply related
* RE: [Intel-wired-lan] [PATCH iwl-next v5] igb: Retrieve Tx timestamp from BH workqueue
From: Rinitha, SX @ 2026-04-21 6:37 UTC (permalink / raw)
To: Kurt Kanzenbach, Nguyen, Anthony L, Kitszel, Przemyslaw
Cc: Paul Menzel, Vadim Fedorenko, Gomes, Vinicius,
netdev@vger.kernel.org, Richard Cochran,
linux-kernel@vger.kernel.org, Loktionov, Aleksandr, Andrew Lunn,
Eric Dumazet, intel-wired-lan@lists.osuosl.org, Keller, Jacob E,
Jakub Kicinski, Paolo Abeni, David S. Miller,
Sebastian Andrzej Siewior
In-Reply-To: <20260305-igb_irq_ts-v5-1-d3b96828ab5b@linutronix.de>
> -----Original Message-----
> From: Intel-wired-lan <intel-wired-lan-bounces@osuosl.org> On Behalf Of Kurt Kanzenbach
> Sent: 05 March 2026 15:56
> To: Nguyen, Anthony L <anthony.l.nguyen@intel.com>; Kitszel, Przemyslaw <przemyslaw.kitszel@intel.com>
> Cc: Paul Menzel <pmenzel@molgen.mpg.de>; Vadim Fedorenko <vadim.fedorenko@linux.dev>; Gomes, Vinicius <vinicius.gomes@intel.com>; netdev@vger.kernel.org; Richard Cochran <richardcochran@gmail.com>; Kurt Kanzenbach <kurt@linutronix.de>; linux-kernel@vger.kernel.org; Loktionov, Aleksandr <aleksandr.loktionov@intel.com>; Andrew Lunn <andrew+netdev@lunn.ch>; Eric Dumazet <edumazet@google.com>; intel-wired-lan@lists.osuosl.org; Keller, Jacob E <jacob.e.keller@intel.com>; Jakub Kicinski <kuba@kernel.org>; Paolo Abeni <pabeni@redhat.com>; David S. Miller <davem@davemloft.net>; Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> Subject: [Intel-wired-lan] [PATCH iwl-next v5] igb: Retrieve Tx timestamp from BH workqueue
>
> Retrieve Tx timestamp from system BH instead of regular system workqueue.
>
> The current implementation uses schedule_work() which is executed by the system work queue and kworkers to retrieve Tx timestamps. This increases latency and can lead to timeouts in case of heavy system load. i210 is often used in industrial systems, where timestamp timeouts can be fatal.
>
> Therefore, switch to the system BH workqueues which are executed in softirq context shortly after the IRQ handler returns.
>
>Tested between Intel i210 and i350 with ptp4l gPTP profile:
>
> |ptp4l[30.405]: rms 4 max 7 freq +12825 +/- 3 delay 247 +/- 0
> |ptp4l[31.406]: rms 2 max 3 freq +12829 +/- 3 delay 248 +/- 0
>|ptp4l[32.406]: rms 3 max 3 freq +12827 +/- 3 delay 248 +/- 0
> |ptp4l[33.406]: rms 2 max 3 freq +12827 +/- 3 delay 248 +/- 0
> |ptp4l[34.407]: rms 3 max 6 freq +12825 +/- 4 delay 248 +/- 0
> |ptp4l[35.407]: rms 3 max 6 freq +12822 +/- 4 delay 246 +/- 0
> |ptp4l[36.407]: rms 7 max 10 freq +12812 +/- 5 delay 248 +/- 0
> |ptp4l[37.408]: rms 5 max 8 freq +12808 +/- 3 delay 248 +/- 0
>
> Furthermore, Miroslav Lichvar tested with ntpperf and chrony on Intel i350:
>
> Without the patch:
>
> | | responses | response time (ns)
> |rate clients | lost invalid basic xleave | min mean max stddev
> |150000 15000 0.00% 0.00% 0.00% 100.00% +4188 +36475 +193328 16179
> |157500 15750 0.02% 0.00% 0.02% 99.96% +6373 +42969 +683894 22682
> |165375 16384 0.03% 0.00% 0.00% 99.97% +7911 +43960 +692471 24454
> |173643 16384 0.06% 0.00% 0.00% 99.94% +8323 +45627 +707240 28452
> |182325 16384 0.06% 0.00% 0.00% 99.94% +8404 +47292 +722524 26936
> |191441 16384 0.00% 0.00% 0.00% 100.00% +8930 +51738 +223727 14272
> |201013 16384 0.05% 0.00% 0.00% 99.95% +9634 +53696 +776445 23783
> |211063 16384 0.00% 0.00% 0.00% 100.00% +14393 +54558 +329546 20473
> |221616 16384 2.59% 0.00% 0.05% 97.36% +23924 +321205 +518192 21838
> |232696 16384 7.00% 0.00% 0.10% 92.90% +33396 +337709 +575661 21017
> |244330 16384 10.82% 0.00% 0.15% 89.03% +34188 +340248 +556237 20880
> |
> |With the patch:
> |150000 15000 5.11% 0.00% 0.00% 94.88% +4426 +460642 +640884 83746
> |157500 15750 11.54% 0.00% 0.26% 88.20% +14434 +543656 +738355 30349
> |165375 16384 15.61% 0.00% 0.31% 84.08% +35822 +515304 +833859 25596
> |173643 16384 19.58% 0.00% 0.37% 80.05% +20762 +568962 +900100 28118
> |182325 16384 23.46% 0.00% 0.42% 76.13% +41829 +547974 +804170 27890
> |191441 16384 27.23% 0.00% 0.46% 72.31% +15182 +557920 +798212 28868
> |201013 16384 30.51% 0.00% 0.49% 69.00% +15980 +560764 +805576 29979
> |211063 16384 0.06% 0.00% 0.00% 99.94% +12668 +80487 +410555 62182
> |221616 16384 2.94% 0.00% 0.05% 97.00% +21587 +342769 +517566 23359
> |232696 16384 6.94% 0.00% 0.10% 92.96% +16581 +336068 +484574 18453
> |244330 16384 11.45% 0.00% 0.14% 88.41% +23608 +345023 +564130 19177
>
> There are some minor differences at lower rates, but no performance regressions at higher ones.
>
> Reviewed-by: Paul Menzel <pmenzel@molgen.mpg.de>
> Reviewed-by: Aleksandr Loktionov <aleksandr.loktionov@intel.com>
> Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de>
> ---
> Changes in v5:
> - Adjust changelog wording (Aleksandr Loktionov)
> - Include measurement numbers in changelog (Paul Menzel)
> - Link to v4: https://patch.msgid.link/20260303-igb_irq_ts-v4-1-cbae7f127061@linutronix.de
>
> Changes in v4:
> - Use BH workqueue (tasklet) instead of doing timestamping in IRQ path (Jakub Kicinski)
> - Link to v3: https://patch.msgid.link/20260205-igb_irq_ts-v3-1-2efc7bc4b885@linutronix.de
>
> Changes in v3:
> - Switch back to IRQ, but for i210 only
> - Keep kworker for all other NICs like i350 (Miroslav)
> - Link to v2: https://lore.kernel.org/r/20250822-igb_irq_ts-v2-1-1ac37078a7a4@linutronix.de
>
> Changes in v2:
> - Switch from IRQ to PTP aux worker due to NTP performance regression (Miroslav)
> - Link to v1: https://lore.kernel.org/r/20250815-igb_irq_ts-v1-1-8c6fc0353422@linutronix.de
> ---
> drivers/net/ethernet/intel/igb/igb_main.c | 4 ++-- drivers/net/ethernet/intel/igb/igb_ptp.c | 2 +-
> 2 files changed, 3 insertions(+), 3 deletions(-)
>
Tested-by: Rinitha S <sx.rinitha@intel.com> (A Contingent worker at Intel)
^ permalink raw reply
* [PATCH net] net: airoha: fix BQL imbalance in TX path
From: Lorenzo Bianconi @ 2026-04-21 6:35 UTC (permalink / raw)
To: Andrew Lunn, David S. Miller, Eric Dumazet, Jakub Kicinski,
Paolo Abeni, Hariprasad Kelam
Cc: Simon Horman, linux-arm-kernel, linux-mediatek, netdev,
Lorenzo Bianconi
Fix a possible BQL imbalance in airoha_dev_xmit(), where inflight
packets are accounted only for the AIROHA_NUM_TX_RING netdev TX
queues. The queue index is computed as:
qid = skb_get_queue_mapping(skb) % ARRAY_SIZE(qdma->q_tx)
txq = netdev_get_tx_queue(dev, qid);
However, airoha_qdma_tx_napi_poll() accounts completions across all
netdev TX queues (num_tx_queues), leading to inconsistent BQL
accounting.
Also reset all netdev TX queues in the ndo_stop callback.
Fixes: 1d304174106c ("net: airoha: Implement BQL support")
Fixes: c9f947769b77 ("net: airoha: Reset BQL stopping the netdevice")
Signed-off-by: Lorenzo Bianconi <lorenzo@kernel.org>
---
drivers/net/ethernet/airoha/airoha_eth.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)
diff --git a/drivers/net/ethernet/airoha/airoha_eth.c b/drivers/net/ethernet/airoha/airoha_eth.c
index 19f67c7dd8e1..6c7390f0de5d 100644
--- a/drivers/net/ethernet/airoha/airoha_eth.c
+++ b/drivers/net/ethernet/airoha/airoha_eth.c
@@ -929,10 +929,9 @@ static int airoha_qdma_tx_napi_poll(struct napi_struct *napi, int budget)
q->queued--;
if (skb) {
- u16 queue = skb_get_queue_mapping(skb);
struct netdev_queue *txq;
- txq = netdev_get_tx_queue(skb->dev, queue);
+ txq = skb_get_tx_queue(skb->dev, skb);
netdev_tx_completed_queue(txq, 1, skb->len);
dev_kfree_skb_any(skb);
}
@@ -1711,7 +1710,7 @@ static int airoha_dev_stop(struct net_device *dev)
if (err)
return err;
- for (i = 0; i < ARRAY_SIZE(qdma->q_tx); i++)
+ for (i = 0; i < dev->num_tx_queues; i++)
netdev_tx_reset_subqueue(dev, i);
airoha_set_gdm_port_fwd_cfg(qdma->eth, REG_GDM_FWD_CFG(port->id),
@@ -2002,7 +2001,7 @@ static netdev_tx_t airoha_dev_xmit(struct sk_buff *skb,
spin_lock_bh(&q->lock);
- txq = netdev_get_tx_queue(dev, qid);
+ txq = skb_get_tx_queue(dev, skb);
nr_frags = 1 + skb_shinfo(skb)->nr_frags;
if (q->queued + nr_frags >= q->ndesc) {
---
base-commit: a663bac71a2f0b3ac6c373168ca57b2a6e6381aa
change-id: 20260421-airoha-fix-bql-7fff7cebbc9a
Best regards,
--
Lorenzo Bianconi <lorenzo@kernel.org>
^ permalink raw reply related
* Re: Discuss: Future of AX25, NETROM and ROSE in the kernel ?
From: Hugh Blemings @ 2026-04-21 6:28 UTC (permalink / raw)
To: Steve Conklin; +Cc: Stuart Longland VK4MSL, Dan Cross, linux-hams, netdev
In-Reply-To: <CALJxBtX_T8sv11ank-arLfoVqJqXegi3D_ZT9ZFbMHkuf-2RAg@mail.gmail.com>
Hi All,
Just to note in this thread (top posting as it's a bit orthogonal to the
rest of this discussion) that events have preceeded us somewhat here
A patch just recently submitted removes the AX25, NETROM and ROSE code
from the kernel moving it to the mod-orphan sub tree of netdev
https://lore.kernel.org/netdev/20260421021824.1293976-1-kuba@kernel.org/T/#u
A shame but perhaps inevitable - but I think we have a good plan
unfolding to both take care of medium term maintenance of the kernel
code (in tree or out as it may be) as well as a move to userspace in the
longer term.
For the benefit of the netdev readership - we had a thread over in
linux-hams on this but that may not have been visible to folks in
netdev. TL;DR: we think we have a way forward but appreciate this may
not be quick enough to meet the requirements/concerns put forward
If we can delay removal, that'd be grand, but appreciate that moment may
have passed.
Cheers/73
Hugh
VK3YYZ/AD5RV
On 20/4/2026 02:18, Steve Conklin wrote:
> On Sun, Apr 19, 2026 at 4:36 AM Hugh Blemings <hugh@blemings.org> wrote:
>> HI All,
>>
>> On 19/4/2026 14:01, Stuart Longland VK4MSL wrote:
>>> On 19/4/26 05:28, Dan Cross wrote:
>>>> [Top-posting to make meta-commendary]
>>>>
>>>> I wonder if other folks have thoughts, here? It doesn't bode well that
>>>> the discussion hasn't progressed. 🙁
>>> I haven't had a chance to fully review what you've posted… there was a
>>> lot of historical information in there including detail on the
>>> protocols in question. I've earmarked it to go through closely
>>> however. (e.g. I had heard of "ROSE" but never seen a spec for it.)
>>>
>>> My situation was wanting a library that I could use to do AX.25
>>> networking from userspace without having applications having to
>>> elevate to `root` to achieve it. There was also a maintenance
>>> concern. Rather than try and work out the AX.25 kernel stack, I opted
>>> to instead build my own.
>>>
>>> Instructive, but difficult as the documentation is sketchy in places.
>>>
>>> My implementation was written in Python 3.5+ for ease of development.
>>> Probably not the best option, but it got the job done. `aioax25`
>>> allowed me to deliver a project for an emergency comms group and
>>> provides a reasonable foundation for simple tasks. The stack is also
>>> portable to other platforms. (I mostly only care about Linux and
>>> *BSD, but well written software should work elsewhere too. Apparently
>>> it works fine on Apple MacOS X.)
>>>
>>> A userspace AX.25 daemon which implements the stack would seem to be
>>> the best course of action, but the elephant in the room is what the
>>> API would look like.
>>>
>>> The only thing I've seen close to achieving something like that would
>>> be the AGWPE protocol, however the author of that AX.25 stack has
>>> categorically stated that he "owns" that protocol. I don't feel like
>>> going to court to argue copyright of interfaces for the sake of a hobby.
>>>
>>> The AGWPE protocol is also very limiting: a lot of fields in the AX.25
>>> frame are not accessible via this protocol, either for reading or
>>> setting. Want to use the two reserved bits to signal something in a
>>> custom protocol? Too bad.
>>>
>>> I was therefore pondering a "stream"-like protocol using KISS-style
>>> framing (to re-use existing code). The frames would serve as an RPC
>>> mechanism for implementing something like the `libax25` API, exposing
>>> the same functionality and allowing an application to interact with
>>> the AX.25 stack without having to implement the whole protocol (as
>>> they'd have to do with KISS).
>>>
>>> Client applications could connect either via Unix domain sockets or TCP.
>>>
>>> You mention the performance hit of crossing the kernel/user-space
>>> boundary… I think Direwolf experimentally can work as high as
>>> 38400bps. A turn-of-the-century desktop PC was easily able to keep up
>>> with that for PPP links (with `pppd` running in userspace). ARMv7
>>> single board computers made 15 years later can deliver similar
>>> performance. I don't think this will be much of a bottleneck in
>>> practice.
>>>
>>> I think userspace is the right way forward given the niche use case here.
>> Apologies, I kicked off this thread and life intervened a bit and have
>> only now had a chance to get back to read through the excellent
>> discourse since.
>>
>> I did have one off list conversation about this which was similarly
>> leaning towards a well managed/discussed shift to a userspace approach.
>> The individual in question has had quite a lot of both ham radio and
>> FOSS experience - I'll give them a nudge and see if they would be
>> willing to weigh in here too as I think they'd add a lot to the thread.
>>
>> I wonder if anyone on list feels they have the right skills to put
>> together the shim/compatibility library that would be needed to allow
>> the kernel code to be removed? Seems like that might be the next thing
>> to explore ?
>>
>> Hoping things settle down a bit and will be able to contribute more to
>> ongoing discussion
>>
>> vy 73
>> Hugh
>> VK1YYZ/AD5RV
>>
>>
>>
>> --
>> I am slowly moving to hugh@blemings.id.au as my main email address.
>> If you're using hugh@blemings.org please update your address book accordingly.
>> Thank you :)
>>
>>
> Hi all,
>
> I'm another former maintainer who has been less active in the FOSS
> hamm community for a while. Thanks for all the history, and it's been
> great to see familiar names and callsigns again.
> /wave
>
> My personal take is that moving to userspace is the right long-term
> goal. As for deciding the exact form that takes, this thread is a good
> start.
>
> I'd like to make an offer for a long-term home for these components.
>
> I'm a director at ORI (Open Research Institute), and ORI could be a
> home for these. We're a project-based, completely open and volunteer
> nonprofit focused on ham radio and communications. We have volunteers
> with FOSS and ham radio experience, and we're already set up with a
> GitHub org, slack chat, and the infrastructure to manage projects.
>
> https://www.openresearch.institute/
> https://www.openresearch.institute/your-project-is-welcome/
>
> Anyone who currently maintains or has an interest in this is welcome
> to participate there.
> I personally volunteer to help move development infrastructure to ORI,
> if that's the wish of the community.
>
> Steve Conklin AI4QR
>
--
I am slowly moving to hugh@blemings.id.au as my main email address.
If you're using hugh@blemings.org please update your address book accordingly.
Thank you :)
^ permalink raw reply
* Re: [PATCH v5 net] ax25: fix OOB read after address header strip in ax25_rcv()
From: Ashutosh Desai @ 2026-04-21 6:08 UTC (permalink / raw)
To: netdev
Cc: linux-hams, jreuter, davem, edumazet, kuba, pabeni, horms,
linux-kernel
In-Reply-To: <20260421054858.732939-1-ashutoshdesai993@gmail.com>
[-- Attachment #1: Type: text/plain, Size: 297 bytes --]
On Tue, Apr 21, 2026 at 05:48:58 +0000, Ashutosh Desai wrote:
> [PATCH v5 net] ax25: fix OOB read after address header strip in ax25_rcv()
Please ignore this patch. A net-deletions patch removing the entire ax25
and amateur radio subsystem was posted earlier today. This fix is no
longer needed.
^ permalink raw reply
* Re: [PATCH net-deletions] net: remove ax25 and amateur radio (hamradio) subsystem
From: Greg KH @ 2026-04-21 6:04 UTC (permalink / raw)
To: Jakub Kicinski
Cc: davem, netdev, edumazet, pabeni, andrew+netdev, horms, corbet,
skhan, federico.vaga, carlos.bilbao, avadhut.naik, alexs,
si.yanteng, dzm91, 2023002089, tsbogend, dsahern, jani.nikula,
mchehab+huawei, jirislaby, tytso, herbert, ebiggers,
johannes.berg, geert, pablo, tglx, mashiro.chen, mingo, dqfext,
jreuter, sdf, pkshih, enelsonmoore, mkl, toke, kees, crossd,
jlayton, wangliang74, aha310510, takamitz, kuniyu, linux-doc,
linux-mips
In-Reply-To: <20260421021824.1293976-1-kuba@kernel.org>
On Mon, Apr 20, 2026 at 07:18:23PM -0700, Jakub Kicinski wrote:
> Remove the amateur radio (AX.25, NET/ROM, ROSE) protocol implementation
> and all associated hamradio device drivers from the kernel tree.
> This set of protocols has long been a huge bug/syzbot magnet,
> and since nobody stepped up to help us deal with the influx
> of the AI-generated bug reports we need to move it out of tree
> to protect our sanity.
>
> The code is moved to an out-of-tree repo:
> https://github.com/linux-netdev/mod-orphan
> if it's cleaned up and reworked there we can accept it back.
>
> Minimal stub headers are kept for include/net/ax25.h (AX25_P_IP,
> AX25_ADDR_LEN, ax25_address) and include/net/rose.h (ROSE_ADDR_LEN)
> so that the conditional integration code in arp.c and tun.c continues
> to compile and work when the out-of-tree modules are loaded.
>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
^ permalink raw reply
* Re: [PATCH net-deletions] net: remove ISDN subsystem and Bluetooth CMTP
From: Greg KH @ 2026-04-21 6:03 UTC (permalink / raw)
To: Jakub Kicinski
Cc: davem, netdev, edumazet, pabeni, andrew+netdev, horms, corbet,
skhan, marcel, luiz.dentz, mchehab+huawei, jani.nikula, demarchi,
rdunlap, justonli, ivecera, jonathan.cameron, kees,
marco.crivellari, ferr.lambarginio, nihaal, mingo, tglx, linmq006,
linux-doc, linux-bluetooth
In-Reply-To: <20260421022108.1299678-1-kuba@kernel.org>
On Mon, Apr 20, 2026 at 07:21:07PM -0700, Jakub Kicinski wrote:
> Remove the ISDN (mISDN, CAPI) subsystem and Bluetooth CMTP protocol
> from the kernel tree.
>
> ISDN is a pretty old technology and it's unclear whether anyone still
> uses it. I went over the last few years of git history and all the
> commits are either tree-wide conversions or syzbot/static analyzer
> fixes.
>
> When we discussed removal in the past IIRC there were some concerns
> about ISDN still being used in parts of Germany. Unfortunately, the
> code base is quite old, none of the current maintainers are familiar
> with it and AI tools will have a field day finding bugs here.
>
> Delete this code and preserve it in an out-of-tree repository
> for any remaining users:
> https://github.com/linux-netdev/mod-orphan
>
> UAPI constants AF_ISDN/PF_ISDN and the SELinux isdn_socket class
> are preserved for ABI stability, but the rest of uAPI is removed.
>
> Signed-off-by: Jakub Kicinski <kuba@kernel.org>
Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
^ permalink raw reply
* [PATCH v3] llc: Return -EINPROGRESS from llc_ui_connect()
From: Ernestas Kulik @ 2026-04-21 6:02 UTC (permalink / raw)
To: netdev; +Cc: kuba, linux-kernel, Ernestas Kulik
In-Reply-To: <20260415063457.1008868-1-ernestas.k@iconn-networks.com>
Given a zero sk_sndtimeo, llc_ui_connect() skips waiting for state
change and returns 0, confusing userspace applications that will assume
the socket is connected, making e.g. getpeername() calls error out.
More specifically, the issue was discovered in libcoap, where
newly-added AF_LLC socket support was behaving differently from AF_INET
connections due to EINPROGRESS handling being skipped.
Set rc to -EINPROGRESS if connect() would not block, akin to AF_INET
sockets.
Signed-off-by: Ernestas Kulik <ernestas.k@iconn-networks.com>
---
v2:
- Add note about discovering the issue
- Make rc assignment conditional
v3:
- Fix commit message after v2 changes
---
net/llc/af_llc.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/net/llc/af_llc.c b/net/llc/af_llc.c
index 59d593bb5d18..1b210db3119e 100644
--- a/net/llc/af_llc.c
+++ b/net/llc/af_llc.c
@@ -518,12 +518,14 @@ static int llc_ui_connect(struct socket *sock, struct sockaddr_unsized *uaddr,
}
if (sk->sk_state == TCP_SYN_SENT) {
const long timeo = sock_sndtimeo(sk, flags & O_NONBLOCK);
- if (!timeo || !llc_ui_wait_for_conn(sk, timeo))
+ if (!timeo || !llc_ui_wait_for_conn(sk, timeo)) {
+ rc = -EINPROGRESS;
goto out;
+ }
rc = sock_intr_errno(timeo);
if (signal_pending(current))
goto out;
}
--
2.53.0
^ permalink raw reply related
* [PATCH v2] llc: Return -EINPROGRESS from llc_ui_connect()
From: Ernestas Kulik @ 2026-04-21 5:54 UTC (permalink / raw)
To: netdev; +Cc: kuba, linux-kernel, Ernestas Kulik
In-Reply-To: <20260415063457.1008868-1-ernestas.k@iconn-networks.com>
Given a zero sk_sndtimeo, llc_ui_connect() skips waiting for state
change and returns 0, confusing userspace applications that will assume
the socket is connected, making e.g. getpeername() calls error out.
More specifically, the issue was discovered in libcoap, where
newly-added AF_LLC socket support was behaving differently from AF_INET
connections due to EINPROGRESS handling being skipped.
Set rc to -EINPROGRESS before considering blocking, akin to AF_INET
sockets.
Signed-off-by: Ernestas Kulik <ernestas.k@iconn-networks.com>
---
net/llc/af_llc.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/net/llc/af_llc.c b/net/llc/af_llc.c
index 59d593bb5d18..1b210db3119e 100644
--- a/net/llc/af_llc.c
+++ b/net/llc/af_llc.c
@@ -518,12 +518,14 @@ static int llc_ui_connect(struct socket *sock, struct sockaddr_unsized *uaddr,
}
if (sk->sk_state == TCP_SYN_SENT) {
const long timeo = sock_sndtimeo(sk, flags & O_NONBLOCK);
- if (!timeo || !llc_ui_wait_for_conn(sk, timeo))
+ if (!timeo || !llc_ui_wait_for_conn(sk, timeo)) {
+ rc = -EINPROGRESS;
goto out;
+ }
rc = sock_intr_errno(timeo);
if (signal_pending(current))
goto out;
}
--
2.53.0
^ permalink raw reply related
* Re: [PATCH net] ipv6: rpl: expand skb head when recompressed SRH grows, not only on last segment
From: Greg KH @ 2026-04-21 5:50 UTC (permalink / raw)
To: Kuniyuki Iwashima
Cc: davem, dsahern, edumazet, horms, kuba, linux-kernel, netdev,
pabeni, stable
In-Reply-To: <20260421045510.1546375-1-kuniyu@google.com>
On Tue, Apr 21, 2026 at 04:52:52AM +0000, Kuniyuki Iwashima wrote:
> From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> Date: Mon, 20 Apr 2026 21:32:25 +0200
> > ipv6_rpl_srh_rcv() processes a Routing Protocol for LLNs Source Routing
> > Header by decompressing it, swapping the next segment address into
> > ipv6_hdr->daddr, recompressing, and pushing the new header back. The
> > recompressed header can be larger than the original when the
> > address-elision opportunities are worse after the swap.
> >
> > The function pulls (hdr->hdrlen + 1) << 3 bytes (the old header) and
> > pushes (chdr->hdrlen + 1) << 3 + sizeof(ipv6hdr) bytes (the new header
> > plus the IPv6 header). pskb_expand_head() is called to guarantee
> > headroom only when segments_left == 0.
> >
> > A crafted SRH that loops back to the local host (each segment is a local
> > address, so ip6_route_input() delivers it back to ipv6_rpl_srh_rcv())
> > with chdr growing on each pass exhausts headroom over several
> > iterations.
>
> How could this occur.. ? Did AI generate a repro or just
> flagged the possibility ?
It generated a reproducer which caused a crash which made me have to
create this patch. I'll dig it out of the huge pile of mess that was
sent to me and get it into a form that I can reply here to.
thanks,
greg k-h
^ permalink raw reply
* [PATCH v5 net] ax25: fix OOB read after address header strip in ax25_rcv()
From: Ashutosh Desai @ 2026-04-21 5:48 UTC (permalink / raw)
To: netdev, linux-hams
Cc: jreuter, davem, edumazet, kuba, pabeni, horms, linux-kernel,
stable, Ashutosh Desai
A crafted AX.25 frame with a valid address header but no control byte
causes skb->len to reach zero after skb_pull() strips the header.
The subsequent reads of skb->data[0] (control) and skb->data[1] (PID)
are then out of bounds.
Linearize the skb after confirming the device is an AX.25 interface.
Guard with skb->len < 1 after the pull - one byte suffices for LAPB
control frames which have no PID byte. Add a separate skb->len < 2
check inside the UI branch before accessing the PID byte.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: stable@vger.kernel.org
Signed-off-by: Ashutosh Desai <ashutoshdesai993@gmail.com>
---
v5:
- Move skb_linearize() to after ax25_dev_ax25dev() check; avoids
unnecessary allocation for frames on non-AX.25 interfaces
- Lower general guard from skb->len < 2 to skb->len < 1; the stricter
limit incorrectly dropped valid 1-byte LAPB control frames (SABM,
DISC, UA, DM, RR) which carry no PID byte
- Add explicit skb->len < 2 check inside UI branch before the PID
byte (skb->data[1]) access
v4:
- Linearize skb at entry to ax25_rcv(); replace pskb_may_pull() with
skb->len < 2 check (per David Laight review)
v3:
- Remove incorrect Suggested-by; add Fixes:, Cc: stable@
v2:
- Replace skb->len check with pskb_may_pull(skb, 2)
Link to v4: https://lore.kernel.org/netdev/20260417065407.206499-1-ashutoshdesai993@gmail.com/
Link to v3: https://lore.kernel.org/netdev/20260415063654.3831353-1-ashutoshdesai993@gmail.com/
Link to v2: https://lore.kernel.org/netdev/20260409152400.2219716-1-ashutoshdesai993@gmail.com/
Link to v1: https://lore.kernel.org/netdev/20260409012235.2049389-1-ashutoshdesai993@gmail.com/
net/ax25/ax25_in.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/net/ax25/ax25_in.c b/net/ax25/ax25_in.c
index d75b3e9ed93d..c81d6830af48 100644
--- a/net/ax25/ax25_in.c
+++ b/net/ax25/ax25_in.c
@@ -199,6 +199,9 @@ static int ax25_rcv(struct sk_buff *skb, struct net_device *dev,
if ((ax25_dev = ax25_dev_ax25dev(dev)) == NULL)
goto free;
+ if (skb_linearize(skb))
+ goto free;
+
/*
* Parse the address header.
*/
@@ -217,6 +220,9 @@ static int ax25_rcv(struct sk_buff *skb, struct net_device *dev,
*/
skb_pull(skb, ax25_addr_size(&dp));
+ if (skb->len < 1)
+ goto free;
+
/* For our port addresses ? */
if (ax25cmp(&dest, dev_addr) == 0 && dp.lastrepeat + 1 == dp.ndigi)
mine = 1;
@@ -227,6 +233,9 @@ static int ax25_rcv(struct sk_buff *skb, struct net_device *dev,
/* UI frame - bypass LAPB processing */
if ((*skb->data & ~0x10) == AX25_UI && dp.lastrepeat + 1 == dp.ndigi) {
+ if (skb->len < 2)
+ goto free;
+
skb_set_transport_header(skb, 2); /* skip control and pid */
ax25_send_to_raw(&dest, skb, skb->data[1]);
--
2.34.1
^ permalink raw reply related
* Re: [PATCH] llc: Return -EINPROGRESS from llc_ui_connect()
From: Ernestas Kulik @ 2026-04-21 5:48 UTC (permalink / raw)
To: Jakub Kicinski; +Cc: netdev, linux-kernel
In-Reply-To: <20260420114138.1aa52551@kernel.org>
On 2026-04-20 21:41, Jakub Kicinski wrote:
> On Wed, 15 Apr 2026 09:34:57 +0300 Ernestas Kulik wrote:
>> Given a zero sk_sndtimeo, llc_ui_connect() skips waiting for state
>> change and returns 0, confusing userspace applications that will assume
>> the socket is connected, making e.g. getpeername() calls error out.
>>
>> Set rc to -EINPROGRESS before considering blocking, akin to AF_INET
>> sockets.
>
> Please add a note on how you discovered this issue.
> Including whether you're actively using this code or just scanning it
> for bugs.
Will do. It was discovered while adding support for AF_LLC sockets in
libcoap, the usual code path for client connections was failing due to
this specific issue, so I figured the behavior should be analogous to
AF_INET.
>> diff --git a/net/llc/af_llc.c b/net/llc/af_llc.c
>> index 59d593bb5d18..9317d092ba84 100644
>> --- a/net/llc/af_llc.c
>> +++ b/net/llc/af_llc.c
>> @@ -515,10 +515,12 @@ static int llc_ui_connect(struct socket *sock, struct sockaddr_unsized *uaddr,
>> sock->state = SS_UNCONNECTED;
>> sk->sk_state = TCP_CLOSE;
>> goto out;
>> }
>>
>> + rc = -EINPROGRESS;
>
> Isn't this a bit of an odd placement? ..
>
>> if (sk->sk_state == TCP_SYN_SENT) {
>> const long timeo = sock_sndtimeo(sk, flags & O_NONBLOCK);
>>
>> if (!timeo || !llc_ui_wait_for_conn(sk, timeo))
>> goto out;
>
> .. I suspect you mean to target this branch, right?
I can’t remember now why I put it there, but you’re right, that would be
the better place.
^ permalink raw reply
* [PATCH v5 net] ax25: fix OOB read after address header strip in ax25_rcv()
From: Ashutosh Desai @ 2026-04-21 5:46 UTC (permalink / raw)
To: netdev, linux-hams
Cc: jreuter, davem, edumazet, kuba, pabeni, horms, linux-kernel,
stable, Ashutosh Desai
In-Reply-To: <20260417065407.206499-1-ashutoshdesai993@gmail.com>
A crafted AX.25 frame with a valid address header but no control byte
causes skb->len to reach zero after skb_pull() strips the header.
The subsequent reads of skb->data[0] (control) and skb->data[1] (PID)
are then out of bounds.
Linearize the skb after confirming the device is an AX.25 interface.
Guard with skb->len < 1 after the pull - one byte suffices for LAPB
control frames which have no PID byte. Add a separate skb->len < 2
check inside the UI branch before accessing the PID byte.
Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Cc: stable@vger.kernel.org
Signed-off-by: Ashutosh Desai <ashutoshdesai993@gmail.com>
---
v5:
- Move skb_linearize() to after ax25_dev_ax25dev() check; avoids
unnecessary allocation for frames on non-AX.25 interfaces
- Lower general guard from skb->len < 2 to skb->len < 1; the stricter
limit incorrectly dropped valid 1-byte LAPB control frames (SABM,
DISC, UA, DM, RR) which carry no PID byte
- Add explicit skb->len < 2 check inside UI branch before the PID
byte (skb->data[1]) access
v4:
- Linearize skb at entry to ax25_rcv(); replace pskb_may_pull() with
skb->len < 2 check (per David Laight review)
v3:
- Remove incorrect Suggested-by; add Fixes:, Cc: stable@
v2:
- Replace skb->len check with pskb_may_pull(skb, 2)
Link to v4: https://lore.kernel.org/netdev/20260417065407.206499-1-ashutoshdesai993@gmail.com/
Link to v3: https://lore.kernel.org/netdev/20260415063654.3831353-1-ashutoshdesai993@gmail.com/
Link to v2: https://lore.kernel.org/netdev/20260409152400.2219716-1-ashutoshdesai993@gmail.com/
Link to v1: https://lore.kernel.org/netdev/20260409012235.2049389-1-ashutoshdesai993@gmail.com/
net/ax25/ax25_in.c | 9 +++++++++
1 file changed, 9 insertions(+)
diff --git a/net/ax25/ax25_in.c b/net/ax25/ax25_in.c
index d75b3e9ed93d..c81d6830af48 100644
--- a/net/ax25/ax25_in.c
+++ b/net/ax25/ax25_in.c
@@ -199,6 +199,9 @@ static int ax25_rcv(struct sk_buff *skb, struct net_device *dev,
if ((ax25_dev = ax25_dev_ax25dev(dev)) == NULL)
goto free;
+ if (skb_linearize(skb))
+ goto free;
+
/*
* Parse the address header.
*/
@@ -217,6 +220,9 @@ static int ax25_rcv(struct sk_buff *skb, struct net_device *dev,
*/
skb_pull(skb, ax25_addr_size(&dp));
+ if (skb->len < 1)
+ goto free;
+
/* For our port addresses ? */
if (ax25cmp(&dest, dev_addr) == 0 && dp.lastrepeat + 1 == dp.ndigi)
mine = 1;
@@ -227,6 +233,9 @@ static int ax25_rcv(struct sk_buff *skb, struct net_device *dev,
/* UI frame - bypass LAPB processing */
if ((*skb->data & ~0x10) == AX25_UI && dp.lastrepeat + 1 == dp.ndigi) {
+ if (skb->len < 2)
+ goto free;
+
skb_set_transport_header(skb, 2); /* skip control and pid */
ax25_send_to_raw(&dest, skb, skb->data[1]);
--
2.34.1
^ permalink raw reply related
* [PATCH iwl-net v2] idpf: do not perform flow ops when netdev is detached
From: Li Li @ 2026-04-21 5:16 UTC (permalink / raw)
To: Tony Nguyen, Przemek Kitszel, David S. Miller, Jakub Kicinski,
Eric Dumazet, intel-wired-lan
Cc: netdev, linux-kernel, David Decotigny, Anjali Singhai,
Sridhar Samudrala, Brian Vazquez, Li Li, emil.s.tantilov, stable
Even though commit 2e281e1155fc ("idpf: detach and close netdevs while
handling a reset") prevents ethtool -N/-n operations to operate on
detached netdevs, we found that out-of-tree workflows like OpenOnload
can bypass ethtool core locks and call idpf_set_rxnfc directly during
an idpf HW reset. When this happens, we could get kernel crashes like
the following:
[ 4045.787439] BUG: kernel NULL pointer dereference, address: 0000000000000070
[ 4045.794420] #PF: supervisor read access in kernel mode
[ 4045.799580] #PF: error_code(0x0000) - not-present page
[ 4045.804739] PGD 0
[ 4045.806772] Oops: Oops: 0000 [#1] SMP NOPTI
...
[ 4045.836425] Workqueue: onload-wqueue oof_do_deferred_work_fn [onload]
[ 4045.842926] RIP: 0010:idpf_del_flow_steer+0x24/0x170 [idpf]
...
[ 4045.946323] Call Trace:
[ 4045.948796] <TASK>
[ 4045.950915] ? show_trace_log_lvl+0x1b0/0x2f0
[ 4045.955293] ? show_trace_log_lvl+0x1b0/0x2f0
[ 4045.959672] ? idpf_set_rxnfc+0x6f/0x80 [idpf]
[ 4046.063613] </TASK>
To prevent this, we need to add checks in idpf_set_rxnfc and
idpf_get_rxnfc to error out if the netdev is already detached.
Tested: synthetically forced idpf into a HW reset by introducing module
parameters to simulate a Tx timeout and force virtual channel
initialization failure. This was done by skipping completion cleaning for
specific queues and returning -EIO during core initialization.
The failure was then triggered by writing 1 to the corresponding sysfs
parameters and calling idpf_get_rxnfc() during the reset process.
Without the patch: encountered NULL pointer and kernel crash.
With the patch: no crashes.
Fixes: 2e281e1155fc ("idpf: detach and close netdevs while handling a reset")
Cc: stable@vger.kernel.org
Signed-off-by: Li Li <boolli@google.com>
---
v2:
- Removed the raw code block from the commit message and replaced it with
a textual description of the test modifications.
drivers/net/ethernet/intel/idpf/idpf_ethtool.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/drivers/net/ethernet/intel/idpf/idpf_ethtool.c b/drivers/net/ethernet/intel/idpf/idpf_ethtool.c
index bb99d9e7c65d..8368a7e6a754 100644
--- a/drivers/net/ethernet/intel/idpf/idpf_ethtool.c
+++ b/drivers/net/ethernet/intel/idpf/idpf_ethtool.c
@@ -43,6 +43,9 @@ static int idpf_get_rxnfc(struct net_device *netdev, struct ethtool_rxnfc *cmd,
unsigned int cnt = 0;
int err = 0;
+ if (!netdev || !netif_device_present(netdev))
+ return -ENODEV;
+
idpf_vport_ctrl_lock(netdev);
vport = idpf_netdev_to_vport(netdev);
vport_config = np->adapter->vport_config[np->vport_idx];
@@ -349,6 +352,9 @@ static int idpf_set_rxnfc(struct net_device *netdev, struct ethtool_rxnfc *cmd)
{
int ret = -EOPNOTSUPP;
+ if (!netdev || !netif_device_present(netdev))
+ return -ENODEV;
+
idpf_vport_ctrl_lock(netdev);
switch (cmd->cmd) {
case ETHTOOL_SRXCLSRLINS:
--
2.54.0.rc1.555.g9c883467ad-goog
^ permalink raw reply related
* Re: [net-next v2 2/5] dt-bindings: net: starfive,jh7110-dwmac: Add JHB100 support
From: Minda Chen @ 2026-04-21 3:30 UTC (permalink / raw)
To: Krzysztof Kozlowski
Cc: Alexandre Torgue, Andrew Lunn, David S . Miller, Eric Dumazet,
Jakub Kicinski, Paolo Abeni, Maxime Coquelin,
Emil Renner Berthing, Rob Herring, Krzysztof Kozlowski,
Conor Dooley, netdev@vger.kernel.org,
linux-kernel@vger.kernel.org,
linux-stm32@st-md-mailman.stormreply.com,
devicetree@vger.kernel.org
In-Reply-To: <20260420-messy-elite-panther-a7ffbc@quoll>
>
> On Fri, Apr 17, 2026 at 10:45:20AM +0800, Minda Chen wrote:
> > Add StarFive JHB100 dwmac support and compatible.
> > The JHB100 dwmac shares the same driver code as the JH7110 dwmac,
>
> Please describe the hardware or programming interface, not driver code.
>
> > which contains 2 SGMII interfaces, 1 RGMII/RMII interface and
> > 1 RMII interface.
> > JHB100 dwmac has only one reset signal and one main interrupt line.
>
>
> Drop all below, not relevant.
>
> >
> > Please refer to below:
> >
> > JHB100: reset-names = "stmmaceth";
> >
> > Example usage of JHB100 in the device tree:
> >
> > gmac0: ethernet@11b80000 {
> > compatible = "starfive,jhb100-dwmac",
> > "snps,dwmac-5.20";
> > interrupts = <225>;
> > interrupt-names = "macirq";
> > ...
> > };
> >
> > Signed-off-by: Minda Chen <minda.chen@starfivetech.com>
> > ---
> > .../devicetree/bindings/net/snps,dwmac.yaml | 1 +
> > .../bindings/net/starfive,jh7110-dwmac.yaml | 23 +++++++++++++++++++
> > 2 files changed, 24 insertions(+)
> >
> > diff --git a/Documentation/devicetree/bindings/net/snps,dwmac.yaml
> > b/Documentation/devicetree/bindings/net/snps,dwmac.yaml
> > index 38bc34dc4f09..85cd3252e8b1 100644
> > --- a/Documentation/devicetree/bindings/net/snps,dwmac.yaml
> > +++ b/Documentation/devicetree/bindings/net/snps,dwmac.yaml
> > @@ -115,6 +115,7 @@ properties:
> > - sophgo,sg2044-dwmac
> > - starfive,jh7100-dwmac
> > - starfive,jh7110-dwmac
> > + - starfive,jhb100-dwmac
> > - tesla,fsd-ethqos
> > - thead,th1520-gmac
> >
> > diff --git
> > a/Documentation/devicetree/bindings/net/starfive,jh7110-dwmac.yaml
> > b/Documentation/devicetree/bindings/net/starfive,jh7110-dwmac.yaml
> > index 0d1962980f57..edc246a71ce3 100644
> > --- a/Documentation/devicetree/bindings/net/starfive,jh7110-dwmac.yaml
> > +++ b/Documentation/devicetree/bindings/net/starfive,jh7110-dwmac.yaml
> > @@ -18,6 +18,7 @@ select:
> > enum:
> > - starfive,jh7100-dwmac
> > - starfive,jh7110-dwmac
> > + - starfive,jhb100-dwmac
> > required:
> > - compatible
> >
> > @@ -30,6 +31,9 @@ properties:
> > - items:
> > - const: starfive,jh7110-dwmac
> > - const: snps,dwmac-5.20
> > + - items:
> > + - const: starfive,jhb100-dwmac
>
> So that's an enum in previous "items" list.... but your commit msg said your
> devices are compatible, so confusing.
>
> Best regards,
> Krzysztof
Got it . I will correct the commit messages. Thanks
^ permalink raw reply
* Re: [PATCH net] tcp: make probe0 timer handle expired user timeout
From: Eric Dumazet @ 2026-04-21 4:57 UTC (permalink / raw)
To: Altan Hacigumus
Cc: Neal Cardwell, Kuniyuki Iwashima, David S . Miller, David Ahern,
Jakub Kicinski, Paolo Abeni, Simon Horman, netdev, linux-kernel,
Enke Chen
In-Reply-To: <20260414013634.43997-1-ahacigu.linux@gmail.com>
On Mon, Apr 13, 2026 at 6:36 PM Altan Hacigumus <ahacigu.linux@gmail.com> wrote:
>
> tcp_clamp_probe0_to_user_timeout() computes remaining time in jiffies
> using subtraction with an unsigned lvalue. If elapsed probing time
> already exceeds the configured TCP_USER_TIMEOUT, the subtraction
> underflows and yields a large value.
>
> Handle this expiration case similarly to tcp_clamp_rto_to_user_timeout().
>
> Fixes: 344db93ae3ee ("tcp: make TCP_USER_TIMEOUT accurate for zero window probes")
> Signed-off-by: Altan Hacigumus <ahacigu.linux@gmail.com>
> ---
> net/ipv4/tcp_timer.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/net/ipv4/tcp_timer.c b/net/ipv4/tcp_timer.c
> index 5a14a53a3c9e..4a43356a4e06 100644
> --- a/net/ipv4/tcp_timer.c
> +++ b/net/ipv4/tcp_timer.c
> @@ -50,7 +50,8 @@ static u32 tcp_clamp_rto_to_user_timeout(const struct sock *sk)
> u32 tcp_clamp_probe0_to_user_timeout(const struct sock *sk, u32 when)
> {
> const struct inet_connection_sock *icsk = inet_csk(sk);
> - u32 remaining, user_timeout;
> + u32 user_timeout;
> + s32 remaining;
> s32 elapsed;
>
> user_timeout = READ_ONCE(icsk->icsk_user_timeout);
> @@ -61,6 +62,8 @@ u32 tcp_clamp_probe0_to_user_timeout(const struct sock *sk, u32 when)
> if (unlikely(elapsed < 0))
> elapsed = 0;
> remaining = msecs_to_jiffies(user_timeout) - elapsed;
> + if (remaining <= 0)
> + return 1;
I do not think this chunk is needed ?
If @remaining is signed, then perhaps change the following line to:
remaining = max_t(int, remaining, TCP_TIMEOUT_MIN);
Also, it would be great to have a new packetdrill test.
> remaining = max_t(u32, remaining, TCP_TIMEOUT_MIN);
>
> return min_t(u32, remaining, when);
> --
> 2.43.0
>
^ permalink raw reply
* Re: [PATCH net] ipv6: rpl: expand skb head when recompressed SRH grows, not only on last segment
From: Kuniyuki Iwashima @ 2026-04-21 4:52 UTC (permalink / raw)
To: gregkh
Cc: davem, dsahern, edumazet, horms, kuba, linux-kernel, netdev,
pabeni, stable
In-Reply-To: <2026042024-cabbie-gills-9371@gregkh>
From: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Date: Mon, 20 Apr 2026 21:32:25 +0200
> ipv6_rpl_srh_rcv() processes a Routing Protocol for LLNs Source Routing
> Header by decompressing it, swapping the next segment address into
> ipv6_hdr->daddr, recompressing, and pushing the new header back. The
> recompressed header can be larger than the original when the
> address-elision opportunities are worse after the swap.
>
> The function pulls (hdr->hdrlen + 1) << 3 bytes (the old header) and
> pushes (chdr->hdrlen + 1) << 3 + sizeof(ipv6hdr) bytes (the new header
> plus the IPv6 header). pskb_expand_head() is called to guarantee
> headroom only when segments_left == 0.
>
> A crafted SRH that loops back to the local host (each segment is a local
> address, so ip6_route_input() delivers it back to ipv6_rpl_srh_rcv())
> with chdr growing on each pass exhausts headroom over several
> iterations.
How could this occur.. ? Did AI generate a repro or just
flagged the possibility ?
ipv6_rpl_sr_hdr.hdrlen >> 3 is the size of addresses in the
header and 1 >> 3 is the size of ipv6_rpl_sr_hdr itself, which
is pulled into skb_headroom in ipv6_rthdr_rcv().
In ipv6_rpl_srh_rcv(), the number of addresses is calculated
based on ipv6_rpl_sr_hdr.hdrlen, and when hdr->segments_left
is not zero in the "if" below, the new header has the exact same
size with the old header, so there should be no overflow.
Also, before the "if", ipv6_rpl_srh_rcv() calls
skb_pull(skb, ((hdr->hdrlen + 1) << 3));
and after that,
skb_push(skb, ((chdr->hdrlen + 1) << 3) + sizeof(struct ipv6hdr));
and if we jump to the looped_back label, it calls
skb_pull(skb, sizeof(struct ipv6hdr));
So, I think the same size are pulled and pushed for each iteration
(except for segments_left == 0 case) even with local addresses.
> When skb_push() lands skb->data exactly at skb->head,
> skb_reset_network_header() stores 0, and skb_mac_header_rebuild()'s
> skb_set_mac_header(skb, -skb->mac_len) computes 0 + (u16)(-14) = 65522.
> The subsequent memmove writes 14 bytes at skb->head + 65522.
>
> Expand the head whenever there is insufficient room for the push, not
> only on the final segment.
>
> Cc: "David S. Miller" <davem@davemloft.net>
> Cc: David Ahern <dsahern@kernel.org>
> Cc: Eric Dumazet <edumazet@google.com>
> Cc: Jakub Kicinski <kuba@kernel.org>
> Cc: Paolo Abeni <pabeni@redhat.com>
> Cc: Simon Horman <horms@kernel.org>
> Reported-by: Anthropic
> Cc: stable <stable@kernel.org>
> Assisted-by: gkh_clanker_t1000
> Signed-off-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> ---
> net/ipv6/exthdrs.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/net/ipv6/exthdrs.c b/net/ipv6/exthdrs.c
> index 95558fd6f447..d866ab011e0a 100644
> --- a/net/ipv6/exthdrs.c
> +++ b/net/ipv6/exthdrs.c
> @@ -592,7 +592,9 @@ static int ipv6_rpl_srh_rcv(struct sk_buff *skb)
> skb_pull(skb, ((hdr->hdrlen + 1) << 3));
> skb_postpull_rcsum(skb, oldhdr,
> sizeof(struct ipv6hdr) + ((hdr->hdrlen + 1) << 3));
> - if (unlikely(!hdr->segments_left)) {
> + if (unlikely(!hdr->segments_left ||
> + skb_headroom(skb) < sizeof(struct ipv6hdr) +
> + ((chdr->hdrlen + 1) << 3))) {
> if (pskb_expand_head(skb, sizeof(struct ipv6hdr) + ((chdr->hdrlen + 1) << 3), 0,
> GFP_ATOMIC)) {
> __IP6_INC_STATS(net, ip6_dst_idev(skb_dst(skb)), IPSTATS_MIB_OUTDISCARDS);
> --
> 2.53.0
>
^ permalink raw reply
page: next (older) | prev (newer) | latest
- recent:[subjects (threaded)|topics (new)|topics (active)]
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox